Bug introduced in 11.0.1 or earlier and fixed in 12.0.
MMA 11.0.1.0 (Windows 10, 64-bit)
I see that "JSON" import can be buggy (154245), so my question is simple: can the following behaviour be repeated and, if so, is it a bug? (excuses if I've missed some duplicate).
So, starting with a number longer than 19 digits import (different kinds of import, including URLExecute, where I noticed that errors) fails.
1) The following works fine.
str19 = "{\"id\":19,\"number\":1234567890123456789}";
ImportString[str19, #] & /@ {"JSON", "RawJSON"}
{{"id" -> 19, "number" -> 1234567890123456789}, <|"id" -> 19, "number" -> 1234567890123456789|>}
2) Just 0 added to the number, so now it's 20-digit long.
str20 = "{\"id\":20,\"number\":12345678901234567890}";
ImportString[str20, #] & /@ {"JSON", "RawJSON"}
Import::fmterr: Cannot import data as JSON format.
Import::mnumber: The value 12345678901234567890 cannot be coerced into a machine number.
Import::jsonhintposandchar: An error occurred near character '}', at line 1:40
{$Failed, $Failed}
Answer
This was a bug, fixed in version 12 (some portion of the fix may have been in later incremental updates of 11.3).
The root of the issue was that these numbers did not pass MachineNumberQ, and in some way this broke the JSON parser. I got in touch, and after a little confusion about the JSON spec, which specifies that numbers can be arbitrary precision, it was accepted as a bug. The issue appears to now have been completely fixed in version 12 for JSON, RawJSON and JavascriptExpression.
If you are running an older version, a workaround may be to use ExternalEvaluate to parse the JSON in python or node.js, or JLink or the like to parse the JSON in some other language. For small files, it may be as simple as doing a StringReplace to convert the formatting to an Association (that is, using <| and |> in place of { and } and -> instead of :, and then running ToExpression on the resulting strings (or something like TextCases["Number"]).
Comments
Post a Comment