<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<font size="4"><font face="monospace">As you can probably imagine,
I've been thinking about these topics for quite a while, ever
since we started working on records and pattern matching. It
sounds like a lot of your thoughts have followed a similar arc
to ours. <br>
<br>
I'll share with you some of our thoughts, but I can't be
engaging in a detailed back-and-forth right now -- we have too
many other things going on, and this isn't yet on the front
burner. I think there's a right time for this work, and we're
not quite there yet, but we'll get there soon enough and we'll
pick up the ball again then. <br>
<br>
To the existential question: yes, there should be a simpler,
built-in way to parse JSON. And, as you observe, the railroad
diagram in the JSON spec is a graphical description of an
algebraic data type. One of the great simplifying effects of
having algebraic data types (records + sealed classes) in the
language is that many data modeling problems collapse down to
the point where considerably less creativity is required of an
API. Here's the JSON API one can write after literally only 30
seconds of thought: <br>
<br>
<blockquote type="cite">
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">sealed
interface </span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:
Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ </span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonString</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">(String
s)<span style="mso-spacerun:yes"> </span>implements
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonNumber</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">(double
d)<span style="mso-spacerun:yes"> </span>implements
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonNull</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">()<span style="mso-spacerun:yes">
</span>implements </span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonBoolean</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">(</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">boolean</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
b)<span style="mso-spacerun:yes"> </span>implements
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonArray</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">(List<</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">>
values)<span style="mso-spacerun:yes"> </span>implements
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%"><span style="mso-spacerun:yes"> </span>record </span><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonObject</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">(Map<String,
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:
Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">>
pairs)<span style="mso-spacerun:yes"> </span>implements
</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">JsonValue</span><span style="font-size:14.0pt;font-family:Consolas;mso-ascii-font-family:Consolas;
mso-fareast-font-family:+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:
minor-fareast;color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:
en-US;mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">
{ }</span></p>
<p style="language:en-US;margin-top:0pt;margin-bottom:0pt;margin-left:.07in;text-indent:0in;text-align:left;direction:ltr;unicode-bidi:embed;mso-line-break-override:
none;word-break:normal;punctuation-wrap:hanging"><span style="font-size:14.0pt;
font-family:Consolas;mso-ascii-font-family:Consolas;mso-fareast-font-family:
+mn-ea;mso-bidi-font-family:Arial;mso-fareast-theme-font:minor-fareast;
color:black;mso-color-index:1;mso-font-kerning:12.0pt;language:en-US;
mso-style-textfill-type:solid;mso-style-textfill-fill-themecolor:text1;
mso-style-textfill-fill-color:black;mso-style-textfill-fill-alpha:100.0%">}</span></p>
</blockquote>
<br>
It matches the JSON spec almost literally, and you can use
pattern matching to parse a document. (OK, there's some tiny
bit of creativity here in that True/False have been collapsed to
a single JsonBoolean type, but you get my point.) <br>
<br>
But, we're not quite ready to put this API into the JDK, because
the language isn't *quite* there yet. Records give you nice
pattern matching, but they come at a cost; they're very specific
and have rigid ideas about initialization, which ripples into a
number of constraints on an implementation (i.e., much harder to
parse lazily.) So we're waiting until we have deconstruction
patterns (next up on the patterns parade) so that the records
above can be interfaces and still support pattern matching (and
more flexibility in implementation, including using value
classes when they arrive.) It's not a long hop, though. <br>
<br>
I agree with your assessment of streaming models; for documents
too large to fit into memory, we'll let someone else provide a
specialized solution. Streaming and fully-materialized-tree are
not the only two options; there are plenty of points in the
middle. <br>
<br>
As to API idioms, these can be layered. The lazy-tree model
outlined above can be a foundation for data binding, dynamic
mapping to records, jsonpath, etc. But once you've made the
streaming-vs-materialized choice in favor of materialized, it's
hard to imagine not having something like the above at the base
of the tower. <br>
<br>
The question you raise about error handling is one that infuses
pattern matching in general. Pattern matching allows us to
collapse what would be a thousand questions -- "does key X
exist? is it mapped to a number? is the number in the range of
byte?" -- each with their own failure-handling path, into a
single question. That's great for reliable and readable code,
but it does make errors more opaque, because it is more like the
red "check engine" light on your dashboard. (Something like
JSONPath could generate better error messages since you've given
it a declarative description of an assumed structural
invariant.) But, imperative code that has to treat each
structural assumption as a possible control-flow point is a
disaster; we've seen too much code like this already. <br>
<br>
The ecosystem is big enough that there will be lots of people
with strong opinions that "X is the only sensible way to do it"
(we've already seen X=databinding on this thread), but the
reality is that there are multiple overlapping audiences here,
and we have to be clear which audiences we are prioritizing. We
can have that debate when the time is right. <br>
<br>
So, we'll get there, but we're waiting for one or two more bits
of language evolution to give us the substrate for the API that
feels right. <br>
<br>
Hope this helps,<br>
-Brian<br>
<br>
</font></font><br>
<div class="moz-cite-prefix">On 12/15/2022 3:30 PM, Ethan McCue
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CA+NR86hGVP2F948jFH3qo7fDQsc-W6HP-1YK6LVSCJTL5yQ+mA@mail.gmail.com">
<div dir="ltr"><font face="monospace">I'm writing this to drive
some forward motion and to nerd-snipe those who know better
than I do into putting their thoughts into words.<br>
<br>
There are three ways to process JSON[1]<br>
- Streaming (Push or Pull)<br>
- Traversing a Tree (Realized or Lazy)<br>
- Declarative Databind (N ways)<br>
<br>
Of these, JEP-198 explicitly ruled out providing "JAXB style
type safe data binding."<br>
<br>
No justification is given, but if I had to insert my own:
mapping the Json model to/from the Java/JVM object model is a
cursed combo of<br>
- Huge possible design space<br>
- Unpalatably large surface for backwards compatibility<br>
- Serialization! Boo![2]<br>
<br>
So for an artifact like the JDK, it probably doesn't make
sense to include. That tracks.<br>
It won't make everyone happy, people like databind APIs, but
it tracks.<br>
<br>
So for the "read flow" these are the things to figure out.<br>
<br>
| Should Provide? | Intended User(s) |<br>
----------------+-----------------+------------------+<br>
Streaming Push | | |<br>
----------------+-----------------+------------------+<br>
Streaming Pull | | |<br>
----------------+-----------------+------------------+<br>
Realized Tree | | |<br>
----------------+-----------------+------------------+<br>
Lazy Tree | | |<br>
----------------+-----------------+------------------+<br>
<br>
At which point, we should talk about what "meets needs of Java
developers using JSON" implies.<br>
<br>
JSON is ubiquitous. Most kinds of software us schmucks write
could have a reason to interact with it.<br>
The full set of "user personas" therefore aren't practical for
me to talk about.[3]<br>
<br>
JSON documents, however, are not so varied.<br>
<br>
- There are small ones (1-10kb)<br>
- There are medium ones (10-1000kb)<br>
- There are big ones (1000kb-???)<br>
<br>
- There are shallow ones<br>
- There are deep ones<br>
<br>
So that feels like an easier direction to talk about it from.<br>
<br>
<br>
This repo[4] has some convenient toy examples of how some of
those APIs look in libraries<br>
in the ecosystem. Specifically the Streaming Pull and Realized
Tree models.<br>
<br>
User r = new User();<br>
while (true) {<br>
JsonToken token = reader.peek();<br>
switch (token) {<br>
case BEGIN_OBJECT:<br>
reader.beginObject();<br>
break;<br>
case END_OBJECT:<br>
reader.endObject();<br>
return r;<br>
case NAME:<br>
String fieldname = reader.nextName();<br>
switch (fieldname) {<br>
case "id":<br>
r.setId(reader.nextString());<br>
break;<br>
case "index":<br>
r.setIndex(reader.nextInt());<br>
break;<br>
...<br>
case "friends":<br>
r.setFriends(new
ArrayList<>());<br>
Friend f = null;<br>
carryOn = true;<br>
while (carryOn) {<br>
token = reader.peek();<br>
switch (token) {<br>
case BEGIN_ARRAY:<br>
reader.beginArray();<br>
break;<br>
case END_ARRAY:<br>
reader.endArray();<br>
carryOn = false;<br>
break;<br>
case BEGIN_OBJECT:<br>
reader.beginObject();<br>
f = new Friend();<br>
break;<br>
case END_OBJECT:<br>
reader.endObject();<br>
r.getFriends().add(f);<br>
break;<br>
case NAME:<br>
String fn =
reader.nextName();<br>
switch (fn) {<br>
case "id":<br>
f.setId(reader.nextString());<br>
break;<br>
case "name":<br>
f.setName(reader.nextString());<br>
break;<br>
}<br>
break;<br>
}<br>
}<br>
break;<br>
}<br>
}<br>
<br>
I think its not hard to argue that the streaming apis are
brutalist. The above is Gson, but Jackson, moshi, etc<br>
seem at least morally equivalent.<br>
<br>
Its hard to write, hard to write *correctly*, and theres is a
curious protensity towards pairing it<br>
with anemic, mutable models.<br>
<br>
That being said, it handles big documents and deep documents
really well. It also performs<br>
pretty darn well and is good enough as a "fallback" when the
intended user experience <br>
is through something like databind.<br>
<br>
So what could we do meaningfully better with the language we
have today/will have tommorow?<br>
<br>
- Sealed interfaces + Pattern matching could give a nicer
model for tokens<br>
<br>
sealed interface JsonToken {<br>
record Field(String name) implements JsonToken {}<br>
record BeginArray() implements JsonToken {}<br>
record EndArray() implements JsonToken {}<br>
record BeginObject() implements JsonToken {}<br>
record EndObject() implements JsonToken {}<br>
// ...<br>
}<br>
<br>
// ...<br>
<br>
User r = new User();<br>
while (true) {<br>
JsonToken token = reader.peek();<br>
switch (token) {<br>
case BeginObject __:<br>
reader.beginObject();<br>
break;<br>
case EndObject __:<br>
reader.endObject();<br>
return r;<br>
case Field("id"):<br>
r.setId(reader.nextString());<br>
break;<br>
case Field("index"):<br>
r.setIndex(reader.nextInt());<br>
break;<br>
<br>
// ...<br>
<br>
case Field("friends"):<br>
r.setFriends(new ArrayList<>());<br>
Friend f = null;<br>
carryOn = true;<br>
while (carryOn) {<br>
token = reader.peek();<br>
switch (token) {<br>
// ...<br>
<br>
- Value classes can make it all more efficient<br>
<br>
sealed interface JsonToken {<br>
value record Field(String name) implements
JsonToken {}<br>
value record BeginArray() implements JsonToken {}<br>
value record EndArray() implements JsonToken {}<br>
value record BeginObject() implements JsonToken {}<br>
value record EndObject() implements JsonToken {}<br>
// ...<br>
}<br>
<br>
- (Fun One) We can transform a simpler-to-write push parser
into a pull parser with Coroutines<br>
<br>
This is just a toy we could play with while making
something in the JDK. I'm pretty sure<br>
we could make a parser which feeds into something like<br>
<br>
interface Listener {<br>
void onObjectStart();<br>
void onObjectEnd();<br>
void onArrayStart();<br>
void onArrayEnd();<br>
void onField(String name);<br>
// ...<br>
}<br>
<br>
and invert a loop like<br>
<br>
while (true) {<br>
char c = next();<br>
switch (c) {<br>
case '{':<br>
listener.onObjectStart();<br>
// ...<br>
// ...<br>
}<br>
}<br>
<br>
by putting a Coroutine.yield in the callback.<br>
<br>
That might be a meaningful simplification in code
structure, I don't know enough to say.<br>
<br>
But, I think there are some hard questions like<br>
<br>
- Is the intent[5] to be make backing parser for ecosystem
databind apis?<br>
- Is the intent that users who want to handle big/deep
documents fall back to this?<br>
- Are those new language features / conveniences enough to
offset the cost of committing to a new api?<br>
- To whom exactly does a low level api provide value?<br>
- What benefit is standardization in the JDK?<br>
<br>
and just generally - who would be the consumer(s) of this?<br>
<br>
The other kind of API still on the table is a Tree. There are
two ways to handle this<br>
<br>
1. Load it into `Object`. Use a bunch of instanceof
checks/casts to confirm what it actually is.<br>
<br>
Object v;<br>
User u = new User();<br>
<br>
if ((v = jso.get("id")) != null) {<br>
u.setId((String) v);<br>
}<br>
if ((v = jso.get("index")) != null) {<br>
u.setIndex(((Long) v).intValue());<br>
}<br>
if ((v = jso.get("guid")) != null) {<br>
u.setGuid((String) v);<br>
}<br>
if ((v = jso.get("isActive")) != null) {<br>
u.setIsActive(((Boolean) v));<br>
}<br>
if ((v = jso.get("balance")) != null) {<br>
u.setBalance((String) v);<br>
}<br>
// ...<br>
if ((v = jso.get("latitude")) != null) {<br>
u.setLatitude(v instanceof BigDecimal ?
((BigDecimal) v).doubleValue() : (Double) v);<br>
}<br>
if ((v = jso.get("longitude")) != null) {<br>
u.setLongitude(v instanceof BigDecimal ?
((BigDecimal) v).doubleValue() : (Double) v);<br>
}<br>
if ((v = jso.get("greeting")) != null) {<br>
u.setGreeting((String) v);<br>
}<br>
if ((v = jso.get("favoriteFruit")) != null) {<br>
u.setFavoriteFruit((String) v);<br>
}<br>
if ((v = jso.get("tags")) != null) {<br>
List<Object> jsonarr = (List<Object>)
v;<br>
u.setTags(new ArrayList<>());<br>
for (Object vi : jsonarr) {<br>
u.getTags().add((String) vi);<br>
}<br>
}<br>
if ((v = jso.get("friends")) != null) {<br>
List<Object> jsonarr = (List<Object>)
v;<br>
u.setFriends(new ArrayList<>());<br>
for (Object vi : jsonarr) {<br>
Map<String, Object> jso0 =
(Map<String, Object>) vi;<br>
Friend f = new Friend();<br>
f.setId((String) jso0.get("id"));<br>
f.setName((String) jso0.get("name"));<br>
u.getFriends().add(f);<br>
}<br>
}<br>
<br>
2. Have an explicit model for Json, and helper methods that do
said casts[6]<br>
<br>
<br>
this.setSiteSetting(readFromJson(jsonObject.getJsonObject("site")));<br>
JsonArray groups = jsonObject.getJsonArray("group");<br>
if(groups != null)<br>
{<br>
int len = groups.size();<br>
for(int i=0; i<len; i++)<br>
{<br>
JsonObject grp = groups.getJsonObject(i);<br>
SNMPSetting grpSetting = readFromJson(grp);<br>
String grpName = grp.getString("dbgroup", null);<br>
if(grpName != null && grpSetting != null)<br>
this.groupSettings.put(grpName, grpSetting);<br>
}<br>
}<br>
JsonArray hosts = jsonObject.getJsonArray("host");<br>
if(hosts != null)<br>
{<br>
int len = hosts.size();<br>
for(int i=0; i<len; i++)<br>
{<br>
JsonObject host = hosts.getJsonObject(i);<br>
SNMPSetting hostSetting = readFromJson(host);<br>
String hostName = host.getString("dbhost", null);<br>
if(hostName != null && hostSetting != null)<br>
this.hostSettings.put(hostName, hostSetting);<br>
}<br>
}<br>
<br>
I think what has become easier to represent in the language
nowadays is that explicit model for Json.<br>
Its the 101 lesson of sealed interfaces.[7] It feels nice and
clean.<br>
<br>
sealed interface Json {<br>
final class Null implements Json {}<br>
final class True implements Json {}<br>
final class False implements Json {}<br>
final class Array implements Json {}<br>
final class Object implements Json {}<br>
final class String implements Json {}<br>
final class Number implements Json {}<br>
}<br>
<br>
And the cast-and-check approach is now more viable on account
of pattern matching.<br>
<br>
if (jso.get("id") instanceof String v) {<br>
u.setId(v);<br>
}<br>
if (jso.get("index") instanceof Long v) {<br>
u.setIndex(v.intValue());<br>
}<br>
if (jso.get("guid") instanceof String v) {<br>
u.setGuid(v);<br>
}<br>
<br>
// or <br>
<br>
if (jso.get("id") instanceof String id &&<br>
jso.get("index") instanceof Long index
&&<br>
jso.get("guid") instanceof String guid) {<br>
return new User(id, index, guid, ...); // look ma,
no setters!<br>
}<br>
<br>
<br>
And on the horizon, again, is value types.<br>
<br>
But there are problems with this approach beyond the
performance implications of loading into<br>
a tree.<br>
<br>
For one, all the code samples above have different behaviors
around null keys and missing keys<br>
that are not obvious from first glance.<br>
<br>
This won't accept any null or missing fields<br>
<br>
if (jso.get("id") instanceof String id &&<br>
jso.get("index") instanceof Long index
&&<br>
jso.get("guid") instanceof String guid) {<br>
return new User(id, index, guid, ...);<br>
}<br>
<br>
This will accept individual null or missing fields, but also
will silently ignore <br>
fields with incorrect types<br>
<br>
if (jso.get("id") instanceof String v) {<br>
u.setId(v);<br>
}<br>
if (jso.get("index") instanceof Long v) {<br>
u.setIndex(v.intValue());<br>
}<br>
if (jso.get("guid") instanceof String v) {<br>
u.setGuid(v);<br>
}<br>
<br>
And, compared to databind where there is information about the
expected structure of the document<br>
and its the job of the framework to assert that, I posit that
the errors that would be encountered<br>
when writing code against this would be more like <br>
<br>
"something wrong with user" <br>
<br>
than <br>
<br>
"problem at users[5].name, expected string or null. got 5"<br>
<br>
Which feels unideal.<br>
<br>
<br>
One approach I find promising is something close to what Elm
does with its decoders[8]. Not just combining assertion<br>
and binding like what pattern matching with records allows,
but including a scheme for bubbling/nesting errors.<br>
<br>
static String string(Json json) throws
JsonDecodingException {<br>
if (!(json instanceof Json.String jsonString)) {<br>
throw JsonDecodingException.of(<br>
"expected a string",<br>
json<br>
);<br>
} else {<br>
return jsonString.value();<br>
}<br>
}<br>
<br>
static <T> T field(Json json, String fieldName,
Decoder<? extends T> valueDecoder) throws
JsonDecodingException {<br>
var jsonObject = object(json);<br>
var value = jsonObject.get(fieldName);<br>
if (value == null) {<br>
throw JsonDecodingException.atField(<br>
fieldName,<br>
JsonDecodingException.of(<br>
"no value for field",<br>
json<br>
)<br>
);<br>
}<br>
else {<br>
try {<br>
return valueDecoder.decode(value);<br>
} catch (JsonDecodingException e) {<br>
throw JsonDecodingException.atField(<br>
fieldName,<br>
e<br>
);<br>
} catch (Exception e) {<br>
throw JsonDecodingException.atField(fieldName,
JsonDecodingException.of(e, value));<br>
}<br>
}<br>
}<br>
<br>
Which I think has some benefits over the ways I've seen of
working with trees.<br>
<br>
<br>
<br>
- It is declarative enough that folks who prefer databind
might be happy enough.<br>
<br>
static User fromJson(Json json) {<br>
return new User(<br>
Decoder.field(json, "id", Decoder::string),<br>
Decoder.field(json, "index", Decoder::long_),<br>
Decoder.field(json, "guid", Decoder::string),<br>
);<br>
}<br>
<br>
/ ...<br>
<br>
List<User> users = Decoders.array(json,
User::fromJson);<br>
<br>
- Handling null and optional fields could be less easily
conflated<br>
<br>
Decoder.field(json, "id", Decoder::string);<br>
<br>
Decoder.nullableField(json, "id", Decoder::string);<br>
<br>
Decoder.optionalField(json, "id", Decoder::string);<br>
<br>
Decoder.optionalNullableField(json, "id",
Decoder::string);<br>
<br>
<br>
- It composes well with user defined classes<br>
<br>
record Guid(String value) {<br>
Guid {<br>
// some assertions on the structure of value<br>
}<br>
}<br>
<br>
Decoder.string(json, "guid", guid -> new
Guid(Decoder.string(guid)));<br>
<br>
// or even<br>
<br>
record Guid(String value) {<br>
Guid {<br>
// some assertions on the structure of value<br>
}<br>
<br>
static Guid fromJson(Json json) {<br>
return new Guid(Decoder.string(guid));<br>
}<br>
}<br>
<br>
Decoder.string(json, "guid", Guid::fromJson);<br>
<br>
<br>
- When something goes wrong, the API can handle the fiddlyness
of capturing information for feedback.<br>
<br>
In the code I've sketched out its just what field/index
things went wrong at. Potentially<br>
capturing metadata like row/col numbers of the source
would be sensible too.<br>
<br>
Its just not reasonable to expect devs to do extra work to
get that and its really nice to give it.<br>
<br>
There are also some downsides like<br>
<br>
- I do not know how compatible it would be with lazy trees.<br>
<br>
Lazy trees being the only way that a tree api could
handle big or deep documents.<br>
The general concept as applied in libraries like
json-tree[9] is to navigate without<br>
doing any work, and that clashes with wanting to
instanceof check the info at the<br>
current path.<br>
<br>
- It *almost* gives enough information to be a general schema
approach<br>
<br>
If one field fails, that in the model throws an exception
immediately. If an API should<br>
return "errors": [...], that is inconvenient to construct.<br>
<br>
- None of the existing popular libraries are doing this<br>
<br>
The only mechanics that are strictly required to give
this sort of API is lambdas. Those have<br>
been out for a decade. Yes sealed interfaces make the
data model prettier but in concept you<br>
can build the same thing on top of anything.<br>
<br>
I could argue that this is because of "cultural momentum"
of databind or some other reason,<br>
but the fact remains that it isn't a proven out approach.<br>
<br>
Writing Json libraries is a todo list[10]. There are a
lot of bad ideas and this might be one of the,<br>
<br>
- Performance impact of so many instanceof checks<br>
<br>
I've gotten a 4.2% slowdown compared to the "regular" tree
code without the repeated casts.<br>
<br>
But that was with a parser that is 5x slower than
Jacksons. (using the same benchmark project as for the
snippets).<br>
I think there could be reason to believe that the JIT does
well enough with repeated instanceof<br>
checks to consider it.<br>
<br>
<br>
My current thinking is that - despite not solving for large or
deep documents - starting with a really "dumb" realized tree
api<br>
might be the right place to start for the read side of a
potential incubator module.<br>
<br>
But regardless - this feels like a good time to start more
concrete conversations. I fell I should cap this email since
I've reached the point of decoherence and haven't even
mentioned the write side of things<br>
<br>
<br>
<br>
<br>
[1]: <a href="http://www.cowtowncoder.com/blog/archives/2009/01/entry_131.html" moz-do-not-send="true" class="moz-txt-link-freetext">http://www.cowtowncoder.com/blog/archives/2009/01/entry_131.html</a><br>
[2]: <a href="https://security.snyk.io/vuln/maven?search=jackson-databind" moz-do-not-send="true" class="moz-txt-link-freetext">https://security.snyk.io/vuln/maven?search=jackson-databind</a><br>
[3]: I only know like 8 people<br>
[4]: <a href="https://github.com/fabienrenaud/java-json-benchmark/blob/master/src/main/java/com/github/fabienrenaud/jjb/stream/UsersStreamDeserializer.java" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/fabienrenaud/java-json-benchmark/blob/master/src/main/java/com/github/fabienrenaud/jjb/stream/UsersStreamDeserializer.java</a><br>
[5]: When I say "intent", I do so knowing full well no one has
been actively thinking of this for an entire Game of Thrones <br>
[6]: <a href="https://github.com/yahoo/mysql_perf_analyzer/blob/master/myperf/src/main/java/com/yahoo/dba/perf/myperf/common/SNMPSettings.java" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/yahoo/mysql_perf_analyzer/blob/master/myperf/src/main/java/com/yahoo/dba/perf/myperf/common/SNMPSettings.java</a><br>
[7]: <a href="https://www.infoq.com/articles/data-oriented-programming-java/" moz-do-not-send="true" class="moz-txt-link-freetext">https://www.infoq.com/articles/data-oriented-programming-java/</a><br>
[8]: <a href="https://package.elm-lang.org/packages/elm/json/latest/Json-Decode" moz-do-not-send="true" class="moz-txt-link-freetext">https://package.elm-lang.org/packages/elm/json/latest/Json-Decode</a><br>
[9]: <a href="https://github.com/jbee/json-tree" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/jbee/json-tree</a><br>
[10]: <a href="https://stackoverflow.com/a/14442630/2948173" moz-do-not-send="true" class="moz-txt-link-freetext">https://stackoverflow.com/a/14442630/2948173</a><br>
[11]: In 30 days JEP-198 it will be recognizably PI days old
for the 2nd time in its history.<br>
[12]: To me, the fact that is still an open JEP is more a
social convenience than anything. I could just as easily
writing this exact same email about TOML. </font><br>
</div>
</blockquote>
<br>
</body>
</html>