SimpleIO in JEP draft 8323335

Brian Goetz brian.goetz at oracle.com
Mon Feb 19 17:06:51 UTC 2024


There's a reason there are so many opinions here: because the goals are 
in conflict.  Everyone wants simplicity, but people don't agree on what 
"simple" means.  (Cue the jokes about "I would simply not write programs 
with bugs.")

Yes, getting numbers from the user is a basic task.  But it is not, in 
any way, simple!  Because reading numbers from the input is invariably 
complected with discarding things that are "acceptably non-numbery" 
(e.g., whitespace), which is neither simple nor usually terribly well 
documented.  We've all encountered the problem in many language runtimes 
where reading a number using the "friendly way" leaves the input in a 
state that requires fixing or yields surprises for the next operation.

This is because reading a number from an input stream is not any sort of 
primitive; it is the composite of reading from the input, deciding what 
to skip, deciding when to stop reading, converting to another type, 
deciding what state to leave the input stream in, and deciding what to 
do if no number could be found (or if the number was too big to fit into 
an int, etc.)  This is not^3 simple!

C starts with a simple and principled answer, which is that the IO 
primitive is getchar() and putchar().  Reading or writing one character 
is unquestionably a primitive.  (But also, unless you are writing `cat`, 
no one wants to program with getchar and putchar, because it's too 
primitive.)

One can make a reasonable case for "write a line / read a line" being 
sensible primitives.  They are simple enough: no parsing, no deciding 
what to throw away, no possible errors other than EOF, it is clear what 
state you leave the stream in.  These may not be what the student wants, 
but they are primitives a student can deal with without having to 
understand parsing and error handling and statefulness yet.

     String s = getALine();
     printALine(s);

is a program every student can reason about.

But, it is true that dealing in strings, while honest and simple, is not 
always what the student wants.  But herein lies the strongest argument 
for not trying to reinvent Scanner here: the ability to read numbers 
makes the complexity of the problem, and hence of the API, much much 
bigger.  (Scanner was very well intentioned, and was not written by 
children, and yet none of us want to use it.  That's a sign that a 
one-size-fits-all magic input processing system is harder than it looks, 
and for something that is explicitly aimed at beginners, is a double 
warning sign.)

I could imagine someone suggesting "why don't you just add 
`readLineAsInt`".  But what would happen next?  Well, there would be a 
million requests (including from folks like Cay) of "you should add X", 
and then the result is a mishmash jumble of an API (that's already 
terrible), but worse, it's an onramp that leads to nowhere.  Once the 
user's needs are slightly more complicated, they are nowhere.

Remi has it absolutely right (yes, I really said that) with

> The classical program is:
>    input -> strings -> objects -> strings -> output

We do not do users a favor by blurring the distinction between "input -> 
string" and "string -> object", and because the latter is so much more 
open-ended than the former, the latter infects the former with its 
complexity if we try.

Is this simple API the most wonderful, be-all of APIs?  Of course not.  
But it is a sensible set of primitives that users can understand and 
*build on* in a transparent way.

Some teachers may immediately reach for teaching Integer::parseInt; 
that's a reasonable strategy, it exposes students to the questions of 
"what happens when preconditions fail", and the two compose just fine.  
But maybe you don't like Integer::parseInt for some reason.  Another way 
to teach this is to have them write it themselves.  This will expose 
them to all sorts of interesting questions (what about whitespace? what 
about double negatives?), but of course is also throwing in the deep end 
of the pool.  But SimpleIO::readMeALinePlease is agnostic; it works with 
both approaches.

Could the JDK use some better tools for parsing?  Sure; pattern matching 
has a role to play here, a `String::unformat` would be really cool, and 
I love parser combinators.  All of this can happen in the future, and 
none have the effect of making this API look like yet another white 
elephant like Scanner.  Because it focused purely on the basics.


On 2/19/2024 7:25 AM, Remi Forax wrote:
> I agree with Brian here,
> as a teacher, you have to talk about parsing and formatting, those 
> should not be hidden.
>
> The classical program is:
>    input -> strings -> objects -> strings -> output
>
> Rémi
>
> ------------------------------------------------------------------------
>
>     *From: *"Tagir Valeev" <amaembo at gmail.com>
>     *To: *"Cay Horstmann" <cay at horstmann.com>
>     *Cc: *"Brian Goetz" <brian.goetz at oracle.com>, "amber-dev"
>     <amber-dev at openjdk.org>
>     *Sent: *Monday, February 19, 2024 10:09:35 AM
>     *Subject: *Re: SimpleIO in JEP draft 8323335
>
>     I agree that simple methods to get numeric input are essential for
>     beginners. They should not be distracted with a complex ceremony.
>     Instead, they should be able to learn control flow statements and
>     simple algorithms as soon as possible, having a simple way to get
>     numbers from the user.
>     With best regards,
>     Tagir Valeev.
>
>     On Mon, Feb 19, 2024 at 9:10 AM Cay Horstmann <cay at horstmann.com>
>     wrote:
>
>         Yes, that's what I am saying. If scanners live in vain, stick
>         with a subset of the Console methods. Use its readLine. Make
>         it so that SimpleIO uses System.console(). And add print and
>         println to Console.
>
>         The JEP talks about being able to start programming without
>         having to know about static methods. How does a beginner read
>         a number? With Integer.parseInt(readLine(prompt))?
>
>         What about locales? Is print/println localized? Console.printf
>         is. If so, how are beginners from around the world supposed to
>         read localized numbers? With
>         NumberFormat.getInstance().parse(readLine(prompt))?
>
>         Adding localized readInt/readDouble to SimpleIO might do the
>         trick. Do they consume the trailing newline? (The equivalent
>         Scanner methods don't, which is definitely a sharp edge for
>         beginners.)
>
>         On 18/02/2024 23.08, Brian Goetz wrote:
>         > OK, so is this really just that that you are bikeshedding
>         the name?  Renaming `input` to `readLine`?
>         >
>         > This is a perfectly reasonable naming choice, of course, but
>         also, not what you suggested the first time around:
>         >
>         >  > ... "a third API" ...
>         >
>         >  > ... "there are two feasible directions" ...
>         >
>         > So what exactly are you suggesting?
>         >
>         >
>         >
>         > On 2/18/2024 5:03 PM, Cay Horstmann wrote:
>         >> Like I said, either the scanner methods or the console
>         methods are fine.
>         >>
>         >> I am of course aware of the utility/complexity of Scanner,
>         and can understand the motivation to have a simpler/feebler
>         behavior in SimpleIO. Like the one in Console.
>         >>
>         >> You don't have to "get a console". A SimpleIO.readLine
>         method can just invoke readLine on the system console.
>         >>
>         >> My objection is to add yet another "input" method into the
>         mix. "input" is weak. Does it read a token or the entire line?
>         Does it consume the newline? And if it does just what readLine
>         does, why another method name? Because "input" is three
>         characters fewer? Let's not count characters.
>         >>
>         >> On 18/02/2024 22.43, Brian Goetz wrote:
>         >>> I think you are counting characters and not counting concepts.
>         >>>
>         >>> Scanner has a ton of complexity in it that can easily trip
>         up beginners.  The main sin (though there are others) is that
>         input and parsing are complected (e.g., nextInt), which only
>         causes more problems (e.g., end of line issues.)   Reading
>         from the console is clearly a () -> String operation.  The
>         input() method does one thing, which is get a line of text. 
>         That's simple.
>         >>>
>         >>> Integer.parseInt (or, soon, patterns that match against
>         string and bind an int) also does one thing: convert a string
>         from int.  It may seem verbose to have to do both explicitly,
>         but it allows each of these operations to be simple, and it is
>         perfectly obvious what is going on. On the other hand, Scanner
>         is a world of complexity on its own.
>         >>>
>         >>> Console::readLine is nice, but first you have to get a
>         Console. ("Why can I print something without having to get
>         some magic helper object, but I can't do the same for
>         reading?")  What we're optimizing for here is conceptual
>         simplicity; the simplest possible input method is the inverse
>         of println.  The fact that input has to be validated is a fact
>         of life; we can treat validation separately from IO (and we
>         should), and it gets simpler when you do.
>         >>>
>         >>> On 2/18/2024 4:12 PM, Cay Horstmann wrote:
>         >>>> I would like to comment on the simplicity of
>         https://openjdk.org/jeps/8323335 for beginning students.
>         >>>>
>         >>>> I am the author of college texts for introductory
>         programming. Like other authors, I introduce the Scanner class
>         (and not Console) for reading user input. Given that students
>         already know about System.out, it is simpler to call
>         >>>>
>         >>>> System.out.print("How old are you? ");
>         >>>> int x = in.nextInt(); // in is a Scanner
>         >>>>
>         >>>> than
>         >>>>
>         >>>> int x = Integer.parseInt(console.readLine("How old are
>         you? "));
>         >>>>
>         >>>> or with the JEP draft:
>         >>>>
>         >>>> int x = Integer.parseInt(input("How old are you? "));
>         >>>>
>         >>>> Then again, having a prompt string is nice too, so I
>         could imagine using the Console API with Integer.parseInt and
>         Double.parseDouble, instead of Scanner.nextInt/nextDouble.
>         >>>>
>         >>>> But why have a third API, i.e. "input"?
>         >>>>
>         >>>> I think there are two feasible directions. Either embrace
>         the Scanner API and next/nextInt/nextDouble/nextLine, or the
>         Console API and readLine. Adding "input" into the mix is just
>         clutter, and ambiguous clutter at that. At least readLine
>         makes it clear that the entire line is consumed.
>         >>>>
>         >>>> Cheers,
>         >>>>
>         >>>> Cay
>         >>>>
>         >>>> --
>         >>>>
>         >>>> Cay S. Horstmann |
>         https://urldefense.com/v3/__http://horstmann.com__;!!ACWV5N9M2RV99hQ!IuXZk_tqIH8rEw1bD3uYb8UcIZF-nnoeFT3UG17pMO5EVXIYVRaAKi7XCq_T02HwnAek1wuV8Wed08w$
>         | mailto:cay at horstmann.com
>         >>>
>         >>
>         >
>
>         -- 
>
>         --
>
>         Cay S. Horstmann | http://horstmann.com
>         <https://urldefense.com/v3/__http://horstmann.com__;!!ACWV5N9M2RV99hQ!IZrLgaQxOHBjUURoC5mWbfsijev257bb4C0DMamUDpoGqS5JMACpaMKsbUNQlWcGds7fifmS9sARC6aKMHEf$>
>         | mailto:cay at horstmann.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20240219/d5b00461/attachment-0001.htm>


More information about the amber-dev mailing list