RFR: 8330532: Improve line-oriented text parsing in HotSpot

John R Rose jrose at openjdk.org
Mon Apr 22 22:06:29 UTC 2024


On Mon, 22 Apr 2024 20:52:13 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text)
>> 
>> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example:
>> 
>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169
>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066
>> 
>> Common problems:
>> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors.
>> - There's ad-hoc code that deals with `FILE*` differently than from memory.
>> 
>> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs.
>> 
>> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it.
>> 
>> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.)
>> 
>> Credit:
>> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html .
>> 
>> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR).
>
> src/hotspot/share/utilities/istream.hpp line 150:
> 
>> 148:     assert(_buffer_size == 0 || _next <= _buffer_size, "");
>> 149:     return true;
>> 150:   }
> 
> Please add message, even if only "invariant".

No, that’s not necessary.  There are many, many empty assert strings in HotSpot.  If there’s no message, it means “check the code logic here”.  You don’t need to say “invariant” or “sanity” or “must be” as a redundant means of conveying that message.  Although, some authors do this.  But if there are a long string of asserts, saying “invariant” that many times is simply noise.

> src/hotspot/share/utilities/istream.hpp line 207:
> 
>> 205:       const_cast<inputStream*>(this)->fill_buffer();
>> 206:     }
>> 207:   }
> 
> Why `const_cast` and assign this method `const` when it clearly is not? This shouldn't be `const`, is my point.

The method is const because the logical state of the stream is invariant, as visible to the API user.  If the implementation needs an invisible internal state change, it needs a const-cast (or mutable field, sometimes, will work).  If `preload` were made non-const as you suggest, we’d need to move the const-casting elsewhere, and it would be less clear that `preload` preserves API-visible state.  So the code, as it is, is the most convenient place to put the const-cast, as an internal implementation decision.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575402794
PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575401801


More information about the hotspot-dev mailing list