RFR: 8301065: Handle control characters in java_lang_String::print [v2]
Ioi Lam
iklam at openjdk.org
Mon Apr 17 04:39:46 UTC 2023
On Fri, 14 Apr 2023 20:16:18 GMT, Kevin Walls <kevinw at openjdk.org> wrote:
>> src/hotspot/share/classfile/javaClasses.cpp line 768:
>>
>>> 766: for (int index = 0; index < length; index++) {
>>> 767: jchar c = value->byte_at(index);
>>> 768: if (c < ' ') {
>>
>> `byte_at()` will give the wrong result if `is_latin1` is false.
>>
>> Maybe we should use `java_lang_String::as_quoted_ascii()` instead?
>
> At first I read that comment as pointing out that this will not print all characters. But that is the existing behaviour, the existing java_lang_String::print does not handle extended/non-ASCII characters, it just doesn't print them.
>
> This is called only at debug time, including hs_err reporting, so expect nobody yet thought that lack of the extended characters was a problem (if it is, maybe we should handle it in a separate change).
>
> But it should be better using the existing condition to fetch a char or byte, then check if that's a low/control character value.
>
> I looked at using java_lang_String::as_quoted_ascii(), and then we see escaped versions of extended characters, as well as control characters. At least we do see the extended characters, rather than nothing, but a concern might be that as_quoted_ascii makes an allocation, and this is called at debug time, maybe after a crash.
>
> I also looked at the UNICODE::as_utf8 conversion methods, they all require a resource area allocation at some point.
>
> Separately I tried a version which made an allocation with NEW_RESOURCE_ARRAY_RETURN_NULL (which as_quoted_ascii does) and called UNICODE::as_utf8 to get a utf8 string, then prints that (escaping the low-numbered characters). That works and shows extended characters, but the allocation might be a concern. I was thinking better to get this check done, and visit extended characters separately if there's interest.
This new version looks good to me. I agree that the extra allocation in UNICODE::as_utf8 might be a concern.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/12190#discussion_r1168150388
More information about the hotspot-runtime-dev
mailing list