RFR: 8301065: Handle control characters in java_lang_String::print [v2]
David Holmes
dholmes at openjdk.org
Mon Apr 17 05:16:38 UTC 2023
On Fri, 14 Apr 2023 20:21:34 GMT, Kevin Walls <kevinw at openjdk.org> wrote:
>> Change to avoid printing raw control characters in java_lang_String::print.
>> Usually called in debug printing and error reporting. One usage from debug logging for modules.
>>
>> Format as \x followed by two hex digits, e.g. \x0A
>>
>> Small change, in a routine with few callers. Could make two calls to value->byte_at(), which I left in as it reads more clearly, and this is not called at time critical situations.
>>
>> The error reporting and debug.cpp usages I can test manually, with some trial and error, trying to catch the register info containing relevant info. This same String printing routine is used for showing register contents or stack slot mappings, and for a String, or a class containing a String.
>>
>> Before the change: (newlines and null embedded in String)
>>
>> R11=0x000000011f74e568 is an oop: java.lang.String
>> {0x000000011f74e568} - klass: 'java/lang/String'
>> - string: "this is my string
>> ^@and also: this is more of my string
>> "
>> - ---- fields (total size 3 words):
>> - private 'hash' 'I' @12 0 (0x00000000)
>>
>>
>>
>> With the change:
>>
>> {0x000000011faf5bf0} - klass: 'Test'
>> - ---- fields (total size 3 words):
>> - private 'lastChar' 'C' @12 10 (0x000a)
>> - private 'myString' 'Ljava/lang/String;' @16 "this is my string\x0A\x00and also: this is more of my string\x0A"{0x000000011faf5ca8} (0x23f5eb95)
>> R15=0x00007f928002a260 is a thread
>
> Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - update: get char or byte for control char check
> - Merge branch 'master' into escape_objectprinting
> - 8301065: Handle control characters in java_lang_String::print
Escaping the control characters this way seems quite reasonable to me. I assume we never intentionally include them, but if tracking down a bug we might be printing a corrupt string and so this aids in showing the actual character content. Arguably in such a situation all non-printable characters should be escaped, so there could be a second RFE to handle that generalization.
Thanks.
-------------
Marked as reviewed by dholmes (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/12190#pullrequestreview-1387261325
More information about the hotspot-runtime-dev
mailing list