RFR: 8274329: Fix non-portable HotSpot code in MethodMatcher::parse_method_pattern

Ioi Lam iklam at openjdk.java.net
Thu Sep 30 16:23:33 UTC 2021


On Thu, 30 Sep 2021 08:31:51 GMT, Jie Fu <jiefu at openjdk.org> wrote:

> > `RANGEBASE` was added by [JDK-6500501](https://bugs.openjdk.java.net/browse/JDK-6500501) and later was modified by [JDK-8027829](https://bugs.openjdk.java.net/browse/JDK-8027829)
> > Note the original comment from 6500501:
> > ```
> > // The characters allowed in a class or method name.  All characters > 0x7f
> > // are allowed in order to handle obfuscated class files (e.g. Volano)
> > ```
> 
> Thanks @vnkozlov for your very helpful comments.
> 
> I have one question: how can we specify (non-ascii chars) and (non-printable ascii chars) through `-XX:CompileCommand`?
> 
> I just learned from https://bugs.openjdk.java.net/browse/JDK-8027829 that we can use unicode like `\uxxxx`. But it doesn't work in my experiments.
> 
> My example was made from: https://bugs.openjdk.java.net/secure/attachment/17128/UnicodeIdentifierTest.java
> 
> ```
> public class UnicodeIdentifierTest {
>     public static void main(String args[]) {
>         System.out.println("Can I use \\u0001 in identifier name? " +
>                            (Character.isJavaIdentifierPart(1) ? "yes" : "no"));
>         for (int i = 0; i < 100000; i++ )
>         methodWithUnicode\u0001Char();
> 
>         System.out.println("Can I use \\u00aa in identifier name? " +
>                            (Character.isJavaIdentifierPart(0xaa) ? "yes" : "no"));
>         for (int i = 0; i < 100000; i++ )
>         methodWithUnicode\u00aaChar();
> 
>         System.out.println("Can I use \\u006b in identifier name? " +
>                            (Character.isJavaIdentifierPart(0x6b) ? "yes" : "no"));
>         for (int i = 0; i < 100000; i++ )
>         methodWithUnicode\u006bChar();
> 
>     }
>     public static int a = 0;
>     public static void methodWithUnicode\u0001Char() {
>         a++;
>     }
> 
>     public static void methodWithUnicode\u00aaChar() {
>         a++;
>     }
> 
>     public static void methodWithUnicode\u006bChar() {
>         a++;
>     }
> }
> ```
> 
> And I tried to exclude some specific methods like this
> 
> ```
> ${JDK}/bin/java \
>    -XX:+PrintCompilation \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest::methodWithUnicode\u0001Char"` \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest.methodWithUnicode\u0001Char"` \
>    -XX:CompileCommand=exclude,"UnicodeIdentifierTest.methodWithUnicode\u0001Char" \
>    -XX:CompileCommand=exclude,'UnicodeIdentifierTest.methodWithUnicode\u0001Char' \
>    -XX:CompileCommand=exclude,UnicodeIdentifierTest.methodWithUnicode\u0001Char \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest::methodWithUnicode\u00aaChar"` \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest.methodWithUnicode\u00aaChar"` \
>    -XX:CompileCommand=exclude,"UnicodeIdentifierTest.methodWithUnicode\u00aaChar" \
>    -XX:CompileCommand=exclude,'UnicodeIdentifierTest.methodWithUnicode\u00aaChar' \
>    -XX:CompileCommand=exclude,UnicodeIdentifierTest.methodWithUnicode\u00aaChar \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest::methodWithUnicode\u006bChar"` \
>    -XX:CompileCommand=exclude,`echo -e "UnicodeIdentifierTest.methodWithUnicode\u006bChar"` \
>    -XX:CompileCommand=exclude,"UnicodeIdentifierTest.methodWithUnicode\u006bChar" \
>    -XX:CompileCommand=exclude,'UnicodeIdentifierTest.methodWithUnicode\u006bChar' \
>    -XX:CompileCommand=exclude,UnicodeIdentifierTest.methodWithUnicode\u006bChar \
>    ${TEST}
> ```
> 
> But none of them worked.
> 
> So if there is no other way to specify a non-ascii chars, it seems safe to remove the non-ascii code.
> 
> If I miss something, please let me know. Thanks.

(The Chinese characters in this comment may not be displayed properly inside an e-mail reader. Please see this comment on GitHub https://github.com/openjdk/jdk/pull/5704)

-XX:CompileCommand does not process \uxxxx sequences. However, if your shell's locale is UTF8, you can do something like this, by directly entering them on the command-line, without escaping with \u: 


public class CJK {
    public static void main(String args[]) {
        \u722a\u54c7();
    }

    static void \u722a\u54c7() { // Chinese word for "Java"
        Thread.dumpStack();
    }
}
=======
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

$ javac CJK.java
$ java -Xcomp -XX:-BackgroundCompilation -XX:CompileCommand='compileonly,*::爪哇' -XX:+PrintCompilation -cp . CJK > log.txt
java.lang.Exception: Stack trace
	at java.base/java.lang.Thread.dumpStack(Thread.java:1380)
	at CJK.爪哇  (CJK.java:7)
	at CJK.main(CJK.java:3)
$ grep '^   ' log.txt
     53    1    b  3       CJK::\u722a\u54c7 (4 bytes)
     53    2    b  4       CJK::\u722a\u54c7 (4 bytes)
     53    1       3       CJK::\u722a\u54c7 (4 bytes)   made not entrant

-------------

PR: https://git.openjdk.java.net/jdk/pull/5704



More information about the build-dev mailing list