From maurizio.cimadamore at oracle.com Wed May 1 07:59:47 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 1 May 2019 08:59:47 +0100 Subject: [foreign] Supporting different versions of LLVM In-Reply-To: <65236dd243d6a0530d60816a9b92fc0b@xs4all.nl> References: <65236dd243d6a0530d60816a9b92fc0b@xs4all.nl> Message-ID: Hi Jorn, thanks for following up issues with the LLVM community. (another area where it would be nice to get some help would be in macro expression evaluation - there's an expression evaluation API but doesn't work on macros). I think that, as developers it would be relatively easy to update our versions of LLVM. The bigger issue would be in trying and adjusting our scripts for the EA builds - while on Linux we build from sources, I believe for other platforms we just use ready-made binary snapshots, so if we bump clang, these scripts would need to be updated too. When the time comes, I can ping Erik from the infra team and see if we can bump LLVM. Maurizio On 29/04/2019 12:44, Jorn Vernee wrote: > Hi, > > Over the weekend I have been looking into a fix for: > https://bugs.openjdk.java.net/browse/JDK-8223031 > > The underlying problem is that the behaviour of one of the libclang > function, namely `clang_Cursor_isAnonymous`, was changed to do > something very different [1], and I haven't been able to find a good > workaround that covers all needed cases. > > So, instead I've submitted a patch to clang to add a new function that > has the old behaviour [2], which is currently under review and seems > to be a-go for commit soon. With a small patch on our side we can make > our code work with the new function [3]. > > While I was at it I also submitted a patch for the incomplete array > problem I worked around before [4]. > > Hopefully these patches will make it into the next LLVM release. But, > it brings up the question of which LLVM versions should be supported > by us. Currently I'm locally using version 7.0.1 which works fine, and > version 8.0.0 is definitely not supported due to the > `clang_Cursor_isAnonymous` change. Once a version is released that has > [2], we can support that version as well, and with a version that > supports [4] we can also remove the incomplete array problem > workaround (which is pretty extensive). Though the latter would mean > dropping support for earlier LLVM versions as well. I think as long as > there's a binary distribution available this will not be a problem. We > can update the lib-clang autoconf script to check the LLVM version, > and warn about any that are not supported. > > How does that sound? > > Thanks, > Jorn > > [1] : https://reviews.llvm.org/D54996 > [2] : https://reviews.llvm.org/D61232 > [3] : > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223031/webrev.00/ > [4] : https://reviews.llvm.org/D61239 From maurizio.cimadamore at oracle.com Wed May 1 10:39:51 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 1 May 2019 11:39:51 +0100 Subject: [foreign] jextract issue with system headers Message-ID: <14fa63ac-7ec2-f18f-a837-f7feb6722c82@oracle.com> Hi, I was trying to rerun some examples I had with the new generation scheme. One of them has failed - it is a simple getpid test, that is, an extraction of the 'unistd.h' file: $ jextract -t stdlib -d classes /usr/include/unistd.h java.lang.RuntimeException: In memory compilation failed: /stdlib/x86_64_linux_gnu/bits/confname_h.java:2451: error: method _PC_LINK_MAX() is already defined in interface confname_h ??? public int _PC_LINK_MAX(); ?????????????? ^ /stdlib/x86_64_linux_gnu/bits/confname_h.java:2459: error: method _PC_MAX_CANON() is already defined in interface confname_h ??? public int _PC_MAX_CANON(); ?????????????? ^ /stdlib/x86_64_linux_gnu/bits/confname_h.java:2467: error: method _PC_MAX_INPUT() is already defined in interface confname_h ??? public int _PC_MAX_INPUT(); The output goes on with 300 errors. Turns out that jextract is generating duplicate constants, one coming from the? enum constant and one coming from a #define with same name - the pattern in the header is as follows: /* Values for the NAME argument to `pathconf' and `fpathconf'. */ enum ? { ??? _PC_LINK_MAX, #define _PC_LINK_MAX??????????????????? _PC_LINK_MAX ??? _PC_MAX_CANON, #define _PC_MAX_CANON?????????????????? _PC_MAX_CANON ??? _PC_MAX_INPUT, ... } Should jextract check for duplicates before emitting a constant with same name? Or should we use some kind of mangling to distinguish between the #defined constants and the enum ones? I think I'd rather see the former in the short term. Maurizio From jbvernee at xs4all.nl Wed May 1 10:59:42 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 01 May 2019 12:59:42 +0200 Subject: [foreign] jextract issue with system headers In-Reply-To: <14fa63ac-7ec2-f18f-a837-f7feb6722c82@oracle.com> References: <14fa63ac-7ec2-f18f-a837-f7feb6722c82@oracle.com> Message-ID: Interesting. I think the defines are for use with #ifdef Checking for duplicates seems like a good choice. I think discarding the macro constant, and keeping the one from the enum is fine in this case. Jorn Maurizio Cimadamore schreef op 2019-05-01 12:39: > Hi, > I was trying to rerun some examples I had with the new generation > scheme. One of them has failed - it is a simple getpid test, that is, > an extraction of the 'unistd.h' file: > > $ jextract -t stdlib -d classes /usr/include/unistd.h > java.lang.RuntimeException: In memory compilation failed: > /stdlib/x86_64_linux_gnu/bits/confname_h.java:2451: error: method > _PC_LINK_MAX() is already defined in interface confname_h > ??? public int _PC_LINK_MAX(); > ?????????????? ^ > /stdlib/x86_64_linux_gnu/bits/confname_h.java:2459: error: method > _PC_MAX_CANON() is already defined in interface confname_h > ??? public int _PC_MAX_CANON(); > ?????????????? ^ > /stdlib/x86_64_linux_gnu/bits/confname_h.java:2467: error: method > _PC_MAX_INPUT() is already defined in interface confname_h > ??? public int _PC_MAX_INPUT(); > > The output goes on with 300 errors. > > Turns out that jextract is generating duplicate constants, one coming > from the? enum constant and one coming from a #define with same name - > the pattern in the header is as follows: > > /* Values for the NAME argument to `pathconf' and `fpathconf'. */ > enum > ? { > ??? _PC_LINK_MAX, > #define _PC_LINK_MAX??????????????????? _PC_LINK_MAX > ??? _PC_MAX_CANON, > #define _PC_MAX_CANON?????????????????? _PC_MAX_CANON > ??? _PC_MAX_INPUT, > ... > } > > > Should jextract check for duplicates before emitting a constant with > same name? Or should we use some kind of mangling to distinguish > between the #defined constants and the enum ones? I think I'd rather > see the former in the short term. > > Maurizio From jbvernee at xs4all.nl Wed May 1 13:06:46 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 01 May 2019 15:06:46 +0200 Subject: [foreign] RFR 8223185: Consolidate jextract output logic in Writer Message-ID: <21f193d28b1fc46e5ee99c768747a904@xs4all.nl> Hi, I have been experimenting with a plugin based on jextract's new source code generation, for my earlier idea to generate LayoutType constants in various generated classes [1]. I ran into a problem using --src-dump-dir. Since the dumping is done inline in the JavaSourceFactory* classes this meant the the changes from the plugin were not being shown in the output. It seems good to move the source output logic to Writer, since then we can do the source output once and for all in Main, after the jextract run. I've also moved the JModWriter and JarWriter calls into Writer, so that we have 1 main interface for doing output (Writer). Please review the following: Bug: https://bugs.openjdk.java.net/browse/JDK-8223185 Webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.00/ Thanks, Jorn [1] : https://bugs.openjdk.java.net/browse/JDK-8220063 From henry.jen at oracle.com Wed May 1 19:43:23 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 1 May 2019 12:43:23 -0700 Subject: [foreign] RFR:8223105 Message-ID: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> Hi, Please review a webrev[1] for JDK-8223105[2]. This is an gcc extension[3](also supported by clang) and likely observed in system headers. I don?t find same feature for Windows, so I cannot have the test case for Windows. A different approach to enable alias is used for Windows(and is a portable way) using macro, but this is not the same and generated Java interface will have real name for it. Perhaps that can be a separate RFE to add alias when a macro definition is a single identifier match to a variable or function. Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8223105/0/webrev/ [2] https://bugs.openjdk.java.net/browse/JDK-8223105 [3] https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Asm-Labels.html From maurizio.cimadamore at oracle.com Thu May 2 10:27:37 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 11:27:37 +0100 Subject: [foreign] RFR:8223105 In-Reply-To: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> References: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> Message-ID: Overall looks good! Curious: why were all decls in DuplicateDeclarationHandler changed from List to ArrayList ? JavaSourceFactory: I'm not super convinced of the logic here - it seems like in some cases (e.g. functions) we build a layout with the right 'name' annotation on it and then we lean on that layout to extract the name info. In other cases, e.g. global variables, we leave the layout as is, and we then have to tweak it on the fly while generating the code. I believe it would be better/more uniform if the 'name' annotation would be injected on all layouts when the tree is created? If you go down that path, I believe that would make the addition of the VarTree::label method useless? FunctionTree: return label.isPresent() ? fn.withAnnotation("name", label.get()) : fn; This code could use the static NAME field in Layout. Maurizio On 01/05/2019 20:43, Henry Jen wrote: > Hi, > > Please review a webrev[1] for JDK-8223105[2]. This is an gcc extension[3](also supported by clang) and likely observed in system headers. > I don?t find same feature for Windows, so I cannot have the test case for Windows. > > A different approach to enable alias is used for Windows(and is a portable way) using macro, but this is not the same and generated Java interface will have real name for it. Perhaps that can be a separate RFE to add alias when a macro definition is a single identifier match to a variable or function. > > Cheers, > Henry > > > [1] http://cr.openjdk.java.net/~henryjen/panama/8223105/0/webrev/ > [2] https://bugs.openjdk.java.net/browse/JDK-8223105 > [3] https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Asm-Labels.html From maurizio.cimadamore at oracle.com Thu May 2 10:34:59 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 11:34:59 +0100 Subject: [foreign] RFR 8223185: Consolidate jextract output logic in Writer In-Reply-To: <21f193d28b1fc46e5ee99c768747a904@xs4all.nl> References: <21f193d28b1fc46e5ee99c768747a904@xs4all.nl> Message-ID: Overall I like it. The only thing I don't like much is that Main seems to be in control of whether classes are being compiled (see call to Writer::compileClasses). I think it would be better if that was hidden inside Writer itself - e.g. in order to write classfiles, jars or jmod, first the internals will have to call 'compileClasses'. That way, Main would be free from any ordering issue. Maurizio On 01/05/2019 14:06, Jorn Vernee wrote: > Hi, > > I have been experimenting with a plugin based on jextract's new source > code generation, for my earlier idea to generate LayoutType constants > in various generated classes [1]. > > I ran into a problem using --src-dump-dir. Since the dumping is done > inline in the JavaSourceFactory* classes this meant the the changes > from the plugin were not being shown in the output. > > It seems good to move the source output logic to Writer, since then we > can do the source output once and for all in Main, after the jextract > run. I've also moved the JModWriter and JarWriter calls into Writer, > so that we have 1 main interface for doing output (Writer). > > Please review the following: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223185 > Webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.00/ > > Thanks, > Jorn > > [1] : https://bugs.openjdk.java.net/browse/JDK-8220063 From maurizio.cimadamore at oracle.com Thu May 2 10:50:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 11:50:43 +0100 Subject: [foreign] does static forwarder generation has to depend on -l presence? Message-ID: Hi, as I was playing with some stdlib examples, I noted that jextract was not generating any static forwarder for me. I was then reminded that static forwarders are only emitted if some library is passed via the -l option. Is this too strict? After all, we have a 'default' Library that we can use as resolution context, which will pick up anything loaded by the VM (which also includes things in paths specified in sys variables like LD_LIBRARY_PATH). So, would it make sense to relax static forwarder generation so that, in the absence of -l, it will just use the default library? That would improve usability in certain cases. Maurizio From jbvernee at xs4all.nl Thu May 2 11:21:25 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 13:21:25 +0200 Subject: [foreign] RFR 8223185: Consolidate jextract output logic in Writer In-Reply-To: References: <21f193d28b1fc46e5ee99c768747a904@xs4all.nl> Message-ID: Thanks, I had my doubts about that myself. Updated webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.01/ FWIW; I thought the compilation was a pretty big side-effect to have, so it should rather be triggered explicitly. But, I didn't really have a strong opinion either way. Jorn Maurizio Cimadamore schreef op 2019-05-02 12:34: > Overall I like it. The only thing I don't like much is that Main seems > to be in control of whether classes are being compiled (see call to > Writer::compileClasses). I think it would be better if that was hidden > inside Writer itself - e.g. in order to write classfiles, jars or > jmod, first the internals will have to call 'compileClasses'. That > way, Main would be free from any ordering issue. > > Maurizio > > On 01/05/2019 14:06, Jorn Vernee wrote: >> Hi, >> >> I have been experimenting with a plugin based on jextract's new source >> code generation, for my earlier idea to generate LayoutType constants >> in various generated classes [1]. >> >> I ran into a problem using --src-dump-dir. Since the dumping is done >> inline in the JavaSourceFactory* classes this meant the the changes >> from the plugin were not being shown in the output. >> >> It seems good to move the source output logic to Writer, since then we >> can do the source output once and for all in Main, after the jextract >> run. I've also moved the JModWriter and JarWriter calls into Writer, >> so that we have 1 main interface for doing output (Writer). >> >> Please review the following: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8223185 >> Webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.00/ >> >> Thanks, >> Jorn >> >> [1] : https://bugs.openjdk.java.net/browse/JDK-8220063 From maurizio.cimadamore at oracle.com Thu May 2 11:27:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 12:27:39 +0100 Subject: [foreign] RFR 8223185: Consolidate jextract output logic in Writer In-Reply-To: References: <21f193d28b1fc46e5ee99c768747a904@xs4all.nl> Message-ID: Looks good! Yes, compilation is important - but I always look at the code from the perspective of: what would happen in 5 months when we update the code to do something else? The fact that we have an ordering dependency between Main and Writer is likely going to bite once we all forgot about this (and we will :-)). Maurizio On 02/05/2019 12:21, Jorn Vernee wrote: > Thanks, I had my doubts about that myself. > > Updated webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.01/ > > FWIW; I thought the compilation was a pretty big side-effect to have, > so it should rather be triggered explicitly. But, I didn't really have > a strong opinion either way. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-02 12:34: >> Overall I like it. The only thing I don't like much is that Main seems >> to be in control of whether classes are being compiled (see call to >> Writer::compileClasses). I think it would be better if that was hidden >> inside Writer itself - e.g. in order to write classfiles, jars or >> jmod, first the internals will have to call 'compileClasses'. That >> way, Main would be free from any ordering issue. >> >> Maurizio >> >> On 01/05/2019 14:06, Jorn Vernee wrote: >>> Hi, >>> >>> I have been experimenting with a plugin based on jextract's new >>> source code generation, for my earlier idea to generate LayoutType >>> constants in various generated classes [1]. >>> >>> I ran into a problem using --src-dump-dir. Since the dumping is done >>> inline in the JavaSourceFactory* classes this meant the the changes >>> from the plugin were not being shown in the output. >>> >>> It seems good to move the source output logic to Writer, since then >>> we can do the source output once and for all in Main, after the >>> jextract run. I've also moved the JModWriter and JarWriter calls >>> into Writer, so that we have 1 main interface for doing output >>> (Writer). >>> >>> Please review the following: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223185 >>> Webrev: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8223185/webrev.00/ >>> >>> Thanks, >>> Jorn >>> >>> [1] : https://bugs.openjdk.java.net/browse/JDK-8220063 From jbvernee at xs4all.nl Thu May 2 11:30:21 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 13:30:21 +0200 Subject: [foreign] does static forwarder generation has to depend on -l presence? In-Reply-To: References: Message-ID: I think there was some caution when they were added, but over time we have started relying on static forwarders more and more. I think it makes sense to always generate the static forwarders. The generation can always be turned off again using the command line option if needed. Jorn Maurizio Cimadamore schreef op 2019-05-02 12:50: > Hi, > as I was playing with some stdlib examples, I noted that jextract was > not generating any static forwarder for me. I was then reminded that > static forwarders are only emitted if some library is passed via the > -l option. > > Is this too strict? After all, we have a 'default' Library that we can > use as resolution context, which will pick up anything loaded by the > VM (which also includes things in paths specified in sys variables > like LD_LIBRARY_PATH). So, would it make sense to relax static > forwarder generation so that, in the absence of -l, it will just use > the default library? > > That would improve usability in certain cases. > > Maurizio From jbvernee at xs4all.nl Thu May 2 11:32:19 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Thu, 02 May 2019 11:32:19 +0000 Subject: hg: panama/dev: 8223185: Consolidate jextract output logic in Writer Message-ID: <201905021132.x42BWKTd011699@aojmv0008.oracle.com> Changeset: c462b33549f4 Author: jvernee Date: 2019-05-02 13:31 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/c462b33549f4 8223185: Consolidate jextract output logic in Writer Reviewed-by: mcimadamore ! src/jdk.jextract/share/classes/com/sun/tools/jextract/HeaderFile.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactory.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactoryExt.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JextractTool.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Options.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/resources/Messages.properties ! test/jdk/com/sun/tools/jextract/Runner.java From sundararajan.athijegannathan at oracle.com Thu May 2 11:53:54 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 17:23:54 +0530 Subject: [foreign] does static forwarder generation has to depend on -l presence? In-Reply-To: References: Message-ID: <5CCADA52.7000007@oracle.com> Default library on Windows is a bit costly, right? i.e., we have to walk all loaded modules? The API used is meant for debug use if I recall. -Sundar On 02/05/19, 5:00 PM, Jorn Vernee wrote: > I think there was some caution when they were added, but over time we > have started relying on static forwarders more and more. I think it > makes sense to always generate the static forwarders. The generation > can always be turned off again using the command line option if needed. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-02 12:50: >> Hi, >> as I was playing with some stdlib examples, I noted that jextract was >> not generating any static forwarder for me. I was then reminded that >> static forwarders are only emitted if some library is passed via the >> -l option. >> >> Is this too strict? After all, we have a 'default' Library that we can >> use as resolution context, which will pick up anything loaded by the >> VM (which also includes things in paths specified in sys variables >> like LD_LIBRARY_PATH). So, would it make sense to relax static >> forwarder generation so that, in the absence of -l, it will just use >> the default library? >> >> That would improve usability in certain cases. >> >> Maurizio From jbvernee at xs4all.nl Thu May 2 12:18:53 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 14:18:53 +0200 Subject: [foreign] does static forwarder generation has to depend on -l presence? In-Reply-To: <5CCADA52.7000007@oracle.com> References: <5CCADA52.7000007@oracle.com> Message-ID: Yes, we have to walk all modules loaded by the process. But, I imagine the internal implementation of RTLD_DEFAULT is pretty similar? The LD_LIBRARY_PATH behavior doesn't apply to Windows either, since it's an implementation detail of dlsym, so using the default library really only gives someone access to libraries loaded previously by some other mechanism, e.g. System.loadLibrary. I think it's not needed to worry about performance at this point. If better performance is needed the -l option can be used as well. Jorn Sundararajan Athijegannathan schreef op 2019-05-02 13:53: > Default library on Windows is a bit costly, right? i.e., we have to > walk all loaded modules? The API used is meant for debug use if I > recall. > > -Sundar > > On 02/05/19, 5:00 PM, Jorn Vernee wrote: >> I think there was some caution when they were added, but over time we >> have started relying on static forwarders more and more. I think it >> makes sense to always generate the static forwarders. The generation >> can always be turned off again using the command line option if >> needed. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-02 12:50: >>> Hi, >>> as I was playing with some stdlib examples, I noted that jextract was >>> not generating any static forwarder for me. I was then reminded that >>> static forwarders are only emitted if some library is passed via the >>> -l option. >>> >>> Is this too strict? After all, we have a 'default' Library that we >>> can >>> use as resolution context, which will pick up anything loaded by the >>> VM (which also includes things in paths specified in sys variables >>> like LD_LIBRARY_PATH). So, would it make sense to relax static >>> forwarder generation so that, in the absence of -l, it will just use >>> the default library? >>> >>> That would improve usability in certain cases. >>> >>> Maurizio From sundararajan.athijegannathan at oracle.com Thu May 2 12:59:08 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 18:29:08 +0530 Subject: is this the issue that you faced with latest llvm? Message-ID: <5CCAE99C.9060501@oracle.com> Hi Jorn, Jim Laskey is facing this issue -> https://bugs.openjdk.java.net/browse/JDK-8223238 Is this the same jextract issue you faced with the latest llvm? or something else? Thanks -Sundar From jbvernee at xs4all.nl Thu May 2 13:06:46 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 15:06:46 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <5CCAE99C.9060501@oracle.com> References: <5CCAE99C.9060501@oracle.com> Message-ID: <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> Hi Sundar, No, this looks like another issue. The issue I found did not cause a StackOverflowError. It only caused struct/union classes to not be generated. Cheers, Jorn Sundararajan Athijegannathan schreef op 2019-05-02 14:59: > Hi Jorn, > > Jim Laskey is facing this issue -> > https://bugs.openjdk.java.net/browse/JDK-8223238 > > Is this the same jextract issue you faced with the latest llvm? or > something else? > > Thanks > -Sundar From jbvernee at xs4all.nl Thu May 2 13:08:03 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 15:08:03 +0200 Subject: [foreign] does static forwarder generation has to depend on -l presence? In-Reply-To: References: <5CCADA52.7000007@oracle.com> Message-ID: <122d2185b5127e6dcca07007e3dd4ba0@xs4all.nl> Thinking some more about this; Note that the presence of static forwarders doesn't change the use of the default library. Currently, if -l is not specified, the default library will still be used by the binder for the header class if no libraries attribute is specified in NativeHeader. Alternatively, we could also do away with the native RTLD_DEFAULT and EnumProcessModules lookup and instead implement the default library behaviour on the Java side as a SymbolLookup instance that iterates a set of libraries e.g. taken from LD_LIBRARY_PATH, and perhaps others, and tries the symbol lookup in each one. This ensures the behaviour is more consistent across platforms. Jorn Jorn Vernee schreef op 2019-05-02 14:18: > Yes, we have to walk all modules loaded by the process. But, I imagine > the internal implementation of RTLD_DEFAULT is pretty similar? > > The LD_LIBRARY_PATH behavior doesn't apply to Windows either, since > it's an implementation detail of dlsym, so using the default library > really only gives someone access to libraries loaded previously by > some other mechanism, e.g. System.loadLibrary. > > I think it's not needed to worry about performance at this point. If > better performance is needed the -l option can be used as well. > > Jorn > > Sundararajan Athijegannathan schreef op 2019-05-02 13:53: >> Default library on Windows is a bit costly, right? i.e., we have to >> walk all loaded modules? The API used is meant for debug use if I >> recall. >> >> -Sundar >> >> On 02/05/19, 5:00 PM, Jorn Vernee wrote: >>> I think there was some caution when they were added, but over time we >>> have started relying on static forwarders more and more. I think it >>> makes sense to always generate the static forwarders. The generation >>> can always be turned off again using the command line option if >>> needed. >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-02 12:50: >>>> Hi, >>>> as I was playing with some stdlib examples, I noted that jextract >>>> was >>>> not generating any static forwarder for me. I was then reminded that >>>> static forwarders are only emitted if some library is passed via the >>>> -l option. >>>> >>>> Is this too strict? After all, we have a 'default' Library that we >>>> can >>>> use as resolution context, which will pick up anything loaded by the >>>> VM (which also includes things in paths specified in sys variables >>>> like LD_LIBRARY_PATH). So, would it make sense to relax static >>>> forwarder generation so that, in the absence of -l, it will just use >>>> the default library? >>>> >>>> That would improve usability in certain cases. >>>> >>>> Maurizio From sundararajan.athijegannathan at oracle.com Thu May 2 13:14:23 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 18:44:23 +0530 Subject: [foreign] does static forwarder generation has to depend on -l presence? In-Reply-To: <122d2185b5127e6dcca07007e3dd4ba0@xs4all.nl> References: <5CCADA52.7000007@oracle.com> <122d2185b5127e6dcca07007e3dd4ba0@xs4all.nl> Message-ID: <5CCAED2F.601@oracle.com> Maurizio convinced me offline ;) I'll do change static forwarder generation (generate regardless of -l) Thanks, -Sundar On 02/05/19, 6:38 PM, Jorn Vernee wrote: > Thinking some more about this; Note that the presence of static > forwarders doesn't change the use of the default library. Currently, > if -l is not specified, the default library will still be used by the > binder for the header class if no libraries attribute is specified in > NativeHeader. > > Alternatively, we could also do away with the native RTLD_DEFAULT and > EnumProcessModules lookup and instead implement the default library > behaviour on the Java side as a SymbolLookup instance that iterates a > set of libraries e.g. taken from LD_LIBRARY_PATH, and perhaps others, > and tries the symbol lookup in each one. This ensures the behaviour is > more consistent across platforms. > > Jorn > > Jorn Vernee schreef op 2019-05-02 14:18: >> Yes, we have to walk all modules loaded by the process. But, I imagine >> the internal implementation of RTLD_DEFAULT is pretty similar? >> >> The LD_LIBRARY_PATH behavior doesn't apply to Windows either, since >> it's an implementation detail of dlsym, so using the default library >> really only gives someone access to libraries loaded previously by >> some other mechanism, e.g. System.loadLibrary. >> >> I think it's not needed to worry about performance at this point. If >> better performance is needed the -l option can be used as well. >> >> Jorn >> >> Sundararajan Athijegannathan schreef op 2019-05-02 13:53: >>> Default library on Windows is a bit costly, right? i.e., we have to >>> walk all loaded modules? The API used is meant for debug use if I >>> recall. >>> >>> -Sundar >>> >>> On 02/05/19, 5:00 PM, Jorn Vernee wrote: >>>> I think there was some caution when they were added, but over time >>>> we have started relying on static forwarders more and more. I think >>>> it makes sense to always generate the static forwarders. The >>>> generation can always be turned off again using the command line >>>> option if needed. >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-02 12:50: >>>>> Hi, >>>>> as I was playing with some stdlib examples, I noted that jextract was >>>>> not generating any static forwarder for me. I was then reminded that >>>>> static forwarders are only emitted if some library is passed via the >>>>> -l option. >>>>> >>>>> Is this too strict? After all, we have a 'default' Library that we >>>>> can >>>>> use as resolution context, which will pick up anything loaded by the >>>>> VM (which also includes things in paths specified in sys variables >>>>> like LD_LIBRARY_PATH). So, would it make sense to relax static >>>>> forwarder generation so that, in the absence of -l, it will just use >>>>> the default library? >>>>> >>>>> That would improve usability in certain cases. >>>>> >>>>> Maurizio From sundararajan.athijegannathan at oracle.com Thu May 2 14:47:17 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 20:17:17 +0530 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option Message-ID: <5CCB02F5.6030500@oracle.com> Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ Thanks, -Sundar From henry.jen at oracle.com Thu May 2 15:09:52 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 2 May 2019 08:09:52 -0700 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <5CCB02F5.6030500@oracle.com> References: <5CCB02F5.6030500@oracle.com> Message-ID: I was thinking about this because I need static forwarder with default library. I think it?s better to have static forwarder initially based on if -l is provided, or explicit set to true assuming the default library should be enough for binding. Cheers, Henry diff -r c6aa368eeed0 -r 237718a86bbe src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:00:41 2019 -0700 +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:14:20 2019 -0700 @@ -117,11 +117,11 @@ } // generate static forwarder class if user specified -l option - boolean staticForwarder = true; + boolean staticForwarder = options.has("l"); if (options.has("static-forwarder")) { staticForwarder = (boolean)options.valueOf("static-forwarder"); } - builder.setGenStaticForwarder(staticForwarder && options.has("l")); + builder.setGenStaticForwarder(staticForwarder); boolean recordLibraryPath = options.has("record-library-path"); if (recordLibraryPath) { > On May 2, 2019, at 7:47 AM, Sundararajan Athijegannathan wrote: > > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 > Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ > > Thanks, > -Sundar From maurizio.cimadamore at oracle.com Thu May 2 15:17:18 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 16:17:18 +0100 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: References: <5CCB02F5.6030500@oracle.com> Message-ID: <3c04eeb9-abf0-246d-251a-1a5bee16a6cd@oracle.com> Honestly this looks good as is. Henry do you have some specific case in mind you'd like to prevent? If this is just 'fear of misuse', I suggest let's try it out and see where we land Maurizio On 02/05/2019 16:09, Henry Jen wrote: > I was thinking about this because I need static forwarder with default library. I think it?s better to have static forwarder initially based on if -l is provided, or explicit set to true assuming the default library should be enough for binding. > > Cheers, > Henry > > diff -r c6aa368eeed0 -r 237718a86bbe src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java > --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:00:41 2019 -0700 > +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:14:20 2019 -0700 > @@ -117,11 +117,11 @@ > } > > // generate static forwarder class if user specified -l option > - boolean staticForwarder = true; > + boolean staticForwarder = options.has("l"); > if (options.has("static-forwarder")) { > staticForwarder = (boolean)options.valueOf("static-forwarder"); > } > - builder.setGenStaticForwarder(staticForwarder && options.has("l")); > + builder.setGenStaticForwarder(staticForwarder); > > boolean recordLibraryPath = options.has("record-library-path"); > if (recordLibraryPath) { > > >> On May 2, 2019, at 7:47 AM, Sundararajan Athijegannathan wrote: >> >> Please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 >> Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ >> >> Thanks, >> -Sundar From jbvernee at xs4all.nl Thu May 2 15:23:07 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 17:23:07 +0200 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <5CCB02F5.6030500@oracle.com> References: <5CCB02F5.6030500@oracle.com> Message-ID: <0afefd02f2943575110d8fd20b996a41@xs4all.nl> Regarding the change in Messages.properties; For the Writer patch I just pushed, there were 3 `cannot.write.xyz.file` already. So, I replaced them with a `cannot.write.file` where the `xyz` is passed in as a format argument (and this allowed some minor simplification of the code in Main as well). It looks like I missed one use sites of `cannot.write.class.file` in JavaSourceFactory. Should this also use `cannot.write.file` and pass in "class" as the first formatting argument, or are more specific message formats preferred? Thanks, Jorn Sundararajan Athijegannathan schreef op 2019-05-02 16:47: > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 > Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ > > Thanks, > -Sundar From sundararajan.athijegannathan at oracle.com Thu May 2 15:37:58 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 21:07:58 +0530 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <0afefd02f2943575110d8fd20b996a41@xs4all.nl> References: <5CCB02F5.6030500@oracle.com> <0afefd02f2943575110d8fd20b996a41@xs4all.nl> Message-ID: <5CCB0ED6.3080707@oracle.com> I don't mind either way - I had to add resource to avoid a test failure. -Sundar On 02/05/19, 8:53 PM, Jorn Vernee wrote: > Regarding the change in Messages.properties; > > For the Writer patch I just pushed, there were 3 > `cannot.write.xyz.file` already. So, I replaced them with a > `cannot.write.file` where the `xyz` is passed in as a format argument > (and this allowed some minor simplification of the code in Main as well). > > It looks like I missed one use sites of `cannot.write.class.file` in > JavaSourceFactory. Should this also use `cannot.write.file` and pass > in "class" as the first formatting argument, or are more specific > message formats preferred? > > Thanks, > Jorn > > Sundararajan Athijegannathan schreef op 2019-05-02 16:47: >> Please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 >> Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ >> >> Thanks, >> -Sundar From sundararajan.athijegannathan at oracle.com Thu May 2 16:04:40 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 02 May 2019 21:34:40 +0530 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <5CCB0ED6.3080707@oracle.com> References: <5CCB02F5.6030500@oracle.com> <0afefd02f2943575110d8fd20b996a41@xs4all.nl> <5CCB0ED6.3080707@oracle.com> Message-ID: <5CCB1518.4030900@oracle.com> Updated: https://cr.openjdk.java.net/~sundar/8223247/webrev.01/ -Sundar On 02/05/19, 9:07 PM, Sundararajan Athijegannathan wrote: > I don't mind either way - I had to add resource to avoid a test failure. > > -Sundar > > On 02/05/19, 8:53 PM, Jorn Vernee wrote: >> Regarding the change in Messages.properties; >> >> For the Writer patch I just pushed, there were 3 >> `cannot.write.xyz.file` already. So, I replaced them with a >> `cannot.write.file` where the `xyz` is passed in as a format argument >> (and this allowed some minor simplification of the code in Main as >> well). >> >> It looks like I missed one use sites of `cannot.write.class.file` in >> JavaSourceFactory. Should this also use `cannot.write.file` and pass >> in "class" as the first formatting argument, or are more specific >> message formats preferred? >> >> Thanks, >> Jorn >> >> Sundararajan Athijegannathan schreef op 2019-05-02 16:47: >>> Please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 >>> Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ >>> >>> Thanks, >>> -Sundar From jbvernee at xs4all.nl Thu May 2 16:17:39 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 02 May 2019 18:17:39 +0200 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <5CCB1518.4030900@oracle.com> References: <5CCB02F5.6030500@oracle.com> <0afefd02f2943575110d8fd20b996a41@xs4all.nl> <5CCB0ED6.3080707@oracle.com> <5CCB1518.4030900@oracle.com> Message-ID: <48a8679b8178d9cdbfa42c41630bc4c5@xs4all.nl> Looks good! Cheers, Jorn Sundararajan Athijegannathan schreef op 2019-05-02 18:04: > Updated: https://cr.openjdk.java.net/~sundar/8223247/webrev.01/ > > -Sundar > > On 02/05/19, 9:07 PM, Sundararajan Athijegannathan wrote: >> I don't mind either way - I had to add resource to avoid a test >> failure. >> >> -Sundar >> >> On 02/05/19, 8:53 PM, Jorn Vernee wrote: >>> Regarding the change in Messages.properties; >>> >>> For the Writer patch I just pushed, there were 3 >>> `cannot.write.xyz.file` already. So, I replaced them with a >>> `cannot.write.file` where the `xyz` is passed in as a format argument >>> (and this allowed some minor simplification of the code in Main as >>> well). >>> >>> It looks like I missed one use sites of `cannot.write.class.file` in >>> JavaSourceFactory. Should this also use `cannot.write.file` and pass >>> in "class" as the first formatting argument, or are more specific >>> message formats preferred? >>> >>> Thanks, >>> Jorn >>> >>> Sundararajan Athijegannathan schreef op 2019-05-02 16:47: >>>> Please review. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 >>>> Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ >>>> >>>> Thanks, >>>> -Sundar From sundararajan.athijegannathan at oracle.com Thu May 2 16:26:53 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Thu, 02 May 2019 16:26:53 +0000 Subject: hg: panama/dev: 8223247: jextract should generate static forwarder regardless of -l option Message-ID: <201905021626.x42GQsBD011617@aojmv0008.oracle.com> Changeset: 5bd15b90f1bf Author: sundar Date: 2019-05-02 22:00 +0530 URL: http://hg.openjdk.java.net/panama/dev/rev/5bd15b90f1bf 8223247: jextract should generate static forwarder regardless of -l option Reviewed-by: mcimadamore, jvernee ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactory.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java ! test/jdk/com/sun/tools/jextract/BadBitfieldTest.java ! test/jdk/com/sun/tools/jextract/systemHeaders/SystemHeadersTest.java From henry.jen at oracle.com Thu May 2 16:55:45 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 2 May 2019 09:55:45 -0700 Subject: [foreign] RFR 8223247: jextract should generate static forwarder regardless of -l option In-Reply-To: <3c04eeb9-abf0-246d-251a-1a5bee16a6cd@oracle.com> References: <5CCB02F5.6030500@oracle.com> <3c04eeb9-abf0-246d-251a-1a5bee16a6cd@oracle.com> Message-ID: <0D400DFB-7B5C-4562-80E7-856794200AAC@oracle.com> Since static will try to bind, it must have a library. This is simply make the intention of using default library explicit as it seems we are preferring explicitness. Either way will work fine, I think. Just some may simply ?forgot? to make sure the library is needed. Cheers, Henry > On May 2, 2019, at 8:17 AM, Maurizio Cimadamore wrote: > > Honestly this looks good as is. Henry do you have some specific case in mind you'd like to prevent? > > If this is just 'fear of misuse', I suggest let's try it out and see where we land > > Maurizio > > On 02/05/2019 16:09, Henry Jen wrote: >> I was thinking about this because I need static forwarder with default library. I think it?s better to have static forwarder initially based on if -l is provided, or explicit set to true assuming the default library should be enough for binding. >> >> Cheers, >> Henry >> >> diff -r c6aa368eeed0 -r 237718a86bbe src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java >> --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:00:41 2019 -0700 >> +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java Wed May 01 23:14:20 2019 -0700 >> @@ -117,11 +117,11 @@ >> } >> >> // generate static forwarder class if user specified -l option >> - boolean staticForwarder = true; >> + boolean staticForwarder = options.has("l"); >> if (options.has("static-forwarder")) { >> staticForwarder = (boolean)options.valueOf("static-forwarder"); >> } >> - builder.setGenStaticForwarder(staticForwarder && options.has("l")); >> + builder.setGenStaticForwarder(staticForwarder); >> >> boolean recordLibraryPath = options.has("record-library-path"); >> if (recordLibraryPath) { >> >> >>> On May 2, 2019, at 7:47 AM, Sundararajan Athijegannathan wrote: >>> >>> Please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223247 >>> Webrev: https://cr.openjdk.java.net/~sundar/8223247/webrev.00/ >>> >>> Thanks, >>> -Sundar From henry.jen at oracle.com Thu May 2 17:08:50 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 2 May 2019 10:08:50 -0700 Subject: [foreign] RFR:8223105 In-Reply-To: References: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> Message-ID: <4FBC155B-A0F0-4ABC-AA7E-3DFB80C8F0DD@oracle.com> > On May 2, 2019, at 3:27 AM, Maurizio Cimadamore wrote: > > Overall looks good! > > Curious: why were all decls in DuplicateDeclarationHandler changed from List to ArrayList ? I tried to save the order of encounter and use set when duplicate is encountered again. Don?t think that matters though. > > JavaSourceFactory: I'm not super convinced of the logic here - it seems like in some cases (e.g. functions) we build a layout with the right 'name' annotation on it and then we lean on that layout to extract the name info. In other cases, e.g. global variables, we leave the layout as is, and we then have to tweak it on the fly while generating the code. I believe it would be better/more uniform if the 'name' annotation would be injected on all layouts when the tree is created? If you go down that path, I believe that would make the addition of the VarTree::label method useless? > > FunctionTree: > > return label.isPresent() ? fn.withAnnotation("name", label.get()) : fn; > > > > This code could use the static NAME field in Layout. > I noticed that inconsistency, but because the static name field of VarTree is use for the accessor function name, we cannot use that field. But we can put in the name annotation of layout. The reason I didn?t go with that is because I am not sure if current behavior(layout without name annotation) is for a reason. Cheers, Henry > Maurizio > > > On 01/05/2019 20:43, Henry Jen wrote: >> Hi, >> >> Please review a webrev[1] for JDK-8223105[2]. This is an gcc extension[3](also supported by clang) and likely observed in system headers. >> I don?t find same feature for Windows, so I cannot have the test case for Windows. >> >> A different approach to enable alias is used for Windows(and is a portable way) using macro, but this is not the same and generated Java interface will have real name for it. Perhaps that can be a separate RFE to add alias when a macro definition is a single identifier match to a variable or function. >> >> Cheers, >> Henry >> >> >> [1] >> http://cr.openjdk.java.net/~henryjen/panama/8223105/0/webrev/ >> >> [2] >> https://bugs.openjdk.java.net/browse/JDK-8223105 >> >> [3] >> https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Asm-Labels.html From maurizio.cimadamore at oracle.com Thu May 2 17:36:09 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 18:36:09 +0100 Subject: [foreign] RFR:8223105 In-Reply-To: <4FBC155B-A0F0-4ABC-AA7E-3DFB80C8F0DD@oracle.com> References: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> <4FBC155B-A0F0-4ABC-AA7E-3DFB80C8F0DD@oracle.com> Message-ID: On 02/05/2019 18:08, Henry Jen wrote: > I noticed that inconsistency, but because the static name field of VarTree is use for the accessor function name, we cannot use that field. But we can put in the name annotation of layout. The reason I didn?t go with that is because I am not sure if current behavior(layout without name annotation) is for a reason. I think that, whenever the underlying name doesn't match the cursor spelling, it is reasonable to have a name annotation embedded in the layout. Currently we do not add any name annotation to the global var layouts simply because we assume that cursor spelling == name. That might be a fine assumption, but the _asm_ label changes that. I think where I'd like to land is a place where these things are handled in an uniform way - so, the layout associated to a FunctionTree/VarTree can either: * have no name annotation - in which case: cursor spelling == Java member name == native symbol name * have a name annotation - in which case: cursor spelling == Java member name != native symbol name There's a catch though; when we emit layouts for global variables, we always need a name annotation, no matter what (that's how we can associate global layouts to global symbols), e.g. : @NativeHeader(globals = { ??????????? "i32(errno)", ??????????? "u64(environ):u64:v"} ??? ) This introduces a slight asymmetry w.r.t. functions, because, essentially, globals always need some name annotation (to be referenced by accessors) whereas functions don't. That said, I think that adding a name annotation on the global VarTree layout which takes into account the label thingie is likely the right thing to do. That way, globals will always come equipped with the 'right' name annotation, which can then be used when generating both the toplevel annotation and the associated accessors. Maurizio From henry.jen at oracle.com Thu May 2 19:28:51 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 2 May 2019 12:28:51 -0700 Subject: [foreign] RFR:8223105 In-Reply-To: References: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> <4FBC155B-A0F0-4ABC-AA7E-3DFB80C8F0DD@oracle.com> Message-ID: Updated webrev[1]. Removed label() method from FunctionTree and VarTree, VarTree layout() will always have name, and JavaSourceFactory simply use that layout. Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8223105/1/webrev/ > On May 2, 2019, at 10:36 AM, Maurizio Cimadamore wrote: > > > On 02/05/2019 18:08, Henry Jen wrote: >> I noticed that inconsistency, but because the static name field of VarTree is use for the accessor function name, we cannot use that field. But we can put in the name annotation of layout. The reason I didn?t go with that is because I am not sure if current behavior(layout without name annotation) is for a reason. > > I think that, whenever the underlying name doesn't match the cursor spelling, it is reasonable to have a name annotation embedded in the layout. Currently we do not add any name annotation to the global var layouts simply because we assume that cursor spelling == name. That might be a fine assumption, but the _asm_ label changes that. > > I think where I'd like to land is a place where these things are handled in an uniform way - so, the layout associated to a FunctionTree/VarTree can either: > > * have no name annotation - in which case: cursor spelling == Java member name == native symbol name > * have a name annotation - in which case: cursor spelling == Java member name != native symbol name > > There's a catch though; when we emit layouts for global variables, we always need a name annotation, no matter what (that's how we can associate global layouts to global symbols), e.g. : > > @NativeHeader(globals = { > "i32(errno)", > "u64(environ):u64:v"} > ) > > This introduces a slight asymmetry w.r.t. functions, because, essentially, globals always need some name annotation (to be referenced by accessors) whereas functions don't. > > That said, I think that adding a name annotation on the global VarTree layout which takes into account the label thingie is likely the right thing to do. That way, globals will always come equipped with the 'right' name annotation, which can then be used when generating both the toplevel annotation and the associated accessors. > > Maurizio > From maurizio.cimadamore at oracle.com Thu May 2 20:16:05 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 May 2019 21:16:05 +0100 Subject: [foreign] RFR:8223105 In-Reply-To: References: <62FFC162-2AB4-4BCC-9576-27D2A242AF9D@oracle.com> <4FBC155B-A0F0-4ABC-AA7E-3DFB80C8F0DD@oracle.com> Message-ID: <1ab96ec0-b2f1-8214-26f2-89478d6afc36@oracle.com> Looks good! Maurizio On 02/05/2019 20:28, Henry Jen wrote: > Updated webrev[1]. > > Removed label() method from FunctionTree and VarTree, VarTree layout() will always have name, and JavaSourceFactory simply use that layout. > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/8223105/1/webrev/ > >> On May 2, 2019, at 10:36 AM, Maurizio Cimadamore wrote: >> >> >> On 02/05/2019 18:08, Henry Jen wrote: >>> I noticed that inconsistency, but because the static name field of VarTree is use for the accessor function name, we cannot use that field. But we can put in the name annotation of layout. The reason I didn?t go with that is because I am not sure if current behavior(layout without name annotation) is for a reason. >> I think that, whenever the underlying name doesn't match the cursor spelling, it is reasonable to have a name annotation embedded in the layout. Currently we do not add any name annotation to the global var layouts simply because we assume that cursor spelling == name. That might be a fine assumption, but the _asm_ label changes that. >> >> I think where I'd like to land is a place where these things are handled in an uniform way - so, the layout associated to a FunctionTree/VarTree can either: >> >> * have no name annotation - in which case: cursor spelling == Java member name == native symbol name >> * have a name annotation - in which case: cursor spelling == Java member name != native symbol name >> >> There's a catch though; when we emit layouts for global variables, we always need a name annotation, no matter what (that's how we can associate global layouts to global symbols), e.g. : >> >> @NativeHeader(globals = { >> "i32(errno)", >> "u64(environ):u64:v"} >> ) >> >> This introduces a slight asymmetry w.r.t. functions, because, essentially, globals always need some name annotation (to be referenced by accessors) whereas functions don't. >> >> That said, I think that adding a name annotation on the global VarTree layout which takes into account the label thingie is likely the right thing to do. That way, globals will always come equipped with the 'right' name annotation, which can then be used when generating both the toplevel annotation and the associated accessors. >> >> Maurizio >> From henry.jen at oracle.com Thu May 2 23:38:16 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Thu, 02 May 2019 23:38:16 +0000 Subject: hg: panama/dev: 8223105: Support symbols in header files are alias for different name in share library Message-ID: <201905022338.x42NcHod009769@aojmv0008.oracle.com> Changeset: c0d53152a71d Author: henryjen Date: 2019-05-02 16:12 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/c0d53152a71d 8223105: Support symbols in header files are alias for different name in share library Reviewed-by: mcimadamore ! src/java.base/share/classes/java/foreign/layout/Function.java ! src/java.base/share/classes/java/foreign/memory/LayoutType.java ! src/java.base/share/classes/jdk/internal/foreign/HeaderImplGenerator.java ! src/java.base/share/classes/jdk/internal/foreign/Util.java ! src/java.base/share/classes/jdk/internal/foreign/memory/DescriptorParser.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/CursorKind.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/DuplicateDeclarationHandler.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Filters.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactory.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/FunctionTree.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/Tree.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/TreeMaker.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/VarTree.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/CursorKind.java + test/jdk/com/sun/tools/jextract/test8223105/Test8223105A.java + test/jdk/com/sun/tools/jextract/test8223105/Test8223105B.java + test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java + test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java + test/jdk/com/sun/tools/jextract/test8223105/libAsmSymbol.c + test/jdk/com/sun/tools/jextract/test8223105/libAsmSymbol.h From sundararajan.athijegannathan at oracle.com Fri May 3 09:17:32 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 03 May 2019 14:47:32 +0530 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name Message-ID: <5CCC072C.2070903@oracle.com> Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ Thanks, -Sundar From maurizio.cimadamore at oracle.com Fri May 3 09:48:15 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 10:48:15 +0100 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name In-Reply-To: <5CCC072C.2070903@oracle.com> References: <5CCC072C.2070903@oracle.com> Message-ID: <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> Hi Sundar, the patch solves the problem I had with unistd.h - thanks! I noted that the code only applies the filtering logic if the enum is anonymous - but I think the non-static binding always lift enum constants (anon or not) to the header class, which, I think, means that all enum constants should be treated as 'global' ? Maurizio On 03/05/2019 10:17, Sundararajan Athijegannathan wrote: > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 > Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ > > Thanks, > -Sundar From sundararajan.athijegannathan at oracle.com Fri May 3 10:04:40 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 03 May 2019 15:34:40 +0530 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name In-Reply-To: <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> References: <5CCC072C.2070903@oracle.com> <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> Message-ID: <5CCC1238.2050906@oracle.com> Hi, Thanks. You're right, header interface lifts all enum constants. Updated: https://cr.openjdk.java.net/~sundar/8223289/webrev.01/ Thanks, -Sundar On 03/05/19, 3:18 PM, Maurizio Cimadamore wrote: > Hi Sundar, > the patch solves the problem I had with unistd.h - thanks! > > I noted that the code only applies the filtering logic if the enum is > anonymous - but I think the non-static binding always lift enum > constants (anon or not) to the header class, which, I think, means > that all enum constants should be treated as 'global' ? > > Maurizio > > On 03/05/2019 10:17, Sundararajan Athijegannathan wrote: >> Please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 >> Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ >> >> Thanks, >> -Sundar From maurizio.cimadamore at oracle.com Fri May 3 10:03:59 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 03:03:59 -0700 (PDT) Subject: is this the issue that you faced with latest llvm? In-Reply-To: <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> Message-ID: I get the same with LLVM 8 on Ubuntu 16.04 -- e.g. missing structs leading to errors like: /usr/include/_G_config_h.java:87: error: cannot find symbol public void __state$set(usr.include.wchar_h.__mbstate_t value); I think at this point in time it's safe to assume that the latest LLVM supported by Panama is LLVM 7. Maurizio On 02/05/2019 14:06, Jorn Vernee wrote: > Hi Sundar, > > No, this looks like another issue. The issue I found did not cause a > StackOverflowError. It only caused struct/union classes to not be > generated. > > Cheers, > Jorn > > Sundararajan Athijegannathan schreef op 2019-05-02 14:59: >> Hi Jorn, >> >> Jim Laskey is facing this issue -> >> https://bugs.openjdk.java.net/browse/JDK-8223238 >> >> Is this the same jextract issue you faced with the latest llvm? or >> something else? >> >> Thanks >> -Sundar From maurizio.cimadamore at oracle.com Fri May 3 10:05:11 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 11:05:11 +0100 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name In-Reply-To: <5CCC1238.2050906@oracle.com> References: <5CCC072C.2070903@oracle.com> <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> <5CCC1238.2050906@oracle.com> Message-ID: Looks good! Maurizio On 03/05/2019 11:04, Sundararajan Athijegannathan wrote: > Hi, > > Thanks. You're right, header interface lifts all enum constants. > Updated: https://cr.openjdk.java.net/~sundar/8223289/webrev.01/ > > Thanks, > -Sundar > > On 03/05/19, 3:18 PM, Maurizio Cimadamore wrote: >> Hi Sundar, >> the patch solves the problem I had with unistd.h - thanks! >> >> I noted that the code only applies the filtering logic if the enum is >> anonymous - but I think the non-static binding always lift enum >> constants (anon or not) to the header class, which, I think, means >> that all enum constants should be treated as 'global' ? >> >> Maurizio >> >> On 03/05/2019 10:17, Sundararajan Athijegannathan wrote: >>> Please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 >>> Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ >>> >>> Thanks, >>> -Sundar From james.laskey at oracle.com Fri May 3 10:18:55 2019 From: james.laskey at oracle.com (James Laskey) Date: Fri, 3 May 2019 07:18:55 -0300 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> Message-ID: Could we have an option to point to a specific version of LLVM? Sent from my iPhone > On May 3, 2019, at 7:03 AM, Maurizio Cimadamore wrote: > > I get the same with LLVM 8 on Ubuntu 16.04 -- e.g. missing structs leading to errors like: > > /usr/include/_G_config_h.java:87: error: cannot find symbol > public void __state$set(usr.include.wchar_h.__mbstate_t value); > > > > > I think at this point in time it's safe to assume that the latest LLVM supported by Panama is LLVM 7. > > Maurizio > >> On 02/05/2019 14:06, Jorn Vernee wrote: >> Hi Sundar, >> >> No, this looks like another issue. The issue I found did not cause a StackOverflowError. It only caused struct/union classes to not be generated. >> >> Cheers, >> Jorn >> >> Sundararajan Athijegannathan schreef op 2019-05-02 14:59: >>> Hi Jorn, >>> >>> Jim Laskey is facing this issue -> >>> https://bugs.openjdk.java.net/browse/JDK-8223238 >>> >>> Is this the same jextract issue you faced with the latest llvm? or >>> something else? >>> >>> Thanks >>> -Sundar From sundararajan.athijegannathan at oracle.com Fri May 3 10:24:06 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Fri, 03 May 2019 10:24:06 +0000 Subject: hg: panama/dev: 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name Message-ID: <201905031024.x43AO7h1023801@aojmv0008.oracle.com> Changeset: d0df5d75b402 Author: sundar Date: 2019-05-03 15:57 +0530 URL: http://hg.openjdk.java.net/panama/dev/rev/d0df5d75b402 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name Reviewed-by: mcimadamore ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactory.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactoryExt.java ! test/jdk/com/sun/tools/jextract/duplicatedecls.h From maurizio.cimadamore at oracle.com Fri May 3 11:27:37 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 12:27:37 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> Message-ID: <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> You mean fail configure step if wrong version is used? If so, that would be desirable, yes Maurizio On 03/05/2019 11:18, James Laskey wrote: > Could we have an option to point to a specific version of LLVM? > > Sent from my iPhone > >> On May 3, 2019, at 7:03 AM, Maurizio Cimadamore wrote: >> >> I get the same with LLVM 8 on Ubuntu 16.04 -- e.g. missing structs leading to errors like: >> >> /usr/include/_G_config_h.java:87: error: cannot find symbol >> public void __state$set(usr.include.wchar_h.__mbstate_t value); >> >> >> >> >> I think at this point in time it's safe to assume that the latest LLVM supported by Panama is LLVM 7. >> >> Maurizio >> >>> On 02/05/2019 14:06, Jorn Vernee wrote: >>> Hi Sundar, >>> >>> No, this looks like another issue. The issue I found did not cause a StackOverflowError. It only caused struct/union classes to not be generated. >>> >>> Cheers, >>> Jorn >>> >>> Sundararajan Athijegannathan schreef op 2019-05-02 14:59: >>>> Hi Jorn, >>>> >>>> Jim Laskey is facing this issue -> >>>> https://bugs.openjdk.java.net/browse/JDK-8223238 >>>> >>>> Is this the same jextract issue you faced with the latest llvm? or >>>> something else? >>>> >>>> Thanks >>>> -Sundar From jbvernee at xs4all.nl Fri May 3 11:51:18 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 03 May 2019 13:51:18 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> Message-ID: <9fae9bec806174fc33a9c7b5bb5f1d6b@xs4all.nl> Note that there are several extra configure options to specify the LLVM directories: --with-libclang-lib= Specify where to find libclang binary, so/dylib/lib --with-libclang-include= Specify where to find libclang header files, clang-c/Index.h --with-libclang-include-aux= Specify where to find libclang auxiliary header files, lib/clang//include/stddef.h --with-libclang-bin= Specify where to find clang binary, libclang.dll While we don't have an option to select an LLVM version (yet?), explicitly setting some of these might have the same effect. Jorn Maurizio Cimadamore schreef op 2019-05-03 13:27: > You mean fail configure step if wrong version is used? If so, that > would be desirable, yes > > Maurizio > > On 03/05/2019 11:18, James Laskey wrote: >> Could we have an option to point to a specific version of LLVM? >> >> Sent from my iPhone >> >>> On May 3, 2019, at 7:03 AM, Maurizio Cimadamore >>> wrote: >>> >>> I get the same with LLVM 8 on Ubuntu 16.04 -- e.g. missing structs >>> leading to errors like: >>> >>> /usr/include/_G_config_h.java:87: error: cannot find symbol >>> public void __state$set(usr.include.wchar_h.__mbstate_t >>> value); >>> >>> >>> >>> >>> I think at this point in time it's safe to assume that the latest >>> LLVM supported by Panama is LLVM 7. >>> >>> Maurizio >>> >>>> On 02/05/2019 14:06, Jorn Vernee wrote: >>>> Hi Sundar, >>>> >>>> No, this looks like another issue. The issue I found did not cause a >>>> StackOverflowError. It only caused struct/union classes to not be >>>> generated. >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Sundararajan Athijegannathan schreef op 2019-05-02 14:59: >>>>> Hi Jorn, >>>>> >>>>> Jim Laskey is facing this issue -> >>>>> https://bugs.openjdk.java.net/browse/JDK-8223238 >>>>> >>>>> Is this the same jextract issue you faced with the latest llvm? or >>>>> something else? >>>>> >>>>> Thanks >>>>> -Sundar From james.laskey at oracle.com Fri May 3 11:57:26 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Fri, 3 May 2019 08:57:26 -0300 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> Message-ID: Actually more general. We have several problems currently all based on the fact we make way too many assumptions about the llvm configuration. 0. --with-libclang=/usr/local is too generalized. You should be able to --with-libclang=/usr/local/lib/clang/9.0.0/. 1. llvm is not always in a fixed location, whether it be per platform or whether it be a local build. 2. The current look up of llvm is very fragile, taking the highest platform ordered entry in the release directory. Eg. will find 9.0.0 but will mess up when 10.0.0 is installed. 3. Wrongly assumes naivet? on the users part. Maybe jextract doesn't work on 8 and 9 but if the user wants to try fixes in 9 or 10, they should be allowed to do so. 4. The shipping jdk jextract configuration may be different that what the user has installed. Allowing the specification of llvm locale on jextract itself would resolve these issues. -- Jim > On May 3, 2019, at 8:27 AM, Maurizio Cimadamore wrote: > > You mean fail configure step if wrong version is used? If so, that would be desirable, yes > > Maurizio > > On 03/05/2019 11:18, James Laskey wrote: >> Could we have an option to point to a specific version of LLVM? >> >> Sent from my iPhone >> >>> On May 3, 2019, at 7:03 AM, Maurizio Cimadamore wrote: >>> >>> I get the same with LLVM 8 on Ubuntu 16.04 -- e.g. missing structs leading to errors like: >>> >>> /usr/include/_G_config_h.java:87: error: cannot find symbol >>> public void __state$set(usr.include.wchar_h.__mbstate_t value); >>> >>> >>> >>> >>> I think at this point in time it's safe to assume that the latest LLVM supported by Panama is LLVM 7. >>> >>> Maurizio >>> >>>> On 02/05/2019 14:06, Jorn Vernee wrote: >>>> Hi Sundar, >>>> >>>> No, this looks like another issue. The issue I found did not cause a StackOverflowError. It only caused struct/union classes to not be generated. >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Sundararajan Athijegannathan schreef op 2019-05-02 14:59: >>>>> Hi Jorn, >>>>> >>>>> Jim Laskey is facing this issue -> >>>>> https://bugs.openjdk.java.net/browse/JDK-8223238 >>>>> >>>>> Is this the same jextract issue you faced with the latest llvm? or >>>>> something else? >>>>> >>>>> Thanks >>>>> -Sundar From maurizio.cimadamore at oracle.com Fri May 3 12:30:51 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 13:30:51 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> Message-ID: The current system is mostly design to work with prebuilt binaries here: http://releases.llvm.org/download.html You can point the Panama JDK to any binary snapshot you want in there and it just works (I tried earlier with several version to see if I was able to reproduce some issues). If you want the Panama build to work with system installation of clang/LLVM, you need to use separate configure options to tell the build where to find 1) the libclang includes 2) the libclang.so lib 3) the aux include files shipped with LLVM $ sh configure --help | grep clang?? --with-libclang= ????????????????????????? Specify path of llvm installation containing ????????????????????????? libclang. Pre-built llvm binary can be downloaded ????????????????????????? from http://llvm.org/releases/download.html ? --with-libclang-lib= ????????????????????????? Specify where to find libclang binary, so/dylib/lib ? --with-libclang-include= ????????????????????????? Specify where to find libclang header files, ????????????????????????? clang-c/Index.h ? --with-libclang-include-aux= ????????????????????????? Specify where to find libclang auxiliary header ????????????????????????? files, lib/clang//include/stddef.h ? --with-libclang-bin= ????????????????????????? Specify where to find clang binary, libclang.dll While this can be improved, it seems somewhat documented, and has worked for us so far. Doing --with-libclang=/usr/local/lib/clang/9.0.0/ doesn#'t really make sense in the current system. Maurizio On 03/05/2019 12:57, Jim Laskey wrote: > Allowing the specification of llvm locale on jextract itself would resolve these issues. From jbvernee at xs4all.nl Fri May 3 13:25:10 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 03 May 2019 15:25:10 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> Message-ID: <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> I've made a quick patch that does some basic version checking and also allows selection of a specific LLVM version: http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.00/ This picks a version matching '7' by default, and allows specifying another version using e.g. --with-libclang-version=8, but also supports more complex version strings like 8.0.1 (using grep). If the selected version can not be found a config error is emitted. The version is ignored if --with-libclang-include-aux is specified manually (which is AFAIK the only place where the version matters). Jorn Maurizio Cimadamore schreef op 2019-05-03 14:30: > The current system is mostly design to work with prebuilt binaries > here: > > http://releases.llvm.org/download.html > > You can point the Panama JDK to any binary snapshot you want in there > and it just works (I tried earlier with several version to see if I > was able to reproduce some issues). > > If you want the Panama build to work with system installation of > clang/LLVM, you need to use separate configure options to tell the > build where to find > > 1) the libclang includes > 2) the libclang.so lib > 3) the aux include files shipped with LLVM > > $ sh configure --help | grep clang?? --with-libclang= > ????????????????????????? Specify path of llvm installation containing > ????????????????????????? libclang. Pre-built llvm binary can be > downloaded > ????????????????????????? from http://llvm.org/releases/download.html > ? --with-libclang-lib= > ????????????????????????? Specify where to find libclang binary, > so/dylib/lib > ? --with-libclang-include= > ????????????????????????? Specify where to find libclang header files, > ????????????????????????? clang-c/Index.h > ? --with-libclang-include-aux= > ????????????????????????? Specify where to find libclang auxiliary > header > ????????????????????????? files, > lib/clang//include/stddef.h > ? --with-libclang-bin= > ????????????????????????? Specify where to find clang binary, > libclang.dll > > While this can be improved, it seems somewhat documented, and has > worked for us so far. > > Doing --with-libclang=/usr/local/lib/clang/9.0.0/ doesn#'t really make > sense in the current system. > > Maurizio > > On 03/05/2019 12:57, Jim Laskey wrote: >> Allowing the specification of llvm locale on jextract itself would >> resolve these issues. From maurizio.cimadamore at oracle.com Fri May 3 13:27:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 14:27:46 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> Message-ID: I'll try this with our internal build infra used for EA first... :-) Maurizio On 03/05/2019 14:25, Jorn Vernee wrote: > I've made a quick patch that does some basic version checking and also > allows selection of a specific LLVM version: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.00/ > > This picks a version matching '7' by default, and allows specifying > another version using e.g. --with-libclang-version=8, but also > supports more complex version strings like 8.0.1 (using grep). If the > selected version can not be found a config error is emitted. The > version is ignored if --with-libclang-include-aux is specified > manually (which is AFAIK the only place where the version matters). > > Jorn > > Maurizio Cimadamore schreef op 2019-05-03 14:30: >> The current system is mostly design to work with prebuilt binaries here: >> >> http://releases.llvm.org/download.html >> >> You can point the Panama JDK to any binary snapshot you want in there >> and it just works (I tried earlier with several version to see if I >> was able to reproduce some issues). >> >> If you want the Panama build to work with system installation of >> clang/LLVM, you need to use separate configure options to tell the >> build where to find >> >> 1) the libclang includes >> 2) the libclang.so lib >> 3) the aux include files shipped with LLVM >> >> $ sh configure --help | grep clang?? --with-libclang= >> ????????????????????????? Specify path of llvm installation containing >> ????????????????????????? libclang. Pre-built llvm binary can be >> downloaded >> ????????????????????????? from http://llvm.org/releases/download.html >> ? --with-libclang-lib= >> ????????????????????????? Specify where to find libclang binary, >> so/dylib/lib >> ? --with-libclang-include= >> ????????????????????????? Specify where to find libclang header files, >> ????????????????????????? clang-c/Index.h >> ? --with-libclang-include-aux= >> ????????????????????????? Specify where to find libclang auxiliary >> header >> ????????????????????????? files, >> lib/clang//include/stddef.h >> ? --with-libclang-bin= >> ????????????????????????? Specify where to find clang binary, >> libclang.dll >> >> While this can be improved, it seems somewhat documented, and has >> worked for us so far. >> >> Doing --with-libclang=/usr/local/lib/clang/9.0.0/ doesn#'t really make >> sense in the current system. >> >> Maurizio >> >> On 03/05/2019 12:57, Jim Laskey wrote: >>> Allowing the specification of llvm locale on jextract itself would >>> resolve these issues. From maurizio.cimadamore at oracle.com Fri May 3 14:38:00 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 15:38:00 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> Message-ID: <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> This seems to fail on MacOs: ib > checking libclang version to be used... 7 (default) jib > checking libclang auxiliary include path... usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]] jib > [-e pattern] [-f file] [--binary-files=value] [--color=when] jib > [--context[=num]] [--directories=action] [--label] [--line-buffered] jib > [--null] [pattern] [file ...] jib > configure: error: Can not find libclang version matching the specified version: '7' in jib > /scratch/mesos/jib-master/install/jpg/infra/builddeps/libclang-macosx_x64/7.0.0+1.0/libclang-macosx_x64-7.0.0+1.0.tar.gz/lib/clang//7.0.0 jib > /scratch/mesos/slaves/df27b84d-b5c1-4760-9e48-df95fd33274c-S783/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c479931a-1383-4646-a0d0-8e7355fd9205/runs/aefb5cb9-ffe4-4600-b811-dfe0055e18c6/workspace/build/.configure-support/generated-configure.sh: line 82: 5: Bad file descriptor jib > configure exiting with result code 1 seems like an issue with grep? Maurizio On 03/05/2019 14:27, Maurizio Cimadamore wrote: > I'll try this with our internal build infra used for EA first... :-) > > Maurizio > > > On 03/05/2019 14:25, Jorn Vernee wrote: >> I've made a quick patch that does some basic version checking and >> also allows selection of a specific LLVM version: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.00/ >> >> This picks a version matching '7' by default, and allows specifying >> another version using e.g. --with-libclang-version=8, but also >> supports more complex version strings like 8.0.1 (using grep). If the >> selected version can not be found a config error is emitted. The >> version is ignored if --with-libclang-include-aux is specified >> manually (which is AFAIK the only place where the version matters). >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-03 14:30: >>> The current system is mostly design to work with prebuilt binaries >>> here: >>> >>> http://releases.llvm.org/download.html >>> >>> You can point the Panama JDK to any binary snapshot you want in there >>> and it just works (I tried earlier with several version to see if I >>> was able to reproduce some issues). >>> >>> If you want the Panama build to work with system installation of >>> clang/LLVM, you need to use separate configure options to tell the >>> build where to find >>> >>> 1) the libclang includes >>> 2) the libclang.so lib >>> 3) the aux include files shipped with LLVM >>> >>> $ sh configure --help | grep clang?? --with-libclang= >>> ????????????????????????? Specify path of llvm installation containing >>> ????????????????????????? libclang. Pre-built llvm binary can be >>> downloaded >>> ????????????????????????? from http://llvm.org/releases/download.html >>> ? --with-libclang-lib= >>> ????????????????????????? Specify where to find libclang binary, >>> so/dylib/lib >>> ? --with-libclang-include= >>> ????????????????????????? Specify where to find libclang header files, >>> ????????????????????????? clang-c/Index.h >>> ? --with-libclang-include-aux= >>> ????????????????????????? Specify where to find libclang auxiliary >>> header >>> ????????????????????????? files, >>> lib/clang//include/stddef.h >>> ? --with-libclang-bin= >>> ????????????????????????? Specify where to find clang binary, >>> libclang.dll >>> >>> While this can be improved, it seems somewhat documented, and has >>> worked for us so far. >>> >>> Doing --with-libclang=/usr/local/lib/clang/9.0.0/ doesn#'t really make >>> sense in the current system. >>> >>> Maurizio >>> >>> On 03/05/2019 12:57, Jim Laskey wrote: >>>> Allowing the specification of llvm locale on jextract itself would >>>> resolve these issues. From maurizio.cimadamore at oracle.com Fri May 3 14:41:03 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 15:41:03 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> Message-ID: <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> On 03/05/2019 15:38, Maurizio Cimadamore wrote: > seems like an issue with grep? Yeah -P, --perl-regexp ????????????? Interpret the pattern as a? Perl-compatible regular? expression ????????????? (PCRE).?? This? is? highly? experimental and grep -P may warn of ????????????? unimplemented features. Why did you go for Perl-style? Maurizio From jbvernee at xs4all.nl Fri May 3 14:45:57 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 03 May 2019 16:45:57 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> Message-ID: <65496dfa87e34692394ccdf9c9076c0f@xs4all.nl> Ah, sorry. This was left from when I was testing with allowing multiple different version, e.g. `grep -P "^(10|9|8)"` or similar. The default grep regex didn't seem to support that syntax, so I went with perl regex, and didn't realize this wouldn't work everywhere. The -P option can just be removed for the current implementation. Maybe if we want to support multiple different version we will have to look into that again. Jorn Maurizio Cimadamore schreef op 2019-05-03 16:41: > On 03/05/2019 15:38, Maurizio Cimadamore wrote: >> seems like an issue with grep? > > Yeah > > -P, --perl-regexp > ????????????? Interpret the pattern as a? Perl-compatible regular? > expression > ????????????? (PCRE).?? This? is? highly? experimental and grep -P may > warn of > ????????????? unimplemented features. > > Why did you go for Perl-style? > > Maurizio From jbvernee at xs4all.nl Fri May 3 14:49:48 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 03 May 2019 16:49:48 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: <65496dfa87e34692394ccdf9c9076c0f@xs4all.nl> References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> <65496dfa87e34692394ccdf9c9076c0f@xs4all.nl> Message-ID: Updated webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.01/ Cheers, Jorn Jorn Vernee schreef op 2019-05-03 16:45: > Ah, sorry. This was left from when I was testing with allowing > multiple different version, e.g. `grep -P "^(10|9|8)"` or similar. The > default grep regex didn't seem to support that syntax, so I went with > perl regex, and didn't realize this wouldn't work everywhere. > > The -P option can just be removed for the current implementation. > Maybe if we want to support multiple different version we will have to > look into that again. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-03 16:41: >> On 03/05/2019 15:38, Maurizio Cimadamore wrote: >>> seems like an issue with grep? >> >> Yeah >> >> -P, --perl-regexp >> ????????????? Interpret the pattern as a? Perl-compatible regular? >> expression >> ????????????? (PCRE).?? This? is? highly? experimental and grep -P may >> warn of >> ????????????? unimplemented features. >> >> Why did you go for Perl-style? >> >> Maurizio From maurizio.cimadamore at oracle.com Fri May 3 14:56:55 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 15:56:55 +0100 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> <65496dfa87e34692394ccdf9c9076c0f@xs4all.nl> Message-ID: Thanks - I already tweaked that locally and triggered another build/test cycle Maurizio On 03/05/2019 15:49, Jorn Vernee wrote: > Updated webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.01/ > > Cheers, > Jorn > > Jorn Vernee schreef op 2019-05-03 16:45: >> Ah, sorry. This was left from when I was testing with allowing >> multiple different version, e.g. `grep -P "^(10|9|8)"` or similar. The >> default grep regex didn't seem to support that syntax, so I went with >> perl regex, and didn't realize this wouldn't work everywhere. >> >> The -P option can just be removed for the current implementation. >> Maybe if we want to support multiple different version we will have to >> look into that again. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-03 16:41: >>> On 03/05/2019 15:38, Maurizio Cimadamore wrote: >>>> seems like an issue with grep? >>> >>> Yeah >>> >>> -P, --perl-regexp >>> ????????????? Interpret the pattern as a? Perl-compatible regular? >>> expression >>> ????????????? (PCRE).?? This? is? highly? experimental and grep -P >>> may warn of >>> ????????????? unimplemented features. >>> >>> Why did you go for Perl-style? >>> >>> Maurizio From jbvernee at xs4all.nl Fri May 3 15:00:14 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 03 May 2019 17:00:14 +0200 Subject: is this the issue that you faced with latest llvm? In-Reply-To: References: <5CCAE99C.9060501@oracle.com> <9d9d6c04a78a30b3c49fde915accba98@xs4all.nl> <62851261-d4d8-d01d-fa6b-80aad9cc2a63@oracle.com> <8cb78f1e5326a81faef26ba43c187244@xs4all.nl> <873982f6-95fb-67db-0b48-5a02401da62f@oracle.com> <8f10f7a0-d048-3e32-8ffd-5687bef58f03@oracle.com> <65496dfa87e34692394ccdf9c9076c0f@xs4all.nl> Message-ID: Oh, one more thing I realized; If we want to do things The Right Way? we should use the tool variables defined in basics.m4 instead of hard-coding the tool name, since there might be some extra logic to find the tool (like when the tool path is manually specified). Updated webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.02/ So, this includes basics.m4, and then use `$LS` instead of `ls` etc. Cheers, Jorn Jorn Vernee schreef op 2019-05-03 16:49: > Updated webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/llvmconf/webrev.01/ > > Cheers, > Jorn > > Jorn Vernee schreef op 2019-05-03 16:45: >> Ah, sorry. This was left from when I was testing with allowing >> multiple different version, e.g. `grep -P "^(10|9|8)"` or similar. The >> default grep regex didn't seem to support that syntax, so I went with >> perl regex, and didn't realize this wouldn't work everywhere. >> >> The -P option can just be removed for the current implementation. >> Maybe if we want to support multiple different version we will have to >> look into that again. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-03 16:41: >>> On 03/05/2019 15:38, Maurizio Cimadamore wrote: >>>> seems like an issue with grep? >>> >>> Yeah >>> >>> -P, --perl-regexp >>> ????????????? Interpret the pattern as a? Perl-compatible regular? >>> expression >>> ????????????? (PCRE).?? This? is? highly? experimental and grep -P >>> may warn of >>> ????????????? unimplemented features. >>> >>> Why did you go for Perl-style? >>> >>> Maurizio From henry.jen at oracle.com Fri May 3 15:33:42 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 3 May 2019 08:33:42 -0700 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name In-Reply-To: References: <5CCC072C.2070903@oracle.com> <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> <5CCC1238.2050906@oracle.com> Message-ID: The test case make sure jextract won?t error out, but we are not clear on what to expect of the result? What definition would prevail for duplicate symbols? Undetermined/encounter order? Does it match C result? Cheers, Henry > On May 3, 2019, at 3:05 AM, Maurizio Cimadamore wrote: > > Looks good! > > Maurizio > > On 03/05/2019 11:04, Sundararajan Athijegannathan wrote: >> Hi, >> >> Thanks. You're right, header interface lifts all enum constants. Updated: https://cr.openjdk.java.net/~sundar/8223289/webrev.01/ >> >> Thanks, >> -Sundar >> >> On 03/05/19, 3:18 PM, Maurizio Cimadamore wrote: >>> Hi Sundar, >>> the patch solves the problem I had with unistd.h - thanks! >>> >>> I noted that the code only applies the filtering logic if the enum is anonymous - but I think the non-static binding always lift enum constants (anon or not) to the header class, which, I think, means that all enum constants should be treated as 'global' ? >>> >>> Maurizio >>> >>> On 03/05/2019 10:17, Sundararajan Athijegannathan wrote: >>>> Please review. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 >>>> Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ >>>> >>>> Thanks, >>>> -Sundar From sundararajan.athijegannathan at oracle.com Fri May 3 16:41:38 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 03 May 2019 22:11:38 +0530 Subject: [foreign] RFR 8223289: jextract fails when proprocessor macro and anonymous enum constant have the same name In-Reply-To: References: <5CCC072C.2070903@oracle.com> <4de36781-9f76-4a67-2651-4c1b9f85cece@oracle.com> <5CCC1238.2050906@oracle.com> Message-ID: <5CCC6F42.1010709@oracle.com> No it won't match C result - because C preprocessor directives are effective from the point of #define. Before #define, enum value is effective. It is done to unblock the linux sample failures (basically anything that include system header like unistd.h in linux). In that case at least preprocessor macro and enum value are same - so it shouldn't affect there. General macro conflicting stuff defined before (with *different* value) is not implemented. -Sundar On 03/05/19, 9:03 PM, Henry Jen wrote: > The test case make sure jextract won?t error out, but we are not clear on what to expect of the result? What definition would prevail for duplicate symbols? Undetermined/encounter order? Does it match C result? > > Cheers, > Henry > >> On May 3, 2019, at 3:05 AM, Maurizio Cimadamore wrote: >> >> Looks good! >> >> Maurizio >> >> On 03/05/2019 11:04, Sundararajan Athijegannathan wrote: >>> Hi, >>> >>> Thanks. You're right, header interface lifts all enum constants. Updated: https://cr.openjdk.java.net/~sundar/8223289/webrev.01/ >>> >>> Thanks, >>> -Sundar >>> >>> On 03/05/19, 3:18 PM, Maurizio Cimadamore wrote: >>>> Hi Sundar, >>>> the patch solves the problem I had with unistd.h - thanks! >>>> >>>> I noted that the code only applies the filtering logic if the enum is anonymous - but I think the non-static binding always lift enum constants (anon or not) to the header class, which, I think, means that all enum constants should be treated as 'global' ? >>>> >>>> Maurizio >>>> >>>> On 03/05/2019 10:17, Sundararajan Athijegannathan wrote: >>>>> Please review. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223289 >>>>> Webrev: https://cr.openjdk.java.net/~sundar/8223289/webrev.00/ >>>>> >>>>> Thanks, >>>>> -Sundar From henry.jen at oracle.com Fri May 3 16:43:47 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 3 May 2019 09:43:47 -0700 Subject: RFR: Improve missing symbols handling Message-ID: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> Hi, Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will - Same behavior as before if both -l and -L are provided. - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. To be 100% compatible with before, use '?missing-symbols ignore? Thoughts? Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From maurizio.cimadamore at oracle.com Fri May 3 16:46:32 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 17:46:32 +0100 Subject: [foreign] RFR 8222765: Implement foreign memory access through VarHandle In-Reply-To: <2152fa1c-4f0d-5bb2-b275-afa94ace5c0b@oracle.com> References: <4c9b23b5-efbd-72a7-8df8-f767f6bf5b9c@oracle.com> <129FF4F0-CD00-4D2A-B8B1-4951AD359F70@oracle.com> <5cb7699c-7fcf-e6c1-a836-92bad98e43e9@oracle.com> <7d5a4e02cea9f2067ca03ff185b6aff8@xs4all.nl> <1f8028e3-86b7-3c59-2dcb-a13153af9cf8@oracle.com> <508c9eb8-70da-f360-fa07-db3a6f2a72e0@oracle.com> <19794f1765ba5fce707754a621b3c26a@xs4all.nl> <2152fa1c-4f0d-5bb2-b275-afa94ace5c0b@oracle.com> Message-ID: Here's a new revision based on some comments from Jorn: * revised VarHandle templating logic - now there's a single VarHandleMemoryAddressBase declared in j.l.i package * added more carrier checks to VarHandles::makeMemoryAddressViewHandle (and test) * made layout lookup logic uniform between values and group (before, group lookup would not check if the group itself matched the condition) http://cr.openjdk.java.net/~mcimadamore/panama/8222765_v3/webrev/ Maurizio On 19/04/2019 21:48, Maurizio Cimadamore wrote: > Fixed the template: > > http://cr.openjdk.java.net/~mcimadamore/panama/8222765_v2/ > > I'd prefer to address the API issues in follow up issues, as we keep > going through the API (also with Brian and others). > > Cheers > Maurizio > > On 19/04/2019 20:07, Jorn Vernee wrote: >> AFAIK it ensures that there is a unique name for the variables used >> in the macro. >> >> e.g. if t = Byte then $1_Type expands to >> VAR_HANDLE_BYTE_ARRAY_Byte_Type, as variable name. But, this >> currently conflicts with the GenerateVarHandleByteArray macro for >> some types. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-04-19 19:55: >>> After a closer look, it seems like the first parameter to the macro is >>> not used. The macro GenerateVarHandleXYZ does, as its first thing: >>> >>> $1_Type := $2 >>> >>> And then keeps using $1_Type after that. >>> >>> Maybe the $1 parameter is some remains of older code. >>> >>> Maurizio >>> On 19/04/2019 17:42, Maurizio Cimadamore wrote: >>> >>>> On 19/04/2019 16:46, Jorn Vernee wrote: >>>> >>>>> - In GensrcVarHandles.gmk you seem to have a mistake in the >>>>> generation command: >>>>> >>>>> 274 # List the types to generate source for, with capitalized >>>>> first letter >>>>> 275 VARHANDLES_MEMORY_ADDRESS_TYPES := Byte Short Char Int Long >>>>> Float Double >>>>> 276 $(foreach t, $(VARHANDLES_MEMORY_ADDRESS_TYPES), \ >>>>> 277?? $(eval $(call >>>>> GenerateVarHandleMemoryAddress,VAR_HANDLE_BYTE_ARRAY_$t,$t))) >>>>> >>>>> You still have to change the name VAR_HANDLE_BYTE_ARRAY_ to >>>>> something else (probably forgotten when copy pasting?) >>>> Good point, I wonder why it? still works??? I'll take another look. >>>> The right classes are definitively generated. From maurizio.cimadamore at oracle.com Fri May 3 17:00:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 18:00:31 +0100 Subject: RFR: Improve missing symbols handling In-Reply-To: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> Message-ID: <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? Maurizio On 03/05/2019 17:43, Henry Jen wrote: > Hi, > > Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. > > This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will > > - Same behavior as before if both -l and -L are provided. > - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. > - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. > > To be 100% compatible with before, use '?missing-symbols ignore? > > Thoughts? > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From henry.jen at oracle.com Fri May 3 17:02:16 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 3 May 2019 10:02:16 -0700 Subject: RFR: Improve missing symbols handling In-Reply-To: <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> Message-ID: <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> Correct. Cheers, Henry > On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: > > Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? > > Maurizio > > > > > > On 03/05/2019 17:43, Henry Jen wrote: >> Hi, >> >> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >> >> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >> >> - Same behavior as before if both -l and -L are provided. >> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >> >> To be 100% compatible with before, use '?missing-symbols ignore? >> >> Thoughts? >> >> Cheers, >> Henry >> >> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From henry.jen at oracle.com Fri May 3 17:12:28 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 3 May 2019 10:12:28 -0700 Subject: RFR: Improve missing symbols handling In-Reply-To: <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> Message-ID: The current implementation behave correctly as expected, but reading LibraryLookupFilter feels wrong. It depends on the fact that linkCheckPaths is initialized to java.library.path when -l is specified but not -L in Main.java. At least we can do is to add following change, --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java @@ -86,8 +86,9 @@ } private void initSymChecker(List linkCheckPaths) { - if (!libraryNames.isEmpty() && !linkCheckPaths.isEmpty()) { + if (!libraryNames.isEmpty()) { try { + assert !linkCheckPaths.isEmpty(); Library[] libs = loadLibraries(MethodHandles.lookup(), linkCheckPaths.toArray(new String[0]), libraryNames.toArray(new String[0])); Cheers, Henry > On May 3, 2019, at 10:02 AM, Henry Jen wrote: > > Correct. > > Cheers, > Henry > > >> On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: >> >> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? >> >> Maurizio >> >> >> >> >> >> On 03/05/2019 17:43, Henry Jen wrote: >>> Hi, >>> >>> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >>> >>> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >>> >>> - Same behavior as before if both -l and -L are provided. >>> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >>> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >>> >>> To be 100% compatible with before, use '?missing-symbols ignore? >>> >>> Thoughts? >>> >>> Cheers, >>> Henry >>> >>> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ > From maurizio.cimadamore at oracle.com Fri May 3 20:34:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 3 May 2019 21:34:31 +0100 Subject: RFR: Improve missing symbols handling In-Reply-To: References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> Message-ID: <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> If I understand what you are saying, you just want to make explicit the fact that if -l has been set, _some_ library will always be set, either explicitly (via -L) or implicitly (inferred from java.library.path). If so, this seems like a good change. Maurizio On 03/05/2019 18:12, Henry Jen wrote: > The current implementation behave correctly as expected, but reading LibraryLookupFilter feels wrong. It depends on the fact that linkCheckPaths is initialized to java.library.path when -l is specified but not -L in Main.java. > > At least we can do is to add following change, > > --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java > +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java > @@ -86,8 +86,9 @@ > } > > private void initSymChecker(List linkCheckPaths) { > - if (!libraryNames.isEmpty() && !linkCheckPaths.isEmpty()) { > + if (!libraryNames.isEmpty()) { > try { > + assert !linkCheckPaths.isEmpty(); > Library[] libs = loadLibraries(MethodHandles.lookup(), > linkCheckPaths.toArray(new String[0]), > libraryNames.toArray(new String[0])); > > Cheers, > Henry > >> On May 3, 2019, at 10:02 AM, Henry Jen wrote: >> >> Correct. >> >> Cheers, >> Henry >> >> >>> On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: >>> >>> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? >>> >>> Maurizio >>> >>> >>> >>> >>> >>> On 03/05/2019 17:43, Henry Jen wrote: >>>> Hi, >>>> >>>> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >>>> >>>> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >>>> >>>> - Same behavior as before if both -l and -L are provided. >>>> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >>>> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >>>> >>>> To be 100% compatible with before, use '?missing-symbols ignore? >>>> >>>> Thoughts? >>>> >>>> Cheers, >>>> Henry >>>> >>>> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From shravya.rukmannagari at intel.com Fri May 3 21:13:53 2019 From: shravya.rukmannagari at intel.com (shravya.rukmannagari at intel.com) Date: Fri, 03 May 2019 21:13:53 +0000 Subject: hg: panama/dev: JDK-8223330: Test jdk/incubator/vector/VectorReshapeTests.java failed on fastdebug build Message-ID: <201905032113.x43LDrZL009708@aojmv0008.oracle.com> Changeset: 604b20b0ca4a Author: srukmannagar Date: 2019-05-03 05:47 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/604b20b0ca4a JDK-8223330: Test jdk/incubator/vector/VectorReshapeTests.java failed on fastdebug build Summary: Fix for VectorReshapeTests in VectorAPI ! changeset.log ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/DoubleVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/FloatVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/LongVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ShortVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X-Vector.java.template From maurizio.cimadamore at oracle.com Fri May 3 21:19:34 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 03 May 2019 21:19:34 +0000 Subject: hg: panama/dev: Automatic merge with vectorIntrinsics Message-ID: <201905032119.x43LJZ7f011774@aojmv0008.oracle.com> Changeset: 71c80ee9f770 Author: mcimadamore Date: 2019-05-03 23:19 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/71c80ee9f770 Automatic merge with vectorIntrinsics From john.r.rose at oracle.com Sat May 4 01:26:05 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 3 May 2019 18:26:05 -0700 Subject: how to spin a Vector API java doc? Message-ID: Hey Kishor, what command are you using to build the sample javadocs? This is my best guess so far: $ make docs-jdk-api-javadoc-only JDK_MODULES=jdk.incubator.vector It fails but maybe because I don't have a clean build. ? John From kishor.kharbas at intel.com Sat May 4 01:32:58 2019 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Sat, 4 May 2019 01:32:58 +0000 Subject: how to spin a Vector API java doc? In-Reply-To: References: Message-ID: Hi John, I use 'make docs-jdk-api`. I also replaced Javadoc option ' --override-methods=summary` with ` --override-methods=detail`. Without it overriding methods with covariant return type were not listed in the derived class. Maybe I am facing this bug - https://bugs.openjdk.java.net/browse/JDK-8219147 Thanks, Kishor > -----Original Message----- > From: John Rose [mailto:john.r.rose at oracle.com] > Sent: Friday, May 3, 2019 6:26 PM > To: Kharbas, Kishor > Cc: panama-dev at openjdk.java.net > Subject: how to spin a Vector API java doc? > > Hey Kishor, what command are you using to build the sample javadocs? > > This is my best guess so far: > > $ make docs-jdk-api-javadoc-only JDK_MODULES=jdk.incubator.vector > > It fails but maybe because I don't have a clean build. > > ? John From kishor.kharbas at intel.com Sat May 4 01:34:48 2019 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Sat, 4 May 2019 01:34:48 +0000 Subject: how to spin a Vector API java doc? References: Message-ID: I replaced the option in latest Javadoc which is at http://cr.openjdk.java.net/~kkharbas/vector-api/javadoc/VectorApi-JavaDoc.06/jdk.incubator.vector/module-summary.html Just FYI : I am just finishing up creating CSR. > -----Original Message----- > From: Kharbas, Kishor > Sent: Friday, May 3, 2019 6:33 PM > To: 'John Rose' > Cc: panama-dev at openjdk.java.net; Kharbas, Kishor > > Subject: RE: how to spin a Vector API java doc? > > Hi John, > > I use 'make docs-jdk-api`. > > I also replaced Javadoc option ' --override-methods=summary` with ` -- > override-methods=detail`. Without it overriding methods with covariant > return type were not listed in the derived class. > Maybe I am facing this bug - https://bugs.openjdk.java.net/browse/JDK- > 8219147 > > Thanks, > Kishor > > > -----Original Message----- > > From: John Rose [mailto:john.r.rose at oracle.com] > > Sent: Friday, May 3, 2019 6:26 PM > > To: Kharbas, Kishor > > Cc: panama-dev at openjdk.java.net > > Subject: how to spin a Vector API java doc? > > > > Hey Kishor, what command are you using to build the sample javadocs? > > > > This is my best guess so far: > > > > $ make docs-jdk-api-javadoc-only JDK_MODULES=jdk.incubator.vector > > > > It fails but maybe because I don't have a clean build. > > > > ? John From henry.jen at oracle.com Mon May 6 21:15:27 2019 From: henry.jen at oracle.com (Henry Jen) Date: Mon, 6 May 2019 14:15:27 -0700 Subject: [foreign] RFR 8223413: Improve missing symbols handling in jextract In-Reply-To: <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> Message-ID: Bug[1] created, and official webrev[2] review request. [1] https://bugs.openjdk.java.net/browse/JDK-8223413 [2] http://cr.openjdk.java.net/~henryjen/panama/8223413/0/webrev/ Cheers, Henry > On May 3, 2019, at 1:34 PM, Maurizio Cimadamore wrote: > > If I understand what you are saying, you just want to make explicit the fact that if -l has been set, _some_ library will always be set, either explicitly (via -L) or implicitly (inferred from java.library.path). > > If so, this seems like a good change. > > Maurizio > > On 03/05/2019 18:12, Henry Jen wrote: >> The current implementation behave correctly as expected, but reading LibraryLookupFilter feels wrong. It depends on the fact that linkCheckPaths is initialized to java.library.path when -l is specified but not -L in Main.java. >> >> At least we can do is to add following change, >> >> --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >> +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >> @@ -86,8 +86,9 @@ >> } >> >> private void initSymChecker(List linkCheckPaths) { >> - if (!libraryNames.isEmpty() && !linkCheckPaths.isEmpty()) { >> + if (!libraryNames.isEmpty()) { >> try { >> + assert !linkCheckPaths.isEmpty(); >> Library[] libs = loadLibraries(MethodHandles.lookup(), >> linkCheckPaths.toArray(new String[0]), >> libraryNames.toArray(new String[0])); >> >> Cheers, >> Henry >> >>> On May 3, 2019, at 10:02 AM, Henry Jen wrote: >>> >>> Correct. >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: >>>> >>>> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? >>>> >>>> Maurizio >>>> >>>> >>>> >>>> >>>> >>>> On 03/05/2019 17:43, Henry Jen wrote: >>>>> Hi, >>>>> >>>>> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >>>>> >>>>> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >>>>> >>>>> - Same behavior as before if both -l and -L are provided. >>>>> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >>>>> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >>>>> >>>>> To be 100% compatible with before, use '?missing-symbols ignore? >>>>> >>>>> Thoughts? >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From maurizio.cimadamore at oracle.com Tue May 7 09:59:27 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 09:59:27 +0000 Subject: hg: panama/dev: 8222765: Implement foreign memory access through VarHandle? Message-ID: <201905070959.x479xRrZ002370@aojmv0008.oracle.com> Changeset: 9b19c6615cb2 Author: mcimadamore Date: 2019-05-07 10:58 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/9b19c6615cb2 8222765: Implement foreign memory access through VarHandle? ! make/gensrc/GensrcVarHandles.gmk ! src/hotspot/share/ci/ciField.cpp + src/java.base/share/classes/java/foreign/layout/AbstractDescriptor.java + src/java.base/share/classes/java/foreign/layout/Address.java + src/java.base/share/classes/java/foreign/layout/Descriptor.java + src/java.base/share/classes/java/foreign/layout/Function.java + src/java.base/share/classes/java/foreign/layout/Group.java + src/java.base/share/classes/java/foreign/layout/Layout.java + src/java.base/share/classes/java/foreign/layout/LayoutPath.java + src/java.base/share/classes/java/foreign/layout/Padding.java + src/java.base/share/classes/java/foreign/layout/Sequence.java + src/java.base/share/classes/java/foreign/layout/Unresolved.java + src/java.base/share/classes/java/foreign/layout/Value.java + src/java.base/share/classes/java/foreign/memory/MemoryAddress.java + src/java.base/share/classes/java/foreign/memory/MemoryScope.java + src/java.base/share/classes/java/lang/invoke/AddressVarHandleGenerator.java ! src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java + src/java.base/share/classes/java/lang/invoke/VarHandleMemoryAddressBase.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java + src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template ! src/java.base/share/classes/java/nio/Buffer.java ! src/java.base/share/classes/jdk/internal/access/JavaLangInvokeAccess.java ! src/java.base/share/classes/jdk/internal/access/JavaNioAccess.java + src/java.base/share/classes/jdk/internal/foreign/LayoutPathsImpl.java + src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java + src/java.base/share/classes/jdk/internal/foreign/MemoryBoundInfo.java + src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java ! src/java.base/share/classes/module-info.java ! test/jdk/TEST.groups + test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Tue May 7 10:00:04 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 7 May 2019 11:00:04 +0100 Subject: [foreign] RFR 8222765: Implement foreign memory access through VarHandle In-Reply-To: References: <4c9b23b5-efbd-72a7-8df8-f767f6bf5b9c@oracle.com> <129FF4F0-CD00-4D2A-B8B1-4951AD359F70@oracle.com> <5cb7699c-7fcf-e6c1-a836-92bad98e43e9@oracle.com> <7d5a4e02cea9f2067ca03ff185b6aff8@xs4all.nl> <1f8028e3-86b7-3c59-2dcb-a13153af9cf8@oracle.com> <508c9eb8-70da-f360-fa07-db3a6f2a72e0@oracle.com> <19794f1765ba5fce707754a621b3c26a@xs4all.nl> <2152fa1c-4f0d-5bb2-b275-afa94ace5c0b@oracle.com> Message-ID: <662c6726-e7e5-efc6-4ba8-36257bbec7c5@oracle.com> Just pushed this, under the branch name 'foreign-memaccess'. Maurizio On 03/05/2019 17:46, Maurizio Cimadamore wrote: > Here's a new revision based on some comments from Jorn: > > * revised VarHandle templating logic - now there's a single > VarHandleMemoryAddressBase declared in j.l.i package > > * added more carrier checks to VarHandles::makeMemoryAddressViewHandle > (and test) > > * made layout lookup logic uniform between values and group (before, > group lookup would not check if the group itself matched the condition) > > http://cr.openjdk.java.net/~mcimadamore/panama/8222765_v3/webrev/ > > Maurizio > > > On 19/04/2019 21:48, Maurizio Cimadamore wrote: >> Fixed the template: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8222765_v2/ >> >> I'd prefer to address the API issues in follow up issues, as we keep >> going through the API (also with Brian and others). >> >> Cheers >> Maurizio >> >> On 19/04/2019 20:07, Jorn Vernee wrote: >>> AFAIK it ensures that there is a unique name for the variables used >>> in the macro. >>> >>> e.g. if t = Byte then $1_Type expands to >>> VAR_HANDLE_BYTE_ARRAY_Byte_Type, as variable name. But, this >>> currently conflicts with the GenerateVarHandleByteArray macro for >>> some types. >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-04-19 19:55: >>>> After a closer look, it seems like the first parameter to the macro is >>>> not used. The macro GenerateVarHandleXYZ does, as its first thing: >>>> >>>> $1_Type := $2 >>>> >>>> And then keeps using $1_Type after that. >>>> >>>> Maybe the $1 parameter is some remains of older code. >>>> >>>> Maurizio >>>> On 19/04/2019 17:42, Maurizio Cimadamore wrote: >>>> >>>>> On 19/04/2019 16:46, Jorn Vernee wrote: >>>>> >>>>>> - In GensrcVarHandles.gmk you seem to have a mistake in the >>>>>> generation command: >>>>>> >>>>>> 274 # List the types to generate source for, with capitalized >>>>>> first letter >>>>>> 275 VARHANDLES_MEMORY_ADDRESS_TYPES := Byte Short Char Int Long >>>>>> Float Double >>>>>> 276 $(foreach t, $(VARHANDLES_MEMORY_ADDRESS_TYPES), \ >>>>>> 277?? $(eval $(call >>>>>> GenerateVarHandleMemoryAddress,VAR_HANDLE_BYTE_ARRAY_$t,$t))) >>>>>> >>>>>> You still have to change the name VAR_HANDLE_BYTE_ARRAY_ to >>>>>> something else (probably forgotten when copy pasting?) >>>>> Good point, I wonder why it? still works??? I'll take another look. >>>>> The right classes are definitively generated. From maurizio.cimadamore at oracle.com Tue May 7 11:50:37 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 11:50:37 +0000 Subject: hg: panama/dev: Fix merge issues Message-ID: <201905071150.x47Boc4M010299@aojmv0008.oracle.com> Changeset: 1820595af53a Author: mcimadamore Date: 2019-05-07 12:50 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/1820595af53a Fix merge issues ! src/hotspot/cpu/x86/directUpcallHandler_x86.cpp ! src/java.base/share/classes/jdk/internal/foreign/abi/LinkToNativeInvoker.java From maurizio.cimadamore at oracle.com Tue May 7 12:06:54 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 12:06:54 +0000 Subject: hg: panama/dev: manual merge with default Message-ID: <201905071206.x47C6tj6019919@aojmv0008.oracle.com> Changeset: 2baba58867ec Author: mcimadamore Date: 2019-05-07 13:06 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/2baba58867ec manual merge with default ! make/CompileJavaModules.gmk ! src/hotspot/share/classfile/javaClasses.cpp ! src/hotspot/share/classfile/javaClasses.hpp ! src/hotspot/share/classfile/systemDictionary.cpp ! src/hotspot/share/classfile/systemDictionary.hpp ! src/hotspot/share/classfile/vmSymbols.hpp ! src/hotspot/share/code/codeCache.hpp ! src/hotspot/share/oops/method.cpp ! src/hotspot/share/opto/callnode.cpp ! src/hotspot/share/prims/methodHandles.cpp ! src/hotspot/share/prims/unsafe.cpp ! src/java.base/share/classes/jdk/internal/misc/Unsafe.java - src/jdk.accessibility/windows/native/common/AccessBridgeStatusWindow.RC - src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/Pool.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.common/src/org/graalvm/compiler/core/common/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.common/src/org/graalvm/compiler/core/common/util/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.graph/src/org/graalvm/compiler/graph/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/NodeCostDumpUtil.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.replacements/src/org/graalvm/compiler/replacements/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.word/src/org/graalvm/compiler/word/UnsafeAccess.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AbstractModuleIndexWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AbstractPackageIndexWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AllClassesFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/FrameOutputWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModuleFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModuleIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModulePackageIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/PackageFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/PackageIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/external/jquery/jquery.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_55_fbf9ee_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_65_dadada_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_75_dadada_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_75_e6e6e6_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_95_fef1ec_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_highlight-soft_75_cccccc_1x100.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_222222_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_2e83ff_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_454545_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_888888_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_cd0a0a_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-3.3.1.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-migrate-3.0.1.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.min.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.structure.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.structure.min.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils-ie.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils-ie.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip/dist/jszip.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip/dist/jszip.min.js - test/hotspot/jtreg/applications/ctw/modules/jdk_incubator_httpclient.java - test/hotspot/jtreg/applications/ctw/modules/jdk_packager.java - test/hotspot/jtreg/applications/ctw/modules/jdk_packager_services.java - test/hotspot/jtreg/runtime/ErrorHandling/ExplicitArithmeticCheck.java - test/hotspot/jtreg/runtime/Thread/MonitorCacheMaybeExpand_DeadLock.java - test/hotspot/jtreg/runtime/containers/cgroup/PlainRead.java - test/hotspot/jtreg/runtime/containers/docker/AttemptOOM.java - test/hotspot/jtreg/runtime/containers/docker/CheckContainerized.java - test/hotspot/jtreg/runtime/containers/docker/DockerBasicTest.java - test/hotspot/jtreg/runtime/containers/docker/HelloDocker.java - test/hotspot/jtreg/runtime/containers/docker/JfrReporter.java - test/hotspot/jtreg/runtime/containers/docker/PrintContainerInfo.java - test/hotspot/jtreg/runtime/containers/docker/TEST.properties - test/hotspot/jtreg/runtime/containers/docker/TestCPUAwareness.java - test/hotspot/jtreg/runtime/containers/docker/TestCPUSets.java - test/hotspot/jtreg/runtime/containers/docker/TestJFREvents.java - test/hotspot/jtreg/runtime/containers/docker/TestMemoryAwareness.java - test/hotspot/jtreg/runtime/containers/docker/TestMisc.java - test/hotspot/jtreg/runtime/interpreter/WideStrictInline.java - test/hotspot/jtreg/runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java - test/hotspot/jtreg/runtime/noClassDefFoundMsg/libNoClassDefFoundMsg.c - test/hotspot/jtreg/serviceability/dcmd/framework/TestJavaProcess.java - test/jdk/sun/security/ssl/rsa/BrokenRSAPrivateCrtKey.java - test/jdk/sun/security/tools/jarsigner/AlgOptions.sh - test/jdk/sun/security/tools/jarsigner/PercentSign.sh - test/jdk/sun/security/tools/jarsigner/certpolicy.sh - test/jdk/sun/security/tools/jarsigner/checkusage.sh - test/jdk/sun/security/tools/jarsigner/collator.sh - test/jdk/sun/security/tools/jarsigner/concise_jarsigner.sh - test/jdk/sun/security/tools/jarsigner/crl.sh - test/jdk/sun/security/tools/jarsigner/default_options.sh - test/jdk/sun/security/tools/jarsigner/diffend.sh - test/jdk/sun/security/tools/jarsigner/ec.sh - test/jdk/sun/security/tools/jarsigner/emptymanifest.sh - test/jdk/sun/security/tools/jarsigner/jvindex.sh - test/jdk/sun/security/tools/jarsigner/nameclash.sh - test/jdk/sun/security/tools/jarsigner/newsize7.sh - test/jdk/sun/security/tools/jarsigner/oldsig.sh - test/jdk/sun/security/tools/jarsigner/onlymanifest.sh - test/jdk/sun/security/tools/jarsigner/passtype.sh - test/jdk/sun/security/tools/jarsigner/samename.sh - test/jdk/sun/security/tools/jarsigner/weaksize.sh - test/jdk/sun/security/tools/keytool/CloneKeyAskPassword.sh - test/jdk/sun/security/tools/keytool/NoExtNPE.sh - test/jdk/sun/security/tools/keytool/SecretKeyKS.sh - test/jdk/sun/security/tools/keytool/StandardAlgName.sh - test/jdk/sun/security/tools/keytool/StorePasswordsByShell.sh - test/jdk/sun/security/tools/keytool/default_options.sh - test/jdk/sun/security/tools/keytool/emptysubject.sh - test/jdk/sun/security/tools/keytool/file-in-help.sh - test/jdk/sun/security/tools/keytool/i18n.sh - test/jdk/sun/security/tools/keytool/importreadall.sh - test/jdk/sun/security/tools/keytool/keyalg.sh - test/jdk/sun/security/tools/keytool/newhelp.sh - test/jdk/sun/security/tools/keytool/resource.sh - test/jdk/sun/security/tools/keytool/selfissued.sh - test/jdk/sun/security/tools/keytool/trystore.sh - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/AccessFrameTitle.java - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/p1/C1.java - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/p2/C2.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/PackagesHeader.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/p1/C1.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/p2/C2.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/TestClassDocCatalog.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyAnnotation.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyClass.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyEnum.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyError.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyException.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyInterface.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyAnnotation.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyClass.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyEnum.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyError.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyException.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyInterface.java - test/langtools/jdk/javadoc/doclet/testFramesNoFrames/TestFramesNoFrames.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/TestWindowTitle.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/p1/C1.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/p2/C2.java From maurizio.cimadamore at oracle.com Tue May 7 12:09:51 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 12:09:51 +0000 Subject: hg: panama/dev: Automatic merge with linkToNative Message-ID: <201905071209.x47C9paI023700@aojmv0008.oracle.com> Changeset: 3b58d486721a Author: mcimadamore Date: 2019-05-07 14:09 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/3b58d486721a Automatic merge with linkToNative ! make/CompileDemos.gmk ! make/CompileJavaModules.gmk ! make/RunTests.gmk ! make/autoconf/basics.m4 ! make/autoconf/spec.gmk.in ! make/conf/jib-profiles.js ! make/test/JtregNativeJdk.gmk ! src/hotspot/cpu/x86/frame_x86.cpp ! src/hotspot/share/opto/lcm.cpp ! src/hotspot/share/prims/nativeLookup.cpp ! src/hotspot/share/runtime/init.cpp ! src/hotspot/share/runtime/sharedRuntime.cpp ! src/hotspot/share/runtime/thread.hpp - src/jdk.accessibility/windows/native/common/AccessBridgeStatusWindow.RC - src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/Pool.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.common/src/org/graalvm/compiler/core/common/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.common/src/org/graalvm/compiler/core/common/util/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.graph/src/org/graalvm/compiler/graph/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/NodeCostDumpUtil.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.replacements/src/org/graalvm/compiler/replacements/UnsafeAccess.java - src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.word/src/org/graalvm/compiler/word/UnsafeAccess.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AbstractModuleIndexWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AbstractPackageIndexWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/AllClassesFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/FrameOutputWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModuleFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModuleIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/ModulePackageIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/PackageFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/PackageIndexFrameWriter.java - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/external/jquery/jquery.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_55_fbf9ee_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_65_dadada_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_75_dadada_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_75_e6e6e6_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_glass_95_fef1ec_1x400.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-bg_highlight-soft_75_cccccc_1x100.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_222222_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_2e83ff_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_454545_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_888888_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/images/ui-icons_cd0a0a_256x240.png - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-3.3.1.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-migrate-3.0.1.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.min.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.structure.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jquery-ui.structure.min.css - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils-ie.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils-ie.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip-utils/dist/jszip-utils.min.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip/dist/jszip.js - src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/jquery/jszip/dist/jszip.min.js - test/hotspot/jtreg/applications/ctw/modules/jdk_incubator_httpclient.java - test/hotspot/jtreg/applications/ctw/modules/jdk_packager.java - test/hotspot/jtreg/applications/ctw/modules/jdk_packager_services.java - test/hotspot/jtreg/runtime/ErrorHandling/ExplicitArithmeticCheck.java - test/hotspot/jtreg/runtime/Thread/MonitorCacheMaybeExpand_DeadLock.java - test/hotspot/jtreg/runtime/containers/cgroup/PlainRead.java - test/hotspot/jtreg/runtime/containers/docker/AttemptOOM.java - test/hotspot/jtreg/runtime/containers/docker/CheckContainerized.java - test/hotspot/jtreg/runtime/containers/docker/DockerBasicTest.java - test/hotspot/jtreg/runtime/containers/docker/HelloDocker.java - test/hotspot/jtreg/runtime/containers/docker/JfrReporter.java - test/hotspot/jtreg/runtime/containers/docker/PrintContainerInfo.java - test/hotspot/jtreg/runtime/containers/docker/TEST.properties - test/hotspot/jtreg/runtime/containers/docker/TestCPUAwareness.java - test/hotspot/jtreg/runtime/containers/docker/TestCPUSets.java - test/hotspot/jtreg/runtime/containers/docker/TestJFREvents.java - test/hotspot/jtreg/runtime/containers/docker/TestMemoryAwareness.java - test/hotspot/jtreg/runtime/containers/docker/TestMisc.java - test/hotspot/jtreg/runtime/interpreter/WideStrictInline.java - test/hotspot/jtreg/runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java - test/hotspot/jtreg/runtime/noClassDefFoundMsg/libNoClassDefFoundMsg.c - test/hotspot/jtreg/serviceability/dcmd/framework/TestJavaProcess.java - test/jdk/sun/security/ssl/rsa/BrokenRSAPrivateCrtKey.java - test/jdk/sun/security/tools/jarsigner/AlgOptions.sh - test/jdk/sun/security/tools/jarsigner/PercentSign.sh - test/jdk/sun/security/tools/jarsigner/certpolicy.sh - test/jdk/sun/security/tools/jarsigner/checkusage.sh - test/jdk/sun/security/tools/jarsigner/collator.sh - test/jdk/sun/security/tools/jarsigner/concise_jarsigner.sh - test/jdk/sun/security/tools/jarsigner/crl.sh - test/jdk/sun/security/tools/jarsigner/default_options.sh - test/jdk/sun/security/tools/jarsigner/diffend.sh - test/jdk/sun/security/tools/jarsigner/ec.sh - test/jdk/sun/security/tools/jarsigner/emptymanifest.sh - test/jdk/sun/security/tools/jarsigner/jvindex.sh - test/jdk/sun/security/tools/jarsigner/nameclash.sh - test/jdk/sun/security/tools/jarsigner/newsize7.sh - test/jdk/sun/security/tools/jarsigner/oldsig.sh - test/jdk/sun/security/tools/jarsigner/onlymanifest.sh - test/jdk/sun/security/tools/jarsigner/passtype.sh - test/jdk/sun/security/tools/jarsigner/samename.sh - test/jdk/sun/security/tools/jarsigner/weaksize.sh - test/jdk/sun/security/tools/keytool/CloneKeyAskPassword.sh - test/jdk/sun/security/tools/keytool/NoExtNPE.sh - test/jdk/sun/security/tools/keytool/SecretKeyKS.sh - test/jdk/sun/security/tools/keytool/StandardAlgName.sh - test/jdk/sun/security/tools/keytool/StorePasswordsByShell.sh - test/jdk/sun/security/tools/keytool/default_options.sh - test/jdk/sun/security/tools/keytool/emptysubject.sh - test/jdk/sun/security/tools/keytool/file-in-help.sh - test/jdk/sun/security/tools/keytool/i18n.sh - test/jdk/sun/security/tools/keytool/importreadall.sh - test/jdk/sun/security/tools/keytool/keyalg.sh - test/jdk/sun/security/tools/keytool/newhelp.sh - test/jdk/sun/security/tools/keytool/resource.sh - test/jdk/sun/security/tools/keytool/selfissued.sh - test/jdk/sun/security/tools/keytool/trystore.sh - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/AccessFrameTitle.java - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/p1/C1.java - test/langtools/jdk/javadoc/doclet/AccessFrameTitle/p2/C2.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/PackagesHeader.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/p1/C1.java - test/langtools/jdk/javadoc/doclet/PackagesHeader/p2/C2.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/TestClassDocCatalog.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyAnnotation.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyClass.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyEnum.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyError.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyException.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg1/EmptyInterface.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyAnnotation.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyClass.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyEnum.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyError.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyException.java - test/langtools/jdk/javadoc/doclet/testClassDocCatalog/pkg2/EmptyInterface.java - test/langtools/jdk/javadoc/doclet/testFramesNoFrames/TestFramesNoFrames.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/TestWindowTitle.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/p1/C1.java - test/langtools/jdk/javadoc/doclet/testWindowTitle/p2/C2.java From maurizio.cimadamore at oracle.com Tue May 7 13:24:56 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 13:24:56 +0000 Subject: hg: panama/dev: manual merge with foreign Message-ID: <201905071324.x47DOvbi011166@aojmv0008.oracle.com> Changeset: 58bf7ac38956 Author: mcimadamore Date: 2019-05-07 14:24 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/58bf7ac38956 manual merge with foreign ! make/CompileJavaModules.gmk ! src/hotspot/share/prims/nativeLookup.cpp ! src/java.base/share/classes/jdk/internal/foreign/abi/DirectSignatureShuffler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/LinkToNativeSignatureShuffler.java - src/java.base/share/classes/jdk/internal/foreign/abi/ShuffleRecipeFieldHelper.java - src/java.base/share/classes/jdk/internal/foreign/abi/ShuffleRecipeOperationCollector.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalNativeInvoker.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalUpcallHandler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/VarargsInvoker.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/CallingSequenceBuilder.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/SharedConstants.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/Constants.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/StandardCall.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/SysVx64ABI.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/UniversalNativeInvokerImpl.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/UniversalUpcallHandlerImpl.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/VarargsInvokerImpl.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/Constants.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/UniversalNativeInvokerImpl.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/UniversalUpcallHandlerImpl.java - src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/VarargsInvokerImpl.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/Windowsx64ABI.java - src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactory.java - src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactoryExt.java + test/jdk/java/foreign/abi/x64/sysv/CallingSequenceTest.java - test/jdk/java/foreign/abi/x64/sysv/StandardCallTest.java From maurizio.cimadamore at oracle.com Tue May 7 13:43:41 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 13:43:41 +0000 Subject: hg: panama/dev: Fix wrong alignment used in LinkToNativeInvoker for returnInMemory calls Message-ID: <201905071343.x47Dhgda025336@aojmv0008.oracle.com> Changeset: d73cf75f728f Author: mcimadamore Date: 2019-05-07 14:43 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/d73cf75f728f Fix wrong alignment used in LinkToNativeInvoker for returnInMemory calls ! src/java.base/share/classes/jdk/internal/foreign/abi/LinkToNativeInvoker.java From jbvernee at xs4all.nl Tue May 7 17:55:36 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Tue, 07 May 2019 17:55:36 +0000 Subject: hg: panama/dev: Summary: Add libclang version check and option to select version Message-ID: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> Changeset: 819585c6b8e3 Author: jvernee Date: 2019-05-07 18:51 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/819585c6b8e3 Summary: Add libclang version check and option to select version Reviewed-by: mcimadamore ! make/autoconf/lib-clang.m4 From maurizio.cimadamore at oracle.com Tue May 7 17:59:26 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 07 May 2019 17:59:26 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905071759.x47HxReI029866@aojmv0008.oracle.com> Changeset: 6b82651e9873 Author: mcimadamore Date: 2019-05-07 19:59 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/6b82651e9873 Automatic merge with foreign From sundararajan.athijegannathan at oracle.com Wed May 8 09:58:36 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Wed, 08 May 2019 15:28:36 +0530 Subject: [foreign] RFR 8223413: Improve missing symbols handling in jextract In-Reply-To: References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> Message-ID: <5CD2A84C.1@oracle.com> Looks good. -Sundar On 07/05/19, 2:45 AM, Henry Jen wrote: > Bug[1] created, and official webrev[2] review request. > > [1] https://bugs.openjdk.java.net/browse/JDK-8223413 > [2] http://cr.openjdk.java.net/~henryjen/panama/8223413/0/webrev/ > > Cheers, > Henry > >> On May 3, 2019, at 1:34 PM, Maurizio Cimadamore wrote: >> >> If I understand what you are saying, you just want to make explicit the fact that if -l has been set, _some_ library will always be set, either explicitly (via -L) or implicitly (inferred from java.library.path). >> >> If so, this seems like a good change. >> >> Maurizio >> >> On 03/05/2019 18:12, Henry Jen wrote: >>> The current implementation behave correctly as expected, but reading LibraryLookupFilter feels wrong. It depends on the fact that linkCheckPaths is initialized to java.library.path when -l is specified but not -L in Main.java. >>> >>> At least we can do is to add following change, >>> >>> --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>> +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>> @@ -86,8 +86,9 @@ >>> } >>> >>> private void initSymChecker(List linkCheckPaths) { >>> - if (!libraryNames.isEmpty()&& !linkCheckPaths.isEmpty()) { >>> + if (!libraryNames.isEmpty()) { >>> try { >>> + assert !linkCheckPaths.isEmpty(); >>> Library[] libs = loadLibraries(MethodHandles.lookup(), >>> linkCheckPaths.toArray(new String[0]), >>> libraryNames.toArray(new String[0])); >>> >>> Cheers, >>> Henry >>> >>>> On May 3, 2019, at 10:02 AM, Henry Jen wrote: >>>> >>>> Correct. >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: >>>>> >>>>> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? >>>>> >>>>> Maurizio >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 03/05/2019 17:43, Henry Jen wrote: >>>>>> Hi, >>>>>> >>>>>> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >>>>>> >>>>>> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >>>>>> >>>>>> - Same behavior as before if both -l and -L are provided. >>>>>> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >>>>>> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >>>>>> >>>>>> To be 100% compatible with before, use '?missing-symbols ignore? >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From sundararajan.athijegannathan at oracle.com Wed May 8 10:49:51 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Wed, 08 May 2019 16:19:51 +0530 Subject: [foreign] RFR 8223568: jextract should generate clang_support under user specified target package Message-ID: <5CD2B44F.608@oracle.com> Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8223568 Webrev: https://cr.openjdk.java.net/~sundar/8223568/webrev.00/ Thanks, -Sundar From maurizio.cimadamore at oracle.com Wed May 8 12:05:00 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 8 May 2019 13:05:00 +0100 Subject: [foreign] RFR 8223568: jextract should generate clang_support under user specified target package In-Reply-To: <5CD2B44F.608@oracle.com> References: <5CD2B44F.608@oracle.com> Message-ID: Looks good. While ago we discussed about _requiring_ a target package, explicitly to avoid these kind of issues; maybe the time has come to do it? Maurizio On 08/05/2019 11:49, Sundararajan Athijegannathan wrote: > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223568 > Webrev: https://cr.openjdk.java.net/~sundar/8223568/webrev.00/ > > Thanks, > -Sundar From sundararajan.athijegannathan at oracle.com Wed May 8 13:06:05 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Wed, 08 May 2019 13:06:05 +0000 Subject: hg: panama/dev: 8223568: jextract should generate clang_support under user specified target package Message-ID: <201905081306.x48D666r014647@aojmv0008.oracle.com> Changeset: 34449b3f3b3d Author: sundar Date: 2019-05-08 18:40 +0530 URL: http://hg.openjdk.java.net/panama/dev/rev/34449b3f3b3d 8223568: jextract should generate clang_support under user specified target package Reviewed-by: mcimadamore ! src/jdk.jextract/share/classes/com/sun/tools/jextract/HeaderResolver.java ! test/jdk/com/sun/tools/jextract/test8221154/SrcGenTest.java From maurizio.cimadamore at oracle.com Wed May 8 13:09:40 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 08 May 2019 13:09:40 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905081309.x48D9fkR018474@aojmv0008.oracle.com> Changeset: 148ddec1f3e6 Author: mcimadamore Date: 2019-05-08 15:09 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/148ddec1f3e6 Automatic merge with foreign From henry.jen at oracle.com Wed May 8 15:19:06 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Wed, 08 May 2019 15:19:06 +0000 Subject: hg: panama/dev: 8223413: Improve missing symbols handling in jextract Message-ID: <201905081519.x48FJ6It005650@aojmv0008.oracle.com> Changeset: 0de69b434c97 Author: henryjen Date: 2019-05-08 08:16 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/0de69b434c97 8223413: Improve missing symbols handling in jextract Reviewed-by: mcimadamore, sundar ! src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Main.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/MissingSymbolAction.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/resources/Messages.properties ! test/jdk/com/sun/tools/jextract/Runner.java ! test/jdk/com/sun/tools/jextract/TestForwardRef.java ! test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java ! test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java From henry.jen at oracle.com Wed May 8 15:21:08 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 8 May 2019 08:21:08 -0700 Subject: [foreign] RFR 8223413: Improve missing symbols handling in jextract In-Reply-To: <5CD2A84C.1@oracle.com> References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> <5CD2A84C.1@oracle.com> Message-ID: Thanks, pushed. I also sneak in a fix for intermmitent test failure for 8223105 on Windows. Cheers, Henry diff -r 34449b3f3b3d test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java --- a/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java Wed May 08 18:40:07 2019 +0530 +++ b/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java Wed May 08 08:18:36 2019 -0700 @@ -33,7 +33,7 @@ * @requires os.family == "windows" * @library .. * @run driver JtregJextract -C -DADD -t test.jextract.asmsymbol -- libAsmSymbol.h - * @run testng Test8223105WinA + * @run testng/othervm Test8223105WinA */ public class Test8223105WinA { static final libAsmSymbol_h libAsmSymbol; diff -r 34449b3f3b3d test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java --- a/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java Wed May 08 18:40:07 2019 +0530 +++ b/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java Wed May 08 08:18:36 2019 -0700 @@ -33,7 +33,7 @@ * @requires os.family == "windows" * @library .. * @run driver JtregJextract -t test.jextract.asmsymbol -- libAsmSymbol.h - * @run testng Test8223105WinB + * @run testng/othervm Test8223105WinB */ public class Test8223105WinB { static final libAsmSymbol_h libAsmSymbol; > On May 8, 2019, at 2:58 AM, Sundararajan Athijegannathan wrote: > > Looks good. > > -Sundar > > On 07/05/19, 2:45 AM, Henry Jen wrote: >> Bug[1] created, and official webrev[2] review request. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8223413 >> [2] http://cr.openjdk.java.net/~henryjen/panama/8223413/0/webrev/ >> >> Cheers, >> Henry >> >>> On May 3, 2019, at 1:34 PM, Maurizio Cimadamore wrote: >>> >>> If I understand what you are saying, you just want to make explicit the fact that if -l has been set, _some_ library will always be set, either explicitly (via -L) or implicitly (inferred from java.library.path). >>> >>> If so, this seems like a good change. >>> >>> Maurizio >>> >>> On 03/05/2019 18:12, Henry Jen wrote: >>>> The current implementation behave correctly as expected, but reading LibraryLookupFilter feels wrong. It depends on the fact that linkCheckPaths is initialized to java.library.path when -l is specified but not -L in Main.java. >>>> >>>> At least we can do is to add following change, >>>> >>>> --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>>> +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>>> @@ -86,8 +86,9 @@ >>>> } >>>> >>>> private void initSymChecker(List linkCheckPaths) { >>>> - if (!libraryNames.isEmpty()&& !linkCheckPaths.isEmpty()) { >>>> + if (!libraryNames.isEmpty()) { >>>> try { >>>> + assert !linkCheckPaths.isEmpty(); >>>> Library[] libs = loadLibraries(MethodHandles.lookup(), >>>> linkCheckPaths.toArray(new String[0]), >>>> libraryNames.toArray(new String[0])); >>>> >>>> Cheers, >>>> Henry >>>> >>>>> On May 3, 2019, at 10:02 AM, Henry Jen wrote: >>>>> >>>>> Correct. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>>> On May 3, 2019, at 10:00 AM, Maurizio Cimadamore wrote: >>>>>> >>>>>> Seems like an useful follow up. Just to make sure I understand, if -l is specified, we get same behavior as before. If -l is NOT specified, then the behavior would differ (as we now do the check), and therefore we need the 'ignore' option explicitly, if we want to suppress logging. Right? >>>>>> >>>>>> Maurizio >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 03/05/2019 17:43, Henry Jen wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review a webrev[1] that add the missing ?missing-symbols warn support, and turn on symbol checking against default library by default. >>>>>>> >>>>>>> This is kind of a follow up to JDK-8223247, as that simply assume we are using the default libraries within JVM if no -l option is provided. This webrev now will >>>>>>> >>>>>>> - Same behavior as before if both -l and -L are provided. >>>>>>> - Symbol check is turned on always. If there is no -l provided, jextract will check symbols against the default library. >>>>>>> - Default is to issue warnings without -l, exclude with explicit -l. This is mostly backward compatible as it doesn?t change generated code/classes, but show warnings to inform user about potential missing libraries. >>>>>>> >>>>>>> To be 100% compatible with before, use '?missing-symbols ignore? >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From maurizio.cimadamore at oracle.com Wed May 8 15:24:30 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 08 May 2019 15:24:30 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905081524.x48FOUUf011036@aojmv0008.oracle.com> Changeset: 84d7cfa7eb34 Author: mcimadamore Date: 2019-05-08 17:24 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/84d7cfa7eb34 Automatic merge with foreign From jbvernee at xs4all.nl Wed May 8 16:18:55 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 08 May 2019 18:18:55 +0200 Subject: [foreign] RFR 8223413: Improve missing symbols handling in jextract In-Reply-To: References: <6D347C29-4F55-4904-A44A-82DED73E4333@oracle.com> <9aaf53e8-5ab9-b527-832e-2aed27fabf48@oracle.com> <10E9FCC8-3923-4934-8A51-92F21CAF81CF@oracle.com> <9f55011d-3de6-3c40-e786-bc2770bdbdef@oracle.com> <5CD2A84C.1@oracle.com> Message-ID: <60820cad8cbfc878b6de627723501441@xs4all.nl> > I also sneak in a fix for intermmitent test failure Thanks! Was running into this as well. Cheers, Jorn Henry Jen schreef op 2019-05-08 17:21: > Thanks, pushed. I also sneak in a fix for intermmitent test failure > for 8223105 on Windows. > > Cheers, > Henry > > > diff -r 34449b3f3b3d > test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java > --- a/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java > Wed May 08 18:40:07 2019 +0530 > +++ b/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinA.java > Wed May 08 08:18:36 2019 -0700 > @@ -33,7 +33,7 @@ > * @requires os.family == "windows" > * @library .. > * @run driver JtregJextract -C -DADD -t test.jextract.asmsymbol -- > libAsmSymbol.h > - * @run testng Test8223105WinA > + * @run testng/othervm Test8223105WinA > */ > public class Test8223105WinA { > static final libAsmSymbol_h libAsmSymbol; > diff -r 34449b3f3b3d > test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java > --- a/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java > Wed May 08 18:40:07 2019 +0530 > +++ b/test/jdk/com/sun/tools/jextract/test8223105/Test8223105WinB.java > Wed May 08 08:18:36 2019 -0700 > @@ -33,7 +33,7 @@ > * @requires os.family == "windows" > * @library .. > * @run driver JtregJextract -t test.jextract.asmsymbol -- > libAsmSymbol.h > - * @run testng Test8223105WinB > + * @run testng/othervm Test8223105WinB > */ > public class Test8223105WinB { > static final libAsmSymbol_h libAsmSymbol; > > >> On May 8, 2019, at 2:58 AM, Sundararajan Athijegannathan >> wrote: >> >> Looks good. >> >> -Sundar >> >> On 07/05/19, 2:45 AM, Henry Jen wrote: >>> Bug[1] created, and official webrev[2] review request. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8223413 >>> [2] http://cr.openjdk.java.net/~henryjen/panama/8223413/0/webrev/ >>> >>> Cheers, >>> Henry >>> >>>> On May 3, 2019, at 1:34 PM, Maurizio >>>> Cimadamore wrote: >>>> >>>> If I understand what you are saying, you just want to make explicit >>>> the fact that if -l has been set, _some_ library will always be set, >>>> either explicitly (via -L) or implicitly (inferred from >>>> java.library.path). >>>> >>>> If so, this seems like a good change. >>>> >>>> Maurizio >>>> >>>> On 03/05/2019 18:12, Henry Jen wrote: >>>>> The current implementation behave correctly as expected, but >>>>> reading LibraryLookupFilter feels wrong. It depends on the fact >>>>> that linkCheckPaths is initialized to java.library.path when -l is >>>>> specified but not -L in Main.java. >>>>> >>>>> At least we can do is to add following change, >>>>> >>>>> --- >>>>> a/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>>>> +++ >>>>> b/src/jdk.jextract/share/classes/com/sun/tools/jextract/LibraryLookupFilter.java >>>>> @@ -86,8 +86,9 @@ >>>>> } >>>>> >>>>> private void initSymChecker(List linkCheckPaths) { >>>>> - if (!libraryNames.isEmpty()&& !linkCheckPaths.isEmpty()) >>>>> { >>>>> + if (!libraryNames.isEmpty()) { >>>>> try { >>>>> + assert !linkCheckPaths.isEmpty(); >>>>> Library[] libs = >>>>> loadLibraries(MethodHandles.lookup(), >>>>> linkCheckPaths.toArray(new String[0]), >>>>> libraryNames.toArray(new String[0])); >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 3, 2019, at 10:02 AM, Henry Jen >>>>>> wrote: >>>>>> >>>>>> Correct. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>>> On May 3, 2019, at 10:00 AM, Maurizio >>>>>>> Cimadamore wrote: >>>>>>> >>>>>>> Seems like an useful follow up. Just to make sure I understand, >>>>>>> if -l is specified, we get same behavior as before. If -l is NOT >>>>>>> specified, then the behavior would differ (as we now do the >>>>>>> check), and therefore we need the 'ignore' option explicitly, if >>>>>>> we want to suppress logging. Right? >>>>>>> >>>>>>> Maurizio >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 03/05/2019 17:43, Henry Jen wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review a webrev[1] that add the missing ?missing-symbols >>>>>>>> warn support, and turn on symbol checking against default >>>>>>>> library by default. >>>>>>>> >>>>>>>> This is kind of a follow up to JDK-8223247, as that simply >>>>>>>> assume we are using the default libraries within JVM if no -l >>>>>>>> option is provided. This webrev now will >>>>>>>> >>>>>>>> - Same behavior as before if both -l and -L are provided. >>>>>>>> - Symbol check is turned on always. If there is no -l provided, >>>>>>>> jextract will check symbols against the default library. >>>>>>>> - Default is to issue warnings without -l, exclude with explicit >>>>>>>> -l. This is mostly backward compatible as it doesn?t change >>>>>>>> generated code/classes, but show warnings to inform user about >>>>>>>> potential missing libraries. >>>>>>>> >>>>>>>> To be 100% compatible with before, use '?missing-symbols ignore? >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>> [1] >>>>>>>> http://cr.openjdk.java.net/~henryjen/panama/missingSymbols/webrev/ From vivek.r.deshpande at intel.com Wed May 8 19:03:43 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 8 May 2019 19:03:43 +0000 Subject: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F48846A@ORSMSX106.amr.corp.intel.com> Hi all, I have worked on adding missing tests in the jtreg test suite for vector APIs. In this patch I have added tests for masked reductions and masked min, max. Also uncovered a bug in the masked minLanes maxLanes for Float/Double so added a fix for that. I have created a webrev for these changes. Requesting a review. Webrev - http://cr.openjdk.java.net/~vdeshpande/8221429/webrev.00/ Bug - https://bugs.openjdk.java.net/browse/JDK-8221429 Thanks, Vivek From henry.jen at oracle.com Thu May 9 06:11:53 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 8 May 2019 23:11:53 -0700 Subject: hg: panama/dev: Summary: Add libclang version check and option to select version In-Reply-To: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> References: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> Message-ID: This change force me to use ?with-libclang-version unless I am using 7, is this desired? Normally, ?with-libclang should be enough and there would be only one version as the bundle is one single version. Anything else should be optional to accommodate special cases. Cheers, Henry > On May 7, 2019, at 10:55 AM, jbvernee at xs4all.nl wrote: > > Changeset: 819585c6b8e3 > Author: jvernee > Date: 2019-05-07 18:51 +0200 > URL: http://hg.openjdk.java.net/panama/dev/rev/819585c6b8e3 > > Summary: Add libclang version check and option to select version > Reviewed-by: mcimadamore > > ! make/autoconf/lib-clang.m4 > From maurizio.cimadamore at oracle.com Thu May 9 09:51:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 9 May 2019 10:51:39 +0100 Subject: hg: panama/dev: Summary: Add libclang version check and option to select version In-Reply-To: References: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> Message-ID: <20a8e71c-758d-e8c4-16fc-74edcb677e8f@oracle.com> I think that was the goal yes, to prevent you from using versions that are not working fully with jextract. In reality I had no trouble with LLVM 4/5/6 - but the logic Jorn submitted is simpler, and it's clear what you have to do in order to get started. If you really want to override, you have the option to do so (at your own risk), which is also good. Maurizio On 09/05/2019 07:11, Henry Jen wrote: > This change force me to use ?with-libclang-version unless I am using 7, is this desired? > > Normally, ?with-libclang should be enough and there would be only one version as the bundle is one single version. > > Anything else should be optional to accommodate special cases. > > Cheers, > Henry > > >> On May 7, 2019, at 10:55 AM, jbvernee at xs4all.nl wrote: >> >> Changeset: 819585c6b8e3 >> Author: jvernee >> Date: 2019-05-07 18:51 +0200 >> URL: http://hg.openjdk.java.net/panama/dev/rev/819585c6b8e3 >> >> Summary: Add libclang version check and option to select version >> Reviewed-by: mcimadamore >> >> ! make/autoconf/lib-clang.m4 >> From jbvernee at xs4all.nl Thu May 9 09:58:02 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 09 May 2019 11:58:02 +0200 Subject: hg: panama/dev: Summary: Add libclang version check and option to select version In-Reply-To: References: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> Message-ID: <0bfffba59f3081b6f7d6e6bcdf2bc71e@xs4all.nl> Yes this is desired. We don't support libclang 8 and 9 currently, but we know 7 works, so we set that as a default. It's a way of saying "please use 7, because we know it works". But there could be reasons why people want to use other versions, for instance to experiment, so there's still an option to change the version using --with-libclang-version. > Anything else should be optional to accommodate special cases. So yeah, anything but version 7 would be considered a special case at this point. If you're using another version that also works, we could add that as a default supported version as well, e.g. for 6: ``` diff -r 819585c6b8e3 make/autoconf/lib-clang.m4 --- a/make/autoconf/lib-clang.m4 Tue May 07 18:51:44 2019 +0200 +++ b/make/autoconf/lib-clang.m4 Thu May 09 11:48:51 2019 +0200 @@ -61,7 +61,7 @@ LIBCLANG_VERSION="$with_libclang_version" AC_MSG_RESULT([$LIBCLANG_VERSION (manually specified)]) else - LIBCLANG_VERSION="7" + LIBCLANG_VERSION="[67]" AC_MSG_RESULT([$LIBCLANG_VERSION (default)]) fi ``` Since we use grep to do the filtering. Jorn Henry Jen schreef op 2019-05-09 08:11: > This change force me to use ?with-libclang-version unless I am using > 7, is this desired? > > Normally, ?with-libclang should be enough and there would be only one > version as the bundle is one single version. > > Anything else should be optional to accommodate special cases. > > Cheers, > Henry > > >> On May 7, 2019, at 10:55 AM, jbvernee at xs4all.nl wrote: >> >> Changeset: 819585c6b8e3 >> Author: jvernee >> Date: 2019-05-07 18:51 +0200 >> URL: http://hg.openjdk.java.net/panama/dev/rev/819585c6b8e3 >> >> Summary: Add libclang version check and option to select version >> Reviewed-by: mcimadamore >> >> ! make/autoconf/lib-clang.m4 >> From maurizio.cimadamore at oracle.com Thu May 9 12:39:34 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 9 May 2019 13:39:34 +0100 Subject: [foreign-memaccess] RFR: Flatten package hierarchy Message-ID: Given the small number of classes/interfaces involved in this API, I think it would preferrable to use a single package (java.foreign). This choice will also be resilient for when we do the later ABI layer, which only introduces one public SystemABI interface. For higher-level binder/jextract stuff, I think it makes sense to use a separate package (java.foreign.binder?) http://cr.openjdk.java.net/~mcimadamore/panama/8223614/ Maurizio From maurizio.cimadamore at oracle.com Thu May 9 14:36:06 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 9 May 2019 15:36:06 +0100 Subject: [foreign-memaccess] RFR 8223629: Remove Descriptor and Function Message-ID: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> Hi, this small patch is to remove Descriptor and Function which are unused in the memory access layer. Some simplifications to the Address layout class were needed to adjust for that. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223629/ Maurizio From jbvernee at xs4all.nl Thu May 9 14:59:49 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 09 May 2019 16:59:49 +0200 Subject: [foreign-memaccess] RFR 8223629: Remove Descriptor and Function In-Reply-To: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> References: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> Message-ID: <818ea44a729da9c91e084d953c84e0c3@xs4all.nl> This does not apply cleanly for me... I think the flattening of the packages was not picked up in the webrev? (I'd expect to see a bunch of re-names) Otherwise it looks good, but some of the javadoc you copied from Descriptor to Layout still mentions "descriptor", where I think it should say "layout" instead. e.g.: + * Does this descriptor contain unresolved layouts? + * @return the descriptor name (if any). + * Add annotation to descriptor. + * Attach name annotation to given descriptor. + * @return a new descriptor with desired name annotation. + * Strip all annotations from this (possibly annotated) descriptor. + * @return the unannotated descriptor. Jorn Maurizio Cimadamore schreef op 2019-05-09 16:36: > Hi, > this small patch is to remove Descriptor and Function which are unused > in the memory access layer. Some simplifications to the Address layout > class were needed to adjust for that. > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/panama/8223629/ > > Maurizio From jbvernee at xs4all.nl Thu May 9 15:02:35 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 09 May 2019 17:02:35 +0200 Subject: [foreign-memaccess] RFR 8223629: Remove Descriptor and Function In-Reply-To: <818ea44a729da9c91e084d953c84e0c3@xs4all.nl> References: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> <818ea44a729da9c91e084d953c84e0c3@xs4all.nl> Message-ID: > This does not apply cleanly for me... I think the flattening of the > packages was not picked up in the webrev? (I'd expect to see a bunch > of re-names) No, sorry, that's the other patch I was missing :) Jorn Jorn Vernee schreef op 2019-05-09 16:59: > This does not apply cleanly for me... I think the flattening of the > packages was not picked up in the webrev? (I'd expect to see a bunch > of re-names) > > Otherwise it looks good, but some of the javadoc you copied from > Descriptor to Layout still mentions "descriptor", where I think it > should say "layout" instead. e.g.: > > + * Does this descriptor contain unresolved layouts? > + * @return the descriptor name (if any). > + * Add annotation to descriptor. > + * Attach name annotation to given descriptor. > + * @return a new descriptor with desired name annotation. > + * Strip all annotations from this (possibly annotated) > descriptor. > + * @return the unannotated descriptor. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-09 16:36: >> Hi, >> this small patch is to remove Descriptor and Function which are unused >> in the memory access layer. Some simplifications to the Address layout >> class were needed to adjust for that. >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/panama/8223629/ >> >> Maurizio From jbvernee at xs4all.nl Thu May 9 15:06:39 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 09 May 2019 17:06:39 +0200 Subject: [foreign-memaccess] RFR: Flatten package hierarchy In-Reply-To: References: Message-ID: <008e144cf9697094a5607cf1f0631e41@xs4all.nl> Looks good to me. Jorn Maurizio Cimadamore schreef op 2019-05-09 14:39: > Given the small number of classes/interfaces involved in this API, I > think it would preferrable to use a single package (java.foreign). > > This choice will also be resilient for when we do the later ABI layer, > which only introduces one public SystemABI interface. > > For higher-level binder/jextract stuff, I think it makes sense to use > a separate package (java.foreign.binder?) > > http://cr.openjdk.java.net/~mcimadamore/panama/8223614/ > > Maurizio From maurizio.cimadamore at oracle.com Thu May 9 15:31:41 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 9 May 2019 16:31:41 +0100 Subject: [foreign-memaccess] RFR 8223629: Remove Descriptor and Function In-Reply-To: References: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> <818ea44a729da9c91e084d953c84e0c3@xs4all.nl> Message-ID: Yep this is meant to go on top of the other patch. Sorry I should have said that loud. I'll rectify the javadocs. Thanks Maurizio On 09/05/2019 16:02, Jorn Vernee wrote: >> This does not apply cleanly for me... I think the flattening of the >> packages was not picked up in the webrev? (I'd expect to see a bunch >> of re-names) > > No, sorry, that's the other patch I was missing :) > > Jorn > > Jorn Vernee schreef op 2019-05-09 16:59: >> This does not apply cleanly for me... I think the flattening of the >> packages was not picked up in the webrev? (I'd expect to see a bunch >> of re-names) >> >> Otherwise it looks good, but some of the javadoc you copied from >> Descriptor to Layout still mentions "descriptor", where I think it >> should say "layout" instead. e.g.: >> >> +???? * Does this descriptor contain unresolved layouts? >> +???? * @return the descriptor name (if any). >> +???? * Add annotation to descriptor. >> +???? * Attach name annotation to given descriptor. >> +???? * @return a new descriptor with desired name annotation. >> +???? * Strip all annotations from this (possibly annotated) descriptor. >> +???? * @return the unannotated descriptor. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-09 16:36: >>> Hi, >>> this small patch is to remove Descriptor and Function which are unused >>> in the memory access layer. Some simplifications to the Address layout >>> class were needed to adjust for that. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mcimadamore/panama/8223629/ >>> >>> Maurizio From henry.jen at oracle.com Thu May 9 18:41:16 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 9 May 2019 11:41:16 -0700 Subject: [foreign] RFR: 8223641: Run jextract with --log FINE cause UnsupportedOperationException Message-ID: <88F63906-A38B-4E3D-A74B-900C1B428C0A@oracle.com> Hi, Please review the trivial webrev[1] that fix usage of unsupported JType.Function.getSourceSignature that cause the UOE[2]. Because JType.Function doesn?t have information about function name, thus a proper implementation if not viable. Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8223641/webrev/ [2] https://bugs.openjdk.java.net/browse/JDK-8223641 From henry.jen at oracle.com Thu May 9 18:50:22 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 9 May 2019 11:50:22 -0700 Subject: hg: panama/dev: Summary: Add libclang version check and option to select version In-Reply-To: <0bfffba59f3081b6f7d6e6bcdf2bc71e@xs4all.nl> References: <201905071755.x47HtbQd026672@aojmv0008.oracle.com> <0bfffba59f3081b6f7d6e6bcdf2bc71e@xs4all.nl> Message-ID: Thanks for the explanation. If the goal is to make sure we are using specific versions, then this is good. Cheers, Henry > On May 9, 2019, at 2:58 AM, Jorn Vernee wrote: > > Yes this is desired. We don't support libclang 8 and 9 currently, but we know 7 works, so we set that as a default. It's a way of saying "please use 7, because we know it works". But there could be reasons why people want to use other versions, for instance to experiment, so there's still an option to change the version using --with-libclang-version. > >> Anything else should be optional to accommodate special cases. > > So yeah, anything but version 7 would be considered a special case at this point. > > If you're using another version that also works, we could add that as a default supported version as well, e.g. for 6: > > ``` > diff -r 819585c6b8e3 make/autoconf/lib-clang.m4 > --- a/make/autoconf/lib-clang.m4 Tue May 07 18:51:44 2019 +0200 > +++ b/make/autoconf/lib-clang.m4 Thu May 09 11:48:51 2019 +0200 > @@ -61,7 +61,7 @@ > LIBCLANG_VERSION="$with_libclang_version" > AC_MSG_RESULT([$LIBCLANG_VERSION (manually specified)]) > else > - LIBCLANG_VERSION="7" > + LIBCLANG_VERSION="[67]" > AC_MSG_RESULT([$LIBCLANG_VERSION (default)]) > fi > > ``` > > Since we use grep to do the filtering. > > Jorn > > Henry Jen schreef op 2019-05-09 08:11: >> This change force me to use ?with-libclang-version unless I am using >> 7, is this desired? >> Normally, ?with-libclang should be enough and there would be only one >> version as the bundle is one single version. >> Anything else should be optional to accommodate special cases. >> Cheers, >> Henry >>> On May 7, 2019, at 10:55 AM, jbvernee at xs4all.nl wrote: >>> Changeset: 819585c6b8e3 >>> Author: jvernee >>> Date: 2019-05-07 18:51 +0200 >>> URL: http://hg.openjdk.java.net/panama/dev/rev/819585c6b8e3 >>> Summary: Add libclang version check and option to select version >>> Reviewed-by: mcimadamore >>> ! make/autoconf/lib-clang.m4 From sundararajan.athijegannathan at oracle.com Fri May 10 02:09:34 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 10 May 2019 07:39:34 +0530 Subject: [foreign] RFR: 8223641: Run jextract with --log FINE cause UnsupportedOperationException In-Reply-To: <88F63906-A38B-4E3D-A74B-900C1B428C0A@oracle.com> References: <88F63906-A38B-4E3D-A74B-900C1B428C0A@oracle.com> Message-ID: <5CD4DD5E.8010000@oracle.com> Looks good. -Sundar On 10/05/19, 12:11 AM, Henry Jen wrote: > Hi, > > Please review the trivial webrev[1] that fix usage of unsupported JType.Function.getSourceSignature that cause the UOE[2]. > > Because JType.Function doesn?t have information about function name, thus a proper implementation if not viable. > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/8223641/webrev/ > [2] https://bugs.openjdk.java.net/browse/JDK-8223641 From henry.jen at oracle.com Fri May 10 04:43:54 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Fri, 10 May 2019 04:43:54 +0000 Subject: hg: panama/dev: 8223641: Run jextract with --log FINE cause UnsupportedOperationException Message-ID: <201905100443.x4A4hst9026898@aojmv0008.oracle.com> Changeset: d9ded289d4dd Author: henryjen Date: 2019-05-09 21:42 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/d9ded289d4dd 8223641: Run jextract with --log FINE cause UnsupportedOperationException Reviewed-by:sundar ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactory.java From maurizio.cimadamore at oracle.com Fri May 10 04:49:30 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 10 May 2019 04:49:30 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905100449.x4A4nUsd028509@aojmv0008.oracle.com> Changeset: 07fbc275bd3c Author: mcimadamore Date: 2019-05-10 06:49 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/07fbc275bd3c Automatic merge with foreign From Joshua.Zhu at arm.com Fri May 10 06:41:21 2019 From: Joshua.Zhu at arm.com (Joshua Zhu (Arm Technology China)) Date: Fri, 10 May 2019 06:41:21 +0000 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: Hi Vladimir, > It looks promising to introduce a variant of > VectorIntrinsics.checkIndex() which is used to guard a fast path and is > annotated with a JIT-compiler hint (akin to > java.lang.invoke.MethodHandleImpl.profileBoolean() [1], but without > profiling logic) to override bytecode profiling info, so JIT always puts an > uncommon trap on the false branch. Thanks for your comments. As you suggested, I introduced VectorIntrinsics.expectTrue() in change [2]. It's used as below: if (expectTrue(bool condition)) { // fast path } else { // slow path: uncommon trap } I also wrote a jmh case [3] to check the performance. See below table for jmh test results. (In Throughput Mode, Unit: ops/ms) Base without expectTrue (patch [1]) UncommonTrap (patch [2]) 1000 fastPath 318.228 ? 22.588 457.967 ? 12.622 457.328 ? 11.932 10000 fastPath 21.991 ? 2.496 23.360 ? 0.070 24.744 ? 0.213 100000 fastPath 1.613 ? 0.007 1.581 ? 0.031 1.631 ? 0.003 1000 fastPath + 1 slowPath N/A 57.298 ? 11.033 55.845 ? 0.716 10000 fastPath + 1 slowPath N/A 4.537 ? 0.536 15.164 ? 0.098 100000 fastPath + 1 slowPath N/A 0.577 ? 0.048 1.564 ? 0.005 [1] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/webrev.01/ [2] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/uncommontrap.webrev.00/ [3] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/IntVectorJmhTest.java Please help review and feel free to share your comments. Thanks. Best Regards, Joshua From sundararajan.athijegannathan at oracle.com Fri May 10 08:09:25 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 10 May 2019 13:39:25 +0530 Subject: [foreign-memaccess] RFR: Flatten package hierarchy In-Reply-To: References: Message-ID: <5CD531B5.2060607@oracle.com> Looks good -Sundar On 09/05/19, 6:09 PM, Maurizio Cimadamore wrote: > Given the small number of classes/interfaces involved in this API, I > think it would preferrable to use a single package (java.foreign). > > This choice will also be resilient for when we do the later ABI layer, > which only introduces one public SystemABI interface. > > For higher-level binder/jextract stuff, I think it makes sense to use > a separate package (java.foreign.binder?) > > http://cr.openjdk.java.net/~mcimadamore/panama/8223614/ > > Maurizio > > From sundararajan.athijegannathan at oracle.com Fri May 10 08:12:47 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 10 May 2019 13:42:47 +0530 Subject: [foreign-memaccess] RFR 8223629: Remove Descriptor and Function In-Reply-To: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> References: <418c8309-687e-cd0e-9f1c-4714c9a5c6fa@oracle.com> Message-ID: <5CD5327F.9060000@oracle.com> Looks good -Sundar On 09/05/19, 8:06 PM, Maurizio Cimadamore wrote: > Hi, > this small patch is to remove Descriptor and Function which are unused > in the memory access layer. Some simplifications to the Address layout > class were needed to adjust for that. > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/panama/8223629/ > > Maurizio > > > From maurizio.cimadamore at oracle.com Fri May 10 14:11:50 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 10 May 2019 14:11:50 +0000 Subject: hg: panama/dev: 8223614: Flatten package hierarchy Message-ID: <201905101411.x4AEBoSS019421@aojmv0008.oracle.com> Changeset: fcaa893dfec4 Author: mcimadamore Date: 2019-05-10 15:09 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/fcaa893dfec4 8223614: Flatten package hierarchy Reviewed-by: sundar ! src/java.base/share/classes/java/foreign/AbstractDescriptor.java < src/java.base/share/classes/java/foreign/layout/AbstractDescriptor.java ! src/java.base/share/classes/java/foreign/Address.java < src/java.base/share/classes/java/foreign/layout/Address.java ! src/java.base/share/classes/java/foreign/Descriptor.java < src/java.base/share/classes/java/foreign/layout/Descriptor.java ! src/java.base/share/classes/java/foreign/Function.java < src/java.base/share/classes/java/foreign/layout/Function.java ! src/java.base/share/classes/java/foreign/Group.java < src/java.base/share/classes/java/foreign/layout/Group.java ! src/java.base/share/classes/java/foreign/Layout.java < src/java.base/share/classes/java/foreign/layout/Layout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java < src/java.base/share/classes/java/foreign/layout/LayoutPath.java ! src/java.base/share/classes/java/foreign/MemoryAddress.java < src/java.base/share/classes/java/foreign/memory/MemoryAddress.java ! src/java.base/share/classes/java/foreign/MemoryScope.java < src/java.base/share/classes/java/foreign/memory/MemoryScope.java ! src/java.base/share/classes/java/foreign/Padding.java < src/java.base/share/classes/java/foreign/layout/Padding.java ! src/java.base/share/classes/java/foreign/Sequence.java < src/java.base/share/classes/java/foreign/layout/Sequence.java ! src/java.base/share/classes/java/foreign/Unresolved.java < src/java.base/share/classes/java/foreign/layout/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java < src/java.base/share/classes/java/foreign/layout/Value.java ! src/java.base/share/classes/java/lang/invoke/AddressVarHandleGenerator.java ! src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template ! src/java.base/share/classes/jdk/internal/access/JavaLangInvokeAccess.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathsImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java ! src/java.base/share/classes/module-info.java ! test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Fri May 10 14:25:09 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 10 May 2019 14:25:09 +0000 Subject: hg: panama/dev: 8223629: Remove Descriptor and Function Message-ID: <201905101425.x4AEPAZu027042@aojmv0008.oracle.com> Changeset: 56b3b64161d4 Author: mcimadamore Date: 2019-05-10 15:20 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/56b3b64161d4 8223629: Remove Descriptor and Function Reviewed-by: sundar ! src/java.base/share/classes/java/foreign/Address.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/Padding.java ! src/java.base/share/classes/java/foreign/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java From maurizio.cimadamore at oracle.com Fri May 10 14:26:12 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 10 May 2019 14:26:12 +0000 Subject: hg: panama/dev: 8223629: Remove Descriptor and Function Message-ID: <201905101426.x4AEQCmX028759@aojmv0008.oracle.com> Changeset: 78792f07f470 Author: mcimadamore Date: 2019-05-10 15:25 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/78792f07f470 8223629: Remove Descriptor and Function Reviewed-by: sundar Forgot to hg add/remove files - src/java.base/share/classes/java/foreign/AbstractDescriptor.java + src/java.base/share/classes/java/foreign/AbstractLayout.java - src/java.base/share/classes/java/foreign/Descriptor.java - src/java.base/share/classes/java/foreign/Function.java From maurizio.cimadamore at oracle.com Fri May 10 15:24:45 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 10 May 2019 16:24:45 +0100 Subject: [foreign-memaccess] RFR 8223712: Investigate more fluent LayoutPath API Message-ID: <629317d5-d940-684a-43ec-8259ee525c22@oracle.com> Hi, after some discussion with Brian, here's an attempt to make the LayoutPath lookup API simpler, by removing the Stream-returning lookup method. Now, you just ask a path out of a layout (the path you obtain is rooted there), and then you proceed from there calling various builder-like methods - example: Sequence seq = Sequence.of(20, ????????????????Sequence.of(10, ??????????????????????Group.struct( ????????????????????????????Padding.of(32), ????????????????????????????Value.ofUnsignedInt(32).withName("elem") ??????????????????????) ????????????????)); seq.toPath() ????????????????.sequenceElement() ????????????????.sequenceElement() ????????????????.groupElement("elem") I've also added support for 'by-index' lookups, so the above can also be written by this: seq.toPath() ????????????????.sequenceElement() ????????????????.sequenceElement() ????????????????.groupElement(1) Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223712/ As a side-bar, I'm growing more and more skeptical about the relationship between Sequence and Group; there are sequences that cannot be represented as groups: * if sequence has more than 2^32 elements (in which case we cannot create a big enough group for it) * if sequence has an unknown arity (which we know in the case of native interop can happen) The latter case is important - when you have a pointer, languages are typically very loose as to whether you point to ONE element of that kind, or MANY. It would be nice to be able to model these different cases with different layouts. Maurizio From jbvernee at xs4all.nl Fri May 10 17:27:42 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 10 May 2019 19:27:42 +0200 Subject: [foreign-memaccess] RFR 8223712: Investigate more fluent LayoutPath API In-Reply-To: <629317d5-d940-684a-43ec-8259ee525c22@oracle.com> References: <629317d5-d940-684a-43ec-8259ee525c22@oracle.com> Message-ID: <6600f5fc3a29efcb4583289c960f5c05@xs4all.nl> Hi, I think this is a nice move! One minor nit: In LayoutPathImpl::sequenceElement there is an unspecified throw of IllegalStateException. Should probably be a UOE instead? There's also still a bit of a split between Sequence and Group. For instance, it is not possible to create a LayoutPath to an exact element of a sequence, and there is no way to make a LayoutPath to a group, where you would pass in the name/index of the element you want to access to the VarHandle later. Maybe it could be nice to add support for these missing cases? Like you, I'm also a bit skeptical about the inheritance relationship between Sequence and Group. e.g. if you look at LayoutPathImpl::lookupGroup, any Group is a valid argument, except if it's a Sequence. This seems to violate LSP. I think it might be better to have Sequence inherit from AbstractLayout directly instead? Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-10 17:24: > Hi, > after some discussion with Brian, here's an attempt to make the > LayoutPath lookup API simpler, by removing the Stream-returning lookup > method. > > Now, you just ask a path out of a layout (the path you obtain is > rooted there), and then you proceed from there calling various > builder-like methods - example: > > Sequence seq = Sequence.of(20, > ????????????????Sequence.of(10, > ??????????????????????Group.struct( > ????????????????????????????Padding.of(32), > ????????????????????????????Value.ofUnsignedInt(32).withName("elem") > ??????????????????????) > ????????????????)); > > > seq.toPath() > ????????????????.sequenceElement() > ????????????????.sequenceElement() > ????????????????.groupElement("elem") > > > I've also added support for 'by-index' lookups, so the above can also > be written by this: > > seq.toPath() > ????????????????.sequenceElement() > ????????????????.sequenceElement() > ????????????????.groupElement(1) > > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8223712/ > > > As a side-bar, I'm growing more and more skeptical about the > relationship between Sequence and Group; there are sequences that > cannot be represented as groups: > > * if sequence has more than 2^32 elements (in which case we cannot > create a big enough group for it) > * if sequence has an unknown arity (which we know in the case of > native interop can happen) > > The latter case is important - when you have a pointer, languages are > typically very loose as to whether you point to ONE element of that > kind, or MANY. It would be nice to be able to model these different > cases with different layouts. > > Maurizio From maurizio.cimadamore at oracle.com Fri May 10 18:29:48 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 10 May 2019 19:29:48 +0100 Subject: [foreign-memaccess] RFR 8223712: Investigate more fluent LayoutPath API In-Reply-To: <6600f5fc3a29efcb4583289c960f5c05@xs4all.nl> References: <629317d5-d940-684a-43ec-8259ee525c22@oracle.com> <6600f5fc3a29efcb4583289c960f5c05@xs4all.nl> Message-ID: <8e70f54c-db85-2a2e-e7eb-36a3310d6dc5@oracle.com> On 10/05/2019 18:27, Jorn Vernee wrote: > Hi, > > I think this is a nice move! > > One minor nit: In LayoutPathImpl::sequenceElement there is an > unspecified throw of IllegalStateException. Should probably be a UOE > instead? Yep - sorry, leftover. > > There's also still a bit of a split between Sequence and Group. For > instance, it is not possible to create a LayoutPath to an exact > element of a sequence, and there is no way to make a LayoutPath to a > group, where you would pass in the name/index of the element you want > to access to the VarHandle later. Maybe it could be nice to add > support for these missing cases? I think the former idea is probably interesting to explore (access array at position i, fixed). Other stuff, not so much - in other words, VarHandle access needs to be fast - so the coordinates have to be something which can be translated into an offset in O(1) [w/o allocating]. Looking up things in a map dynamically doesn't fall in this category. > > Like you, I'm also a bit skeptical about the inheritance relationship > between Sequence and Group. e.g. if you look at > LayoutPathImpl::lookupGroup, any Group is a valid argument, except if > it's a Sequence. This seems to violate LSP. I think it might be better > to have Sequence inherit from AbstractLayout directly instead? Right, that's what got me started on this topic... (although I've been aware of the issues esp. w.r.t. unspecified sequences for a long time). Maurizio > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-10 17:24: >> Hi, >> after some discussion with Brian, here's an attempt to make the >> LayoutPath lookup API simpler, by removing the Stream-returning lookup >> method. >> >> Now, you just ask a path out of a layout (the path you obtain is >> rooted there), and then you proceed from there calling various >> builder-like methods - example: >> >> Sequence seq = Sequence.of(20, >> ????????????????Sequence.of(10, >> ??????????????????????Group.struct( >> ????????????????????????????Padding.of(32), >> ????????????????????????????Value.ofUnsignedInt(32).withName("elem") >> ??????????????????????) >> ????????????????)); >> >> >> seq.toPath() >> ????????????????.sequenceElement() >> ????????????????.sequenceElement() >> ????????????????.groupElement("elem") >> >> >> I've also added support for 'by-index' lookups, so the above can also >> be written by this: >> >> seq.toPath() >> ????????????????.sequenceElement() >> ????????????????.sequenceElement() >> ????????????????.groupElement(1) >> >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8223712/ >> >> >> As a side-bar, I'm growing more and more skeptical about the >> relationship between Sequence and Group; there are sequences that >> cannot be represented as groups: >> >> * if sequence has more than 2^32 elements (in which case we cannot >> create a big enough group for it) >> * if sequence has an unknown arity (which we know in the case of >> native interop can happen) >> >> The latter case is important - when you have a pointer, languages are >> typically very loose as to whether you point to ONE element of that >> kind, or MANY. It would be nice to be able to model these different >> cases with different layouts. >> >> Maurizio From maurizio.cimadamore at oracle.com Mon May 13 12:14:47 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 13:14:47 +0100 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts Message-ID: Hi, as discussed previously, the relationship between sequences and groups is a bit shaky, as there are sequences that cannot be expressed as groups (sequences whose size is unbound). This patch models groups and sequences independently: * they all have a (minimal) common supertype called Compound (which supports only a Stream elements() method) * Group implements Iterable, always has a size, and you can access elements by index * Sequence doesn't implement Iterable, has an optional size, and has an accessor to retrieve the element layout (there's only one of them, regardless of the size) I think this is much saner. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ P.S. of course Compound should be renamed to CompoundLayout in a later webrev (as for all other layout names). P.P.S. in a separate patch I'll explore generalizing this method: LayoutPath::groupElement(long index) with this: LayoutPath::compoundElement(long index) Since now we should be able to implement indexed access on both group and sequences. Maurizio From maurizio.cimadamore at oracle.com Mon May 13 12:16:38 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 13:16:38 +0100 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: References: Message-ID: Btw, this patch is bases on the one for 8223712 (which I cannot push as hg server is down ATM). Maurizio On 13/05/2019 13:14, Maurizio Cimadamore wrote: > Hi, > as discussed previously, the relationship between sequences and groups > is a bit shaky, as there are sequences that cannot be expressed as > groups (sequences whose size is unbound). > > This patch models groups and sequences independently: > > * they all have a (minimal) common supertype called Compound (which > supports only a Stream elements() method) > > * Group implements Iterable, always has a size, and you can access > elements by index > > * Sequence doesn't implement Iterable, has an optional size, and has > an accessor to retrieve the element layout (there's only one of them, > regardless of the size) > > I think this is much saner. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ > > P.S. > of course Compound should be renamed to CompoundLayout in a later > webrev (as for all other layout names). > > P.P.S. > in a separate patch I'll explore generalizing this method: > > LayoutPath::groupElement(long index) > > with this: > > LayoutPath::compoundElement(long index) > > Since now we should be able to implement indexed access on both group > and sequences. > > Maurizio > > From jbvernee at xs4all.nl Mon May 13 13:40:50 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 13 May 2019 15:40:50 +0200 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: References: Message-ID: Hi, One comment: - LayoutPathImpl::sequenceElement Currently, looking up sequence elements as contents of a Value does not work. e.g. Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, Value.ofUnsignedInt(8))); VarHandle vh = v.toPath().sequenceElement().dereferenceHandle(byte.class); // Boom! IllegalStateException in LayoutPathImpl::sequenceElement If this case is not meant to be supported, I think the type of Value::contents should just be Optional instead of Optional. Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-13 14:14: > Hi, > as discussed previously, the relationship between sequences and groups > is a bit shaky, as there are sequences that cannot be expressed as > groups (sequences whose size is unbound). > > This patch models groups and sequences independently: > > * they all have a (minimal) common supertype called Compound (which > supports only a Stream elements() method) > > * Group implements Iterable, always has a size, and you can access > elements by index > > * Sequence doesn't implement Iterable, has an optional size, and has > an accessor to retrieve the element layout (there's only one of them, > regardless of the size) > > I think this is much saner. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ > > P.S. > of course Compound should be renamed to CompoundLayout in a later > webrev (as for all other layout names). > > P.P.S. > in a separate patch I'll explore generalizing this method: > > LayoutPath::groupElement(long index) > > with this: > > LayoutPath::compoundElement(long index) > > Since now we should be able to implement indexed access on both group > and sequences. > > Maurizio From maurizio.cimadamore at oracle.com Mon May 13 13:52:49 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 14:52:49 +0100 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: References: Message-ID: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> On 13/05/2019 14:40, Jorn Vernee wrote: > Hi, > > One comment: > > - LayoutPathImpl::sequenceElement > > ? Currently, looking up sequence elements as contents of a Value does > not work. e.g. > > ? Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, > Value.ofUnsignedInt(8))); > ? VarHandle vh = > v.toPath().sequenceElement().dereferenceHandle(byte.class); // Boom! > IllegalStateException in LayoutPathImpl::sequenceElement > > ? If this case is not meant to be supported, I think the type of > Value::contents should just be Optional instead of > Optional. There are two parts to this: I think constructing paths to sub-value contents should work. But I also think that dereferencing sub-value content should not be supported and best left to clients (as that almost always involve some bitmasking and a different set of carriers that is hard to be expressed in a single the API). So, the idea is that deferencing a path that points inside the guts of some value should throw some sane exception (I have some code which makes this more explicit). Of course, when we get to a more general version of compoundElement(long) which works on both sequences and groups, we should make sure that, given your layout, this works: v.toPath().compoundElement(2); A separate question is: given we don't intend to support dereference of value innards, should we keep the Value::contents API? Regardless of the decision, I think the top type 'Compound' is an useful abstraction. Maurizio > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-13 14:14: >> Hi, >> as discussed previously, the relationship between sequences and groups >> is a bit shaky, as there are sequences that cannot be expressed as >> groups (sequences whose size is unbound). >> >> This patch models groups and sequences independently: >> >> * they all have a (minimal) common supertype called Compound (which >> supports only a Stream elements() method) >> >> * Group implements Iterable, always has a size, and you can access >> elements by index >> >> * Sequence doesn't implement Iterable, has an optional size, and has >> an accessor to retrieve the element layout (there's only one of them, >> regardless of the size) >> >> I think this is much saner. >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ >> >> P.S. >> of course Compound should be renamed to CompoundLayout in a later >> webrev (as for all other layout names). >> >> P.P.S. >> in a separate patch I'll explore generalizing this method: >> >> LayoutPath::groupElement(long index) >> >> with this: >> >> LayoutPath::compoundElement(long index) >> >> Since now we should be able to implement indexed access on both group >> and sequences. >> >> Maurizio From jbvernee at xs4all.nl Mon May 13 14:16:54 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 13 May 2019 16:16:54 +0200 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> References: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> Message-ID: > But I also think that dereferencing sub-value > content should not be supported and best left to clients (as that > almost always involve some bitmasking and a different set of carriers > that is hard to be expressed in a single the API). I think supporting this is good, as long as the sub-value matches a Java carrier. e.g. this would make it possible to split larger than 64 bit values into separate parts: u128=[2:u64] and still make them accessible that way, while keeping the semantic meaning that it's actually a 128 bit value. For those cases no bitmasking is needed. Clients could create their own high-level carrier which wraps a `long` carrier VarHandle, and does multiple gets/sets on the underlying VarHandle per high-level get/set. > Regardless of the decision, I think the top type 'Compound' is an > useful abstraction. +1 Jorn Maurizio Cimadamore schreef op 2019-05-13 15:52: > On 13/05/2019 14:40, Jorn Vernee wrote: >> Hi, >> >> One comment: >> >> - LayoutPathImpl::sequenceElement >> >> ? Currently, looking up sequence elements as contents of a Value does >> not work. e.g. >> >> ? Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, >> Value.ofUnsignedInt(8))); >> ? VarHandle vh = >> v.toPath().sequenceElement().dereferenceHandle(byte.class); // Boom! >> IllegalStateException in LayoutPathImpl::sequenceElement >> >> ? If this case is not meant to be supported, I think the type of >> Value::contents should just be Optional instead of >> Optional. > > There are two parts to this: I think constructing paths to sub-value > contents should work. But I also think that dereferencing sub-value > content should not be supported and best left to clients (as that > almost always involve some bitmasking and a different set of carriers > that is hard to be expressed in a single the API). > > So, the idea is that deferencing a path that points inside the guts of > some value should throw some sane exception (I have some code which > makes this more explicit). > > Of course, when we get to a more general version of > compoundElement(long) which works on both sequences and groups, we > should make sure that, given your layout, this works: > > v.toPath().compoundElement(2); > > A separate question is: given we don't intend to support dereference > of value innards, should we keep the Value::contents API? > > Regardless of the decision, I think the top type 'Compound' is an > useful abstraction. > > Maurizio > >> >> Cheers, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-13 14:14: >>> Hi, >>> as discussed previously, the relationship between sequences and >>> groups >>> is a bit shaky, as there are sequences that cannot be expressed as >>> groups (sequences whose size is unbound). >>> >>> This patch models groups and sequences independently: >>> >>> * they all have a (minimal) common supertype called Compound (which >>> supports only a Stream elements() method) >>> >>> * Group implements Iterable, always has a size, and you can access >>> elements by index >>> >>> * Sequence doesn't implement Iterable, has an optional size, and has >>> an accessor to retrieve the element layout (there's only one of them, >>> regardless of the size) >>> >>> I think this is much saner. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ >>> >>> P.S. >>> of course Compound should be renamed to CompoundLayout in a later >>> webrev (as for all other layout names). >>> >>> P.P.S. >>> in a separate patch I'll explore generalizing this method: >>> >>> LayoutPath::groupElement(long index) >>> >>> with this: >>> >>> LayoutPath::compoundElement(long index) >>> >>> Since now we should be able to implement indexed access on both group >>> and sequences. >>> >>> Maurizio From maurizio.cimadamore at oracle.com Mon May 13 14:30:26 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 15:30:26 +0100 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: References: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> Message-ID: <34eda079-8295-2382-e1d4-a4410fd3cdef@oracle.com> On 13/05/2019 15:16, Jorn Vernee wrote: >> But I also think that dereferencing sub-value >> content should not be supported and best left to clients (as that >> almost always involve some bitmasking and a different set of carriers >> that is hard to be expressed in a single the API). > > I think supporting this is good, as long as the sub-value matches a > Java carrier. e.g. this would make it possible to split larger than 64 > bit values into separate parts: u128=[2:u64] and still make them > accessible that way, while keeping the semantic meaning that it's > actually a 128 bit value. For those cases no bitmasking is needed. > Clients could create their own high-level carrier which wraps a `long` > carrier VarHandle, and does multiple gets/sets on the underlying > VarHandle per high-level get/set. Not sure... it seems to me that if you have u128 = [2:u64] and you make a VarHandle::compareAndSet - I think the desired semantics is that atomic access occurs when reading the 128 bits from memory - but, in reality, this will result in weaker access mode - e.g. there will be atomic access only on the accessed 64 bits, which means another client would be able to update the other 64 bits at the same time as access is performed. In other words, the dereference operation should map into an equivalent Unsafe::xyz operation which does the low level memory access; there's no operation to do a read of 128 bits - so I don't think we want to fake it. When we will have more carriers (e.g. Valhalla values) we will be able to add more basic types to the dereference operation. Until then I believe its better to punt. Your use case of u128=[2:u64] would really be better served by just having [2:u64], which is actually the only semantics we can implement with the VM plumbing we have available so far. Maurizio > >> Regardless of the decision, I think the top type 'Compound' is an >> useful abstraction. > > +1 > > Jorn > > Maurizio Cimadamore schreef op 2019-05-13 15:52: >> On 13/05/2019 14:40, Jorn Vernee wrote: >>> Hi, >>> >>> One comment: >>> >>> - LayoutPathImpl::sequenceElement >>> >>> ? Currently, looking up sequence elements as contents of a Value >>> does not work. e.g. >>> >>> ? Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, >>> Value.ofUnsignedInt(8))); >>> ? VarHandle vh = >>> v.toPath().sequenceElement().dereferenceHandle(byte.class); // Boom! >>> IllegalStateException in LayoutPathImpl::sequenceElement >>> >>> ? If this case is not meant to be supported, I think the type of >>> Value::contents should just be Optional instead of >>> Optional. >> >> There are two parts to this: I think constructing paths to sub-value >> contents should work. But I also think that dereferencing sub-value >> content should not be supported and best left to clients (as that >> almost always involve some bitmasking and a different set of carriers >> that is hard to be expressed in a single the API). >> >> So, the idea is that deferencing a path that points inside the guts of >> some value should throw some sane exception (I have some code which >> makes this more explicit). >> >> Of course, when we get to a more general version of >> compoundElement(long) which works on both sequences and groups, we >> should make sure that, given your layout, this works: >> >> v.toPath().compoundElement(2); >> >> A separate question is: given we don't intend to support dereference >> of value innards, should we keep the Value::contents API? >> >> Regardless of the decision, I think the top type 'Compound' is an >> useful abstraction. >> >> Maurizio >> >>> >>> Cheers, >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-13 14:14: >>>> Hi, >>>> as discussed previously, the relationship between sequences and groups >>>> is a bit shaky, as there are sequences that cannot be expressed as >>>> groups (sequences whose size is unbound). >>>> >>>> This patch models groups and sequences independently: >>>> >>>> * they all have a (minimal) common supertype called Compound (which >>>> supports only a Stream elements() method) >>>> >>>> * Group implements Iterable, always has a size, and you can access >>>> elements by index >>>> >>>> * Sequence doesn't implement Iterable, has an optional size, and has >>>> an accessor to retrieve the element layout (there's only one of them, >>>> regardless of the size) >>>> >>>> I think this is much saner. >>>> >>>> Webrev: >>>> >>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ >>>> >>>> P.S. >>>> of course Compound should be renamed to CompoundLayout in a later >>>> webrev (as for all other layout names). >>>> >>>> P.P.S. >>>> in a separate patch I'll explore generalizing this method: >>>> >>>> LayoutPath::groupElement(long index) >>>> >>>> with this: >>>> >>>> LayoutPath::compoundElement(long index) >>>> >>>> Since now we should be able to implement indexed access on both group >>>> and sequences. >>>> >>>> Maurizio From maurizio.cimadamore at oracle.com Mon May 13 14:35:46 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Mon, 13 May 2019 14:35:46 +0000 Subject: hg: panama/dev: 8223712: Investigate more fluent LayoutPath API Message-ID: <201905131435.x4DEZkUM006939@aojmv0008.oracle.com> Changeset: 99f4725fefac Author: mcimadamore Date: 2019-05-13 12:16 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/99f4725fefac 8223712: Investigate more fluent LayoutPath API ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathsImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Mon May 13 14:43:50 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 15:43:50 +0100 Subject: [foreign-memaccess] RFR 8223778: Path lookup API refinements Message-ID: Hi, this patch, built on top of 8223768, addresses the following issues: * the API allows VarHandle with mismatching carrier size to be created. * the API allows creating of boolean.class VarHandle, which is internally not supported * the API allows dereference of path pointing inside Value sub-structure (which should not be supported) * LayoutPath::groupElement(long) doesn't work for indexed access inisde Sequence Thanks to Jorn for pointing out some of these issues. I've also done a cleanup of the javadoc in LayoutPath, and added the notion of bound vs. unbound layout path. A bound layout path is a path we know everything of _statically_. As such we can ask question such as 'offset'. Unbound paths have one or more access dimensions which are supplied at runtime. Webrev http://cr.openjdk.java.net/~mcimadamore/panama/8223778/ Maurizio From jbvernee at xs4all.nl Mon May 13 15:33:50 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 13 May 2019 17:33:50 +0200 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: <34eda079-8295-2382-e1d4-a4410fd3cdef@oracle.com> References: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> <34eda079-8295-2382-e1d4-a4410fd3cdef@oracle.com> Message-ID: In that case I think removing Value::content sounds good as well. If the focus is memory access, a part of the Layout API that can not be used for that seems unneeded. Besides, this seems like something that can be added back fairly easily later. Eventually we might even want to choose a different implementation for this (e.g. sub-type of Group, or something else). Jorn Maurizio Cimadamore schreef op 2019-05-13 16:30: > On 13/05/2019 15:16, Jorn Vernee wrote: >>> But I also think that dereferencing sub-value >>> content should not be supported and best left to clients (as that >>> almost always involve some bitmasking and a different set of carriers >>> that is hard to be expressed in a single the API). >> >> I think supporting this is good, as long as the sub-value matches a >> Java carrier. e.g. this would make it possible to split larger than 64 >> bit values into separate parts: u128=[2:u64] and still make them >> accessible that way, while keeping the semantic meaning that it's >> actually a 128 bit value. For those cases no bitmasking is needed. >> Clients could create their own high-level carrier which wraps a `long` >> carrier VarHandle, and does multiple gets/sets on the underlying >> VarHandle per high-level get/set. > > Not sure... it seems to me that if you have u128 = [2:u64] and you > make a VarHandle::compareAndSet - I think the desired semantics is > that atomic access occurs when reading the 128 bits from memory - but, > in reality, this will result in weaker access mode - e.g. there will > be atomic access only on the accessed 64 bits, which means another > client would be able to update the other 64 bits at the same time as > access is performed. > > In other words, the dereference operation should map into an > equivalent Unsafe::xyz operation which does the low level memory > access; there's no operation to do a read of 128 bits - so I don't > think we want to fake it. > > When we will have more carriers (e.g. Valhalla values) we will be able > to add more basic types to the dereference operation. Until then I > believe its better to punt. > > Your use case of u128=[2:u64] would really be better served by just > having [2:u64], which is actually the only semantics we can implement > with the VM plumbing we have available so far. > > Maurizio > >> >>> Regardless of the decision, I think the top type 'Compound' is an >>> useful abstraction. >> >> +1 >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-13 15:52: >>> On 13/05/2019 14:40, Jorn Vernee wrote: >>>> Hi, >>>> >>>> One comment: >>>> >>>> - LayoutPathImpl::sequenceElement >>>> >>>> ? Currently, looking up sequence elements as contents of a Value >>>> does not work. e.g. >>>> >>>> ? Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, >>>> Value.ofUnsignedInt(8))); >>>> ? VarHandle vh = >>>> v.toPath().sequenceElement().dereferenceHandle(byte.class); // Boom! >>>> IllegalStateException in LayoutPathImpl::sequenceElement >>>> >>>> ? If this case is not meant to be supported, I think the type of >>>> Value::contents should just be Optional instead of >>>> Optional. >>> >>> There are two parts to this: I think constructing paths to sub-value >>> contents should work. But I also think that dereferencing sub-value >>> content should not be supported and best left to clients (as that >>> almost always involve some bitmasking and a different set of carriers >>> that is hard to be expressed in a single the API). >>> >>> So, the idea is that deferencing a path that points inside the guts >>> of >>> some value should throw some sane exception (I have some code which >>> makes this more explicit). >>> >>> Of course, when we get to a more general version of >>> compoundElement(long) which works on both sequences and groups, we >>> should make sure that, given your layout, this works: >>> >>> v.toPath().compoundElement(2); >>> >>> A separate question is: given we don't intend to support dereference >>> of value innards, should we keep the Value::contents API? >>> >>> Regardless of the decision, I think the top type 'Compound' is an >>> useful abstraction. >>> >>> Maurizio >>> >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-13 14:14: >>>>> Hi, >>>>> as discussed previously, the relationship between sequences and >>>>> groups >>>>> is a bit shaky, as there are sequences that cannot be expressed as >>>>> groups (sequences whose size is unbound). >>>>> >>>>> This patch models groups and sequences independently: >>>>> >>>>> * they all have a (minimal) common supertype called Compound (which >>>>> supports only a Stream elements() method) >>>>> >>>>> * Group implements Iterable, always has a size, and you can access >>>>> elements by index >>>>> >>>>> * Sequence doesn't implement Iterable, has an optional size, and >>>>> has >>>>> an accessor to retrieve the element layout (there's only one of >>>>> them, >>>>> regardless of the size) >>>>> >>>>> I think this is much saner. >>>>> >>>>> Webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ >>>>> >>>>> P.S. >>>>> of course Compound should be renamed to CompoundLayout in a later >>>>> webrev (as for all other layout names). >>>>> >>>>> P.P.S. >>>>> in a separate patch I'll explore generalizing this method: >>>>> >>>>> LayoutPath::groupElement(long index) >>>>> >>>>> with this: >>>>> >>>>> LayoutPath::compoundElement(long index) >>>>> >>>>> Since now we should be able to implement indexed access on both >>>>> group >>>>> and sequences. >>>>> >>>>> Maurizio From maurizio.cimadamore at oracle.com Mon May 13 15:44:03 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 16:44:03 +0100 Subject: [foreign-memaccess] RFR 8223768: Rethink relationship between Sequence vs. Group layouts In-Reply-To: References: <63f43867-a786-0704-5d79-f7e31c93b581@oracle.com> <34eda079-8295-2382-e1d4-a4410fd3cdef@oracle.com> Message-ID: On 13/05/2019 16:33, Jorn Vernee wrote: > In that case I think removing Value::content sounds good as well. If > the focus is memory access, a part of the Layout API that can not be > used for that seems unneeded. Besides, this seems like something that > can be added back fairly easily later. Eventually we might even want > to choose a different implementation for this (e.g. sub-type of Group, > or something else). Agree on both fronts - e.g. on dropping 'contents' for now, and on providing some 'facilities' to do bitmasking (e.g. some ways to build MH on top of the 'container' VH?) at a later point. P.S. For the case you were suggesting, I think what would be great would be to surface an int128 as an IntVector with 4 lanes, which then gives you all the primitives you need to access/express computation on the lane elements. But this needs to wait for Vector API - luckily we can add more carriers as we're more ready for it :-) Maurizio > > Jorn > > Maurizio Cimadamore schreef op 2019-05-13 16:30: >> On 13/05/2019 15:16, Jorn Vernee wrote: >>>> But I also think that dereferencing sub-value >>>> content should not be supported and best left to clients (as that >>>> almost always involve some bitmasking and a different set of carriers >>>> that is hard to be expressed in a single the API). >>> >>> I think supporting this is good, as long as the sub-value matches a >>> Java carrier. e.g. this would make it possible to split larger than >>> 64 bit values into separate parts: u128=[2:u64] and still make them >>> accessible that way, while keeping the semantic meaning that it's >>> actually a 128 bit value. For those cases no bitmasking is needed. >>> Clients could create their own high-level carrier which wraps a >>> `long` carrier VarHandle, and does multiple gets/sets on the >>> underlying VarHandle per high-level get/set. >> >> Not sure... it seems to me that if you have u128 = [2:u64] and you >> make a VarHandle::compareAndSet - I think the desired semantics is >> that atomic access occurs when reading the 128 bits from memory - but, >> in reality, this will result in weaker access mode - e.g. there will >> be atomic access only on the accessed 64 bits, which means another >> client would be able to update the other 64 bits at the same time as >> access is performed. >> >> In other words, the dereference operation should map into an >> equivalent Unsafe::xyz operation which does the low level memory >> access; there's no operation to do a read of 128 bits - so I don't >> think we want to fake it. >> >> When we will have more carriers (e.g. Valhalla values) we will be able >> to add more basic types to the dereference operation. Until then I >> believe its better to punt. >> >> Your use case of u128=[2:u64] would really be better served by just >> having [2:u64], which is actually the only semantics we can implement >> with the VM plumbing we have available so far. >> >> Maurizio >> >>> >>>> Regardless of the decision, I think the top type 'Compound' is an >>>> useful abstraction. >>> >>> +1 >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-13 15:52: >>>> On 13/05/2019 14:40, Jorn Vernee wrote: >>>>> Hi, >>>>> >>>>> One comment: >>>>> >>>>> - LayoutPathImpl::sequenceElement >>>>> >>>>> ? Currently, looking up sequence elements as contents of a Value >>>>> does not work. e.g. >>>>> >>>>> ? Value v = Value.ofUnsignedInt(64).withContents(Sequence.of(8, >>>>> Value.ofUnsignedInt(8))); >>>>> ? VarHandle vh = >>>>> v.toPath().sequenceElement().dereferenceHandle(byte.class); // >>>>> Boom! IllegalStateException in LayoutPathImpl::sequenceElement >>>>> >>>>> ? If this case is not meant to be supported, I think the type of >>>>> Value::contents should just be Optional instead of >>>>> Optional. >>>> >>>> There are two parts to this: I think constructing paths to sub-value >>>> contents should work. But I also think that dereferencing sub-value >>>> content should not be supported and best left to clients (as that >>>> almost always involve some bitmasking and a different set of carriers >>>> that is hard to be expressed in a single the API). >>>> >>>> So, the idea is that deferencing a path that points inside the guts of >>>> some value should throw some sane exception (I have some code which >>>> makes this more explicit). >>>> >>>> Of course, when we get to a more general version of >>>> compoundElement(long) which works on both sequences and groups, we >>>> should make sure that, given your layout, this works: >>>> >>>> v.toPath().compoundElement(2); >>>> >>>> A separate question is: given we don't intend to support dereference >>>> of value innards, should we keep the Value::contents API? >>>> >>>> Regardless of the decision, I think the top type 'Compound' is an >>>> useful abstraction. >>>> >>>> Maurizio >>>> >>>>> >>>>> Cheers, >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-13 14:14: >>>>>> Hi, >>>>>> as discussed previously, the relationship between sequences and >>>>>> groups >>>>>> is a bit shaky, as there are sequences that cannot be expressed as >>>>>> groups (sequences whose size is unbound). >>>>>> >>>>>> This patch models groups and sequences independently: >>>>>> >>>>>> * they all have a (minimal) common supertype called Compound (which >>>>>> supports only a Stream elements() method) >>>>>> >>>>>> * Group implements Iterable, always has a size, and you can access >>>>>> elements by index >>>>>> >>>>>> * Sequence doesn't implement Iterable, has an optional size, and has >>>>>> an accessor to retrieve the element layout (there's only one of >>>>>> them, >>>>>> regardless of the size) >>>>>> >>>>>> I think this is much saner. >>>>>> >>>>>> Webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223768/ >>>>>> >>>>>> P.S. >>>>>> of course Compound should be renamed to CompoundLayout in a later >>>>>> webrev (as for all other layout names). >>>>>> >>>>>> P.P.S. >>>>>> in a separate patch I'll explore generalizing this method: >>>>>> >>>>>> LayoutPath::groupElement(long index) >>>>>> >>>>>> with this: >>>>>> >>>>>> LayoutPath::compoundElement(long index) >>>>>> >>>>>> Since now we should be able to implement indexed access on both >>>>>> group >>>>>> and sequences. >>>>>> >>>>>> Maurizio From lev at serebryakov.spb.ru Mon May 13 15:48:30 2019 From: lev at serebryakov.spb.ru (Lev Serebryakov) Date: Mon, 13 May 2019 18:48:30 +0300 Subject: [vector api] Massive benchmark result and some thoughts (and questions!) about Vector API Message-ID: First of all, sorry for long message, but I hope it will be useful. Second, I'm very pleased by Vector API itself and it current performance. Great job, thank you! Now to details. I've finished a massive benchmarking of my code written with Vector API. It takes more than 5 days of pure CPU time. You could find code and description of operations (which do all these names mean) here: https://github.com/blacklion/panama-benchmarks/tree/master/vector And results here: https://docs.google.com/spreadsheets/d/13obJR6I-1K8IEwrFpIrzvF2XrfEHa2NbOXIGE3VRzas/edit?usp=sharing This document contains 5 sheets ? two with raw data and three with re-formatted data, which make analysis simpler. All benchmarks were performed on Windows 10, with JDK built from "foreign+intrinsics" branch, commit 6a27ea0ccb81. Hardware is i7-6700K CPU locked at 4.0Ghz (all energy savings and Turbo Boost were turned off, multiplier fixed at '40' for all cores), with one thread (to avoid overheating for sure). Baseline is simple pure-old-Java code with tight loops (without manual unrolling). All benchmarks process vectors of size [almost] 65536, where one element of vector is one real or complex number, depending on operation. One real number is `float` and one complex number is pair of `float`, so 65536 elements is either 65536 floats or 131072 floats, depending on operation. There is two batches of benchmarks: First run investigates dependency between speed and batch size. Each operation process 65536 ellements not in one call to low-level tight loop, but in portions of different size. This portions are called `callSize` in benchmark, and I benchmarked sizes 3, 4, 7, 8, 15, 128, 1024, 65536. When 65536 is not divisible by `callSize` (3, 7, 15), slightly less data is processed, of course. Second run investigates dependency between speed and offset from start of Java array. Offsets 0..7 are used, but these results are not very interesting. Best way to start on these results are page "Variable callSize ? Aggregated": it shows difference between baseline and vector implementations for each callSize. Columns "difference" show "(vector - baseline)/baseline", so "+100%" is x2. It is color-coded. There are some my thoughts and questions: (1) Don't try to outperform `System.arraycopy()` ;-) (2) Looks like simple operations are vectorized by C2: `rv_dot_cv` or `cv_sum`, for example, is not much faster in vector variant. `rv_dot_rv` is another example, it was hard to make it faster (3) `atan2` and `hypot` are DEAD SLOW in `Math` package. Look at `cv_abs` or `cv_args` which are oh-my-god-incredibility-fast in Vector variant. (4) I'm VERY surprised by unexpected (to me) behavior of horizontal operations like `addLines()` and `maxLines()`. You could see different variants of horizontal summation here: https://github.com/blacklion/panama-benchmarks/blob/master/vector/src/jmh/java/vector/specific/RVsum.java I've been sure, that vector accumulator with only one horizontal summation at the end of loop will be fastest, but NO! Fastest variant uses `addLanes()` in tight loop! Why is this? (5) Looks like Vector API should communicate with C2 loop unroller better. You could see at https://github.com/blacklion/panama-benchmarks/blob/master/vector/src/jmh/java/vector/specific/RVdotRV.java that best variant is heavily-manually-unrolled one (saccum_unroll_4_2_fma_add_lanes), but it looks very-very ugly. BTW, this benchmark shows, that vector accumulator and single `addLanes()` is again slowest method to do job. It is very surprising for me. -- // Black Lion AKA Lev Serebryakov From maurizio.cimadamore at oracle.com Mon May 13 16:34:36 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Mon, 13 May 2019 16:34:36 +0000 Subject: hg: panama/dev: 8223768: Rethink relationship between Sequence vs. Group layouts Message-ID: <201905131634.x4DGYat3022070@aojmv0008.oracle.com> Changeset: a132c54db720 Author: mcimadamore Date: 2019-05-13 17:33 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/a132c54db720 8223768: Rethink relationship between Sequence vs. Group layouts Split Group and Sequence; add Compound supertype. ! src/java.base/share/classes/java/foreign/Address.java + src/java.base/share/classes/java/foreign/Compound.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/Value.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathsImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Mon May 13 18:05:18 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 13 May 2019 19:05:18 +0100 Subject: [foreign-memaccess] RFR 8223786: Rename layout annotation to layout attribute Message-ID: The term 'annotation' is too polluted when it comes to Java. Replacing it with something more neutral like 'attribute' could work better: http://cr.openjdk.java.net/~mcimadamore/panama/8223786/ This is a good first step, but longer term I'm also considering dropping annotations and have layout names, size and alignment to be modeled more directly in the API. After all, the only reason why we need layout attributes at all in this API is to express alignment constraints; in the next few days I'll put together a prototype which implements alignments checks - and see whether it's feasible to expose alignment as a first class property of layouts (as sizes are). Maurizio From henry.jen at oracle.com Tue May 14 01:58:53 2019 From: henry.jen at oracle.com (Henry Jen) Date: Mon, 13 May 2019 18:58:53 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError Message-ID: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> Hi, Please review a webrev[1] that detects an atomic type to get the correct layout for the type. A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. Before that happens, we have our java clang binding trying do that work by: - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. - During the cursor-traversing, we will add in extra types declared in the header files - For an atomic type, we use the type string to get the underlying type. Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ [2] https://reviews.llvm.org/D61716 From nick.gasson at arm.com Tue May 14 06:29:26 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 14 May 2019 14:29:26 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 Message-ID: Hi, A few months ago I asked about a port of the foreign branch for AArch64, and in the last couple of weeks I had some time to do some work on it. I'm sending what I have for review now as I think Dmitry was also planning to start looking at this. http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.0/ With this patch 49/53 tests under test/jdk/java/foreign/ pass on AArch64. The failures are: java/foreign/StructByValueTest.java java/foreign/UnalignedStructTest.java java/foreign/Upcall/StructUpcall.java java/foreign/LongDoubleTest.java Currently passing a >16B struct by value doesn't work, because in this case we need to pass a pointer to a copy of the struct in one of the integer registers, but I couldn't immediately figure out how to modify CallingSequenceBuilderImpl to do this. Although I believe this is the same as the Windows x64 ABI (?) so I'll check the implementation there. Returning a struct by value doesn't work yet as we need to pass a pointer to the temporary storage in the "indirect result location register" (r8). Passing / returning long double doesn't work, I haven't investigated why. Jextract also builds and runs, but some of the tests fail for the reasons above. Because I based this patch on the existing code for x86 there's a lot of duplication now, particularly in universalNativeInvoker_aarch64.cpp. I think all of the Shuffle* classes could be moved to some shared code, with very minimal platform specific #ifdefs. Similarly for some of the code in SharedUtils.java. If you're ok with this I can work on it in a separate patch? Thanks, Nick From maurizio.cimadamore at oracle.com Tue May 14 08:32:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 14 May 2019 09:32:43 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: Message-ID: This is a solid piece of work - thanks!! Some comments inline: On 14/05/2019 07:29, Nick Gasson wrote: > Hi, > > A few months ago I asked about a port of the foreign branch for > AArch64, and in the last couple of weeks I had some time to do some > work on it. I'm sending what I have for review now as I think Dmitry > was also planning to start looking at this. > > http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.0/ > > With this patch 49/53 tests under test/jdk/java/foreign/ pass on > AArch64. The failures are: > > java/foreign/StructByValueTest.java > java/foreign/UnalignedStructTest.java > java/foreign/Upcall/StructUpcall.java > java/foreign/LongDoubleTest.java > > Currently passing a >16B struct by value doesn't work, because in this > case we need to pass a pointer to a copy of the struct in one of the > integer registers, but I couldn't immediately figure out how to modify > CallingSequenceBuilderImpl to do this. Although I believe this is the > same as the Windows x64 ABI (?) so I'll check the implementation there. I see - yes, the Windows code does that. I think the magic happens here: http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallingSequenceBuilderImpl.java#l150 and here: http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/Windowsx64ABI.java#l125 That is, if the struct has certain charateristics (e.g. smaller than 64 bits), Windows pass it in register, otherwise it always pass it by reference (see the 'else' in the ABI::unbox code which creates a pointer). (actually, now that I look at it, it's not clear to me why Windows's CallingSequenceBuilder is using ArgumentClass.INTEGER instead of ArgumentClass.POINTER for these cases) > > Returning a struct by value doesn't work yet as we need to pass a > pointer to the temporary storage in the "indirect result location > register" (r8). This is probably similar to what happens in SystemV ABI - where biger structs are returned with an indirection pointer stored in the first integer register. For this we have CallingSequence::returnsInMemory and CalingSequence::returnInMemoryBindings. It is possible that this logic might require some generalization to work with custom registers - the current code assumes that the bindings for the return-in-memory should be created with position "-1" (return) by the builder, and given the class "INTEGER_ARGUMENT_REGISTER", see http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequence.java#l70 Which might or might not be enough for you. > > Passing / returning long double doesn't work, I haven't investigated why. long double is currently only working on SystemV - and the underlying implementation heavily relies on Intel x87 registers and instructions; note that in SystemV a 'long double' maps onto extended precision IEEE format (80 bits) while in AArch64 it maps onto quad precision IEEE format (128 bits). Which means classification is probably not working correctly on AArch64 and the LongDouble LayoutType/Reference implementation in the binder need some work too, since they assume 80-bit extended format. > > Jextract also builds and runs, but some of the tests fail for the > reasons above. > > Because I based this patch on the existing code for x86 there's a lot > of duplication now, particularly in > universalNativeInvoker_aarch64.cpp. I think all of the Shuffle* > classes could be moved to some shared code, with very minimal platform > specific #ifdefs. Similarly for some of the code in SharedUtils.java. > If you're ok with this I can work on it in a separate patch? Sure, it makes sense to address that separately. The code used to be a lot messier than it is now - I'm positively surprised that, following recent refactoring of the ABI classes the Java duplication has been slashed quite a bit. The idea of SharedUtils is that it could be shared between all the x64 abi implementations, and that worked, but now that we're introducing a new architecture we're seeing some growing pains, which is part of the game to et the implementation into a better state. As we've done for windows, is perfectly reasonable to duplicate now and to clean up later. Process note/question, our internal systems test Windows x64, Linux x64 and MacOS (x64) - once the patch is ready to be integrated, how do we plan to make sure that we're not accidentally introducing AArch64 regressions (given that our nighties will be oblivious to that?) Maurizio > > > Thanks, > Nick From nick.gasson at arm.com Tue May 14 10:15:47 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 14 May 2019 18:15:47 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: Message-ID: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Hi Maurizio, On 14/05/2019 16:32, Maurizio Cimadamore wrote: > > I see - yes, the Windows code does that. I think the magic happens here: > > [...] > > That is, if the struct has certain charateristics (e.g. smaller than 64 > bits), Windows pass it in register, otherwise it always pass it by > reference (see the 'else' in the ABI::unbox code which creates a pointer). > > (actually, now that I look at it, it's not clear to me why Windows's > CallingSequenceBuilder is using ArgumentClass.INTEGER instead of > ArgumentClass.POINTER for these cases) Thanks! I'd found the part where it assigns ArgumentClass.INTEGER for these arguments in Windows CallingSequenceBuilderImpl but missed the corresponding code in box/unboxValue. I think we can use the same approach on AArch64. > > This is probably similar to what happens in SystemV ABI - where biger > structs are returned with an indirection pointer stored in the first > integer register. For this we have CallingSequence::returnsInMemory and > CalingSequence::returnInMemoryBindings. It is possible that this logic > might require some generalization to work with custom registers - the > current code assumes that the bindings for the return-in-memory should > be created with position "-1" (return) by the builder, and given the > class "INTEGER_ARGUMENT_REGISTER", see > > http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequence.java#l70 > > > Which might or might not be enough for you. This is really useful, thanks. Maybe it's enough to add this as special case in StorageCalculator::addBindings where forArguments==false, argumentIndex()==-1 and the argument class is INTEGER: then we can always allocate r8 for this argument. I'll experiment a bit. > > long double is currently only working on SystemV - and the underlying > implementation heavily relies on Intel x87 registers and instructions; > note that in SystemV a 'long double' maps onto extended precision IEEE > format (80 bits) while in AArch64 it maps onto quad precision IEEE > format (128 bits). > > Which means classification is probably not working correctly on AArch64 > and the LongDouble LayoutType/Reference implementation in the binder > need some work too, since they assume 80-bit extended format. > OK, thanks again for the hints! > > Process note/question, our internal systems test Windows x64, Linux x64 > and MacOS (x64) - once the patch is ready to be integrated, how do we > plan to make sure that we're not accidentally introducing AArch64 > regressions (given that our nighties will be oblivious to that?) > We have an Arm-internal CI that we can configure to run nightly tests on the foreign branch (and the vectorIntrinsics branch too, as we're working on the AArch64 port of that). There's still a manual step of us posting to this list or sending a patch when something breaks though. The unit test for AArch64 CallingSequenceBuilder should run on x86 too. Nick From jbvernee at xs4all.nl Tue May 14 11:03:29 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 14 May 2019 13:03:29 +0200 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Message-ID: >> This is probably similar to what happens in SystemV ABI - where biger >> structs are returned with an indirection pointer stored in the first >> integer register. For this we have CallingSequence::returnsInMemory >> and CalingSequence::returnInMemoryBindings. It is possible that this >> logic might require some generalization to work with custom registers >> - the current code assumes that the bindings for the return-in-memory >> should be created with position "-1" (return) by the builder, and >> given the class "INTEGER_ARGUMENT_REGISTER", see >> >> http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequence.java#l70 >> Which might or might not be enough for you. > > This is really useful, thanks. Maybe it's enough to add this as > special case in StorageCalculator::addBindings where > forArguments==false, argumentIndex()==-1 and the argument class is > INTEGER: then we can always allocate r8 for this argument. I'll > experiment a bit. I think ShuffleRecipe::make is currently dependent on the argument bindings for each class occurring in the order of their storage indices to generate skips. And CallingSequenceBuilder always adds the return binding first: http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequenceBuilder.java#l54 If r8 is the first integer argument register this would work, but this doesn't seem to be the case? I think a quick-fix for this is adding a sorting pass of the bindings by storage index to CallingSequenceBuilder::build. Jorn Nick Gasson schreef op 2019-05-14 12:15: > Hi Maurizio, > > On 14/05/2019 16:32, Maurizio Cimadamore wrote: >> >> I see - yes, the Windows code does that. I think the magic happens >> here: >> >> [...] >> >> That is, if the struct has certain charateristics (e.g. smaller than >> 64 bits), Windows pass it in register, otherwise it always pass it by >> reference (see the 'else' in the ABI::unbox code which creates a >> pointer). >> >> (actually, now that I look at it, it's not clear to me why Windows's >> CallingSequenceBuilder is using ArgumentClass.INTEGER instead of >> ArgumentClass.POINTER for these cases) > > Thanks! I'd found the part where it assigns ArgumentClass.INTEGER for > these arguments in Windows CallingSequenceBuilderImpl but missed the > corresponding code in box/unboxValue. I think we can use the same > approach on AArch64. > >> >> This is probably similar to what happens in SystemV ABI - where biger >> structs are returned with an indirection pointer stored in the first >> integer register. For this we have CallingSequence::returnsInMemory >> and CalingSequence::returnInMemoryBindings. It is possible that this >> logic might require some generalization to work with custom registers >> - the current code assumes that the bindings for the return-in-memory >> should be created with position "-1" (return) by the builder, and >> given the class "INTEGER_ARGUMENT_REGISTER", see >> >> http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequence.java#l70 >> Which might or might not be enough for you. > > This is really useful, thanks. Maybe it's enough to add this as > special case in StorageCalculator::addBindings where > forArguments==false, argumentIndex()==-1 and the argument class is > INTEGER: then we can always allocate r8 for this argument. I'll > experiment a bit. > >> >> long double is currently only working on SystemV - and the underlying >> implementation heavily relies on Intel x87 registers and instructions; >> note that in SystemV a 'long double' maps onto extended precision IEEE >> format (80 bits) while in AArch64 it maps onto quad precision IEEE >> format (128 bits). >> >> Which means classification is probably not working correctly on >> AArch64 and the LongDouble LayoutType/Reference implementation in the >> binder need some work too, since they assume 80-bit extended format. >> > > OK, thanks again for the hints! > >> >> Process note/question, our internal systems test Windows x64, Linux >> x64 and MacOS (x64) - once the patch is ready to be integrated, how do >> we plan to make sure that we're not accidentally introducing AArch64 >> regressions (given that our nighties will be oblivious to that?) >> > > We have an Arm-internal CI that we can configure to run nightly tests > on the foreign branch (and the vectorIntrinsics branch too, as we're > working on the AArch64 port of that). There's still a manual step of > us posting to this list or sending a patch when something breaks > though. The unit test for AArch64 CallingSequenceBuilder should run on > x86 too. > > > Nick From maurizio.cimadamore at oracle.com Tue May 14 11:03:37 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 14 May 2019 12:03:37 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> Message-ID: <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. That is, when you see a cursor with: _Atomic("....") extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. Maurizio On 14/05/2019 02:58, Henry Jen wrote: > Hi, > > Please review a webrev[1] that detects an atomic type to get the correct layout for the type. > > A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. > > Before that happens, we have our java clang binding trying do that work by: > - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. > - During the cursor-traversing, we will add in extra types declared in the header files > - For an atomic type, we use the type string to get the underlying type. > > Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ > [2] https://reviews.llvm.org/D61716 From maurizio.cimadamore at oracle.com Tue May 14 11:13:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 14 May 2019 12:13:43 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Message-ID: On 14/05/2019 12:03, Jorn Vernee wrote: > I think ShuffleRecipe::make is currently dependent on the argument > bindings for each class occurring in the order of their storage > indices to generate skips. And CallingSequenceBuilder always adds the > return binding first: > http://hg.openjdk.java.net/panama/dev/file/d9ded289d4dd/src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequenceBuilder.java#l54 > > > If r8 is the first integer argument register this would work, but this > doesn't seem to be the case? I think a quick-fix for this is adding a > sorting pass of the bindings by storage index to > CallingSequenceBuilder::build. Right - it seems like we should have separate treatment for return in memory bindings in CallingSequenceBuilder too - e.g. instead of assume that we can just add another argument binding for it, we should add a 'return in memory' binding (which then on Windows/SysV will just delegate to addArgumentBindings). Maurizio From jean-philippe.halimi at intel.com Tue May 14 20:15:33 2019 From: jean-philippe.halimi at intel.com (Halimi, Jean-Philippe) Date: Tue, 14 May 2019 20:15:33 +0000 Subject: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F48846A@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F48846A@ORSMSX106.amr.corp.intel.com> Message-ID: Hi Vivek, Patch looks good to me. Thanks -Jp -----Original Message----- From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Deshpande, Vivek R Sent: Wednesday, May 8, 2019 12:04 PM To: panama-dev at openjdk.java.net Subject: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) Hi all, I have worked on adding missing tests in the jtreg test suite for vector APIs. In this patch I have added tests for masked reductions and masked min, max. Also uncovered a bug in the masked minLanes maxLanes for Float/Double so added a fix for that. I have created a webrev for these changes. Requesting a review. Webrev - http://cr.openjdk.java.net/~vdeshpande/8221429/webrev.00/ Bug - https://bugs.openjdk.java.net/browse/JDK-8221429 Thanks, Vivek From vivek.r.deshpande at intel.com Tue May 14 20:18:56 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 14 May 2019 20:18:56 +0000 Subject: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) In-Reply-To: References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F48846A@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4E3D8E@ORSMSX106.amr.corp.intel.com> Thanks JP. Would push it soon. Regards, Vivek -----Original Message----- From: Halimi, Jean-Philippe Sent: Tuesday, May 14, 2019 1:16 PM To: Deshpande, Vivek R ; panama-dev at openjdk.java.net Subject: RE: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) Hi Vivek, Patch looks good to me. Thanks -Jp -----Original Message----- From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Deshpande, Vivek R Sent: Wednesday, May 8, 2019 12:04 PM To: panama-dev at openjdk.java.net Subject: [vector] RFR 82221429: Add tests for XXXVector.xxxAll(Mask<>) Hi all, I have worked on adding missing tests in the jtreg test suite for vector APIs. In this patch I have added tests for masked reductions and masked min, max. Also uncovered a bug in the masked minLanes maxLanes for Float/Double so added a fix for that. I have created a webrev for these changes. Requesting a review. Webrev - http://cr.openjdk.java.net/~vdeshpande/8221429/webrev.00/ Bug - https://bugs.openjdk.java.net/browse/JDK-8221429 Thanks, Vivek From henry.jen at oracle.com Wed May 15 04:10:25 2019 From: henry.jen at oracle.com (Henry Jen) Date: Tue, 14 May 2019 21:10:25 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> Message-ID: <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. Cheers, Henry > On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: > > I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. > > That is, when you see a cursor with: > > _Atomic("....") > > extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. > > Maurizio > > On 14/05/2019 02:58, Henry Jen wrote: >> Hi, >> >> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >> >> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >> >> Before that happens, we have our java clang binding trying do that work by: >> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >> - During the cursor-traversing, we will add in extra types declared in the header files >> - For an atomic type, we use the type string to get the underlying type. >> >> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >> >> Cheers, >> Henry >> >> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Wed May 15 04:36:19 2019 From: henry.jen at oracle.com (Henry Jen) Date: Tue, 14 May 2019 21:36:19 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> Message-ID: BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, java.lang.IllegalArgumentException: Error with snippet: uchar_t var; /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' Cheers, Henry > On May 14, 2019, at 9:10 PM, Henry Jen wrote: > > Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. > > In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: > > java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; > /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed > > I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. > > Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. > > Cheers, > Henry > >> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >> >> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >> >> That is, when you see a cursor with: >> >> _Atomic("....") >> >> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >> >> Maurizio >> >> On 14/05/2019 02:58, Henry Jen wrote: >>> Hi, >>> >>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>> >>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>> >>> Before that happens, we have our java clang binding trying do that work by: >>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>> - During the cursor-traversing, we will add in extra types declared in the header files >>> - For an atomic type, we use the type string to get the underlying type. >>> >>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>> >>> Cheers, >>> Henry >>> >>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>> [2] https://reviews.llvm.org/D61716 > From nick.gasson at arm.com Wed May 15 06:23:08 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 15 May 2019 14:23:08 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Message-ID: Hi Jorn and Maurizio, Thanks for the feedback! >> >> If r8 is the first integer argument register this would work, but this >> doesn't seem to be the case? I think a quick-fix for this is adding a >> sorting pass of the bindings by storage index to >> CallingSequenceBuilder::build. This works but now it seems a bit of a kludge as r8 isn't normally an integer argument register. > > Right - it seems like we should have separate treatment for return in > memory bindings in CallingSequenceBuilder too - e.g. instead of assume > that we can just add another argument binding for it, we should add a > 'return in memory' binding (which then on Windows/SysV will just > delegate to addArgumentBindings). > Do you mean add a separate BindingsComputer for returns in memroy bindings? I tried a different approach today of adding a new StorageClass INDIRECT_RESULT_REGISTER that is used in StorageCalculator::addBindings if forArguments && arg.argumentIndex() == -1. And then in universalNativeInvoker_aarch64.cpp we can load this directly into r8. This seems quite neat to me, and adding a new storage class matches the ABI document more closely. Although we need to change the x86 to ignore this class. With the changes to box/unboxValue from Windowsx64ABI, the StructByValueTest case passes now. If you're happy with this approach I can tidy it up and update the webrev? Thanks, Nick From maurizio.cimadamore at oracle.com Wed May 15 09:51:05 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 10:51:05 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Message-ID: On 15/05/2019 07:23, Nick Gasson wrote: > Hi Jorn and Maurizio, > > Thanks for the feedback! > >>> >>> If r8 is the first integer argument register this would work, but >>> this doesn't seem to be the case? I think a quick-fix for this is >>> adding a sorting pass of the bindings by storage index to >>> CallingSequenceBuilder::build. > > This works but now it seems a bit of a kludge as r8 isn't normally an > integer argument register. > >> >> Right - it seems like we should have separate treatment for return in >> memory bindings in CallingSequenceBuilder too - e.g. instead of >> assume that we can just add another argument binding for it, we >> should add a 'return in memory' binding (which then on Windows/SysV >> will just delegate to addArgumentBindings). >> > > Do you mean add a separate BindingsComputer for returns in memroy > bindings? Yes > > > I tried a different approach today of adding a new StorageClass > INDIRECT_RESULT_REGISTER that is used in > StorageCalculator::addBindings if forArguments && arg.argumentIndex() > == -1. And then in universalNativeInvoker_aarch64.cpp we can load this > directly into r8. > > This seems quite neat to me, and adding a new storage class matches > the ABI document more closely. Although we need to change the x86 to > ignore this class. This seems a good direction to explore. I believe that, in general, our assumption that platforms have similar registers classes is only partially valid. This is visible with X87 register, which doesn't really make sense on a non-intel machine. But adding a class as the code is now can be problematic, as there's a 1-1 correspondence between these and ShuffleRecipeClass, which is also shared for all platforms. How did you go about changing that class? I think it could be better, for now, to stick with some hacky solution like the one you proposed above, which keeps current (broken) assumption and then revisit later a way to make the code better/more direct? Maurizio > > With the changes to box/unboxValue from Windowsx64ABI, the > StructByValueTest case passes now. If you're happy with this approach > I can tidy it up and update the webrev? > > > Thanks, > Nick From maurizio.cimadamore at oracle.com Wed May 15 10:11:53 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 11:11:53 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> Message-ID: <5996b06f-ba3c-4106-73d3-3c20b0d6fb30@oracle.com> The error you are getting is odd. I'd like to look at the patch you are trying, if possible. I tried something like this: $ cat test.h typedef unsigned char uchar_t; typedef _Atomic(int) atomic_int_t; struct SomeTypes { ?? _Atomic uchar_t auc; ?? volatile unsigned int vui; ?? _Atomic atomic_int_t aai; }; typedef _Atomic struct SomeTypes atomic_some_types_t; $ cat client.c struct SomeTypes foo; int main(void) { int x = foo.vui; return 0; } $ clang test.h $ clang -include-pch test.h.gch client.c This works ok, and 'SomeTypes' is resolved to the right type name... Maurizio On 15/05/2019 05:36, Henry Jen wrote: > BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, > > java.lang.IllegalArgumentException: Error with snippet: uchar_t var; > /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' > > Cheers, > Henry > > >> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >> >> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >> >> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >> >> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >> >> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >> >> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >> >> Cheers, >> Henry >> >>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>> >>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>> >>> That is, when you see a cursor with: >>> >>> _Atomic("....") >>> >>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>> >>> Maurizio >>> >>> On 14/05/2019 02:58, Henry Jen wrote: >>>> Hi, >>>> >>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>> >>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>> >>>> Before that happens, we have our java clang binding trying do that work by: >>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>> - For an atomic type, we use the type string to get the underlying type. >>>> >>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>> >>>> Cheers, >>>> Henry >>>> >>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>> [2] https://reviews.llvm.org/D61716 From maurizio.cimadamore at oracle.com Wed May 15 10:53:13 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 11:53:13 +0100 Subject: 822394: jextract should normalize paths in layout names Message-ID: Hi, this is a simple patch to add a call to 'toJavaIdentifier' when computing anon layout names Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223947/ Maurizio From maurizio.cimadamore at oracle.com Wed May 15 14:21:38 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 15:21:38 +0100 Subject: 822394: jextract should normalize paths in layout names In-Reply-To: References: Message-ID: <7d15239e-32cd-4bb9-c811-1b049304708c@oracle.com> Something went wrong in the generation of the previous webrev - please use this instead: http://cr.openjdk.java.net/~mcimadamore/panama/8223947_v2/ Maurizio On 15/05/2019 11:53, Maurizio Cimadamore wrote: > Hi, > this is a simple patch to add a call to 'toJavaIdentifier' when > computing anon layout names > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/panama/8223947/ > > Maurizio > From maurizio.cimadamore at oracle.com Wed May 15 14:26:29 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 15 May 2019 14:26:29 +0000 Subject: hg: panama/dev: 8223778: Path lookup API refinements Message-ID: <201905151426.x4FEQUun012587@aojmv0008.oracle.com> Changeset: 2cb601d1d54f Author: mcimadamore Date: 2019-05-15 15:25 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/2cb601d1d54f 8223778: Path lookup API refinements ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathImpl.java < src/java.base/share/classes/jdk/internal/foreign/LayoutPathsImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Wed May 15 14:27:34 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 15 May 2019 14:27:34 +0000 Subject: hg: panama/dev: 8223786: Rename layout annotation to layout attribute Message-ID: <201905151427.x4FERY3D013125@aojmv0008.oracle.com> Changeset: 6c71afe53ad1 Author: mcimadamore Date: 2019-05-15 15:27 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/6c71afe53ad1 8223786: Rename layout annotation to layout attribute ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/Address.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/Padding.java ! src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java From henry.jen at oracle.com Wed May 15 15:50:24 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 15 May 2019 08:50:24 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> Message-ID: <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> I figured this out, it failed with a different TU, need to catch the exception in a loop. Cheers, Henry > On May 14, 2019, at 9:36 PM, Henry Jen wrote: > > BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, > > java.lang.IllegalArgumentException: Error with snippet: uchar_t var; > /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' > > Cheers, > Henry > > >> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >> >> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >> >> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >> >> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >> >> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >> >> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >> >> Cheers, >> Henry >> >>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>> >>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>> >>> That is, when you see a cursor with: >>> >>> _Atomic("....") >>> >>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>> >>> Maurizio >>> >>> On 14/05/2019 02:58, Henry Jen wrote: >>>> Hi, >>>> >>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>> >>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>> >>>> Before that happens, we have our java clang binding trying do that work by: >>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>> - For an atomic type, we use the type string to get the underlying type. >>>> >>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>> >>>> Cheers, >>>> Henry >>>> >>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>> [2] https://reviews.llvm.org/D61716 >> > From maurizio.cimadamore at oracle.com Wed May 15 16:00:52 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 17:00:52 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> Message-ID: <41abf5e0-5a75-1889-9883-a0409e75e962@oracle.com> Note also that macro support is using -Xnostdinc to avoid dependencies on system headers inferred by clang (such as MSVC) which can affect our test infra. Not sure if that plays a role too. Maurizio On 15/05/2019 16:50, Henry Jen wrote: > I figured this out, it failed with a different TU, need to catch the exception in a loop. > > Cheers, > Henry > >> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >> >> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >> >> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >> >> Cheers, >> Henry >> >> >>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>> >>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>> >>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>> >>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>> >>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>> >>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>> >>> Cheers, >>> Henry >>> >>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>> >>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>> >>>> That is, when you see a cursor with: >>>> >>>> _Atomic("....") >>>> >>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>> >>>> Maurizio >>>> >>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>> Hi, >>>>> >>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>> >>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>> >>>>> Before that happens, we have our java clang binding trying do that work by: >>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>> - For an atomic type, we use the type string to get the underlying type. >>>>> >>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Wed May 15 16:31:48 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 15 May 2019 09:31:48 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <5996b06f-ba3c-4106-73d3-3c20b0d6fb30@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <5996b06f-ba3c-4106-73d3-3c20b0d6fb30@oracle.com> Message-ID: Yes, I did some experiments and knew that works. Make the client.c into client.h and remove main to be more close to our situation. However, jextract still seems to have some issue with that, jextract -C -include-pch -C test.h.gch client.h, I see the following message WARNING: nothing to generate Cheers, Henry > On May 15, 2019, at 3:11 AM, Maurizio Cimadamore wrote: > > The error you are getting is odd. > > I'd like to look at the patch you are trying, if possible. > > I tried something like this: > > $ cat test.h > > typedef unsigned char uchar_t; > typedef _Atomic(int) atomic_int_t; > > struct SomeTypes { > _Atomic uchar_t auc; > volatile unsigned int vui; > _Atomic atomic_int_t aai; > }; > > typedef _Atomic struct SomeTypes atomic_some_types_t; > > > $ cat client.c > > struct SomeTypes foo; > > > int main(void) { > int x = foo.vui; > return 0; > } > > $ clang test.h > > $ clang -include-pch test.h.gch client.c > > This works ok, and 'SomeTypes' is resolved to the right type name... > > Maurizio > > On 15/05/2019 05:36, Henry Jen wrote: >> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >> >> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >> >> Cheers, >> Henry >> >> >>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>> >>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>> >>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>> >>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>> >>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>> >>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>> >>> Cheers, >>> Henry >>> >>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>> >>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>> >>>> That is, when you see a cursor with: >>>> >>>> _Atomic("....") >>>> >>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>> >>>> Maurizio >>>> >>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>> Hi, >>>>> >>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>> >>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>> >>>>> Before that happens, we have our java clang binding trying do that work by: >>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>> - For an atomic type, we use the type string to get the underlying type. >>>>> >>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Wed May 15 16:42:45 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 15 May 2019 09:42:45 -0700 Subject: 822394: jextract should normalize paths in layout names In-Reply-To: <7d15239e-32cd-4bb9-c811-1b049304708c@oracle.com> References: <7d15239e-32cd-4bb9-c811-1b049304708c@oracle.com> Message-ID: +1. Cheers, Henry > On May 15, 2019, at 7:21 AM, Maurizio Cimadamore wrote: > > Something went wrong in the generation of the previous webrev - please use this instead: > > http://cr.openjdk.java.net/~mcimadamore/panama/8223947_v2/ > > Maurizio > > On 15/05/2019 11:53, Maurizio Cimadamore wrote: >> Hi, >> this is a simple patch to add a call to 'toJavaIdentifier' when computing anon layout names >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/panama/8223947/ >> >> Maurizio >> From maurizio.cimadamore at oracle.com Wed May 15 17:57:34 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 May 2019 18:57:34 +0100 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts Message-ID: Hi, this patch adds support to the API to express alignment constraints on layouts. Following the model John put forward in [1], I did the following: * there's a way to compute the 'natural alignment' of a layout * if no alignment is specified, alignment is the natural alignment * you can align a layout using Layout::alignTo(long) - this sets the additional constraint (provided is power of two, non-negative, and >=8) * When we obtain a VarHandle out of a Layout, at that point we are in the process of computing offsets; here we can catch e.g. if the offset of a given path doesn't match the specified alignment - in this case we fail fast, even before memory is accessed; this occurs e.g. if the layout path offset is not a multiple of its alignment, or if the alignment of the nested element is stricter than the one of the enclosing element. (in principle we could enforce these checks on layout creation, but given we have unresolved layouts in our radar, I think it's best to do minimal checks on layout creation and let the check fully kick in on path creation). * If VarHandle can be constructed, a dynamic check verifies that alignment of address passed to VH matches the layout requirements * MemoryScope::allocate also honors the alignment requirements - this works by up-allocating memory (to make sure a pointer with desired alignment exists in the allocated area) and then adjusting it after the fact. For alignments < 16bytes, nothing is done given that malloc, at least on x64 is guaranteed to respect that. I think this is powerful and yet relatively simple to understand, overall I quite like it. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ Maurizio From maurizio.cimadamore at oracle.com Wed May 15 17:59:36 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 15 May 2019 17:59:36 +0000 Subject: hg: panama/dev: 8223947: jextract should normalize paths in layout names Message-ID: <201905151759.x4FHxaO1002447@aojmv0008.oracle.com> Changeset: fb878e05cd60 Author: mcimadamore Date: 2019-05-15 18:59 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/fb878e05cd60 8223947: jextract should normalize paths in layout names Reviewed-by: hjen ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/LayoutUtils.java + test/jdk/com/sun/tools/jextract/illegalCharsInHeaders/IllegalCharsInHeadersTest.java + test/jdk/com/sun/tools/jextract/illegalCharsInHeaders/test-minus.h From maurizio.cimadamore at oracle.com Wed May 15 18:04:53 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 15 May 2019 18:04:53 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905151804.x4FI4swm004972@aojmv0008.oracle.com> Changeset: 393d63b3e6b4 Author: mcimadamore Date: 2019-05-15 20:04 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/393d63b3e6b4 Automatic merge with foreign From jbvernee at xs4all.nl Wed May 15 20:07:21 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 15 May 2019 22:07:21 +0200 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: References: Message-ID: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> > * you can align a layout using Layout::alignTo(long) - this sets the > additional constraint (provided is power of two, non-negative, and > >=8) Why the power of two constraint? Shouldn't it just be a multiple of 8? This seems to preclude some packed structs, e.g. with elements with 3 byte alignment: #pragma pack(1) struct Foo { char x; short y; // 1 byte alignment int z; // 3 byte alignment }; // size = 7 bytes In the layout API this is: Value vChar = Value.ofSignedInt(8); Value vShort = Value.ofSignedInt(16); Value vInt = Value.ofSignedInt(32); Group g = Group.struct(vChar, vShort.alignTo(8), vInt.alignTo(24)); But this throws an IAE because alignment of 24 is not allowed. Jorn Maurizio Cimadamore schreef op 2019-05-15 19:57: > Hi, > this patch adds support to the API to express alignment constraints on > layouts. Following the model John put forward in [1], I did the > following: > > * there's a way to compute the 'natural alignment' of a layout > > * if no alignment is specified, alignment is the natural alignment > > * you can align a layout using Layout::alignTo(long) - this sets the > additional constraint (provided is power of two, non-negative, and > >=8) > > * When we obtain a VarHandle out of a Layout, at that point we are in > the process of computing offsets; here we can catch e.g. if the offset > of a given path doesn't match the specified alignment - in this case > we fail fast, even before memory is accessed; this occurs e.g. if the > layout path offset is not a multiple of its alignment, or if the > alignment of the nested element is stricter than the one of the > enclosing element. > > (in principle we could enforce these checks on layout creation, but > given we have unresolved layouts in our radar, I think it's best to do > minimal checks on layout creation and let the check fully kick in on > path creation). > > * If VarHandle can be constructed, a dynamic check verifies that > alignment of address passed to VH matches the layout requirements > > * MemoryScope::allocate also honors the alignment requirements - this > works by up-allocating memory (to make sure a pointer with desired > alignment exists in the allocated area) and then adjusting it after > the fact. For alignments < 16bytes, nothing is done given that malloc, > at least on x64 is guaranteed to respect that. > > > I think this is powerful and yet relatively simple to understand, > overall I quite like it. > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ > > Maurizio From brian.goetz at oracle.com Wed May 15 22:06:11 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 15 May 2019 18:06:11 -0400 Subject: Vector hash combination Message-ID: <6F01AC19-B5DC-4C32-8C92-3D1397DEEEB1@oracle.com> It would be useful to include a method somewhere in the Vector API to do hash combination; taking an IntVector or LongVector, whose elements describe hash values, and combine them together into a hash. Currently we do a sequential loop of multiply-and-add over the individual hash elements; exposing an efficient vectorized combination would be useful. From henry.jen at oracle.com Wed May 15 23:57:33 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 15 May 2019 16:57:33 -0700 Subject: [foreign] RFR: 8224013: jextract failed to generate source file under some scenarios Message-ID: <9E73FF01-2747-4B7F-ABD7-7A6D9A3D1E8E@oracle.com> Hi, Please review a trivial fix[1] for 8224013[2], jextract throws exceptions when use --src-dump-dir under following scenarios, 1. Not specifying target package name with -t. This will cause jextract trying to write static forwarder source into root folder. 2. If the --src-dump-dir specified is a symbolic link to an existing folder, jextract will fail with java.nio.file.FileAlreadyExistsException Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8224013/webrev/ [2] https://bugs.openjdk.java.net/browse/JDK-8224013 From jatin.bhateja at intel.com Thu May 16 02:41:31 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Thu, 16 May 2019 02:41:31 +0000 Subject: [vectorIntrinsics] [PATCH] Elemental shifts and rotates speedup Message-ID: Hi All, Please find a patch having following changes:- A) Intrinsification of two vector APIs: 1) VectorShuffle.shuffleIota(VectorSpecies, int) 2) VectorShuffle.toVector() B) Re-implimentation of following vector APIs using above intrinsified APIs. 1) Vector.shiftLanesLeft(int) 2) Vector.shiftLanesRight(int) 3) Vector.rotateLanesLeft(int) 4) Vector.rotateLanesRight(int) With this we see around ~2X gains in elemental shifts and rotate operations. Webrev: http://cr.openjdk.java.net/~kkharbas/Jatin/rotate_and_shift_lanes/webrev.00/ Kindly review the patch. Best Regards, Jatin From maurizio.cimadamore at oracle.com Thu May 16 10:25:36 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 11:25:36 +0100 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> Message-ID: On 15/05/2019 21:07, Jorn Vernee wrote: >> * you can align a layout using Layout::alignTo(long) - this sets the >> additional constraint (provided is power of two, non-negative, and >> >=8) > > Why the power of two constraint? Shouldn't it just be a multiple of 8? > > This seems to preclude some packed structs, e.g. with elements with 3 > byte alignment: > > ??? #pragma pack(1) > > ??? struct Foo { > ??????? char x; > ??????? short y; // 1 byte alignment > ??????? int z; // 3 byte alignment > ??? }; // size = 7 bytes > > In the layout API this is: > > ??? Value vChar = Value.ofSignedInt(8); > ??? Value vShort = Value.ofSignedInt(16); > ??? Value vInt = Value.ofSignedInt(32); > ??? Group g = Group.struct(vChar, vShort.alignTo(8), vInt.alignTo(24)); > > But this throws an IAE because alignment of 24 is not allowed. I think you can still model that use case - essentially what you want is to just align everything to 8. (in fact that's what pragma pack(1) does). I don't think there's such a thing as 3-byte aligned memory access: if you want to read 4 bytes on a 3-byte aligned address, that's just unaligned read. In other words, nothing new here - we're kind of following what the pragma pack allows you to do: the argument to pragma pack must also be a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference in the Panama API is that everything has to be multiplied by 8, because alignment (as sizes) are expressed in bits, to have more room to add sub-byte alignment (bit-fields, etc.) later on. So, back to your example: Value vChar = Value.ofSignedInt(8); Value vShort = Value.ofSignedInt(16); Value vInt = Value.ofSignedInt(32); Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), vInt.alignTo(8)); Maurizio [1] - https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 [2] - https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html > > Jorn > > Maurizio Cimadamore schreef op 2019-05-15 19:57: >> Hi, >> this patch adds support to the API to express alignment constraints on >> layouts. Following the model John put forward in [1], I did the >> following: >> >> * there's a way to compute the 'natural alignment' of a layout >> >> * if no alignment is specified, alignment is the natural alignment >> >> * you can align a layout using Layout::alignTo(long) - this sets the >> additional constraint (provided is power of two, non-negative, and >> >=8) >> >> * When we obtain a VarHandle out of a Layout, at that point we are in >> the process of computing offsets; here we can catch e.g. if the offset >> of a given path doesn't match the specified alignment - in this case >> we fail fast, even before memory is accessed; this occurs e.g. if the >> layout path offset is not a multiple of its alignment, or if the >> alignment of the nested element is stricter than the one of the >> enclosing element. >> >> (in principle we could enforce these checks on layout creation, but >> given we have unresolved layouts in our radar, I think it's best to do >> minimal checks on layout creation and let the check fully kick in on >> path creation). >> >> * If VarHandle can be constructed, a dynamic check verifies that >> alignment of address passed to VH matches the layout requirements >> >> * MemoryScope::allocate also honors the alignment requirements - this >> works by up-allocating memory (to make sure a pointer with desired >> alignment exists in the allocated area) and then adjusting it after >> the fact. For alignments < 16bytes, nothing is done given that malloc, >> at least on x64 is guaranteed to respect that. >> >> >> I think this is powerful and yet relatively simple to understand, >> overall I quite like it. >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >> >> Maurizio From maurizio.cimadamore at oracle.com Thu May 16 10:27:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 11:27:31 +0100 Subject: [foreign] RFR: 8224013: jextract failed to generate source file under some scenarios In-Reply-To: <9E73FF01-2747-4B7F-ABD7-7A6D9A3D1E8E@oracle.com> References: <9E73FF01-2747-4B7F-ABD7-7A6D9A3D1E8E@oracle.com> Message-ID: <63499858-f211-5eab-c923-35a8a44d7044@oracle.com> Looks good Maurizio On 16/05/2019 00:57, Henry Jen wrote: > Hi, > > Please review a trivial fix[1] for 8224013[2], > > jextract throws exceptions when use --src-dump-dir under following scenarios, > > 1. Not specifying target package name with -t. This will cause jextract trying to write static forwarder source into root folder. > 2. If the --src-dump-dir specified is a symbolic link to an existing folder, jextract will fail with java.nio.file.FileAlreadyExistsException > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/8224013/webrev/ > [2] https://bugs.openjdk.java.net/browse/JDK-8224013 From maurizio.cimadamore at oracle.com Thu May 16 10:38:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 11:38:46 +0100 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> Message-ID: <7a9c5e34-2062-d6fa-b757-2cc82193d3ac@oracle.com> Also note - in C, pragma will basically force the compiler to align fields in the specified way (adding padding if needed). The Layout API is more lower level than that - alignments are essentially constraints, which are local to the element on which the constraint appear. So, you could use alignment an shoot yourself in the foot: Value vChar = Value.ofSignedInt(8); Value vShort = Value.ofSignedInt(16); Group g = Group.of(vChar.alignTo(16), vShort.alignTo(16)); Now, this is bogus, as the vShort will never be correctly aligned (given it's at an 8-bit offset from the start of the struct, which is also 16-bit aligned). This is why we check things when we construct layout paths, to detect pathological situations like these. In C, pragma will probably have ended up generating something like this: Group g = Group.of(vChar.alignTo(16), Padding.of(8), vShort.alignTo(16)); With Layout API you have to do things manually. Now, we don't exclude that, in the future, we will also have an higher level API to do deep transformations on layouts, so that e.g. you can go from natural alignment to e.g. pack(2) alignment (with padding added/changed where needed); this is a similar problem to the one of converting a layout from big endian to little endian, which also require deep transformations. The way I see in the API is that the instance methods in the Layout interface act 'locally'; whereas helper functions which can radically change the sub-structure of layouts are better defined elsewhere. Maurizio On 16/05/2019 11:25, Maurizio Cimadamore wrote: > > On 15/05/2019 21:07, Jorn Vernee wrote: >>> * you can align a layout using Layout::alignTo(long) - this sets the >>> additional constraint (provided is power of two, non-negative, and >>> >=8) >> >> Why the power of two constraint? Shouldn't it just be a multiple of 8? >> >> This seems to preclude some packed structs, e.g. with elements with 3 >> byte alignment: >> >> ??? #pragma pack(1) >> >> ??? struct Foo { >> ??????? char x; >> ??????? short y; // 1 byte alignment >> ??????? int z; // 3 byte alignment >> ??? }; // size = 7 bytes >> >> In the layout API this is: >> >> ??? Value vChar = Value.ofSignedInt(8); >> ??? Value vShort = Value.ofSignedInt(16); >> ??? Value vInt = Value.ofSignedInt(32); >> ??? Group g = Group.struct(vChar, vShort.alignTo(8), vInt.alignTo(24)); >> >> But this throws an IAE because alignment of 24 is not allowed. > > I think you can still model that use case - essentially what you want > is to just align everything to 8. (in fact that's what pragma pack(1) > does). > > I don't think there's such a thing as 3-byte aligned memory access: if > you want to read 4 bytes on a 3-byte aligned address, that's just > unaligned read. > > In other words, nothing new here - we're kind of following what the > pragma pack allows you to do: the argument to pragma pack must also be > a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference > in the Panama API is that everything has to be multiplied by 8, > because alignment (as sizes) are expressed in bits, to have more room > to add sub-byte alignment (bit-fields, etc.) later on. > > So, back to your example: > > Value vChar = Value.ofSignedInt(8); > Value vShort = Value.ofSignedInt(16); > Value vInt = Value.ofSignedInt(32); > Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), > vInt.alignTo(8)); > > Maurizio > > [1] - https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 > [2] - > https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html > > >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-15 19:57: >>> Hi, >>> this patch adds support to the API to express alignment constraints on >>> layouts. Following the model John put forward in [1], I did the >>> following: >>> >>> * there's a way to compute the 'natural alignment' of a layout >>> >>> * if no alignment is specified, alignment is the natural alignment >>> >>> * you can align a layout using Layout::alignTo(long) - this sets the >>> additional constraint (provided is power of two, non-negative, and >>> >=8) >>> >>> * When we obtain a VarHandle out of a Layout, at that point we are in >>> the process of computing offsets; here we can catch e.g. if the offset >>> of a given path doesn't match the specified alignment - in this case >>> we fail fast, even before memory is accessed; this occurs e.g. if the >>> layout path offset is not a multiple of its alignment, or if the >>> alignment of the nested element is stricter than the one of the >>> enclosing element. >>> >>> (in principle we could enforce these checks on layout creation, but >>> given we have unresolved layouts in our radar, I think it's best to do >>> minimal checks on layout creation and let the check fully kick in on >>> path creation). >>> >>> * If VarHandle can be constructed, a dynamic check verifies that >>> alignment of address passed to VH matches the layout requirements >>> >>> * MemoryScope::allocate also honors the alignment requirements - this >>> works by up-allocating memory (to make sure a pointer with desired >>> alignment exists in the allocated area) and then adjusting it after >>> the fact. For alignments < 16bytes, nothing is done given that malloc, >>> at least on x64 is guaranteed to respect that. >>> >>> >>> I think this is powerful and yet relatively simple to understand, >>> overall I quite like it. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >>> >>> Maurizio From jbvernee at xs4all.nl Thu May 16 10:57:27 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 12:57:27 +0200 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> Message-ID: > I think you can still model that use case - essentially what you want > is to just align everything to 8. (in fact that's what pragma pack(1) > does). Ah, yes of course! My mistake. --- Btw, while testing this I also noticed that aligning e.g. a Group with alignTo() will return a Layout. I think we should use covariant return types here (and in the other Layout sub-types). Also, Address is missing on override for alignTo, so this will just return a Value. Thanks, Jorn [1] : https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 Maurizio Cimadamore schreef op 2019-05-16 12:25: > On 15/05/2019 21:07, Jorn Vernee wrote: >>> * you can align a layout using Layout::alignTo(long) - this sets the >>> additional constraint (provided is power of two, non-negative, and >>> >=8) >> >> Why the power of two constraint? Shouldn't it just be a multiple of 8? >> >> This seems to preclude some packed structs, e.g. with elements with 3 >> byte alignment: >> >> ??? #pragma pack(1) >> >> ??? struct Foo { >> ??????? char x; >> ??????? short y; // 1 byte alignment >> ??????? int z; // 3 byte alignment >> ??? }; // size = 7 bytes >> >> In the layout API this is: >> >> ??? Value vChar = Value.ofSignedInt(8); >> ??? Value vShort = Value.ofSignedInt(16); >> ??? Value vInt = Value.ofSignedInt(32); >> ??? Group g = Group.struct(vChar, vShort.alignTo(8), >> vInt.alignTo(24)); >> >> But this throws an IAE because alignment of 24 is not allowed. > > I think you can still model that use case - essentially what you want > is to just align everything to 8. (in fact that's what pragma pack(1) > does). > > I don't think there's such a thing as 3-byte aligned memory access: if > you want to read 4 bytes on a 3-byte aligned address, that's just > unaligned read. > > In other words, nothing new here - we're kind of following what the > pragma pack allows you to do: the argument to pragma pack must also be > a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference > in the Panama API is that everything has to be multiplied by 8, > because alignment (as sizes) are expressed in bits, to have more room > to add sub-byte alignment (bit-fields, etc.) later on. > > So, back to your example: > > Value vChar = Value.ofSignedInt(8); > Value vShort = Value.ofSignedInt(16); > Value vInt = Value.ofSignedInt(32); > Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), > vInt.alignTo(8)); > > Maurizio > > [1] - > https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 > [2] - > https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html > > >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-15 19:57: >>> Hi, >>> this patch adds support to the API to express alignment constraints >>> on >>> layouts. Following the model John put forward in [1], I did the >>> following: >>> >>> * there's a way to compute the 'natural alignment' of a layout >>> >>> * if no alignment is specified, alignment is the natural alignment >>> >>> * you can align a layout using Layout::alignTo(long) - this sets the >>> additional constraint (provided is power of two, non-negative, and >>> >=8) >>> >>> * When we obtain a VarHandle out of a Layout, at that point we are in >>> the process of computing offsets; here we can catch e.g. if the >>> offset >>> of a given path doesn't match the specified alignment - in this case >>> we fail fast, even before memory is accessed; this occurs e.g. if the >>> layout path offset is not a multiple of its alignment, or if the >>> alignment of the nested element is stricter than the one of the >>> enclosing element. >>> >>> (in principle we could enforce these checks on layout creation, but >>> given we have unresolved layouts in our radar, I think it's best to >>> do >>> minimal checks on layout creation and let the check fully kick in on >>> path creation). >>> >>> * If VarHandle can be constructed, a dynamic check verifies that >>> alignment of address passed to VH matches the layout requirements >>> >>> * MemoryScope::allocate also honors the alignment requirements - this >>> works by up-allocating memory (to make sure a pointer with desired >>> alignment exists in the allocated area) and then adjusting it after >>> the fact. For alignments < 16bytes, nothing is done given that >>> malloc, >>> at least on x64 is guaranteed to respect that. >>> >>> >>> I think this is powerful and yet relatively simple to understand, >>> overall I quite like it. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >>> >>> Maurizio From maurizio.cimadamore at oracle.com Thu May 16 11:00:30 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 12:00:30 +0100 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> Message-ID: On 16/05/2019 11:57, Jorn Vernee wrote: >> I think you can still model that use case - essentially what you want >> is to just align everything to 8. (in fact that's what pragma pack(1) >> does). > > Ah, yes of course! My mistake. > > --- > > Btw, while testing this I also noticed that aligning e.g. a Group with > alignTo() will return a Layout. I think we should use covariant return > types here (and in the other Layout sub-types). Also, Address is > missing on override for alignTo, so this will just return a Value. Whoops - good point. I also added a test with your 'pragma pack(1)' example. I'll spin a new webrev shortly. Thanks Maurizio > > Thanks, > Jorn > > [1] : https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 > > Maurizio Cimadamore schreef op 2019-05-16 12:25: >> On 15/05/2019 21:07, Jorn Vernee wrote: >>>> * you can align a layout using Layout::alignTo(long) - this sets the >>>> additional constraint (provided is power of two, non-negative, and >>>> >=8) >>> >>> Why the power of two constraint? Shouldn't it just be a multiple of 8? >>> >>> This seems to preclude some packed structs, e.g. with elements with >>> 3 byte alignment: >>> >>> ??? #pragma pack(1) >>> >>> ??? struct Foo { >>> ??????? char x; >>> ??????? short y; // 1 byte alignment >>> ??????? int z; // 3 byte alignment >>> ??? }; // size = 7 bytes >>> >>> In the layout API this is: >>> >>> ??? Value vChar = Value.ofSignedInt(8); >>> ??? Value vShort = Value.ofSignedInt(16); >>> ??? Value vInt = Value.ofSignedInt(32); >>> ??? Group g = Group.struct(vChar, vShort.alignTo(8), vInt.alignTo(24)); >>> >>> But this throws an IAE because alignment of 24 is not allowed. >> >> I think you can still model that use case - essentially what you want >> is to just align everything to 8. (in fact that's what pragma pack(1) >> does). >> >> I don't think there's such a thing as 3-byte aligned memory access: if >> you want to read 4 bytes on a 3-byte aligned address, that's just >> unaligned read. >> >> In other words, nothing new here - we're kind of following what the >> pragma pack allows you to do: the argument to pragma pack must also be >> a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference >> in the Panama API is that everything has to be multiplied by 8, >> because alignment (as sizes) are expressed in bits, to have more room >> to add sub-byte alignment (bit-fields, etc.) later on. >> >> So, back to your example: >> >> Value vChar = Value.ofSignedInt(8); >> Value vShort = Value.ofSignedInt(16); >> Value vInt = Value.ofSignedInt(32); >> Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), >> vInt.alignTo(8)); >> >> Maurizio >> >> [1] - >> https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 >> [2] - >> https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html >> >> >> >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-15 19:57: >>>> Hi, >>>> this patch adds support to the API to express alignment constraints on >>>> layouts. Following the model John put forward in [1], I did the >>>> following: >>>> >>>> * there's a way to compute the 'natural alignment' of a layout >>>> >>>> * if no alignment is specified, alignment is the natural alignment >>>> >>>> * you can align a layout using Layout::alignTo(long) - this sets the >>>> additional constraint (provided is power of two, non-negative, and >>>> >=8) >>>> >>>> * When we obtain a VarHandle out of a Layout, at that point we are in >>>> the process of computing offsets; here we can catch e.g. if the offset >>>> of a given path doesn't match the specified alignment - in this case >>>> we fail fast, even before memory is accessed; this occurs e.g. if the >>>> layout path offset is not a multiple of its alignment, or if the >>>> alignment of the nested element is stricter than the one of the >>>> enclosing element. >>>> >>>> (in principle we could enforce these checks on layout creation, but >>>> given we have unresolved layouts in our radar, I think it's best to do >>>> minimal checks on layout creation and let the check fully kick in on >>>> path creation). >>>> >>>> * If VarHandle can be constructed, a dynamic check verifies that >>>> alignment of address passed to VH matches the layout requirements >>>> >>>> * MemoryScope::allocate also honors the alignment requirements - this >>>> works by up-allocating memory (to make sure a pointer with desired >>>> alignment exists in the allocated area) and then adjusting it after >>>> the fact. For alignments < 16bytes, nothing is done given that malloc, >>>> at least on x64 is guaranteed to respect that. >>>> >>>> >>>> I think this is powerful and yet relatively simple to understand, >>>> overall I quite like it. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >>>> >>>> Maurizio From maurizio.cimadamore at oracle.com Thu May 16 11:37:44 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 12:37:44 +0100 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> Message-ID: <69ae9ef6-eed7-e36b-170b-f4de66dcf332@oracle.com> Here's new new webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8223978_v2/ I've changed implementation approach, and now added an OptionalLong to the AbstractLayout class. This way we can do almost everything in the superclass, and we can just rely on the already existing 'dup' feature (which we have to expand a bit) in order to produce a new layout with desired alignment. This is more regular, and automatcally handles thing such as covariant overrides. Maurizio On 16/05/2019 12:00, Maurizio Cimadamore wrote: > > On 16/05/2019 11:57, Jorn Vernee wrote: >>> I think you can still model that use case - essentially what you want >>> is to just align everything to 8. (in fact that's what pragma pack(1) >>> does). >> >> Ah, yes of course! My mistake. >> >> --- >> >> Btw, while testing this I also noticed that aligning e.g. a Group >> with alignTo() will return a Layout. I think we should use covariant >> return types here (and in the other Layout sub-types). Also, Address >> is missing on override for alignTo, so this will just return a Value. > > Whoops - good point. > > I also added a test with your 'pragma pack(1)' example. I'll spin a > new webrev shortly. > > Thanks > Maurizio > >> >> Thanks, >> Jorn >> >> [1] : https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 >> >> Maurizio Cimadamore schreef op 2019-05-16 12:25: >>> On 15/05/2019 21:07, Jorn Vernee wrote: >>>>> * you can align a layout using Layout::alignTo(long) - this sets the >>>>> additional constraint (provided is power of two, non-negative, and >>>>> >=8) >>>> >>>> Why the power of two constraint? Shouldn't it just be a multiple of 8? >>>> >>>> This seems to preclude some packed structs, e.g. with elements with >>>> 3 byte alignment: >>>> >>>> ??? #pragma pack(1) >>>> >>>> ??? struct Foo { >>>> ??????? char x; >>>> ??????? short y; // 1 byte alignment >>>> ??????? int z; // 3 byte alignment >>>> ??? }; // size = 7 bytes >>>> >>>> In the layout API this is: >>>> >>>> ??? Value vChar = Value.ofSignedInt(8); >>>> ??? Value vShort = Value.ofSignedInt(16); >>>> ??? Value vInt = Value.ofSignedInt(32); >>>> ??? Group g = Group.struct(vChar, vShort.alignTo(8), >>>> vInt.alignTo(24)); >>>> >>>> But this throws an IAE because alignment of 24 is not allowed. >>> >>> I think you can still model that use case - essentially what you want >>> is to just align everything to 8. (in fact that's what pragma pack(1) >>> does). >>> >>> I don't think there's such a thing as 3-byte aligned memory access: if >>> you want to read 4 bytes on a 3-byte aligned address, that's just >>> unaligned read. >>> >>> In other words, nothing new here - we're kind of following what the >>> pragma pack allows you to do: the argument to pragma pack must also be >>> a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference >>> in the Panama API is that everything has to be multiplied by 8, >>> because alignment (as sizes) are expressed in bits, to have more room >>> to add sub-byte alignment (bit-fields, etc.) later on. >>> >>> So, back to your example: >>> >>> Value vChar = Value.ofSignedInt(8); >>> Value vShort = Value.ofSignedInt(16); >>> Value vInt = Value.ofSignedInt(32); >>> Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), >>> vInt.alignTo(8)); >>> >>> Maurizio >>> >>> [1] - >>> https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 >>> [2] - >>> https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html >>> >>> >>> >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-15 19:57: >>>>> Hi, >>>>> this patch adds support to the API to express alignment >>>>> constraints on >>>>> layouts. Following the model John put forward in [1], I did the >>>>> following: >>>>> >>>>> * there's a way to compute the 'natural alignment' of a layout >>>>> >>>>> * if no alignment is specified, alignment is the natural alignment >>>>> >>>>> * you can align a layout using Layout::alignTo(long) - this sets the >>>>> additional constraint (provided is power of two, non-negative, and >>>>> >=8) >>>>> >>>>> * When we obtain a VarHandle out of a Layout, at that point we are in >>>>> the process of computing offsets; here we can catch e.g. if the >>>>> offset >>>>> of a given path doesn't match the specified alignment - in this case >>>>> we fail fast, even before memory is accessed; this occurs e.g. if the >>>>> layout path offset is not a multiple of its alignment, or if the >>>>> alignment of the nested element is stricter than the one of the >>>>> enclosing element. >>>>> >>>>> (in principle we could enforce these checks on layout creation, but >>>>> given we have unresolved layouts in our radar, I think it's best >>>>> to do >>>>> minimal checks on layout creation and let the check fully kick in on >>>>> path creation). >>>>> >>>>> * If VarHandle can be constructed, a dynamic check verifies that >>>>> alignment of address passed to VH matches the layout requirements >>>>> >>>>> * MemoryScope::allocate also honors the alignment requirements - this >>>>> works by up-allocating memory (to make sure a pointer with desired >>>>> alignment exists in the allocated area) and then adjusting it after >>>>> the fact. For alignments < 16bytes, nothing is done given that >>>>> malloc, >>>>> at least on x64 is guaranteed to respect that. >>>>> >>>>> >>>>> I think this is powerful and yet relatively simple to understand, >>>>> overall I quite like it. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >>>>> >>>>> Maurizio From jbvernee at xs4all.nl Thu May 16 12:02:08 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 14:02:08 +0200 Subject: [foreign-memaccess] RFR 8223978: Add alignment support to layouts In-Reply-To: <69ae9ef6-eed7-e36b-170b-f4de66dcf332@oracle.com> References: <51646702ba5298d53f3e140a6d3f8f4c@xs4all.nl> <69ae9ef6-eed7-e36b-170b-f4de66dcf332@oracle.com> Message-ID: <3a4fecfa10677d54b46c4897f663108b@xs4all.nl> Looks good! Jorn Maurizio Cimadamore schreef op 2019-05-16 13:37: > Here's new new webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8223978_v2/ > > I've changed implementation approach, and now added an OptionalLong to > the AbstractLayout class. This way we can do almost everything in the > superclass, and we can just rely on the already existing 'dup' feature > (which we have to expand a bit) in order to produce a new layout with > desired alignment. > > This is more regular, and automatcally handles thing such as covariant > overrides. > > Maurizio > > On 16/05/2019 12:00, Maurizio Cimadamore wrote: >> >> On 16/05/2019 11:57, Jorn Vernee wrote: >>>> I think you can still model that use case - essentially what you >>>> want >>>> is to just align everything to 8. (in fact that's what pragma >>>> pack(1) >>>> does). >>> >>> Ah, yes of course! My mistake. >>> >>> --- >>> >>> Btw, while testing this I also noticed that aligning e.g. a Group >>> with alignTo() will return a Layout. I think we should use covariant >>> return types here (and in the other Layout sub-types). Also, Address >>> is missing on override for alignTo, so this will just return a Value. >> >> Whoops - good point. >> >> I also added a test with your 'pragma pack(1)' example. I'll spin a >> new webrev shortly. >> >> Thanks >> Maurizio >> >>> >>> Thanks, >>> Jorn >>> >>> [1] : https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 >>> >>> Maurizio Cimadamore schreef op 2019-05-16 12:25: >>>> On 15/05/2019 21:07, Jorn Vernee wrote: >>>>>> * you can align a layout using Layout::alignTo(long) - this sets >>>>>> the >>>>>> additional constraint (provided is power of two, non-negative, and >>>>>> >=8) >>>>> >>>>> Why the power of two constraint? Shouldn't it just be a multiple of >>>>> 8? >>>>> >>>>> This seems to preclude some packed structs, e.g. with elements with >>>>> 3 byte alignment: >>>>> >>>>> ??? #pragma pack(1) >>>>> >>>>> ??? struct Foo { >>>>> ??????? char x; >>>>> ??????? short y; // 1 byte alignment >>>>> ??????? int z; // 3 byte alignment >>>>> ??? }; // size = 7 bytes >>>>> >>>>> In the layout API this is: >>>>> >>>>> ??? Value vChar = Value.ofSignedInt(8); >>>>> ??? Value vShort = Value.ofSignedInt(16); >>>>> ??? Value vInt = Value.ofSignedInt(32); >>>>> ??? Group g = Group.struct(vChar, vShort.alignTo(8), >>>>> vInt.alignTo(24)); >>>>> >>>>> But this throws an IAE because alignment of 24 is not allowed. >>>> >>>> I think you can still model that use case - essentially what you >>>> want >>>> is to just align everything to 8. (in fact that's what pragma >>>> pack(1) >>>> does). >>>> >>>> I don't think there's such a thing as 3-byte aligned memory access: >>>> if >>>> you want to read 4 bytes on a 3-byte aligned address, that's just >>>> unaligned read. >>>> >>>> In other words, nothing new here - we're kind of following what the >>>> pragma pack allows you to do: the argument to pragma pack must also >>>> be >>>> a power of 2 (1, 2, 4, 8 ...), see here [1, 2] - the only difference >>>> in the Panama API is that everything has to be multiplied by 8, >>>> because alignment (as sizes) are expressed in bits, to have more >>>> room >>>> to add sub-byte alignment (bit-fields, etc.) later on. >>>> >>>> So, back to your example: >>>> >>>> Value vChar = Value.ofSignedInt(8); >>>> Value vShort = Value.ofSignedInt(16); >>>> Value vInt = Value.ofSignedInt(32); >>>> Group g = Group.struct(vChar.alignTo(8), vShort.alignTo(8), >>>> vInt.alignTo(8)); >>>> >>>> Maurizio >>>> >>>> [1] - >>>> https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=vs-2019 >>>> [2] - >>>> https://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html >>>>> >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-15 19:57: >>>>>> Hi, >>>>>> this patch adds support to the API to express alignment >>>>>> constraints on >>>>>> layouts. Following the model John put forward in [1], I did the >>>>>> following: >>>>>> >>>>>> * there's a way to compute the 'natural alignment' of a layout >>>>>> >>>>>> * if no alignment is specified, alignment is the natural alignment >>>>>> >>>>>> * you can align a layout using Layout::alignTo(long) - this sets >>>>>> the >>>>>> additional constraint (provided is power of two, non-negative, and >>>>>> >=8) >>>>>> >>>>>> * When we obtain a VarHandle out of a Layout, at that point we are >>>>>> in >>>>>> the process of computing offsets; here we can catch e.g. if the >>>>>> offset >>>>>> of a given path doesn't match the specified alignment - in this >>>>>> case >>>>>> we fail fast, even before memory is accessed; this occurs e.g. if >>>>>> the >>>>>> layout path offset is not a multiple of its alignment, or if the >>>>>> alignment of the nested element is stricter than the one of the >>>>>> enclosing element. >>>>>> >>>>>> (in principle we could enforce these checks on layout creation, >>>>>> but >>>>>> given we have unresolved layouts in our radar, I think it's best >>>>>> to do >>>>>> minimal checks on layout creation and let the check fully kick in >>>>>> on >>>>>> path creation). >>>>>> >>>>>> * If VarHandle can be constructed, a dynamic check verifies that >>>>>> alignment of address passed to VH matches the layout requirements >>>>>> >>>>>> * MemoryScope::allocate also honors the alignment requirements - >>>>>> this >>>>>> works by up-allocating memory (to make sure a pointer with desired >>>>>> alignment exists in the allocated area) and then adjusting it >>>>>> after >>>>>> the fact. For alignments < 16bytes, nothing is done given that >>>>>> malloc, >>>>>> at least on x64 is guaranteed to respect that. >>>>>> >>>>>> >>>>>> I think this is powerful and yet relatively simple to understand, >>>>>> overall I quite like it. >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8223978/ >>>>>> >>>>>> Maurizio From maurizio.cimadamore at oracle.com Thu May 16 12:28:17 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 12:28:17 +0000 Subject: hg: panama/dev: 8223978: Add alignment support to layouts Message-ID: <201905161228.x4GCSIdb026081@aojmv0008.oracle.com> Changeset: ff50d436f560 Author: mcimadamore Date: 2019-05-16 13:27 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/ff50d436f560 8223978: Add alignment support to layouts ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/Address.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/Padding.java ! src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java ! src/java.base/share/classes/java/lang/invoke/AddressVarHandleGenerator.java ! src/java.base/share/classes/java/lang/invoke/VarHandleMemoryAddressBase.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java + test/jdk/java/foreign/TestMemoryAlignment.java From maurizio.cimadamore at oracle.com Thu May 16 12:33:24 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 13:33:24 +0100 Subject: [foreign-memaccess] RFR 8224037: Remove layout attributes Message-ID: Now that we have alignments, we can finally go ahead and drop layout attributes, which is, at present, an overly general mechanism. It is likely that the need for custom attributes will resurface at some point when dealing with foreign function. But this API doesn't need it. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224037/ Maurizio From jbvernee at xs4all.nl Thu May 16 12:55:07 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 14:55:07 +0200 Subject: [foreign-memaccess] RFR 8224037: Remove layout attributes In-Reply-To: References: Message-ID: <70f07439b58c11aad0ab155bef018c28@xs4all.nl> Looks good as well! Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-16 14:33: > Now that we have alignments, we can finally go ahead and drop layout > attributes, which is, at present, an overly general mechanism. > > It is likely that the need for custom attributes will resurface at > some point when dealing with foreign function. But this API doesn't > need it. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224037/ > > Maurizio From maurizio.cimadamore at oracle.com Thu May 16 12:57:50 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 12:57:50 +0000 Subject: hg: panama/dev: 8224037: Remove layout attributes Message-ID: <201905161257.x4GCvp43016007@aojmv0008.oracle.com> Changeset: 9c68a5573299 Author: mcimadamore Date: 2019-05-16 13:56 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/9c68a5573299 8224037: Remove layout attributes ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/Address.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/Padding.java ! src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java From maurizio.cimadamore at oracle.com Thu May 16 13:04:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 14:04:43 +0100 Subject: [foreign-memaccess] RFR 8224039: Remove unnecessary layout classes Message-ID: Hi, this patch removes layout subclasses which were deemed unnecessary for this memory access API: * Address (this API is about contiguous memory access) * Unresolved (by name references is especially useful in combination with pointers) This patch also removes the Value::contents method, on the basis that for now there's nothing sensible the API can do with it (this will change once we can rely upon the Vector API). Note: the removed layouts are not gone forever, and will likely be re-added by higher level layers. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224039/ Cheers Maurizio From jbvernee at xs4all.nl Thu May 16 13:18:30 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 15:18:30 +0200 Subject: [foreign-memaccess] RFR 8224039: Remove unnecessary layout classes In-Reply-To: References: Message-ID: Very nice! You can also remove this check in VarHandles.java: LayoutPath path2 = path; while (path2 != null) { if (path2.enclosing() != null && path2.enclosing().layout() instanceof Value) { throw new IllegalArgumentException("Cannot dereference path into Value container"); } path2 = path2.enclosing(); } Otherwise looks good. --- Also, this is unrelated, but noticed it now: the Layout class javadoc mentions: "A layout is always associated with a size (in bits)." But, since we have unbounded sequences this is no longer true, and Layout::bitSize() can throw an UnsupportedOperationException if it's a Sequence. Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-16 15:04: > Hi, > this patch removes layout subclasses which were deemed unnecessary for > this memory access API: > > * Address (this API is about contiguous memory access) > * Unresolved (by name references is especially useful in combination > with pointers) > > This patch also removes the Value::contents method, on the basis that > for now there's nothing sensible the API can do with it (this will > change once we can rely upon the Vector API). > > Note: the removed layouts are not gone forever, and will likely be > re-added by higher level layers. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224039/ > > Cheers > Maurizio From maurizio.cimadamore at oracle.com Thu May 16 13:20:15 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 14:20:15 +0100 Subject: [foreign-memaccess] RFR 8224039: Remove unnecessary layout classes In-Reply-To: References: Message-ID: On 16/05/2019 14:18, Jorn Vernee wrote: > Very nice! > > You can also remove this check in VarHandles.java: > > ??? LayoutPath path2 = path; > ??? while (path2 != null) { > ??????? if (path2.enclosing() != null && > ??????????? path2.enclosing().layout() instanceof Value) { > ??????????? throw new IllegalArgumentException("Cannot dereference > path into Value container"); > ??????? } > ??????? path2 = path2.enclosing(); > ??? } > > Otherwise looks good. Thanks - good point - I'll fix and push > > --- > > Also, this is unrelated, but noticed it now: the Layout class javadoc > mentions: "A layout is always associated with a size (in bits)." > > But, since we have unbounded sequences this is no longer true, and > Layout::bitSize() can throw an UnsupportedOperationException if it's a > Sequence. True - good catch, I will address in followup misc improvements to Layout API. Maurizio > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-16 15:04: >> Hi, >> this patch removes layout subclasses which were deemed unnecessary for >> this memory access API: >> >> * Address (this API is about contiguous memory access) >> * Unresolved (by name references is especially useful in combination >> with pointers) >> >> This patch also removes the Value::contents method, on the basis that >> for now there's nothing sensible the API can do with it (this will >> change once we can rely upon the Vector API). >> >> Note: the removed layouts are not gone forever, and will likely be >> re-added by higher level layers. >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224039/ >> >> Cheers >> Maurizio From maurizio.cimadamore at oracle.com Thu May 16 13:31:15 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 13:31:15 +0000 Subject: hg: panama/dev: 8224039: Remove unnecessary layout classes Message-ID: <201905161331.x4GDVFcK007075@aojmv0008.oracle.com> Changeset: b9a8decd9ec8 Author: mcimadamore Date: 2019-05-16 14:30 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/b9a8decd9ec8 8224039: Remove unnecessary layout classes ! src/java.base/share/classes/java/foreign/AbstractLayout.java - src/java.base/share/classes/java/foreign/Address.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java ! src/java.base/share/classes/java/foreign/Padding.java ! src/java.base/share/classes/java/foreign/Sequence.java - src/java.base/share/classes/java/foreign/Unresolved.java ! src/java.base/share/classes/java/foreign/Value.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathImpl.java From maurizio.cimadamore at oracle.com Thu May 16 13:43:08 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 14:43:08 +0100 Subject: [foreign-memaccess] RFR 8224040: Misc improvements to layout API Message-ID: <17e8fb60-79e4-d31c-13a7-232f89ee199c@oracle.com> Hi, this cleans up some of the edges of the API: * better spec for Layout::bitsSize (which can throw on unbound sequences, thx Jorn for pointing that out!) * replace LayoutPath::isBound with LayoutPath::dimensions, and fix javadoc * remove Ednianness enum, and just use old good nio ByteOrder Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224040/ Cheers Maurizio From jbvernee at xs4all.nl Thu May 16 14:38:40 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 16:38:40 +0200 Subject: [foreign-memaccess] RFR 8224040: Misc improvements to layout API In-Reply-To: <17e8fb60-79e4-d31c-13a7-232f89ee199c@oracle.com> References: <17e8fb60-79e4-d31c-13a7-232f89ee199c@oracle.com> Message-ID: <1a5fd88047b1860bfd3b16714049dd1e@xs4all.nl> - Spelling nit : LayoutPath::elementPath(String) & LayoutPath::elementPath(long) in the javadoc "a sub-elements" -> "a sub-element". - Some javadoc, particularly in Layout, LayoutPath, Compound, and Sequence, still mentions 'bound'/'unbound' when talking about Sequences with/without a fixed size. I believe this is incorrect and should be 'bounded'/'unbounded' instead, which as an adjective means 'having bounds or limits' (or the inverse) [1]. - While you're doing misc cleanups, you could also do a defensive copy of Group::struct and Group::union `elements` array. Otherwise the resulting object will not be immutable if an existing array is passed in, which is later modified. Jorn [1] : https://www.dictionary.com/browse/bounded Maurizio Cimadamore schreef op 2019-05-16 15:43: > Hi, > this cleans up some of the edges of the API: > > * better spec for Layout::bitsSize (which can throw on unbound > sequences, thx Jorn for pointing that out!) > * replace LayoutPath::isBound with LayoutPath::dimensions, and fix > javadoc > * remove Ednianness enum, and just use old good nio ByteOrder > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224040/ > > Cheers > Maurizio From maurizio.cimadamore at oracle.com Thu May 16 14:57:01 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 14:57:01 +0000 Subject: hg: panama/dev: 8224040: Misc improvements to layout API Message-ID: <201905161457.x4GEv2Se004628@aojmv0008.oracle.com> Changeset: a00cd377e891 Author: mcimadamore Date: 2019-05-16 15:54 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/a00cd377e891 8224040: Misc improvements to layout API ! src/java.base/share/classes/java/foreign/Compound.java ! src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java ! src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/Value.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java From maurizio.cimadamore at oracle.com Thu May 16 15:07:48 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 16:07:48 +0100 Subject: [foreign-memaccess] RFR 8224041: Name of layout classes should end with "Layout" suffix Message-ID: Hi, this is the last patch in the queue of Layout API cleanups, and, perhaps, the simplest one. This patch simply adds a "Layout" suffix to all the nodes in the Layout hierarchy, to avoid over-general name such as 'Compound', 'Group', 'Value'. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224041/ Maurizio From maurizio.cimadamore at oracle.com Thu May 16 15:09:54 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 16:09:54 +0100 Subject: [foreign-memaccess] RFR 8224040: Misc improvements to layout API In-Reply-To: <1a5fd88047b1860bfd3b16714049dd1e@xs4all.nl> References: <17e8fb60-79e4-d31c-13a7-232f89ee199c@oracle.com> <1a5fd88047b1860bfd3b16714049dd1e@xs4all.nl> Message-ID: <76e053fd-223e-50ee-d14a-71bd1a067f3b@oracle.com> Fixed and pushed, thanks Maurizio On 16/05/2019 15:38, Jorn Vernee wrote: > - Spelling nit : LayoutPath::elementPath(String) & > LayoutPath::elementPath(long) in the javadoc "a sub-elements" -> "a > sub-element". > > - Some javadoc, particularly in Layout, LayoutPath, Compound, and > Sequence, still mentions 'bound'/'unbound' when talking about > Sequences with/without a fixed size. I believe this is incorrect and > should be 'bounded'/'unbounded' instead, which as an adjective means > 'having bounds or limits' (or the inverse) [1]. > > - While you're doing misc cleanups, you could also do a defensive copy > of Group::struct and Group::union `elements` array. Otherwise the > resulting object will not be immutable if an existing array is passed > in, which is later modified. > > Jorn > > [1] : https://www.dictionary.com/browse/bounded > > Maurizio Cimadamore schreef op 2019-05-16 15:43: >> Hi, >> this cleans up some of the edges of the API: >> >> * better spec for Layout::bitsSize (which can throw on unbound >> sequences, thx Jorn for pointing that out!) >> * replace LayoutPath::isBound with LayoutPath::dimensions, and fix >> javadoc >> * remove Ednianness enum, and just use old good nio ByteOrder >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224040/ >> >> Cheers >> Maurizio From jbvernee at xs4all.nl Thu May 16 16:05:00 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 16 May 2019 18:05:00 +0200 Subject: [foreign-memaccess] RFR 8224041: Name of layout classes should end with "Layout" suffix In-Reply-To: References: Message-ID: <8397d6f5ddf7ac0466ef9d644a65b11b@xs4all.nl> Looks good, I like it! Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-16 17:07: > Hi, > this is the last patch in the queue of Layout API cleanups, and, > perhaps, the simplest one. > > This patch simply adds a "Layout" suffix to all the nodes in the > Layout hierarchy, to avoid over-general name such as 'Compound', > 'Group', 'Value'. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224041/ > > Maurizio From maurizio.cimadamore at oracle.com Thu May 16 16:20:25 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 16:20:25 +0000 Subject: hg: panama/dev: 8224041: Name of layout classes should end with "Layout" suffix Message-ID: <201905161620.x4GGKQCM028092@aojmv0008.oracle.com> Changeset: b9bfd8de1892 Author: mcimadamore Date: 2019-05-16 16:31 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/b9bfd8de1892 8224041: Name of layout classes should end with "Layout" suffix ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/CompoundLayout.java < src/java.base/share/classes/java/foreign/Compound.java ! src/java.base/share/classes/java/foreign/GroupLayout.java < src/java.base/share/classes/java/foreign/Group.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/LayoutPath.java - src/java.base/share/classes/java/foreign/Padding.java + src/java.base/share/classes/java/foreign/PaddingLayout.java ! src/java.base/share/classes/java/foreign/SequenceLayout.java < src/java.base/share/classes/java/foreign/Sequence.java ! src/java.base/share/classes/java/foreign/ValueLayout.java < src/java.base/share/classes/java/foreign/Value.java ! src/java.base/share/classes/java/lang/invoke/VarHandles.java ! src/java.base/share/classes/jdk/internal/foreign/LayoutPathImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java ! test/jdk/java/foreign/TestMemoryAlignment.java From henry.jen at oracle.com Thu May 16 16:49:18 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Thu, 16 May 2019 16:49:18 +0000 Subject: hg: panama/dev: 8224013: jextract failed to generate source file under some scenarios Message-ID: <201905161649.x4GGnIPq017973@aojmv0008.oracle.com> Changeset: cb9e1143513e Author: henryjen Date: 2019-05-16 09:48 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/cb9e1143513e 8224013: jextract failed to generate source file under some scenarios Reviewed-by: mcimadamore ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JavaSourceFactoryExt.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java ! test/jdk/com/sun/tools/jextract/JextractToolProviderTest.java + test/jdk/com/sun/tools/jextract/TestSrcDump.java From maurizio.cimadamore at oracle.com Thu May 16 16:54:52 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 16:54:52 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905161654.x4GGsqKH022892@aojmv0008.oracle.com> Changeset: e2731c0ccfb5 Author: mcimadamore Date: 2019-05-16 18:54 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/e2731c0ccfb5 Automatic merge with foreign From maurizio.cimadamore at oracle.com Thu May 16 17:28:58 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 May 2019 18:28:58 +0100 Subject: [foreign-memaccess] RFC: split MemoryAddress and MemoryRegion Message-ID: Hi, this is not an official RFR, as much a request for comments; in the past I've mentioned the possibility of separating the region-like aspects of MemoryAddress into a separate abstraction (MemoryRegion), as users can be confused by the dual nature of MemoryAddress. An example of this split is below: http://cr.openjdk.java.net/~mcimadamore/panama/split-region/ Splitting seems to lead to a generally clearer API: * methods 'feels' where they belong (e.g. resize() is on region) * Address implementation becomes even smaller (offset + region) * Having an explicit MemoryRegion thingie will give us a natural carrier for 'compound' layouts, which can be useful later on (e.g. foreign functions) Disadvantages are: * one more API point * MemoryScope::allocate returns a region, which has then to be turned into a MemoryAddress (of course we could always project into address, but it seems more general for allocation to return the region) Comments? P.S. I'm not married to the MR/MA names, so if the main objection against this split has to do with naming, I'm open to discuss alternatives. On the other hand, I've tried to think about other names for MemoryAddres which could handle better its dual nature - and I'm not convinced we can resolve the ambiguity with a simple name change. Cheers Maurizio From vivek.r.deshpande at intel.com Thu May 16 22:29:06 2019 From: vivek.r.deshpande at intel.com (vivek.r.deshpande at intel.com) Date: Thu, 16 May 2019 22:29:06 +0000 Subject: hg: panama/dev: add tests for masked reductions and masked min max and fix for masked minLanes and maxLanes for FP Message-ID: <201905162229.x4GMT68f020718@aojmv0008.oracle.com> Changeset: a5794fee2485 Author: vdeshpande Date: 2019-05-16 15:28 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/a5794fee2485 add tests for masked reductions and masked min max and fix for masked minLanes and maxLanes for FP ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double128Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double256Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double512Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double64Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/DoubleMaxVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float128Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float256Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float512Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float64Vector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/FloatMaxVector.java ! src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X-VectorBits.java.template ! test/jdk/jdk/incubator/vector/Byte128VectorTests.java ! test/jdk/jdk/incubator/vector/Byte256VectorTests.java ! test/jdk/jdk/incubator/vector/Byte512VectorTests.java ! test/jdk/jdk/incubator/vector/Byte64VectorTests.java ! test/jdk/jdk/incubator/vector/ByteMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Double128VectorTests.java ! test/jdk/jdk/incubator/vector/Double256VectorTests.java ! test/jdk/jdk/incubator/vector/Double512VectorTests.java ! test/jdk/jdk/incubator/vector/Double64VectorTests.java ! test/jdk/jdk/incubator/vector/DoubleMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Float128VectorTests.java ! test/jdk/jdk/incubator/vector/Float256VectorTests.java ! test/jdk/jdk/incubator/vector/Float512VectorTests.java ! test/jdk/jdk/incubator/vector/Float64VectorTests.java ! test/jdk/jdk/incubator/vector/FloatMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Int128VectorTests.java ! test/jdk/jdk/incubator/vector/Int256VectorTests.java ! test/jdk/jdk/incubator/vector/Int512VectorTests.java ! test/jdk/jdk/incubator/vector/Int64VectorTests.java ! test/jdk/jdk/incubator/vector/IntMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Long128VectorTests.java ! test/jdk/jdk/incubator/vector/Long256VectorTests.java ! test/jdk/jdk/incubator/vector/Long512VectorTests.java ! test/jdk/jdk/incubator/vector/Long64VectorTests.java ! test/jdk/jdk/incubator/vector/LongMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Short128VectorTests.java ! test/jdk/jdk/incubator/vector/Short256VectorTests.java ! test/jdk/jdk/incubator/vector/Short512VectorTests.java ! test/jdk/jdk/incubator/vector/Short64VectorTests.java ! test/jdk/jdk/incubator/vector/ShortMaxVectorTests.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/DoubleMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/DoubleScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/FloatMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/FloatScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/IntMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/IntScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/LongMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/LongScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ShortMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ShortScalar.java ! test/jdk/jdk/incubator/vector/clean.sh ! test/jdk/jdk/incubator/vector/gen-template.sh ! test/jdk/jdk/incubator/vector/gen-tests.sh + test/jdk/jdk/incubator/vector/templates/Kernel-Reduction-Masked-Max-op.template + test/jdk/jdk/incubator/vector/templates/Kernel-Reduction-Masked-Min-op.template + test/jdk/jdk/incubator/vector/templates/Kernel-Reduction-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Reduction-Masked-Max-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Reduction-Masked-Min-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Reduction-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Scalar-Reduction-Masked-Max-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Scalar-Reduction-Masked-Min-op.template + test/jdk/jdk/incubator/vector/templates/Perf-Scalar-Reduction-Masked-op.template ! test/jdk/jdk/incubator/vector/templates/Unit-Binary-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Masked-Max-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Masked-Min-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Scalar-Masked-Max-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Scalar-Masked-Min-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Reduction-Scalar-Masked-op.template ! test/jdk/jdk/incubator/vector/templates/Unit-header.template From maurizio.cimadamore at oracle.com Thu May 16 22:35:06 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 16 May 2019 22:35:06 +0000 Subject: hg: panama/dev: Automatic merge with vectorIntrinsics Message-ID: <201905162235.x4GMZ6YG022846@aojmv0008.oracle.com> Changeset: 4a59fe14763b Author: mcimadamore Date: 2019-05-17 00:34 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/4a59fe14763b Automatic merge with vectorIntrinsics From vivek.r.deshpande at intel.com Fri May 17 01:48:45 2019 From: vivek.r.deshpande at intel.com (vivek.r.deshpande at intel.com) Date: Fri, 17 May 2019 01:48:45 +0000 Subject: hg: panama/dev: small update to JDK-8221429 Message-ID: <201905170148.x4H1mk6K014164@aojmv0008.oracle.com> Changeset: 0a84f67fc919 Author: vdeshpande Date: 2019-05-16 18:47 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/0a84f67fc919 small update to JDK-8221429 ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/DoubleScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/FloatScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/IntScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/LongScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ShortScalar.java ! test/jdk/jdk/incubator/vector/templates/Perf-Scalar-Reduction-Masked-Max-op.template ! test/jdk/jdk/incubator/vector/templates/Perf-Scalar-Reduction-Masked-Min-op.template From maurizio.cimadamore at oracle.com Fri May 17 01:54:38 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 17 May 2019 01:54:38 +0000 Subject: hg: panama/dev: Automatic merge with vectorIntrinsics Message-ID: <201905170154.x4H1sdDu018714@aojmv0008.oracle.com> Changeset: f39f988e45f0 Author: mcimadamore Date: 2019-05-17 03:54 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/f39f988e45f0 Automatic merge with vectorIntrinsics From samuel.audet at gmail.com Fri May 17 05:54:36 2019 From: samuel.audet at gmail.com (Samuel Audet) Date: Fri, 17 May 2019 14:54:36 +0900 Subject: [foreign] Poor performance? Message-ID: Hi, This person seems to be unable to obtain an acceptable level of performance with Panama: https://github.com/zakgof/java-native-benchmark I thought it would be good to know why so this can be fixed! Samuel From henry.jen at oracle.com Fri May 17 06:36:47 2019 From: henry.jen at oracle.com (Henry Jen) Date: Thu, 16 May 2019 23:36:47 -0700 Subject: [foreign] RFR: 8224013: jextract failed to generate source file under some scenarios In-Reply-To: <63499858-f211-5eab-c923-35a8a44d7044@oracle.com> References: <9E73FF01-2747-4B7F-ABD7-7A6D9A3D1E8E@oracle.com> <63499858-f211-5eab-c923-35a8a44d7044@oracle.com> Message-ID: I somehow messed up the test case. I got a it right at first, but added -t for the symbolic link case wanting to show that it?s irrelevant, but that failed the case as it setting up the folder without package. I?ll push this patch with same bug ID, diff -r b9524dfaa8f0 src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java --- a/src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java Thu May 16 10:06:42 2019 -0700 +++ b/src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java Thu May 16 23:27:38 2019 -0700 @@ -95,10 +95,13 @@ Path dir = fullPath.getParent(); // In case the folder exist and is a link to a folder, this should be OK // Case in point, /tmp on MacOS link to /private/tmp - if (Files.exists(dir) && !Files.isDirectory(dir)) { - throw new FileAlreadyExistsException(dir.toAbsolutePath().toString()); + if (Files.exists(dir)) { + if (!Files.isDirectory(dir)) { + throw new FileAlreadyExistsException(dir.toAbsolutePath().toString()); + } + } else { + Files.createDirectories(fullPath.getParent()); } - Files.createDirectories(fullPath.getParent()); Files.write(fullPath, List.of(entry.getValue())); } } diff -r b9524dfaa8f0 test/jdk/com/sun/tools/jextract/TestSrcDump.java --- a/test/jdk/com/sun/tools/jextract/TestSrcDump.java Thu May 16 10:06:42 2019 -0700 +++ b/test/jdk/com/sun/tools/jextract/TestSrcDump.java Thu May 16 23:27:38 2019 -0700 @@ -61,10 +61,10 @@ Path realTarget = getOutputFilePath("realGenSrc"); Files.createDirectory(realTarget); Files.createSymbolicLink(src, realTarget); - run("--src-dump-dir", src.toString(), "-t", "com.acme", + run("--src-dump-dir", src.toString(), getInputFilePath("simple.h").toString()).checkSuccess(); try { - assertTrue(Files.isRegularFile(src.resolve("com").resolve("acme").resolve(staticForwarderName("simple.h") + ".java"))); + assertTrue(Files.isRegularFile(src.resolve(staticForwarderName("simple.h") + ".java"))); } finally { deleteFile(src); deleteDir(realTarget); Cheers, Henry > On May 16, 2019, at 3:27 AM, Maurizio Cimadamore wrote: > > Looks good > > Maurizio > > On 16/05/2019 00:57, Henry Jen wrote: >> Hi, >> >> Please review a trivial fix[1] for 8224013[2], >> >> jextract throws exceptions when use --src-dump-dir under following scenarios, >> >> 1. Not specifying target package name with -t. This will cause jextract trying to write static forwarder source into root folder. >> 2. If the --src-dump-dir specified is a symbolic link to an existing folder, jextract will fail with java.nio.file.FileAlreadyExistsException >> >> Cheers, >> Henry >> >> [1] http://cr.openjdk.java.net/~henryjen/panama/8224013/webrev/ >> [2] https://bugs.openjdk.java.net/browse/JDK-8224013 From maurizio.cimadamore at oracle.com Fri May 17 10:26:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 11:26:46 +0100 Subject: [foreign] Poor performance? In-Reply-To: References: Message-ID: On 17/05/2019 06:54, Samuel Audet wrote: > Hi, > > This person seems to be unable to obtain an acceptable level of performance > with Panama: > https://github.com/zakgof/java-native-benchmark > > I thought it would be good to know why so this can be fixed! Hi Samuel, thanks you for bringing this up, I saw this benchmark few days ago and I took a look at it. That benchmark is unfortunately hitting on a couple of (transitory!) pain points: (1) it is running on Windows, which lacks the optimizations available for MacOS and Linux (directInvoker). When the linkToNative effort will be completed, this discrepancy between platforms will go away. The second problem (2) is that the call is passing a big struct (e.g. bigger than 64 bits). Even on Linux and Mac, such a call would be unable to take advantage of the optimized invoker and would fall back to the so called 'universal invoker' which is slow. The plan is to start using linkToNative as our official cross-platform fast-path, and then progressively enhance linkToNative so that it can handle all the cases that are currently left to universalInvoker - of these there are two that are important: big structs passed by pointer, and return value in memory. Once this is done (and this is mostly a matter of figuring out a 'private protocol', or set of carriers that are known to both the Panama binder and the VM), then I would expect that performance cliffs like the one you show will go away. At present state, on Linux and Mac, I would expect a call involving small structs and/or primitive to be on par with JNI and be significantly faster than JNI/JFFI. Luckily, when writing real world code, these things usually cancel out - e.g. it would be unusual for a real world application to only have 'slow' calls with big structs, but the situation is of course possible in a synthetic benchmark and we should make sure to polish these edges. Thanks Maurizio > > Samuel From maurizio.cimadamore at oracle.com Fri May 17 14:51:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 15:51:46 +0100 Subject: [foreign] Poor performance? In-Reply-To: References: Message-ID: On 17/05/2019 11:26, Maurizio Cimadamore wrote: > thanks you for bringing this up, I saw this benchmark few days ago and > I took a look at it. That benchmark is unfortunately hitting on a > couple of (transitory!) pain points: (1) it is running on Windows, > which lacks the optimizations available for MacOS and Linux > (directInvoker). When the linkToNative effort will be completed, this > discrepancy between platforms will go away. The second problem (2) is > that the call is passing a big struct (e.g. bigger than 64 bits). Even > on Linux and Mac, such a call would be unable to take advantage of the > optimized invoker and would fall back to the so called 'universal > invoker' which is slow. Actually, my bad, the bench is passing pointer to structs, not structs by value - which I think should mean the 'foreign+linkToNative' experimental branch should be able to handle this. Would be nice to get some confirmation that this is indeed the case. Maurizio From jbvernee at xs4all.nl Fri May 17 15:14:19 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 17 May 2019 17:14:19 +0200 Subject: [foreign] Poor performance? In-Reply-To: References: Message-ID: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> FWIW, I ran the benchmarks with the linkToNative back-end (using -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still 2x slower than JNI: Benchmark Mode Cnt Score Error Units JmhGetSystemTimeSeconds.jni_javacpp avgt 50 298.046 ? 15.744 ns/op JmhGetSystemTimeSeconds.panama_prelayout avgt 50 596.567 ? 20.570 ns/op Of course, like Aleksey says: "The numbers [above] are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are.". Unfortunately, I'm having some trouble getting the project to work with the Windows profiler :/ Was currently looking into that. Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-17 16:51: > On 17/05/2019 11:26, Maurizio Cimadamore wrote: >> thanks you for bringing this up, I saw this benchmark few days ago and >> I took a look at it. That benchmark is unfortunately hitting on a >> couple of (transitory!) pain points: (1) it is running on Windows, >> which lacks the optimizations available for MacOS and Linux >> (directInvoker). When the linkToNative effort will be completed, this >> discrepancy between platforms will go away. The second problem (2) is >> that the call is passing a big struct (e.g. bigger than 64 bits). Even >> on Linux and Mac, such a call would be unable to take advantage of the >> optimized invoker and would fall back to the so called 'universal >> invoker' which is slow. > > Actually, my bad, the bench is passing pointer to structs, not structs > by value - which I think should mean the 'foreign+linkToNative' > experimental branch should be able to handle this. Would be nice to > get some confirmation that this is indeed the case. > > Maurizio From jbvernee at xs4all.nl Fri May 17 15:27:55 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 17 May 2019 17:27:55 +0200 Subject: [foreign] Poor performance? In-Reply-To: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> Message-ID: <63d1a88ccfc7b1e0dcd62b72c6af9cd8@xs4all.nl> Sorry, forgot to include the CallOnly results (seems to have been omitted for some reason), which look much better: Benchmark Mode Cnt Score Error Units JmhCallOnly.jni_javacpp avgt 50 64.958 ? 3.608 ns/op JmhCallOnly.panama avgt 50 39.231 ? 1.951 ns/op JmhGetSystemTimeSeconds.jni_javacpp avgt 50 295.754 ? 13.541 ns/op JmhGetSystemTimeSeconds.panama_prelayout avgt 50 610.027 ? 30.592 ns/op Obviously, this deserves some more investigation either way :) Jorn Jorn Vernee schreef op 2019-05-17 17:14: > FWIW, I ran the benchmarks with the linkToNative back-end (using > -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still > 2x slower than JNI: > > Benchmark Mode Cnt Score > Error Units > JmhGetSystemTimeSeconds.jni_javacpp avgt 50 298.046 ? > 15.744 ns/op > JmhGetSystemTimeSeconds.panama_prelayout avgt 50 596.567 ? > 20.570 ns/op > > Of course, like Aleksey says: "The numbers [above] are just data. To > gain reusable insights, you need to follow up on why the numbers are > the way they are.". Unfortunately, I'm having some trouble getting the > project to work with the Windows profiler :/ Was currently looking > into that. > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-17 16:51: >> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>> thanks you for bringing this up, I saw this benchmark few days ago >>> and I took a look at it. That benchmark is unfortunately hitting on a >>> couple of (transitory!) pain points: (1) it is running on Windows, >>> which lacks the optimizations available for MacOS and Linux >>> (directInvoker). When the linkToNative effort will be completed, this >>> discrepancy between platforms will go away. The second problem (2) is >>> that the call is passing a big struct (e.g. bigger than 64 bits). >>> Even on Linux and Mac, such a call would be unable to take advantage >>> of the optimized invoker and would fall back to the so called >>> 'universal invoker' which is slow. >> >> Actually, my bad, the bench is passing pointer to structs, not structs >> by value - which I think should mean the 'foreign+linkToNative' >> experimental branch should be able to handle this. Would be nice to >> get some confirmation that this is indeed the case. >> >> Maurizio From maurizio.cimadamore at oracle.com Fri May 17 15:33:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 16:33:39 +0100 Subject: [foreign] Poor performance? In-Reply-To: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> Message-ID: <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> Thanks Jorn, I'd be more interested in knowing the raw native call numbers, does it get any better with linkToNative? Here I'd be expecting performances identical to JNI (since the binder should lower the Pointer to a long, which LinkToNative would then pass by register). As for the fuller benchmark, note that you are also measuring the performances of Scope::allocate, which is internally using some maps. JNR/JNI does not do the same liveliness checks that we do, so the full benchmark is not totally fair. But the arw performance of the downcall should be an apple-to-apple comparison, and it shouldn't be 8x slower as it is now (at least not with linkToNative). Maurizio On 17/05/2019 16:14, Jorn Vernee wrote: > FWIW, I ran the benchmarks with the linkToNative back-end (using > -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still > 2x slower than JNI: > > Benchmark?????????????????????????????????? Mode? Cnt Score???? Error? > Units > JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ? > 15.744? ns/op > JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ? > 20.570? ns/op > > Of course, like Aleksey says: "The numbers [above] are just data. To > gain reusable insights, you need to follow up on why the numbers are > the way they are.". Unfortunately, I'm having some trouble getting the > project to work with the Windows profiler :/ Was currently looking > into that. > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-17 16:51: >> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>> thanks you for bringing this up, I saw this benchmark few days ago >>> and I took a look at it. That benchmark is unfortunately hitting on >>> a couple of (transitory!) pain points: (1) it is running on Windows, >>> which lacks the optimizations available for MacOS and Linux >>> (directInvoker). When the linkToNative effort will be completed, >>> this discrepancy between platforms will go away. The second problem >>> (2) is that the call is passing a big struct (e.g. bigger than 64 >>> bits). Even on Linux and Mac, such a call would be unable to take >>> advantage of the optimized invoker and would fall back to the so >>> called 'universal invoker' which is slow. >> >> Actually, my bad, the bench is passing pointer to structs, not structs >> by value - which I think should mean the 'foreign+linkToNative' >> experimental branch should be able to handle this. Would be nice to >> get some confirmation that this is indeed the case. >> >> Maurizio From maurizio.cimadamore at oracle.com Fri May 17 15:39:42 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 16:39:42 +0100 Subject: RFR 8224134: Fix javadoc issues Message-ID: There's an issue I uncovered in javadoc which is preventing javadoc for layout classes from being generated correctly: https://bugs.openjdk.java.net/browse/JDK-8224052 While we're working on a fix, I think it's better to make the intermediate AbstractLayout type non-generic (which avoids the problem), and then sprinkle some covariant overrides. I've also added some javadocs for equals/hashcode in various classes. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224134/ Cheers Maurizio From maurizio.cimadamore at oracle.com Fri May 17 15:54:40 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 16:54:40 +0100 Subject: [foreign] Poor performance? In-Reply-To: <63d1a88ccfc7b1e0dcd62b72c6af9cd8@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <63d1a88ccfc7b1e0dcd62b72c6af9cd8@xs4all.nl> Message-ID: Ok - that is what I was expecting, thanks! I think now, the problem is limited to the fact that Scope::allocate is slow - but note that similar problems have also been found elsewhere: https://github.com/bytedeco/javacpp/issues/299 My point here is that, while we should remove accidental performance degradation (the fact that we keep reparsing the layout anno seems to be an offender here, which the benchmark writer had to work around), on the other hand it is not, I think, 100% fair to compare higher-level allocation solution to a raw malloc. For instance, the benchmark is reusing the same scope over and over, which means you keep allocating in the same scope, and create bigger and bigger lists of allocation units (which at some point will have to be resized), as a result that will use more heap memory than the JNI counterpart. As for native memory usage I don't know - in the JNI bench I don't see a 'free' but maybe JavaCPP is cleaning that up automagically (with a Cleaner?). Those are important behavioral differences which should be taken into account when looking at the numbers. Maurizio On 17/05/2019 16:27, Jorn Vernee wrote: > Sorry, forgot to include the CallOnly results (seems to have been > omitted for some reason), which look much better: > > Benchmark?????????????????????????????????? Mode? Cnt Score???? Error? > Units > JmhCallOnly.jni_javacpp???????????????????? avgt?? 50??? 64.958 ??? > 3.608? ns/op > JmhCallOnly.panama????????????????????????? avgt?? 50??? 39.231 ??? > 1.951? ns/op > JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 295.754 ? > 13.541? ns/op > JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 610.027 ? > 30.592? ns/op > > Obviously, this deserves some more investigation either way :) > > Jorn > > Jorn Vernee schreef op 2019-05-17 17:14: >> FWIW, I ran the benchmarks with the linkToNative back-end (using >> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still >> 2x slower than JNI: >> >> Benchmark?????????????????????????????????? Mode? Cnt Score???? >> Error? Units >> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ?? >> 15.744? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ?? >> 20.570? ns/op >> >> Of course, like Aleksey says: "The numbers [above] are just data. To >> gain reusable insights, you need to follow up on why the numbers are >> the way they are.". Unfortunately, I'm having some trouble getting the >> project to work with the Windows profiler :/ Was currently looking >> into that. >> >> Cheers, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>> thanks you for bringing this up, I saw this benchmark few days ago >>>> and I took a look at it. That benchmark is unfortunately hitting on >>>> a couple of (transitory!) pain points: (1) it is running on >>>> Windows, which lacks the optimizations available for MacOS and >>>> Linux (directInvoker). When the linkToNative effort will be >>>> completed, this discrepancy between platforms will go away. The >>>> second problem (2) is that the call is passing a big struct (e.g. >>>> bigger than 64 bits). Even on Linux and Mac, such a call would be >>>> unable to take advantage of the optimized invoker and would fall >>>> back to the so called 'universal invoker' which is slow. >>> >>> Actually, my bad, the bench is passing pointer to structs, not structs >>> by value - which I think should mean the 'foreign+linkToNative' >>> experimental branch should be able to handle this. Would be nice to >>> get some confirmation that this is indeed the case. >>> >>> Maurizio From jean-philippe.halimi at intel.com Fri May 17 15:58:26 2019 From: jean-philippe.halimi at intel.com (Halimi, Jean-Philippe) Date: Fri, 17 May 2019 15:58:26 +0000 Subject: VectorAPI: Testing for gather and scatter Masked, and Single Message-ID: Dear all, Here are two patches implementing the testing for gather and scatter VectorAPI calls (masked), as well as single. http://cr.openjdk.java.net/~vdeshpande/VectorAPI/webrev_gatherScatter_allTypes_gatherScatterMasked2/ http://cr.openjdk.java.net/~vdeshpande/VectorAPI/webrev_single2/ Please let me know your thoughts, and I will edit or merge accordingly. :) Thanks -Jp From henry.jen at oracle.com Fri May 17 17:00:43 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Fri, 17 May 2019 17:00:43 +0000 Subject: hg: panama/dev: 8224013: jextract failed to generate source file under some scenarios Message-ID: <201905171700.x4HH0hga026628@aojmv0008.oracle.com> Changeset: 9265a3633960 Author: henryjen Date: 2019-05-17 10:00 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/9265a3633960 8224013: jextract failed to generate source file under some scenarios ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Writer.java ! test/jdk/com/sun/tools/jextract/TestSrcDump.java From maurizio.cimadamore at oracle.com Fri May 17 17:04:28 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 17 May 2019 17:04:28 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905171704.x4HH4TiU028627@aojmv0008.oracle.com> Changeset: cef8136ee7ee Author: mcimadamore Date: 2019-05-17 19:04 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/cef8136ee7ee Automatic merge with foreign From jbvernee at xs4all.nl Fri May 17 17:19:09 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 17 May 2019 19:19:09 +0200 Subject: [foreign-memaccess] RFC: split MemoryAddress and MemoryRegion In-Reply-To: References: Message-ID: I like this idea. (tbh I wasn't too enthusiastic about merging the 2 in the foreign impl either, but wanted to see how it went). About the disadvantages: > * one more API point Like you say; "methods 'feels' where they belong", I think this is more important than having a smaller API. > * MemoryScope::allocate returns a region, which has then to be turned > into a MemoryAddress (of course we could always project into address, > but it seems more general for allocation to return the region) What about having 2 allocation methods, one which takes a Layout and which returns a MemoryAdress, and one which takes a byte size and returns a MemoryRegion? MemoryAddress allocate(Layout layout); MemoryRegion allocateRegion(long size); // + overloads with alignment --- About the implementation; I think for now we want to just get rid of MemoryBoundInfo.ofEverything, as well as MemoryAddressImpl.ofNative, since they're not really needed right now. If we want to expose the size of a memory region, we can't use MemoryBoundInfo.ofEverything any ways since the length it uses is bogus. I think down the road for native pointers we'd want to use a null MemoryRegion i.e. have no scope and range checks for those. If attaching a scope to a native pointer is still desirable, we might want to put the scope() method in MemoryAddress after all, since some pointers would have a scope, but not a memory region (i.e. no range checks). Jorn Maurizio Cimadamore schreef op 2019-05-16 19:28: > Hi, > this is not an official RFR, as much a request for comments; in the > past I've mentioned the possibility of separating the region-like > aspects of MemoryAddress into a separate abstraction (MemoryRegion), > as users can be confused by the dual nature of MemoryAddress. An > example of this split is below: > > http://cr.openjdk.java.net/~mcimadamore/panama/split-region/ > > Splitting seems to lead to a generally clearer API: > > * methods 'feels' where they belong (e.g. resize() is on region) > * Address implementation becomes even smaller (offset + region) > * Having an explicit MemoryRegion thingie will give us a natural > carrier for 'compound' layouts, which can be useful later on (e.g. > foreign functions) > > Disadvantages are: > > * one more API point > * MemoryScope::allocate returns a region, which has then to be turned > into a MemoryAddress (of course we could always project into address, > but it seems more general for allocation to return the region) > > Comments? > > P.S. > > I'm not married to the MR/MA names, so if the main objection against > this split has to do with naming, I'm open to discuss alternatives. On > the other hand, I've tried to think about other names for MemoryAddres > which could handle better its dual nature - and I'm not convinced we > can resolve the ambiguity with a simple name change. > > Cheers > Maurizio From maurizio.cimadamore at oracle.com Fri May 17 17:32:50 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 18:32:50 +0100 Subject: [foreign-memaccess] RFC: split MemoryAddress and MemoryRegion In-Reply-To: References: Message-ID: <8a8340b7-fb1f-c6b9-b212-0bc409f8b73e@oracle.com> On 17/05/2019 18:19, Jorn Vernee wrote: > I like this idea. (tbh I wasn't too enthusiastic about merging the 2 > in the foreign impl either, but wanted to see how it went). About the > disadvantages: > >> * one more API point > > Like you say; "methods 'feels' where they belong", I think this is > more important than having a smaller API. yeah > >> * MemoryScope::allocate returns a region, which has then to be turned >> into a MemoryAddress (of course we could always project into address, >> but it seems more general for allocation to return the region) > > What about having 2 allocation methods, one which takes a Layout and > which returns a MemoryAdress, and one which takes a byte size and > returns a MemoryRegion? > > ??? MemoryAddress allocate(Layout layout); > ??? MemoryRegion allocateRegion(long size); > ??? // + overloads with alignment This might be a good idea - after all low-level users might even not want to bother with layouts... seems a sensible suggestion. > > --- > > About the implementation; I think for now we want to just get rid of > MemoryBoundInfo.ofEverything, as well as MemoryAddressImpl.ofNative, > since they're not really needed right now. If we want to expose the > size of a memory region, we can't use MemoryBoundInfo.ofEverything any > ways since the length it uses is bogus. Yeah there's loads of stuff in the impl in need of cleanup, most of what you see here was set up in preparation for later stages, but now that we found a simpler center for the API, it's time to revisit the impl too. > > I think down the road for native pointers we'd want to use a null > MemoryRegion i.e. have no scope and range checks for those. If > attaching a scope to a native pointer is still desirable, we might > want to put the scope() method in MemoryAddress after all, since some > pointers would have a scope, but not a memory region (i.e. no range > checks). Yep - I'll keep that in mind. Maurizio > > Jorn > > Maurizio Cimadamore schreef op 2019-05-16 19:28: >> Hi, >> this is not an official RFR, as much a request for comments; in the >> past I've mentioned the possibility of separating the region-like >> aspects of MemoryAddress into a separate abstraction (MemoryRegion), >> as users can be confused by the dual nature of MemoryAddress. An >> example of this split is below: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/split-region/ >> >> Splitting seems to lead to a generally clearer API: >> >> * methods 'feels' where they belong (e.g. resize() is on region) >> * Address implementation becomes even smaller (offset + region) >> * Having an explicit MemoryRegion thingie will give us a natural >> carrier for 'compound' layouts, which can be useful later on (e.g. >> foreign functions) >> >> Disadvantages are: >> >> * one more API point >> * MemoryScope::allocate returns a region, which has then to be turned >> into a MemoryAddress (of course we could always project into address, >> but it seems more general for allocation to return the region) >> >> Comments? >> >> P.S. >> >> I'm not married to the MR/MA names, so if the main objection against >> this split has to do with naming, I'm open to discuss alternatives. On >> the other hand, I've tried to think about other names for MemoryAddres >> which could handle better its dual nature - and I'm not convinced we >> can resolve the ambiguity with a simple name change. >> >> Cheers >> Maurizio From henry.jen at oracle.com Fri May 17 20:26:40 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 13:26:40 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> Message-ID: Oops, the link to web rev http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ Cheers, Henry > On May 17, 2019, at 1:26 PM, Henry Jen wrote: > > Hi, > > Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. > > We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). > > This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. > > In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. > > Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing > "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' > > Cheers, > Henry > >> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >> >> I figured this out, it failed with a different TU, need to catch the exception in a loop. >> >> Cheers, >> Henry >> >>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>> >>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>> >>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>> >>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>> >>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>> >>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>> >>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>> >>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>> >>>> Cheers, >>>> Henry >>>> >>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>> >>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>> >>>>> That is, when you see a cursor with: >>>>> >>>>> _Atomic("....") >>>>> >>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>> >>>>> Maurizio >>>>> >>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>> Hi, >>>>>> >>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>> >>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>> >>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>> >>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>> [2] https://reviews.llvm.org/D61716 >>>> >>> >> > From henry.jen at oracle.com Fri May 17 20:26:00 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 13:26:00 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> Message-ID: <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> Hi, Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' Cheers, Henry > On May 15, 2019, at 8:50 AM, Henry Jen wrote: > > I figured this out, it failed with a different TU, need to catch the exception in a loop. > > Cheers, > Henry > >> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >> >> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >> >> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >> >> Cheers, >> Henry >> >> >>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>> >>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>> >>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>> >>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>> >>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>> >>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>> >>> Cheers, >>> Henry >>> >>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>> >>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>> >>>> That is, when you see a cursor with: >>>> >>>> _Atomic("....") >>>> >>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>> >>>> Maurizio >>>> >>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>> Hi, >>>>> >>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>> >>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>> >>>>> Before that happens, we have our java clang binding trying do that work by: >>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>> - For an atomic type, we use the type string to get the underlying type. >>>>> >>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>> [2] https://reviews.llvm.org/D61716 >>> >> > From maurizio.cimadamore at oracle.com Fri May 17 20:49:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 17 May 2019 21:49:39 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> Message-ID: <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> Hi Henry, thanks for taking another look at this. It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? So, in reality when you register a reparser, you will always call register() with the same TU, right? Which means the only thing changing are the arguments... Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. Maurizio On 17/05/2019 21:26, Henry Jen wrote: > Oops, the link to web rev > > http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ > > Cheers, > Henry > > >> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >> >> Hi, >> >> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >> >> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >> >> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >> >> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >> >> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >> >> Cheers, >> Henry >> >>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>> >>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>> >>> Cheers, >>> Henry >>> >>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>> >>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>> >>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>> >>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>> >>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>> >>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>> >>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>> >>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>> >>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>> >>>>>> That is, when you see a cursor with: >>>>>> >>>>>> _Atomic("....") >>>>>> >>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>> >>>>>> Maurizio >>>>>> >>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>> >>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>> >>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>> >>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Fri May 17 21:44:38 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 14:44:38 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> Message-ID: I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. Cheers, Henry > On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: > > Hi Henry, > thanks for taking another look at this. > > It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. > > There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. > > And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. > > My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? > > So, in reality when you register a reparser, you will always call register() with the same TU, right? > > Which means the only thing changing are the arguments... > > Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. > > I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. > > In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. > > Maurizio > > > On 17/05/2019 21:26, Henry Jen wrote: >> Oops, the link to web rev >> >> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >> >> Cheers, >> Henry >> >> >>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>> >>> Hi, >>> >>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>> >>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>> >>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>> >>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>> >>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>> >>> Cheers, >>> Henry >>> >>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>> >>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>> >>>> Cheers, >>>> Henry >>>> >>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>> >>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>> >>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>> >>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>> >>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>> >>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>> >>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>> >>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>> >>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>> >>>>>>> That is, when you see a cursor with: >>>>>>> >>>>>>> _Atomic("....") >>>>>>> >>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>> >>>>>>> Maurizio >>>>>>> >>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>> >>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>> >>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>> >>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>> [2] https://reviews.llvm.org/D61716 From maurizio.cimadamore at oracle.com Sat May 18 00:05:54 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Sat, 18 May 2019 01:05:54 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> Message-ID: <94389cdf-eb48-82c5-9091-34b66589855f@oracle.com> Ah - I missed the dependency from LayoutUtils. That said, the way the code has been moved from MacroParser to Parser still looks suspicious. I'll try to play a bit with this to see how if I can find a path to organize the various bits. Thanks Maurizio On 17/05/2019 22:44, Henry Jen wrote: > I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. > > Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. > > We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. > > Cheers, > Henry > > >> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >> >> Hi Henry, >> thanks for taking another look at this. >> >> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >> >> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >> >> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >> >> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >> >> So, in reality when you register a reparser, you will always call register() with the same TU, right? >> >> Which means the only thing changing are the arguments... >> >> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >> >> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >> >> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >> >> Maurizio >> >> >> On 17/05/2019 21:26, Henry Jen wrote: >>> Oops, the link to web rev >>> >>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>> >>>> Hi, >>>> >>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>> >>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>> >>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>> >>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>> >>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>> >>>> Cheers, >>>> Henry >>>> >>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>> >>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>> >>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>> >>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>> >>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>> >>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>> >>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>> >>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>> >>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>> >>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>> >>>>>>>> That is, when you see a cursor with: >>>>>>>> >>>>>>>> _Atomic("....") >>>>>>>> >>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>> >>>>>>>> Maurizio >>>>>>>> >>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>> >>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>> >>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>> >>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Sat May 18 00:06:20 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 17:06:20 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> Message-ID: <1BD46646-9007-4640-AEE4-96908B70BBB1@oracle.com> Theoretically, we got Type from Cursor from a HeaderFile, and I checked usage of LayoutUtil, seems the Cursor is available from callers, from which we can track back to the TU. That would allow us to get rid of static registry and have more targeted search in TypeDictionary. In other word, we can go extra miles to refactoring use of LayoutUtil using Cursor rather than Type(lose of information), and if we don?t need that work-around stays in clang binding, we can eliminate this distaste. Cheers, Henry > On May 17, 2019, at 2:44 PM, Henry Jen wrote: > > I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. > > Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. > > We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. > > Cheers, > Henry > > >> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >> >> Hi Henry, >> thanks for taking another look at this. >> >> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >> >> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >> >> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >> >> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >> >> So, in reality when you register a reparser, you will always call register() with the same TU, right? >> >> Which means the only thing changing are the arguments... >> >> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >> >> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >> >> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >> >> Maurizio >> >> >> On 17/05/2019 21:26, Henry Jen wrote: >>> Oops, the link to web rev >>> >>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>> >>>> Hi, >>>> >>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>> >>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>> >>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>> >>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>> >>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>> >>>> Cheers, >>>> Henry >>>> >>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>> >>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>> >>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>> >>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>> >>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>> >>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>> >>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>> >>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>> >>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>> >>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>> >>>>>>>> That is, when you see a cursor with: >>>>>>>> >>>>>>>> _Atomic("....") >>>>>>>> >>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>> >>>>>>>> Maurizio >>>>>>>> >>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>> >>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>> >>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>> >>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>> [2] https://reviews.llvm.org/D61716 > From henry.jen at oracle.com Sat May 18 00:08:36 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 17:08:36 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <94389cdf-eb48-82c5-9091-34b66589855f@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <94389cdf-eb48-82c5-9091-34b66589855f@oracle.com> Message-ID: > On May 17, 2019, at 5:05 PM, Maurizio Cimadamore wrote: > > Ah - I missed the dependency from LayoutUtils. That said, the way the code has been moved from MacroParser to Parser still looks suspicious. > That move Reparser implementation from MacroParser to Parser, as the Reparser implementation is now lives in ClangUtils. > I'll try to play a bit with this to see how if I can find a path to organize the various bits. > > Thanks > Maurizio > > On 17/05/2019 22:44, Henry Jen wrote: >> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >> >> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >> >> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >> >> Cheers, >> Henry >> >> >>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>> >>> Hi Henry, >>> thanks for taking another look at this. >>> >>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>> >>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>> >>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>> >>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>> >>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>> >>> Which means the only thing changing are the arguments... >>> >>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>> >>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>> >>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>> >>> Maurizio >>> >>> >>> On 17/05/2019 21:26, Henry Jen wrote: >>>> Oops, the link to web rev >>>> >>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>> >>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>> >>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>> >>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>> >>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>> >>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>> >>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>> >>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>> >>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>> >>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>> >>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>> >>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>> >>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>> >>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>> >>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>> >>>>>>>>> That is, when you see a cursor with: >>>>>>>>> >>>>>>>>> _Atomic("....") >>>>>>>>> >>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>> >>>>>>>>> Maurizio >>>>>>>>> >>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>> >>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>> >>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>> >>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Henry >>>>>>>>>> >>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>> [2] https://reviews.llvm.org/D61716 From maurizio.cimadamore at oracle.com Sat May 18 00:12:24 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Sat, 18 May 2019 01:12:24 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <94389cdf-eb48-82c5-9091-34b66589855f@oracle.com> Message-ID: On 18/05/2019 01:08, Henry Jen wrote: > >> On May 17, 2019, at 5:05 PM, Maurizio Cimadamore wrote: >> >> Ah - I missed the dependency from LayoutUtils. That said, the way the code has been moved from MacroParser to Parser still looks suspicious. >> > That move Reparser implementation from MacroParser to Parser, as the Reparser implementation is now lives in ClangUtils. Bur surely it seems like all the code that now is in the Parser could be in the constructor of MacroParser - since that's where it belongs to? Maurizio > >> I'll try to play a bit with this to see how if I can find a path to organize the various bits. >> >> Thanks >> Maurizio >> >> On 17/05/2019 22:44, Henry Jen wrote: >>> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >>> >>> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >>> >>> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>>> >>>> Hi Henry, >>>> thanks for taking another look at this. >>>> >>>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>>> >>>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>>> >>>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>>> >>>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>>> >>>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>>> >>>> Which means the only thing changing are the arguments... >>>> >>>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>>> >>>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>>> >>>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>>> >>>> Maurizio >>>> >>>> >>>> On 17/05/2019 21:26, Henry Jen wrote: >>>>> Oops, the link to web rev >>>>> >>>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>>> >>>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>>> >>>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>>> >>>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>>> >>>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>>> >>>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>>> >>>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>>> >>>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>> >>>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>>> >>>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>>> >>>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>>> >>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>>> >>>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>>> >>>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>>> >>>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>>> >>>>>>>>>> That is, when you see a cursor with: >>>>>>>>>> >>>>>>>>>> _Atomic("....") >>>>>>>>>> >>>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>>> >>>>>>>>>> Maurizio >>>>>>>>>> >>>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>>> >>>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>>> >>>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>>> >>>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Henry >>>>>>>>>>> >>>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Sat May 18 00:59:30 2019 From: henry.jen at oracle.com (Henry Jen) Date: Fri, 17 May 2019 17:59:30 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <94389cdf-eb48-82c5-9091-34b66589855f@oracle.com> Message-ID: It could, just I need two constructors(for exception case, where we don?t have ClangReparser), so I think maybe just inject Reparser from here. Cheers, Henry > On May 17, 2019, at 5:12 PM, Maurizio Cimadamore wrote: > > > On 18/05/2019 01:08, Henry Jen wrote: >> >>> On May 17, 2019, at 5:05 PM, Maurizio Cimadamore wrote: >>> >>> Ah - I missed the dependency from LayoutUtils. That said, the way the code has been moved from MacroParser to Parser still looks suspicious. >>> >> That move Reparser implementation from MacroParser to Parser, as the Reparser implementation is now lives in ClangUtils. > > Bur surely it seems like all the code that now is in the Parser could be in the constructor of MacroParser - since that's where it belongs to? > > Maurizio > >> >>> I'll try to play a bit with this to see how if I can find a path to organize the various bits. >>> >>> Thanks >>> Maurizio >>> >>> On 17/05/2019 22:44, Henry Jen wrote: >>>> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >>>> >>>> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >>>> >>>> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>>>> >>>>> Hi Henry, >>>>> thanks for taking another look at this. >>>>> >>>>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>>>> >>>>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>>>> >>>>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>>>> >>>>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>>>> >>>>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>>>> >>>>> Which means the only thing changing are the arguments... >>>>> >>>>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>>>> >>>>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>>>> >>>>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>>>> >>>>> Maurizio >>>>> >>>>> >>>>> On 17/05/2019 21:26, Henry Jen wrote: >>>>>> Oops, the link to web rev >>>>>> >>>>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>>>> >>>>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>>>> >>>>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>>>> >>>>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>>>> >>>>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>>>> >>>>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>>>> >>>>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>>>> >>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>> >>>>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>>>> >>>>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>>>> >>>>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>>>> >>>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>>>> >>>>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>>>> >>>>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Henry >>>>>>>>>> >>>>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>>>> >>>>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>>>> >>>>>>>>>>> That is, when you see a cursor with: >>>>>>>>>>> >>>>>>>>>>> _Atomic("....") >>>>>>>>>>> >>>>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>>>> >>>>>>>>>>> Maurizio >>>>>>>>>>> >>>>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>>>> >>>>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>>>> >>>>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>>>> >>>>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Henry >>>>>>>>>>>> >>>>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>>>> [2] https://reviews.llvm.org/D61716 From samuel.audet at gmail.com Sat May 18 10:04:36 2019 From: samuel.audet at gmail.com (Samuel Audet) Date: Sat, 18 May 2019 19:04:36 +0900 Subject: [foreign] Poor performance? In-Reply-To: <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> Message-ID: <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> If I understand correctly memory allocation is the culprit? Is there a way to call something like malloc() with Panama and still be able to map it to layouts and/or cast it to whatever we want? Calling malloc() with JavaCPP won't do anything w.r.t to deallocation, scopes, cleaners, etc, but it's available as an option, because sometimes users need it! If Panama doesn't allow them to use raw pointers with layouts without going through hoops, that's a usability problem, and if those issues are not ironed out eventually, people will be forced to keep using JNI. Samuel On 5/18/19 12:33 AM, Maurizio Cimadamore wrote: > Thanks Jorn, > I'd be more interested in knowing the raw native call numbers, does it > get any better with linkToNative? Here I'd be expecting performances > identical to JNI (since the binder should lower the Pointer to a long, > which LinkToNative would then pass by register). > > As for the fuller benchmark, note that you are also measuring the > performances of Scope::allocate, which is internally using some maps. > JNR/JNI does not do the same liveliness checks that we do, so the full > benchmark is not totally fair. But the arw performance of the downcall > should be an apple-to-apple comparison, and it shouldn't be 8x slower as > it is now (at least not with linkToNative). > > Maurizio > > > On 17/05/2019 16:14, Jorn Vernee wrote: > >> FWIW, I ran the benchmarks with the linkToNative back-end (using >> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still >> 2x slower than JNI: >> >> Benchmark?????????????????????????????????? Mode? Cnt Score???? Error >> Units >> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ? 15.744? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ? 20.570? ns/op >> >> Of course, like Aleksey says: "The numbers [above] are just data. To >> gain reusable insights, you need to follow up on why the numbers are >> the way they are.". Unfortunately, I'm having some trouble getting the >> project to work with the Windows profiler :/ Was currently looking >> into that. >> >> Cheers, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>> thanks you for bringing this up, I saw this benchmark few days ago >>>> and I took a look at it. That benchmark is unfortunately hitting on >>>> a couple of (transitory!) pain points: (1) it is running on Windows, >>>> which lacks the optimizations available for MacOS and Linux >>>> (directInvoker). When the linkToNative effort will be completed, >>>> this discrepancy between platforms will go away. The second problem >>>> (2) is that the call is passing a big struct (e.g. bigger than 64 >>>> bits). Even on Linux and Mac, such a call would be unable to take >>>> advantage of the optimized invoker and would fall back to the so >>>> called 'universal invoker' which is slow. >>> >>> Actually, my bad, the bench is passing pointer to structs, not structs >>> by value - which I think should mean the 'foreign+linkToNative' >>> experimental branch should be able to handle this. Would be nice to >>> get some confirmation that this is indeed the case. >>> >>> Maurizio From jbvernee at xs4all.nl Sat May 18 16:42:53 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Sat, 18 May 2019 18:42:53 +0200 Subject: [foreign] Poor performance? In-Reply-To: <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> Message-ID: <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> Users could bind `malloc` and `free` and use those instead. But, allocation really isn't the problem here... The allocation using Scope in the current link2native implementation is actually really fast, since it allocates a slab of 64KB when creating a Scope (which is reused between benchmark calls), and the actual allocations per benchmark call are just pointer bumps [1] (until a new slab needs to be allocated). I re-ran the benchmark with malloc as well, but switching to malloc degrades performance (on Windows). What does significantly improve performance is removing the call to get the field: Benchmark Mode Cnt Score Error Units JmhCallOnly.jni_javacpp avgt 50 64.804 ? 2.588 ns/op JmhCallOnly.jni_javacpp_getonly avgt 50 45.543 ? 1.876 ns/op JmhCallOnly.panama avgt 50 38.244 ? 1.496 ns/op JmhCallOnly.panama_getonly avgt 50 530.956 ? 29.321 ns/op JmhGetSystemTimeSeconds.jni_javacpp avgt 50 309.768 ? 14.556 ns/op JmhGetSystemTimeSeconds.jni_javacpp_noget avgt 50 243.865 ? 10.380 ns/op JmhGetSystemTimeSeconds.panama avgt 50 4769.212 ? 273.042 ns/op JmhGetSystemTimeSeconds.panama_prelayout avgt 50 608.144 ? 26.004 ns/op JmhGetSystemTimeSeconds.panama_prelayout_malloc avgt 50 711.237 ? 33.311 ns/op JmhGetSystemTimeSeconds.panama_prelayout_malloc_noget avgt 50 104.144 ? 4.195 ns/op JmhGetSystemTimeSeconds.panama_prelayout_noget avgt 50 64.545 ? 3.848 ns/op Note in particular the `JmhCallOnly.panama_getonly` results compared to `JmhGetSystemTimeSeconds.panama_prelayout_noget`. The relevant code for `JmhCallOnly.panama_getonly` is just: private static final Scope scope = kernel32_h.scope().fork(); private static final LayoutType<_SYSTEMTIME> systemtimeLayout = LayoutType.ofStruct(_SYSTEMTIME.class); private Pointer<_SYSTEMTIME> preallocatedSystemTime; public PanamaBenchmark() { preallocatedSystemTime = scope.allocate(systemtimeLayout); } public short getOnly() { // JmhCallOnly.panama_getonly return preallocatedSystemTime.get().wSecond$get(); } So the real bottleneck seems to be the field get. But, let's investigate further, since we're doing both a get of the struct object, and then a get of the field. Let's split this into a get of the struct, and a get of a pre-computed _SYSTEMTIME object: private static final Scope scope = kernel32_h.scope().fork(); private static final LayoutType<_SYSTEMTIME> systemtimeLayout = LayoutType.ofStruct(_SYSTEMTIME.class); private Pointer<_SYSTEMTIME> preallocatedSystemTime; private _SYSTEMTIME struct; public PanamaBenchmark() { preallocatedSystemTime = scope.allocate(systemtimeLayout); struct = preallocatedSystemTime.get(); } public short getOnly() { // JmhCallOnly.panama_getonly return preallocatedSystemTime.get().wSecond$get(); } public short getOnlyFieldDirect() { // JmhCallOnly.panama_getfield_only return struct.wSecond$get(); } public Object getStructOnly() { // JmhCallOnly.panama_getstruct_only return preallocatedSystemTime.get(); } Benchmark Mode Cnt Score Error Units JmhCallOnly.jni_javacpp_getonly avgt 50 48.642 ? 1.917 ns/op JmhCallOnly.panama_getfield_only avgt 50 93.360 ? 13.364 ns/op JmhCallOnly.panama_getonly avgt 50 533.978 ? 24.249 ns/op JmhCallOnly.panama_getstruct_only avgt 50 377.114 ? 19.884 ns/op So part of the performance loss goes to getting the field, which creates a bunch of intermediate Pointer objects (see RuntimeSupport::CasterImpl) [2]. I think the new memaccess API could really help there, since we can pre-compute a VarHandle for the field, and shouldn't need any of these intermediate pointer objects. But, by far the largest part of the time seems to go to creating the _SYSTEMTIME object when calling get() on the `Pointer<_SYSTEMTIME>`, which corresponds to References.OfStruct::get [3]: static Struct get(Pointer pointer) { ((BoundedPointer)pointer).checkAlive(); Class carrier = ((LayoutTypeImpl)pointer.type()).carrier(); Class structClass = LibrariesHelper.getStructImplClass(carrier); try { return (Struct)structClass.getConstructor(Pointer.class).newInstance(pointer); } catch (ReflectiveOperationException ex) { throw new IllegalStateException(ex); } } I once had the idea to try and see what specialization of this code on a per-Struct-class basis cloud do for performance. Maybe now is a good time to try it out ;) > If Panama doesn't allow them to use raw pointers with > layouts without going through hoops, that's a usability problem, and > if those issues are not ironed out eventually, people will be forced > to keep using JNI. I think we are far from being set-in-stone, at least as far as the high-level API goes. I agree that there should be DYI options for doing things. I think the current solution being investigated is to have multiple levels of public API (with memaccess, FFI, and then Panama), instead of just one high-level API that everyone uses. So users would have the option to use the low-level APIs to build their own solution from scratch. Jorn [1] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/ScopeImpl.java#l216 [2] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/RuntimeSupport.java#l60 [3] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/memory/References.java#l524 Samuel Audet schreef op 2019-05-18 12:04: > If I understand correctly memory allocation is the culprit? Is there a > way to call something like malloc() with Panama and still be able to > map it to layouts and/or cast it to whatever we want? Calling malloc() > with JavaCPP won't do anything w.r.t to deallocation, scopes, > cleaners, etc, but it's available as an option, because sometimes > users need it! If Panama doesn't allow them to use raw pointers with > layouts without going through hoops, that's a usability problem, and > if those issues are not ironed out eventually, people will be forced > to keep using JNI. > > Samuel > > On 5/18/19 12:33 AM, Maurizio Cimadamore wrote: >> Thanks Jorn, >> I'd be more interested in knowing the raw native call numbers, does it >> get any better with linkToNative? Here I'd be expecting performances >> identical to JNI (since the binder should lower the Pointer to a long, >> which LinkToNative would then pass by register). >> >> As for the fuller benchmark, note that you are also measuring the >> performances of Scope::allocate, which is internally using some maps. >> JNR/JNI does not do the same liveliness checks that we do, so the full >> benchmark is not totally fair. But the arw performance of the downcall >> should be an apple-to-apple comparison, and it shouldn't be 8x slower >> as it is now (at least not with linkToNative). >> >> Maurizio >> >> >> On 17/05/2019 16:14, Jorn Vernee wrote: >> >>> FWIW, I ran the benchmarks with the linkToNative back-end (using >>> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still >>> 2x slower than JNI: >>> >>> Benchmark?????????????????????????????????? Mode? Cnt Score???? Error >>> Units >>> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ? >>> 15.744? ns/op >>> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ? >>> 20.570? ns/op >>> >>> Of course, like Aleksey says: "The numbers [above] are just data. To >>> gain reusable insights, you need to follow up on why the numbers are >>> the way they are.". Unfortunately, I'm having some trouble getting >>> the project to work with the Windows profiler :/ Was currently >>> looking into that. >>> >>> Cheers, >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>>> thanks you for bringing this up, I saw this benchmark few days ago >>>>> and I took a look at it. That benchmark is unfortunately hitting on >>>>> a couple of (transitory!) pain points: (1) it is running on >>>>> Windows, which lacks the optimizations available for MacOS and >>>>> Linux (directInvoker). When the linkToNative effort will be >>>>> completed, this discrepancy between platforms will go away. The >>>>> second problem (2) is that the call is passing a big struct (e.g. >>>>> bigger than 64 bits). Even on Linux and Mac, such a call would be >>>>> unable to take advantage of the optimized invoker and would fall >>>>> back to the so called 'universal invoker' which is slow. >>>> >>>> Actually, my bad, the bench is passing pointer to structs, not >>>> structs >>>> by value - which I think should mean the 'foreign+linkToNative' >>>> experimental branch should be able to handle this. Would be nice to >>>> get some confirmation that this is indeed the case. >>>> >>>> Maurizio From jbvernee at xs4all.nl Sun May 19 15:05:29 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Sun, 19 May 2019 17:05:29 +0200 Subject: [foreign] Poor performance? In-Reply-To: <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> Message-ID: <4ee155fc6928785309b079f93218995c@xs4all.nl> Some followup on this. I've tested a patch that specializes the getter MethodHandle per struct class (rough [1]), and the good news is that this pretty much completely removes the overhead from the struct getter: Benchmark Mode Cnt Score Error Units JmhCallOnly.jni_javacpp_getonly avgt 50 45.704 ? 1.448 ns/op JmhCallOnly.panama_getfield_only avgt 50 87.393 ? 7.810 ns/op JmhCallOnly.panama_getonly avgt 50 101.654 ? 13.549 ns/op JmhCallOnly.panama_getstruct_only avgt 50 13.036 ? 0.648 ns/op Upon inspection, most of the time was spent on the reflective constructor lookup and call. Now the field access is the largest part of the time spent (which should go down with memaccess as well I think). The bad news... since we (apparently) can't get a MethodHandle to the constructor of our Struct impl class without triggering an access violation (seemingly because it's a VMAC?), I had to disable the access checking to get these numbers. But, intuitively, we should be able to get that MethodHandle without disabling the access checks, since it's fine to call newInstance reflectively in the current implementation as well, why wouldn't we be able to do the same through a MethodHandle? I'm not sure... Jorn [1] : http://cr.openjdk.java.net/~jvernee/panama/webrevs/getstruct/webrev.00/ Jorn Vernee schreef op 2019-05-18 18:42: > Users could bind `malloc` and `free` and use those instead. But, > allocation really isn't the problem here... The allocation using Scope > in the current link2native implementation is actually really fast, > since it allocates a slab of 64KB when creating a Scope (which is > reused between benchmark calls), and the actual allocations per > benchmark call are just pointer bumps [1] (until a new slab needs to > be allocated). > > I re-ran the benchmark with malloc as well, but switching to malloc > degrades performance (on Windows). What does significantly improve > performance is removing the call to get the field: > > Benchmark Mode Cnt > Score Error Units > JmhCallOnly.jni_javacpp avgt 50 > 64.804 ? 2.588 ns/op > JmhCallOnly.jni_javacpp_getonly avgt 50 > 45.543 ? 1.876 ns/op > JmhCallOnly.panama avgt 50 > 38.244 ? 1.496 ns/op > JmhCallOnly.panama_getonly avgt 50 > 530.956 ? 29.321 ns/op > JmhGetSystemTimeSeconds.jni_javacpp avgt 50 > 309.768 ? 14.556 ns/op > JmhGetSystemTimeSeconds.jni_javacpp_noget avgt 50 > 243.865 ? 10.380 ns/op > JmhGetSystemTimeSeconds.panama avgt 50 > 4769.212 ? 273.042 ns/op > JmhGetSystemTimeSeconds.panama_prelayout avgt 50 > 608.144 ? 26.004 ns/op > JmhGetSystemTimeSeconds.panama_prelayout_malloc avgt 50 > 711.237 ? 33.311 ns/op > JmhGetSystemTimeSeconds.panama_prelayout_malloc_noget avgt 50 > 104.144 ? 4.195 ns/op > JmhGetSystemTimeSeconds.panama_prelayout_noget avgt 50 > 64.545 ? 3.848 ns/op > > Note in particular the `JmhCallOnly.panama_getonly` results compared > to `JmhGetSystemTimeSeconds.panama_prelayout_noget`. The relevant code > for `JmhCallOnly.panama_getonly` is just: > > private static final Scope scope = kernel32_h.scope().fork(); > private static final LayoutType<_SYSTEMTIME> systemtimeLayout = > LayoutType.ofStruct(_SYSTEMTIME.class); > private Pointer<_SYSTEMTIME> preallocatedSystemTime; > > public PanamaBenchmark() { > preallocatedSystemTime = scope.allocate(systemtimeLayout); > } > > public short getOnly() { // JmhCallOnly.panama_getonly > return preallocatedSystemTime.get().wSecond$get(); > } > > So the real bottleneck seems to be the field get. But, let's > investigate further, since we're doing both a get of the struct > object, and then a get of the field. Let's split this into a get of > the struct, and a get of a pre-computed _SYSTEMTIME object: > > private static final Scope scope = kernel32_h.scope().fork(); > private static final LayoutType<_SYSTEMTIME> systemtimeLayout = > LayoutType.ofStruct(_SYSTEMTIME.class); > private Pointer<_SYSTEMTIME> preallocatedSystemTime; > private _SYSTEMTIME struct; > > public PanamaBenchmark() { > preallocatedSystemTime = scope.allocate(systemtimeLayout); > struct = preallocatedSystemTime.get(); > } > > public short getOnly() { // JmhCallOnly.panama_getonly > return preallocatedSystemTime.get().wSecond$get(); > } > > public short getOnlyFieldDirect() { // > JmhCallOnly.panama_getfield_only > return struct.wSecond$get(); > } > > public Object getStructOnly() { // > JmhCallOnly.panama_getstruct_only > return preallocatedSystemTime.get(); > } > > Benchmark Mode Cnt Score Error Units > JmhCallOnly.jni_javacpp_getonly avgt 50 48.642 ? 1.917 ns/op > JmhCallOnly.panama_getfield_only avgt 50 93.360 ? 13.364 ns/op > JmhCallOnly.panama_getonly avgt 50 533.978 ? 24.249 ns/op > JmhCallOnly.panama_getstruct_only avgt 50 377.114 ? 19.884 ns/op > > So part of the performance loss goes to getting the field, which > creates a bunch of intermediate Pointer objects (see > RuntimeSupport::CasterImpl) [2]. I think the new memaccess API could > really help there, since we can pre-compute a VarHandle for the field, > and shouldn't need any of these intermediate pointer objects. > > But, by far the largest part of the time seems to go to creating the > _SYSTEMTIME object when calling get() on the `Pointer<_SYSTEMTIME>`, > which corresponds to References.OfStruct::get [3]: > > static Struct get(Pointer pointer) { > ((BoundedPointer)pointer).checkAlive(); > Class carrier = > ((LayoutTypeImpl)pointer.type()).carrier(); > Class structClass = > LibrariesHelper.getStructImplClass(carrier); > try { > return > (Struct)structClass.getConstructor(Pointer.class).newInstance(pointer); > } catch (ReflectiveOperationException ex) { > throw new IllegalStateException(ex); > } > } > > I once had the idea to try and see what specialization of this code on > a per-Struct-class basis cloud do for performance. Maybe now is a good > time to try it out ;) > >> If Panama doesn't allow them to use raw pointers with >> layouts without going through hoops, that's a usability problem, and >> if those issues are not ironed out eventually, people will be forced >> to keep using JNI. > > I think we are far from being set-in-stone, at least as far as the > high-level API goes. I agree that there should be DYI options for > doing things. I think the current solution being investigated is to > have multiple levels of public API (with memaccess, FFI, and then > Panama), instead of just one high-level API that everyone uses. So > users would have the option to use the low-level APIs to build their > own solution from scratch. > > Jorn > > [1] : > http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/ScopeImpl.java#l216 > [2] : > http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/RuntimeSupport.java#l60 > [3] : > http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/memory/References.java#l524 > > Samuel Audet schreef op 2019-05-18 12:04: >> If I understand correctly memory allocation is the culprit? Is there a >> way to call something like malloc() with Panama and still be able to >> map it to layouts and/or cast it to whatever we want? Calling malloc() >> with JavaCPP won't do anything w.r.t to deallocation, scopes, >> cleaners, etc, but it's available as an option, because sometimes >> users need it! If Panama doesn't allow them to use raw pointers with >> layouts without going through hoops, that's a usability problem, and >> if those issues are not ironed out eventually, people will be forced >> to keep using JNI. >> >> Samuel >> >> On 5/18/19 12:33 AM, Maurizio Cimadamore wrote: >>> Thanks Jorn, >>> I'd be more interested in knowing the raw native call numbers, does >>> it get any better with linkToNative? Here I'd be expecting >>> performances identical to JNI (since the binder should lower the >>> Pointer to a long, which LinkToNative would then pass by register). >>> >>> As for the fuller benchmark, note that you are also measuring the >>> performances of Scope::allocate, which is internally using some maps. >>> JNR/JNI does not do the same liveliness checks that we do, so the >>> full benchmark is not totally fair. But the arw performance of the >>> downcall should be an apple-to-apple comparison, and it shouldn't be >>> 8x slower as it is now (at least not with linkToNative). >>> >>> Maurizio >>> >>> >>> On 17/05/2019 16:14, Jorn Vernee wrote: >>> >>>> FWIW, I ran the benchmarks with the linkToNative back-end (using >>>> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's >>>> still 2x slower than JNI: >>>> >>>> Benchmark?????????????????????????????????? Mode? Cnt Score???? >>>> Error Units >>>> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ? >>>> 15.744? ns/op >>>> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ? >>>> 20.570? ns/op >>>> >>>> Of course, like Aleksey says: "The numbers [above] are just data. To >>>> gain reusable insights, you need to follow up on why the numbers are >>>> the way they are.". Unfortunately, I'm having some trouble getting >>>> the project to work with the Windows profiler :/ Was currently >>>> looking into that. >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>>>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>>>> thanks you for bringing this up, I saw this benchmark few days ago >>>>>> and I took a look at it. That benchmark is unfortunately hitting >>>>>> on a couple of (transitory!) pain points: (1) it is running on >>>>>> Windows, which lacks the optimizations available for MacOS and >>>>>> Linux (directInvoker). When the linkToNative effort will be >>>>>> completed, this discrepancy between platforms will go away. The >>>>>> second problem (2) is that the call is passing a big struct (e.g. >>>>>> bigger than 64 bits). Even on Linux and Mac, such a call would be >>>>>> unable to take advantage of the optimized invoker and would fall >>>>>> back to the so called 'universal invoker' which is slow. >>>>> >>>>> Actually, my bad, the bench is passing pointer to structs, not >>>>> structs >>>>> by value - which I think should mean the 'foreign+linkToNative' >>>>> experimental branch should be able to handle this. Would be nice to >>>>> get some confirmation that this is indeed the case. >>>>> >>>>> Maurizio From ardikars at gmail.com Sun May 19 23:33:13 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Mon, 20 May 2019 06:33:13 +0700 Subject: Implement no copy memory Message-ID: Hi, There is native function in libpcap library: typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); When I use jextract it generate code like below: @NativeHeader(...) public interface pcap_h { ... @NativeLocation(...) @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); @FunctionalInterface @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") interface pcap_handler { void fn(Pointer buf, Pointer pkthdr, Pointer usr) throws IllegalAccessException; } ... } It looks like pamana copy value from 'u_char *' into 'Pointer buf', correct or not? Because when I call buf.addr() it returns 0, what zero mean? How I can do with no copy? Just returns the address of 'u_char *' like below in callback method? void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, Pointer usr) throws IllegalAccessException; Thanks and Regards, Ardika Rommy Sanjaya From ardikars at gmail.com Mon May 20 01:44:46 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Mon, 20 May 2019 08:44:46 +0700 Subject: Implement no copy memory In-Reply-To: References: Message-ID: Hi, > It looks like pamana copy value from 'u_char *' into 'Pointer buf', correct or not? Because when I call buf.addr() it returns 0, what zero mean? > How I can do with no copy? Just returns the address of 'u_char *' like below in callback method? Returns 0 when call buf.addr() is fixed, but I got another issue. buf.addr() always returns same address. Regards, Ardika Rommy Sanjaya --- > On 20 May 2019, at 06.33, Ardika Rommy Sanjaya wrote: > > Hi, > > There is native function in libpcap library: typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); > When I use jextract it generate code like below: > > @NativeHeader(...) > public interface pcap_h { > > ... > > @NativeLocation(...) > @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") > int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); > > @FunctionalInterface > @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") > interface pcap_handler { > > void fn(Pointer buf, Pointer pkthdr, Pointer usr) throws IllegalAccessException; > > } > > ... > > } > > It looks like pamana copy value from 'u_char *' into 'Pointer buf', correct or not? Because when I call buf.addr() it returns 0, what zero mean? > How I can do with no copy? Just returns the address of 'u_char *' like below in callback method? > > void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, Pointer usr) throws IllegalAccessException; > > Thanks and Regards, > Ardika Rommy Sanjaya From brian.goetz at oracle.com Mon May 20 03:24:35 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 19 May 2019 23:24:35 -0400 Subject: [foreign] Poor performance? In-Reply-To: <4ee155fc6928785309b079f93218995c@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> Message-ID: <0a79fe37-d136-f3d7-4b42-968dfd69a836@oracle.com> > The bad news... since we (apparently) can't get a MethodHandle to the > constructor of our Struct impl class without triggering an access > violation (seemingly because it's a VMAC?), I had to disable the > access checking to get these numbers. There's work underway to replace VMACs with something more predictable, which might well make this problem go away. From samuel.audet at gmail.com Mon May 20 06:13:30 2019 From: samuel.audet at gmail.com (Samuel Audet) Date: Mon, 20 May 2019 15:13:30 +0900 Subject: [foreign] Poor performance? In-Reply-To: <4ee155fc6928785309b079f93218995c@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> Message-ID: <4c75fb8f-b502-76e1-62ff-af04ccf75db4@gmail.com> Hi, Jorn, Thanks for the followup! I see, there's still optimization to be done for layouts. Though, as I previously pointed out, I'm concerned that relying on APIs like MethodHandle and VarHandle is going to prevent these optimizations from working with AOT compilers. Substrate VM already has something for "memaccess" and "FFI", so it would make sense to have a "high-level API" that works with those, and I'm guessing that Panama and Graal are going to work together to define that high-level API eventually, but it would be great to have clear direction about what the plan is. Still, neither Panama nor SVM have started looking at mapping inline functions or C++ templates to something like custom JVM intrinsics. That's the kind of level I would really like for a team like Panama to focus on. Then it doesn't matter what the speed is for getters, setters, and whatever, we would just need to create inline functions and map everything that way. :) Samuel On 5/20/19 12:05 AM, Jorn Vernee wrote: > Some followup on this. > > I've tested a patch that specializes the getter MethodHandle per struct > class (rough [1]), and the good news is that this pretty much completely > removes the overhead from the struct getter: > > Benchmark????????????????????????? Mode? Cnt??? Score??? Error? Units > JmhCallOnly.jni_javacpp_getonly??? avgt?? 50?? 45.704 ?? 1.448? ns/op > JmhCallOnly.panama_getfield_only?? avgt?? 50?? 87.393 ?? 7.810? ns/op > JmhCallOnly.panama_getonly???????? avgt?? 50? 101.654 ? 13.549? ns/op > JmhCallOnly.panama_getstruct_only? avgt?? 50?? 13.036 ?? 0.648? ns/op > > Upon inspection, most of the time was spent on the reflective > constructor lookup and call. Now the field access is the largest part of > the time spent (which should go down with memaccess as well I think). > > The bad news... since we (apparently) can't get a MethodHandle to the > constructor of our Struct impl class without triggering an access > violation (seemingly because it's a VMAC?), I had to disable the access > checking to get these numbers. > > But, intuitively, we should be able to get that MethodHandle without > disabling the access checks, since it's fine to call newInstance > reflectively in the current implementation as well, why wouldn't we be > able to do the same through a MethodHandle? I'm not sure... > > Jorn > > [1] : > http://cr.openjdk.java.net/~jvernee/panama/webrevs/getstruct/webrev.00/ > > Jorn Vernee schreef op 2019-05-18 18:42: >> Users could bind `malloc` and `free` and use those instead. But, >> allocation really isn't the problem here... The allocation using Scope >> in the current link2native implementation is actually really fast, >> since it allocates a slab of 64KB when creating a Scope (which is >> reused between benchmark calls), and the actual allocations per >> benchmark call are just pointer bumps [1] (until a new slab needs to >> be allocated). >> >> I re-ran the benchmark with malloc as well, but switching to malloc >> degrades performance (on Windows). What does significantly improve >> performance is removing the call to get the field: >> >> Benchmark????????????????????????????????????????????? Mode? Cnt Score???? Error? Units >> JmhCallOnly.jni_javacpp??????????????????????????????? avgt?? 50 64.804 ??? 2.588? ns/op >> JmhCallOnly.jni_javacpp_getonly??????????????????????? avgt?? 50 45.543 ??? 1.876? ns/op >> JmhCallOnly.panama???????????????????????????????????? avgt?? 50 38.244 ??? 1.496? ns/op >> JmhCallOnly.panama_getonly???????????????????????????? avgt?? 50 530.956 ?? 29.321? ns/op >> JmhGetSystemTimeSeconds.jni_javacpp??????????????????? avgt?? 50 309.768 ?? 14.556? ns/op >> JmhGetSystemTimeSeconds.jni_javacpp_noget????????????? avgt?? 50 243.865 ?? 10.380? ns/op >> JmhGetSystemTimeSeconds.panama???????????????????????? avgt?? 50 4769.212 ? 273.042? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout?????????????? avgt?? 50 608.144 ?? 26.004? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout_malloc??????? avgt?? 50 711.237 ?? 33.311? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout_malloc_noget? avgt?? 50 104.144 ??? 4.195? ns/op >> JmhGetSystemTimeSeconds.panama_prelayout_noget???????? avgt?? 50 64.545 ??? 3.848? ns/op >> >> Note in particular the `JmhCallOnly.panama_getonly` results compared >> to `JmhGetSystemTimeSeconds.panama_prelayout_noget`. The relevant code >> for `JmhCallOnly.panama_getonly` is just: >> >> ??? private static final Scope scope = kernel32_h.scope().fork(); >> ??? private static final LayoutType<_SYSTEMTIME> systemtimeLayout = LayoutType.ofStruct(_SYSTEMTIME.class); >> ??? private Pointer<_SYSTEMTIME> preallocatedSystemTime; >> >> ??? public PanamaBenchmark() { >> ??????? preallocatedSystemTime = scope.allocate(systemtimeLayout); >> ??? } >> >> ??? public short getOnly() { // JmhCallOnly.panama_getonly >> ??????? return preallocatedSystemTime.get().wSecond$get(); >> ??? } >> >> So the real bottleneck seems to be the field get. But, let's >> investigate further, since we're doing both a get of the struct >> object, and then a get of the field. Let's split this into a get of >> the struct, and a get of a pre-computed _SYSTEMTIME object: >> >> ??? private static final Scope scope = kernel32_h.scope().fork(); >> ??? private static final LayoutType<_SYSTEMTIME> systemtimeLayout = LayoutType.ofStruct(_SYSTEMTIME.class); >> ??? private Pointer<_SYSTEMTIME> preallocatedSystemTime; >> ??? private _SYSTEMTIME struct; >> >> ??? public PanamaBenchmark() { >> ??????? preallocatedSystemTime = scope.allocate(systemtimeLayout); >> ??????? struct = preallocatedSystemTime.get(); >> ??? } >> >> ??? public short getOnly() { // JmhCallOnly.panama_getonly >> ??????? return preallocatedSystemTime.get().wSecond$get(); >> ??? } >> >> ??? public short getOnlyFieldDirect() { // JmhCallOnly.panama_getfield_only >> ??????? return struct.wSecond$get(); >> ??? } >> >> ??? public Object getStructOnly() { // JmhCallOnly.panama_getstruct_only >> ??????? return preallocatedSystemTime.get(); >> ??? } >> >> Benchmark????????????????????????? Mode? Cnt??? Score??? Error? Units >> JmhCallOnly.jni_javacpp_getonly??? avgt?? 50?? 48.642 ?? 1.917? ns/op >> JmhCallOnly.panama_getfield_only?? avgt?? 50?? 93.360 ? 13.364? ns/op >> JmhCallOnly.panama_getonly???????? avgt?? 50? 533.978 ? 24.249? ns/op >> JmhCallOnly.panama_getstruct_only? avgt?? 50? 377.114 ? 19.884? ns/op >> >> So part of the performance loss goes to getting the field, which >> creates a bunch of intermediate Pointer objects (see >> RuntimeSupport::CasterImpl) [2]. I think the new memaccess API could >> really help there, since we can pre-compute a VarHandle for the field, >> and shouldn't need any of these intermediate pointer objects. >> >> But, by far the largest part of the time seems to go to creating the >> _SYSTEMTIME object when calling get() on the `Pointer<_SYSTEMTIME>`, >> which corresponds to References.OfStruct::get [3]: >> >> ??? static Struct get(Pointer pointer) { >> ??????? ((BoundedPointer)pointer).checkAlive(); >> ??????? Class carrier = ((LayoutTypeImpl)pointer.type()).carrier(); >> ??????? Class structClass = LibrariesHelper.getStructImplClass(carrier); >> ??????? try { >> ??????????? return (Struct)structClass.getConstructor(Pointer.class).newInstance(pointer); >> >> ??????? } catch (ReflectiveOperationException ex) { >> ??????????? throw new IllegalStateException(ex); >> ??????? } >> ??? } >> >> I once had the idea to try and see what specialization of this code on >> a per-Struct-class basis cloud do for performance. Maybe now is a good >> time to try it out ;) >> >>> If Panama doesn't allow them to use raw pointers with >>> layouts without going through hoops, that's a usability problem, and >>> if those issues are not ironed out eventually, people will be forced >>> to keep using JNI. >> >> I think we are far from being set-in-stone, at least as far as the >> high-level API goes. I agree that there should be DYI options for >> doing things. I think the current solution being investigated is to >> have multiple levels of public API (with memaccess, FFI, and then >> Panama), instead of just one high-level API that everyone uses. So >> users would have the option to use the low-level APIs to build their >> own solution from scratch. >> >> Jorn >> >> [1] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/ScopeImpl.java#l216 >> [2] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/RuntimeSupport.java#l60 >> [3] : http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/memory/References.java#l524 >> >> >> Samuel Audet schreef op 2019-05-18 12:04: >>> If I understand correctly memory allocation is the culprit? Is there a >>> way to call something like malloc() with Panama and still be able to >>> map it to layouts and/or cast it to whatever we want? Calling malloc() >>> with JavaCPP won't do anything w.r.t to deallocation, scopes, >>> cleaners, etc, but it's available as an option, because sometimes >>> users need it! If Panama doesn't allow them to use raw pointers with >>> layouts without going through hoops, that's a usability problem, and >>> if those issues are not ironed out eventually, people will be forced >>> to keep using JNI. >>> >>> Samuel >>> >>> On 5/18/19 12:33 AM, Maurizio Cimadamore wrote: >>>> Thanks Jorn, >>>> I'd be more interested in knowing the raw native call numbers, does >>>> it get any better with linkToNative? Here I'd be expecting >>>> performances identical to JNI (since the binder should lower the >>>> Pointer to a long, which LinkToNative would then pass by register). >>>> >>>> As for the fuller benchmark, note that you are also measuring the >>>> performances of Scope::allocate, which is internally using some >>>> maps. JNR/JNI does not do the same liveliness checks that we do, so >>>> the full benchmark is not totally fair. But the arw performance of >>>> the downcall should be an apple-to-apple comparison, and it >>>> shouldn't be 8x slower as it is now (at least not with linkToNative). >>>> >>>> Maurizio >>>> >>>> >>>> On 17/05/2019 16:14, Jorn Vernee wrote: >>>> >>>>> FWIW, I ran the benchmarks with the linkToNative back-end (using >>>>> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's >>>>> still 2x slower than JNI: >>>>> >>>>> Benchmark?????????????????????????????????? Mode? Cnt Score Error >>>>> Units >>>>> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50?? 298.046 ? 15.744? ns/op >>>>> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50?? 596.567 ? 20.570? ns/op >>>>> >>>>> Of course, like Aleksey says: "The numbers [above] are just data. >>>>> To gain reusable insights, you need to follow up on why the numbers >>>>> are the way they are.". Unfortunately, I'm having some trouble >>>>> getting the project to work with the Windows profiler :/ Was >>>>> currently looking into that. >>>>> >>>>> Cheers, >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>>>>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>>>>> thanks you for bringing this up, I saw this benchmark few days >>>>>>> ago and I took a look at it. That benchmark is unfortunately >>>>>>> hitting on a couple of (transitory!) pain points: (1) it is >>>>>>> running on Windows, which lacks the optimizations available for >>>>>>> MacOS and Linux (directInvoker). When the linkToNative effort >>>>>>> will be completed, this discrepancy between platforms will go >>>>>>> away. The second problem (2) is that the call is passing a big >>>>>>> struct (e.g. bigger than 64 bits). Even on Linux and Mac, such a >>>>>>> call would be unable to take advantage of the optimized invoker >>>>>>> and would fall back to the so called 'universal invoker' which is >>>>>>> slow. >>>>>> >>>>>> Actually, my bad, the bench is passing pointer to structs, not structs >>>>>> by value - which I think should mean the 'foreign+linkToNative' >>>>>> experimental branch should be able to handle this. Would be nice to >>>>>> get some confirmation that this is indeed the case. >>>>>> >>>>>> Maurizio From nick.gasson at arm.com Mon May 20 06:47:27 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 20 May 2019 14:47:27 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> Message-ID: <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> Hi Maurizio, On 15/05/2019 17:51, Maurizio Cimadamore wrote: > > This seems a good direction to explore. > > I believe that, in general, our assumption that platforms have similar > registers classes is only partially valid. This is visible with X87 > register, which doesn't really make sense on a non-intel machine. > > But adding a class as the code is now can be problematic, as there's a > 1-1 correspondence between these and ShuffleRecipeClass, which is also > shared for all platforms. > > How did you go about changing that class? > > I think it could be better, for now, to stick with some hacky solution > like the one you proposed above, which keeps current (broken) assumption > and then revisit later a way to make the code better/more direct? > I've made two separate patches to implement this, please have a look to see which is preferable. The first patch uses the integer argument class. This puts the the indirect argument at offset 8, which otherwise isn't used for an integer argument. We also now need to sort the bindings when generating the shuffle recipe, as Jorn pointed out. [1] http://cr.openjdk.java.net/~ngasson/foreign/8223808/struct-integer-arg/ The second patch adds a new storage class to represent the indirect result register. But the downside is we need to change the x86 code to ignore this. [2] http://cr.openjdk.java.net/~ngasson/foreign/8223808/struct-new-indirect-class/ I actually think [1] may be better as it doesn't touch the x86 code and is a smaller diff overall? I've also done a minor cleanup of the original patch. This disables the tests that use long double on AArch64 for now. http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.1/ The StructUpcall test will pass with either patch [1] or [2] above but only when using the universal invoker. I'm wondering whether we should disable the direct invoker fast path on AArch64 to match Windows, as this will be replaced later anyway? Or is it better to try to fix it? Thanks, Nick From maurizio.cimadamore at oracle.com Mon May 20 08:14:17 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 09:14:17 +0100 Subject: [foreign] Poor performance? In-Reply-To: <4ee155fc6928785309b079f93218995c@xs4all.nl> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> Message-ID: <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> On 19/05/2019 16:05, Jorn Vernee wrote: > The bad news... since we (apparently) can't get a MethodHandle to the > constructor of our Struct impl class without triggering an access > violation (seemingly because it's a VMAC?), I had to disable the > access checking to get these numbers. Thanks for the tests Jorn - and yes, this is the reason why Struct handles have not been specialized. As Brian says, there is going to be a more reliable way to define anonymous classes soon. In alternative we could always go back to a normal Unsafe.defineClass - and then replace constant pool patching with condy, which is gonna help anyway if we're going at some point to start generating bindings statically. Maurizio From maurizio.cimadamore at oracle.com Mon May 20 08:17:16 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 09:17:16 +0100 Subject: [foreign] Poor performance? In-Reply-To: <4c75fb8f-b502-76e1-62ff-af04ccf75db4@gmail.com> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> <4c75fb8f-b502-76e1-62ff-af04ccf75db4@gmail.com> Message-ID: <24070370-c7ef-433a-28ef-f025e1b69500@oracle.com> On 20/05/2019 07:13, Samuel Audet wrote: > Hi, Jorn, > > Thanks for the followup! > > I see, there's still optimization to be done for layouts. Though, as I > previously pointed out, I'm concerned that relying on APIs like > MethodHandle and VarHandle is going to prevent these optimizations > from working with AOT compilers. Substrate VM already has something > for "memaccess" and "FFI", so it would make sense to have a > "high-level API" that works with those, and I'm guessing that Panama > and Graal are going to work together to define that high-level API > eventually, but it would be great to have clear direction about what > the plan is. > > Still, neither Panama nor SVM have started looking at mapping inline > functions or C++ templates to something like custom JVM intrinsics. > That's the kind of level I would really like for a team like Panama to > focus on. Then it doesn't matter what the speed is for getters, > setters, and whatever, we would just need to create inline functions > and map everything that way. :) Points noted - but let's keep this thread focused please. Maurizio > > Samuel > > On 5/20/19 12:05 AM, Jorn Vernee wrote: >> Some followup on this. >> >> I've tested a patch that specializes the getter MethodHandle per >> struct class (rough [1]), and the good news is that this pretty much >> completely removes the overhead from the struct getter: >> >> Benchmark????????????????????????? Mode? Cnt??? Score??? Error Units >> JmhCallOnly.jni_javacpp_getonly??? avgt?? 50?? 45.704 ?? 1.448 ns/op >> JmhCallOnly.panama_getfield_only?? avgt?? 50?? 87.393 ?? 7.810 ns/op >> JmhCallOnly.panama_getonly???????? avgt?? 50? 101.654 ? 13.549 ns/op >> JmhCallOnly.panama_getstruct_only? avgt?? 50?? 13.036 ?? 0.648 ns/op >> >> Upon inspection, most of the time was spent on the reflective >> constructor lookup and call. Now the field access is the largest part >> of the time spent (which should go down with memaccess as well I think). >> >> The bad news... since we (apparently) can't get a MethodHandle to the >> constructor of our Struct impl class without triggering an access >> violation (seemingly because it's a VMAC?), I had to disable the >> access checking to get these numbers. >> >> But, intuitively, we should be able to get that MethodHandle without >> disabling the access checks, since it's fine to call newInstance >> reflectively in the current implementation as well, why wouldn't we >> be able to do the same through a MethodHandle? I'm not sure... >> >> Jorn >> >> [1] : >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/getstruct/webrev.00/ >> >> Jorn Vernee schreef op 2019-05-18 18:42: >>> Users could bind `malloc` and `free` and use those instead. But, >>> allocation really isn't the problem here... The allocation using Scope >>> in the current link2native implementation is actually really fast, >>> since it allocates a slab of 64KB when creating a Scope (which is >>> reused between benchmark calls), and the actual allocations per >>> benchmark call are just pointer bumps [1] (until a new slab needs to >>> be allocated). >>> >>> I re-ran the benchmark with malloc as well, but switching to malloc >>> degrades performance (on Windows). What does significantly improve >>> performance is removing the call to get the field: >>> >>> Benchmark????????????????????????????????????????????? Mode Cnt >>> Score???? Error? Units >>> JmhCallOnly.jni_javacpp??????????????????????????????? avgt 50 >>> 64.804 ??? 2.588? ns/op >>> JmhCallOnly.jni_javacpp_getonly??????????????????????? avgt 50 >>> 45.543 ??? 1.876? ns/op >>> JmhCallOnly.panama???????????????????????????????????? avgt 50 >>> 38.244 ??? 1.496? ns/op >>> JmhCallOnly.panama_getonly???????????????????????????? avgt 50 >>> 530.956 ?? 29.321? ns/op >>> JmhGetSystemTimeSeconds.jni_javacpp??????????????????? avgt 50 >>> 309.768 ?? 14.556? ns/op >>> JmhGetSystemTimeSeconds.jni_javacpp_noget????????????? avgt 50 >>> 243.865 ?? 10.380? ns/op >>> JmhGetSystemTimeSeconds.panama???????????????????????? avgt 50 >>> 4769.212 ? 273.042? ns/op >>> JmhGetSystemTimeSeconds.panama_prelayout?????????????? avgt 50 >>> 608.144 ?? 26.004? ns/op >>> JmhGetSystemTimeSeconds.panama_prelayout_malloc??????? avgt 50 >>> 711.237 ?? 33.311? ns/op >>> JmhGetSystemTimeSeconds.panama_prelayout_malloc_noget? avgt 50 >>> 104.144 ??? 4.195? ns/op >>> JmhGetSystemTimeSeconds.panama_prelayout_noget???????? avgt 50 >>> 64.545 ??? 3.848? ns/op >>> >>> Note in particular the `JmhCallOnly.panama_getonly` results compared >>> to `JmhGetSystemTimeSeconds.panama_prelayout_noget`. The relevant code >>> for `JmhCallOnly.panama_getonly` is just: >>> >>> ??? private static final Scope scope = kernel32_h.scope().fork(); >>> ??? private static final LayoutType<_SYSTEMTIME> systemtimeLayout = >>> LayoutType.ofStruct(_SYSTEMTIME.class); >>> ??? private Pointer<_SYSTEMTIME> preallocatedSystemTime; >>> >>> ??? public PanamaBenchmark() { >>> ??????? preallocatedSystemTime = scope.allocate(systemtimeLayout); >>> ??? } >>> >>> ??? public short getOnly() { // JmhCallOnly.panama_getonly >>> ??????? return preallocatedSystemTime.get().wSecond$get(); >>> ??? } >>> >>> So the real bottleneck seems to be the field get. But, let's >>> investigate further, since we're doing both a get of the struct >>> object, and then a get of the field. Let's split this into a get of >>> the struct, and a get of a pre-computed _SYSTEMTIME object: >>> >>> ??? private static final Scope scope = kernel32_h.scope().fork(); >>> ??? private static final LayoutType<_SYSTEMTIME> systemtimeLayout = >>> LayoutType.ofStruct(_SYSTEMTIME.class); >>> ??? private Pointer<_SYSTEMTIME> preallocatedSystemTime; >>> ??? private _SYSTEMTIME struct; >>> >>> ??? public PanamaBenchmark() { >>> ??????? preallocatedSystemTime = scope.allocate(systemtimeLayout); >>> ??????? struct = preallocatedSystemTime.get(); >>> ??? } >>> >>> ??? public short getOnly() { // JmhCallOnly.panama_getonly >>> ??????? return preallocatedSystemTime.get().wSecond$get(); >>> ??? } >>> >>> ??? public short getOnlyFieldDirect() { // >>> JmhCallOnly.panama_getfield_only >>> ??????? return struct.wSecond$get(); >>> ??? } >>> >>> ??? public Object getStructOnly() { // >>> JmhCallOnly.panama_getstruct_only >>> ??????? return preallocatedSystemTime.get(); >>> ??? } >>> >>> Benchmark????????????????????????? Mode? Cnt??? Score Error? Units >>> JmhCallOnly.jni_javacpp_getonly??? avgt?? 50?? 48.642 ? 1.917? ns/op >>> JmhCallOnly.panama_getfield_only?? avgt?? 50?? 93.360 ? 13.364? ns/op >>> JmhCallOnly.panama_getonly???????? avgt?? 50? 533.978 ? 24.249? ns/op >>> JmhCallOnly.panama_getstruct_only? avgt?? 50? 377.114 ? 19.884? ns/op >>> >>> So part of the performance loss goes to getting the field, which >>> creates a bunch of intermediate Pointer objects (see >>> RuntimeSupport::CasterImpl) [2]. I think the new memaccess API could >>> really help there, since we can pre-compute a VarHandle for the field, >>> and shouldn't need any of these intermediate pointer objects. >>> >>> But, by far the largest part of the time seems to go to creating the >>> _SYSTEMTIME object when calling get() on the `Pointer<_SYSTEMTIME>`, >>> which corresponds to References.OfStruct::get [3]: >>> >>> ??? static Struct get(Pointer pointer) { >>> ??????? ((BoundedPointer)pointer).checkAlive(); >>> ??????? Class carrier = >>> ((LayoutTypeImpl)pointer.type()).carrier(); >>> ??????? Class structClass = >>> LibrariesHelper.getStructImplClass(carrier); >>> ??????? try { >>> ??????????? return >>> (Struct)structClass.getConstructor(Pointer.class).newInstance(pointer); >>> >>> ??????? } catch (ReflectiveOperationException ex) { >>> ??????????? throw new IllegalStateException(ex); >>> ??????? } >>> ??? } >>> >>> I once had the idea to try and see what specialization of this code on >>> a per-Struct-class basis cloud do for performance. Maybe now is a good >>> time to try it out ;) >>> >>>> If Panama doesn't allow them to use raw pointers with >>>> layouts without going through hoops, that's a usability problem, and >>>> if those issues are not ironed out eventually, people will be forced >>>> to keep using JNI. >>> >>> I think we are far from being set-in-stone, at least as far as the >>> high-level API goes. I agree that there should be DYI options for >>> doing things. I think the current solution being investigated is to >>> have multiple levels of public API (with memaccess, FFI, and then >>> Panama), instead of just one high-level API that everyone uses. So >>> users would have the option to use the low-level APIs to build their >>> own solution from scratch. >>> >>> Jorn >>> >>> [1] : >>> http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/ScopeImpl.java#l216 >>> [2] : >>> http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/RuntimeSupport.java#l60 >>> [3] : >>> http://hg.openjdk.java.net/panama/dev/file/cef8136ee7ee/src/java.base/share/classes/jdk/internal/foreign/memory/References.java#l524 >>> >>> >>> Samuel Audet schreef op 2019-05-18 12:04: >>>> If I understand correctly memory allocation is the culprit? Is there a >>>> way to call something like malloc() with Panama and still be able to >>>> map it to layouts and/or cast it to whatever we want? Calling malloc() >>>> with JavaCPP won't do anything w.r.t to deallocation, scopes, >>>> cleaners, etc, but it's available as an option, because sometimes >>>> users need it! If Panama doesn't allow them to use raw pointers with >>>> layouts without going through hoops, that's a usability problem, and >>>> if those issues are not ironed out eventually, people will be forced >>>> to keep using JNI. >>>> >>>> Samuel >>>> >>>> On 5/18/19 12:33 AM, Maurizio Cimadamore wrote: >>>>> Thanks Jorn, >>>>> I'd be more interested in knowing the raw native call numbers, >>>>> does it get any better with linkToNative? Here I'd be expecting >>>>> performances identical to JNI (since the binder should lower the >>>>> Pointer to a long, which LinkToNative would then pass by register). >>>>> >>>>> As for the fuller benchmark, note that you are also measuring the >>>>> performances of Scope::allocate, which is internally using some >>>>> maps. JNR/JNI does not do the same liveliness checks that we do, >>>>> so the full benchmark is not totally fair. But the arw performance >>>>> of the downcall should be an apple-to-apple comparison, and it >>>>> shouldn't be 8x slower as it is now (at least not with linkToNative). >>>>> >>>>> Maurizio >>>>> >>>>> >>>>> On 17/05/2019 16:14, Jorn Vernee wrote: >>>>> >>>>>> FWIW, I ran the benchmarks with the linkToNative back-end (using >>>>>> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's >>>>>> still 2x slower than JNI: >>>>>> >>>>>> Benchmark?????????????????????????????????? Mode? Cnt Score Error >>>>>> Units >>>>>> JmhGetSystemTimeSeconds.jni_javacpp???????? avgt?? 50 298.046 ? >>>>>> 15.744? ns/op >>>>>> JmhGetSystemTimeSeconds.panama_prelayout??? avgt?? 50 596.567 ? >>>>>> 20.570? ns/op >>>>>> >>>>>> Of course, like Aleksey says: "The numbers [above] are just data. >>>>>> To gain reusable insights, you need to follow up on why the >>>>>> numbers are the way they are.". Unfortunately, I'm having some >>>>>> trouble getting the project to work with the Windows profiler :/ >>>>>> Was currently looking into that. >>>>>> >>>>>> Cheers, >>>>>> Jorn >>>>>> >>>>>> Maurizio Cimadamore schreef op 2019-05-17 16:51: >>>>>>> On 17/05/2019 11:26, Maurizio Cimadamore wrote: >>>>>>>> thanks you for bringing this up, I saw this benchmark few days >>>>>>>> ago and I took a look at it. That benchmark is unfortunately >>>>>>>> hitting on a couple of (transitory!) pain points: (1) it is >>>>>>>> running on Windows, which lacks the optimizations available for >>>>>>>> MacOS and Linux (directInvoker). When the linkToNative effort >>>>>>>> will be completed, this discrepancy between platforms will go >>>>>>>> away. The second problem (2) is that the call is passing a big >>>>>>>> struct (e.g. bigger than 64 bits). Even on Linux and Mac, such >>>>>>>> a call would be unable to take advantage of the optimized >>>>>>>> invoker and would fall back to the so called 'universal >>>>>>>> invoker' which is slow. >>>>>>> >>>>>>> Actually, my bad, the bench is passing pointer to structs, not >>>>>>> structs >>>>>>> by value - which I think should mean the 'foreign+linkToNative' >>>>>>> experimental branch should be able to handle this. Would be nice to >>>>>>> get some confirmation that this is indeed the case. >>>>>>> >>>>>>> Maurizio > From jbvernee at xs4all.nl Mon May 20 08:30:43 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 20 May 2019 10:30:43 +0200 Subject: [foreign] Poor performance? In-Reply-To: <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> Message-ID: FWIW, I did find that setAccessible(true) + unreflectConstructor works as well, and has the same perf. Updated: http://cr.openjdk.java.net/~jvernee/panama/webrevs/getstruct/webrev.03/ > In alternative > we could always go back to a normal Unsafe.defineClass - and then > replace constant pool patching with condy, which is gonna help anyway > if we're going at some point to start generating bindings statically. Actually, this is my first time hearing about this. Is there a plan to generate bindings statically? (I guess it would save time on spinning?). Jorn Maurizio Cimadamore schreef op 2019-05-20 10:14: > On 19/05/2019 16:05, Jorn Vernee wrote: >> The bad news... since we (apparently) can't get a MethodHandle to the >> constructor of our Struct impl class without triggering an access >> violation (seemingly because it's a VMAC?), I had to disable the >> access checking to get these numbers. > > Thanks for the tests Jorn - and yes, this is the reason why Struct > handles have not been specialized. As Brian says, there is going to be > a more reliable way to define anonymous classes soon. In alternative > we could always go back to a normal Unsafe.defineClass - and then > replace constant pool patching with condy, which is gonna help anyway > if we're going at some point to start generating bindings statically. > > Maurizio From maurizio.cimadamore at oracle.com Mon May 20 08:34:06 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 09:34:06 +0100 Subject: [foreign] Poor performance? In-Reply-To: References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> Message-ID: On 20/05/2019 09:30, Jorn Vernee wrote: > FWIW, I did find that setAccessible(true) + unreflectConstructor works > as well, and has the same perf. > > Updated: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/getstruct/webrev.03/ > >> In alternative >> we could always go back to a normal Unsafe.defineClass - and then >> replace constant pool patching with condy, which is gonna help anyway >> if we're going at some point to start generating bindings statically. > > Actually, this is my first time hearing about this. Is there a plan to > generate bindings statically? (I guess it would save time on spinning?). Generating binding statically is a low hanging fruit, in a way. Most of the code that is spinned at runtime by the binder could be spun statically using a right combo of indy/condy. This could be useful in contexts where performances are dominated by those of the binder? - or in cases where it's more handy to do things statically (some earlier experiments with Panama in the JDK seems to suggest this could be a good direction to explore). Maurizio > > Jorn > > Maurizio Cimadamore schreef op 2019-05-20 10:14: >> On 19/05/2019 16:05, Jorn Vernee wrote: >>> The bad news... since we (apparently) can't get a MethodHandle to >>> the constructor of our Struct impl class without triggering an >>> access violation (seemingly because it's a VMAC?), I had to disable >>> the access checking to get these numbers. >> >> Thanks for the tests Jorn - and yes, this is the reason why Struct >> handles have not been specialized. As Brian says, there is going to be >> a more reliable way to define anonymous classes soon. In alternative >> we could always go back to a normal Unsafe.defineClass - and then >> replace constant pool patching with condy, which is gonna help anyway >> if we're going at some point to start generating bindings statically. >> >> Maurizio From maurizio.cimadamore at oracle.com Mon May 20 11:04:32 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 12:04:32 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> Message-ID: <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> The patch in [1] looks more minimal, but I'm curious - it is also correct? It seems to me that some additional SKIP steps will be added to the shuffle recipe, given that you now have an argument whose storage index is 8: int indexInClass = 0; 64 for (ArgumentBinding binding : bindings) { 65 while (indexInClass < binding.storage().getStorageIndex()) { 66 builder.addSkip(); 67 indexInClass++; 68 } I guess that means that, e.g. if your native function takes nothing and returns a big struct, we will need to make a long array that is big enough to contain all register values (which are unused in this case) and fill it with SKIP steps, just in order to use the last step? I think that could work - but at the same time, patch (2) seem more or less in the spirit of what was done with X87 registers (which are only supported on Intel platforms), so I can also live with that one, which has the advantage of being cleaner. I wouldn't bother with DirectInvoker for now; you can enable it in a follow up patch if you find out what the issue is, and want to fix it - but as you say, it's probably better to focus on linkToNative which is gonna be the future here. Cheers Maurizio On 20/05/2019 07:47, Nick Gasson wrote: > Hi Maurizio, > > On 15/05/2019 17:51, Maurizio Cimadamore wrote: >> >> This seems a good direction to explore. >> >> I believe that, in general, our assumption that platforms have >> similar registers classes is only partially valid. This is visible >> with X87 register, which doesn't really make sense on a non-intel >> machine. >> >> But adding a class as the code is now can be problematic, as there's >> a 1-1 correspondence between these and ShuffleRecipeClass, which is >> also shared for all platforms. >> >> How did you go about changing that class? >> >> I think it could be better, for now, to stick with some hacky >> solution like the one you proposed above, which keeps current >> (broken) assumption and then revisit later a way to make the code >> better/more direct? >> > > I've made two separate patches to implement this, please have a look > to see which is preferable. The first patch uses the integer argument > class. This puts the the indirect argument at offset 8, which > otherwise isn't used for an integer argument. We also now need to sort > the bindings when generating the shuffle recipe, as Jorn pointed out. > > [1] > http://cr.openjdk.java.net/~ngasson/foreign/8223808/struct-integer-arg/ > > The second patch adds a new storage class to represent the indirect > result register. But the downside is we need to change the x86 code to > ignore this. > > [2] > http://cr.openjdk.java.net/~ngasson/foreign/8223808/struct-new-indirect-class/ > > I actually think [1] may be better as it doesn't touch the x86 code > and is a smaller diff overall? > > I've also done a minor cleanup of the original patch. This disables > the tests that use long double on AArch64 for now. > > http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.1/ > > The StructUpcall test will pass with either patch [1] or [2] above but > only when using the universal invoker. I'm wondering whether we should > disable the direct invoker fast path on AArch64 to match Windows, as > this will be replaced later anyway? Or is it better to try to fix it? > > Thanks, > Nick From jbvernee at xs4all.nl Mon May 20 13:30:22 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 20 May 2019 15:30:22 +0200 Subject: Implement no copy memory In-Reply-To: References: Message-ID: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> Hi, > It looks like pamana copy value from 'u_char *' into 'Pointer > buf', correct or not? Pointer is just a wrapper for the native pointer value. Only the value of the pointer itself is copied, not the contents of the memory it points to. > Because when I call buf.addr() it returns 0, > what zero mean? Zero means NULL. > How I can do with no copy? Just returns the address of 'u_char *' like > below in callback method? Pointer::addr returns the value of the pointer, to get the address of the pointer value itself, ? la: uchar_t **bufRef = &buf; This is currently not possible. Can you explain a bit more about your use case? We might be able to offer some suggestions. Cheers, Jorn Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: > Hi, > > There is native function in libpcap library: typedef void > (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); > When I use jextract it generate code like below: > > @NativeHeader(...) > public interface pcap_h { > > ... > > @NativeLocation(...) > > @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") > int pcap_loop(Pointer p, int cnt, Callback > callback, Pointer usr); > > @FunctionalInterface > @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") > interface pcap_handler { > > void fn(Pointer buf, Pointer pkthdr, > Pointer usr) throws IllegalAccessException; > > } > > ... > > } > > It looks like pamana copy value from 'u_char *' into 'Pointer > buf', correct or not? Because when I call buf.addr() it returns 0, > what zero mean? > How I can do with no copy? Just returns the address of 'u_char *' like > below in callback method? > > void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, > Pointer usr) throws IllegalAccessException; > > Thanks and Regards, > Ardika Rommy Sanjaya From ardikars at gmail.com Mon May 20 13:56:06 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Mon, 20 May 2019 20:56:06 +0700 Subject: Implement no copy memory In-Reply-To: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> Message-ID: Thanks for response, I really appreciate. I want when every pcap_hander::fn callback function is called, buf::addr() returns the address of u_char *, after that I can use sun.misc.Unsafe to get the content data of u_char * based on that address. Please refer below code: This is libpcap api: int pcap_loop(pcap_t *, int, pcap_handler, u_char *); typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); Below is java with part, generated by jextract with little changes: @NativeHeader( libraries = {"pcap"} ) public interface pcap_mapping { @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); @FunctionalInterface @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") public interface pcap_handler { void fn(Pointer buf, Pointer pkthdr, Pointer arg) throws IllegalAccessException; } } Regards > On 20 May 2019, at 20.30, Jorn Vernee wrote: > > Hi, > >> It looks like pamana copy value from 'u_char *' into 'Pointer >> buf', correct or not? > > Pointer is just a wrapper for the native pointer value. Only the value of the pointer itself is copied, not the contents of the memory it points to. > >> Because when I call buf.addr() it returns 0, >> what zero mean? > > Zero means NULL. > >> How I can do with no copy? Just returns the address of 'u_char *' like >> below in callback method? > > Pointer::addr returns the value of the pointer, to get the address of the pointer value itself, ? la: > > uchar_t **bufRef = &buf; > > This is currently not possible. > > Can you explain a bit more about your use case? We might be able to offer some suggestions. > > Cheers, > Jorn > > Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: >> Hi, >> There is native function in libpcap library: typedef void >> (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); >> When I use jextract it generate code like below: >> @NativeHeader(...) >> public interface pcap_h { >> ... >> @NativeLocation(...) >> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >> int pcap_loop(Pointer p, int cnt, Callback >> callback, Pointer usr); >> @FunctionalInterface >> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >> interface pcap_handler { >> void fn(Pointer buf, Pointer pkthdr, >> Pointer usr) throws IllegalAccessException; >> } >> ... >> } >> It looks like pamana copy value from 'u_char *' into 'Pointer >> buf', correct or not? Because when I call buf.addr() it returns 0, >> what zero mean? >> How I can do with no copy? Just returns the address of 'u_char *' like >> below in callback method? >> void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, >> Pointer usr) throws IllegalAccessException; >> Thanks and Regards, >> Ardika Rommy Sanjaya From jbvernee at xs4all.nl Mon May 20 14:11:24 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Mon, 20 May 2019 16:11:24 +0200 Subject: Implement no copy memory In-Reply-To: References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> Message-ID: Why do you want to use sun.misc.Unsafe to get the data? You can get the data using the Panama API as well. e.g. by overlaying some other type: LayoutType MyType_t = LayoutType.ofStruct(MyType.class); Pointer ptr = buf.cast(NativeTypes.VOID).cast(MyType_t); MyType mt = ptr.get(); // use 'mt' Or by overlaying a ByteBuffer: int size = ...; ByteBuffer bb = buf.asDirectByteBuffer(size); // use 'bb' FWIW, the Pointer.get() API uses Unsafe internally as well. Jorn Ardika Rommy Sanjaya schreef op 2019-05-20 15:56: > Thanks for response, I really appreciate. > > I want when every pcap_hander::fn callback function is called, > buf::addr() returns the address of u_char *, after that I can use > sun.misc.Unsafe to get the content data of u_char * based on that > address. > > Please refer below code: > > This is libpcap api: > > int pcap_loop(pcap_t *, int, pcap_handler, u_char *); > typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, > const u_char *); > > Below is java with part, generated by jextract with little changes: > > @NativeHeader( > libraries = {"pcap"} > ) > public interface pcap_mapping { > > > @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") > int pcap_loop(Pointer p, int cnt, Callback > callback, Pointer usr); > > @FunctionalInterface > @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") > public interface pcap_handler { > void fn(Pointer buf, Pointer > pkthdr, Pointer arg) throws IllegalAccessException; > } > > } > > Regards > >> On 20 May 2019, at 20.30, Jorn Vernee wrote: >> >> Hi, >> >>> It looks like pamana copy value from 'u_char *' into >>> 'Pointer >>> buf', correct or not? >> >> Pointer is just a wrapper for the native pointer value. Only >> the value of the pointer itself is copied, not the contents of the >> memory it points to. >> >>> Because when I call buf.addr() it returns 0, >>> what zero mean? >> >> Zero means NULL. >> >>> How I can do with no copy? Just returns the address of 'u_char *' >>> like >>> below in callback method? >> >> Pointer::addr returns the value of the pointer, to get the address >> of the pointer value itself, ? la: >> >> uchar_t **bufRef = &buf; >> >> This is currently not possible. >> >> Can you explain a bit more about your use case? We might be able to >> offer some suggestions. >> >> Cheers, >> Jorn >> >> Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: >> >>> Hi, >>> There is native function in libpcap library: typedef void >>> (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char >>> *); >>> When I use jextract it generate code like below: >>> @NativeHeader(...) >>> public interface pcap_h { >>> ... >>> @NativeLocation(...) >>> >> > @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >>> int pcap_loop(Pointer p, int cnt, Callback >>> callback, Pointer usr); >>> @FunctionalInterface >>> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >>> interface pcap_handler { >>> void fn(Pointer buf, Pointer pkthdr, >>> Pointer usr) throws IllegalAccessException; >>> } >>> ... >>> } >>> It looks like pamana copy value from 'u_char *' into >>> 'Pointer >>> buf', correct or not? Because when I call buf.addr() it returns 0, >>> what zero mean? >>> How I can do with no copy? Just returns the address of 'u_char *' >>> like >>> below in callback method? >>> void fn(long memoryAddressOf_u_char_ptr, Pointer >>> pkthdr, >>> Pointer usr) throws IllegalAccessException; >>> Thanks and Regards, >>> Ardika Rommy Sanjaya From maurizio.cimadamore at oracle.com Mon May 20 14:10:05 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 15:10:05 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <1BD46646-9007-4640-AEE4-96908B70BBB1@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <1BD46646-9007-4640-AEE4-96908B70BBB1@oracle.com> Message-ID: <519405d2-08ed-d914-578d-681339382c61@oracle.com> I did some more experiments with this, and I couldn't find a great way out of the issues pointed out in this thread. It is possible to convert both LayoutUtils and TypeDIctionary so that their conversion methods take an extra cursor/TU - but that still leaves you with the problem that, in order to create a Reparser, you need to know which command line arguments have been passed to clang by jextract - and, in general this is not possible. In other words, whatever we do, we end up creating a dependency from clang module to jextract module - in fact ClangUtils, in order to function, needs some jextract class to setup some contextual info w/o which it cannot run. This is the most significant thing I don't like about the approach - the fact that this very subtle dependency is hidden away in an HashMap, rather than being explicit in the code. At this point, I think a more hacky solution like the one in [1] might indeed be better. There I don't like the fact that we have to 'observe' all cursors, effectively creating big maps full of mappings from spelling to types. But at least, doing so is 100% self-contained in the clang module and doesn't require a subtle synergy between the clang and the jextract module. If we had a way to recover clang command line options from the TranslationUnit (I looked around clang API but couldn't find any such API point), then it would be possible to do this in a better way. [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ On 18/05/2019 01:06, Henry Jen wrote: > Theoretically, we got Type from Cursor from a HeaderFile, and I checked usage of LayoutUtil, seems the Cursor is available from callers, from which we can track back to the TU. > > That would allow us to get rid of static registry and have more targeted search in TypeDictionary. > > In other word, we can go extra miles to refactoring use of LayoutUtil using Cursor rather than Type(lose of information), and if we don?t need that work-around stays in clang binding, we can eliminate this distaste. > > Cheers, > Henry > >> On May 17, 2019, at 2:44 PM, Henry Jen wrote: >> >> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >> >> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >> >> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >> >> Cheers, >> Henry >> >> >>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>> >>> Hi Henry, >>> thanks for taking another look at this. >>> >>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>> >>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>> >>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>> >>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>> >>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>> >>> Which means the only thing changing are the arguments... >>> >>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>> >>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>> >>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>> >>> Maurizio >>> >>> >>> On 17/05/2019 21:26, Henry Jen wrote: >>>> Oops, the link to web rev >>>> >>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>> >>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>> >>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>> >>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>> >>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>> >>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>> >>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>> >>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>> >>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>> >>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>> >>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>> >>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>> >>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>> >>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>> >>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>> >>>>>>>>> That is, when you see a cursor with: >>>>>>>>> >>>>>>>>> _Atomic("....") >>>>>>>>> >>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>> >>>>>>>>> Maurizio >>>>>>>>> >>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>> >>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>> >>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>> >>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Henry >>>>>>>>>> >>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>> [2] https://reviews.llvm.org/D61716 From maurizio.cimadamore at oracle.com Mon May 20 14:21:19 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 15:21:19 +0100 Subject: Implement no copy memory In-Reply-To: References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> Message-ID: Hi, how are you calling the function? From the pcap doc, I see this: void got_packet(u_char *args, const struct pcap_pkthdr *header, const u_char *packet); Let's examine this in more detail. First, you'll notice that the function has a void return type. This is logical, because pcap_loop() wouldn't know how to handle a return value anyway. The first argument corresponds to the last argument of pcap_loop(). So, if you see that buf.addr() == 0, that probably means you passed NULL as last parameter of the pcap_loop call? Maurizio On 20/05/2019 14:56, Ardika Rommy Sanjaya wrote: > Thanks for response, I really appreciate. > > I want when every pcap_hander::fn callback function is called, buf::addr() returns the address of u_char *, after that I can use sun.misc.Unsafe to get the content data of u_char * based on that address. > > Please refer below code: > > This is libpcap api: > > int pcap_loop(pcap_t *, int, pcap_handler, u_char *); > typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, > const u_char *); > > > Below is java with part, generated by jextract with little changes: > > @NativeHeader( > libraries = {"pcap"} > ) > public interface pcap_mapping { > > @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") > int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); > > @FunctionalInterface > @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") > public interface pcap_handler { > void fn(Pointer buf, Pointer pkthdr, Pointer arg) throws IllegalAccessException; > } > > } > > Regards > > >> On 20 May 2019, at 20.30, Jorn Vernee wrote: >> >> Hi, >> >>> It looks like pamana copy value from 'u_char *' into 'Pointer >>> buf', correct or not? >> Pointer is just a wrapper for the native pointer value. Only the value of the pointer itself is copied, not the contents of the memory it points to. >> >>> Because when I call buf.addr() it returns 0, >>> what zero mean? >> Zero means NULL. >> >>> How I can do with no copy? Just returns the address of 'u_char *' like >>> below in callback method? >> Pointer::addr returns the value of the pointer, to get the address of the pointer value itself, ? la: >> >> uchar_t **bufRef = &buf; >> >> This is currently not possible. >> >> Can you explain a bit more about your use case? We might be able to offer some suggestions. >> >> Cheers, >> Jorn >> >> Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: >>> Hi, >>> There is native function in libpcap library: typedef void >>> (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); >>> When I use jextract it generate code like below: >>> @NativeHeader(...) >>> public interface pcap_h { >>> ... >>> @NativeLocation(...) >>> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >>> int pcap_loop(Pointer p, int cnt, Callback >>> callback, Pointer usr); >>> @FunctionalInterface >>> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >>> interface pcap_handler { >>> void fn(Pointer buf, Pointer pkthdr, >>> Pointer usr) throws IllegalAccessException; >>> } >>> ... >>> } >>> It looks like pamana copy value from 'u_char *' into 'Pointer >>> buf', correct or not? Because when I call buf.addr() it returns 0, >>> what zero mean? >>> How I can do with no copy? Just returns the address of 'u_char *' like >>> below in callback method? >>> void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, >>> Pointer usr) throws IllegalAccessException; >>> Thanks and Regards, >>> Ardika Rommy Sanjaya From maurizio.cimadamore at oracle.com Mon May 20 16:58:37 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Mon, 20 May 2019 16:58:37 +0000 Subject: hg: panama/dev: 8224134:Fix javadoc issues Message-ID: <201905201658.x4KGwcCp027298@aojmv0008.oracle.com> Changeset: 027506b06fc6 Author: mcimadamore Date: 2019-05-20 17:58 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/027506b06fc6 8224134:Fix javadoc issues ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/GroupLayout.java ! src/java.base/share/classes/java/foreign/Layout.java ! src/java.base/share/classes/java/foreign/MemoryAddress.java ! src/java.base/share/classes/java/foreign/PaddingLayout.java ! src/java.base/share/classes/java/foreign/SequenceLayout.java ! src/java.base/share/classes/java/foreign/ValueLayout.java From henry.jen at oracle.com Mon May 20 18:24:15 2019 From: henry.jen at oracle.com (Henry Jen) Date: Mon, 20 May 2019 11:24:15 -0700 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: <519405d2-08ed-d914-578d-681339382c61@oracle.com> References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <1BD46646-9007-4640-AEE4-96908B70BBB1@oracle.com> <519405d2-08ed-d914-578d-681339382c61@oracle.com> Message-ID: OK, I?ll push version 0 and gather clang binding improvements into a separate patch for later review. Cheers, Henry > On May 20, 2019, at 7:10 AM, Maurizio Cimadamore wrote: > > I did some more experiments with this, and I couldn't find a great way out of the issues pointed out in this thread. > > It is possible to convert both LayoutUtils and TypeDIctionary so that their conversion methods take an extra cursor/TU - but that still leaves you with the problem that, in order to create a Reparser, you need to know which command line arguments have been passed to clang by jextract - and, in general this is not possible. > > In other words, whatever we do, we end up creating a dependency from clang module to jextract module - in fact ClangUtils, in order to function, needs some jextract class to setup some contextual info w/o which it cannot run. This is the most significant thing I don't like about the approach - the fact that this very subtle dependency is hidden away in an HashMap, rather than being explicit in the code. > > At this point, I think a more hacky solution like the one in [1] might indeed be better. There I don't like the fact that we have to 'observe' all cursors, effectively creating big maps full of mappings from spelling to types. But at least, doing so is 100% self-contained in the clang module and doesn't require a subtle synergy between the clang and the jextract module. > > > If we had a way to recover clang command line options from the TranslationUnit (I looked around clang API but couldn't find any such API point), then it would be possible to do this in a better way. > > > [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ > > > On 18/05/2019 01:06, Henry Jen wrote: >> Theoretically, we got Type from Cursor from a HeaderFile, and I checked usage of LayoutUtil, seems the Cursor is available from callers, from which we can track back to the TU. >> >> That would allow us to get rid of static registry and have more targeted search in TypeDictionary. >> >> In other word, we can go extra miles to refactoring use of LayoutUtil using Cursor rather than Type(lose of information), and if we don?t need that work-around stays in clang binding, we can eliminate this distaste. >> >> Cheers, >> Henry >> >>> On May 17, 2019, at 2:44 PM, Henry Jen wrote: >>> >>> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >>> >>> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >>> >>> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >>> >>> Cheers, >>> Henry >>> >>> >>>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>>> >>>> Hi Henry, >>>> thanks for taking another look at this. >>>> >>>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>>> >>>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>>> >>>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>>> >>>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>>> >>>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>>> >>>> Which means the only thing changing are the arguments... >>>> >>>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>>> >>>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>>> >>>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>>> >>>> Maurizio >>>> >>>> >>>> On 17/05/2019 21:26, Henry Jen wrote: >>>>> Oops, the link to web rev >>>>> >>>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>>> >>>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>>> >>>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>>> >>>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>>> >>>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>>> >>>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>>> >>>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>>> >>>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>> >>>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>>> >>>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>>> >>>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>>> >>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>>> >>>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>>> >>>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>>> >>>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>>> >>>>>>>>>> That is, when you see a cursor with: >>>>>>>>>> >>>>>>>>>> _Atomic("....") >>>>>>>>>> >>>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>>> >>>>>>>>>> Maurizio >>>>>>>>>> >>>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>>> >>>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>>> >>>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>>> >>>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Henry >>>>>>>>>>> >>>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>>> [2] https://reviews.llvm.org/D61716 From henry.jen at oracle.com Mon May 20 18:55:52 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Mon, 20 May 2019 18:55:52 +0000 Subject: hg: panama/dev: 8223489: _Atomic types can cause StackOverflowError Message-ID: <201905201855.x4KItrUI012115@aojmv0008.oracle.com> Changeset: c65ecdbf4155 Author: henryjen Date: 2019-05-20 11:55 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/c65ecdbf4155 8223489: _Atomic types can cause StackOverflowError Reviewed-by: mcimadamore + src/jdk.internal.clang/share/classes/jdk/internal/clang/ClangUtils.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/Cursor.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/Type.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/TypeKind.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/TypeDictionary.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/Utils.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/LayoutUtils.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/tree/Printer.java ! test/jdk/com/sun/tools/jextract/Runner.java + test/jdk/com/sun/tools/jextract/atomicTypes.h + test/jdk/com/sun/tools/jextract/compare/atomicTypes_h.java + test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/ClangUtils.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/Cursor.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/Type.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/TypeKind.java From maurizio.cimadamore at oracle.com Mon May 20 18:59:50 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Mon, 20 May 2019 18:59:50 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905201859.x4KIxoai013943@aojmv0008.oracle.com> Changeset: 4d7a7a6fa54d Author: mcimadamore Date: 2019-05-20 20:59 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/4d7a7a6fa54d Automatic merge with foreign From maurizio.cimadamore at oracle.com Mon May 20 19:57:07 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 20:57:07 +0100 Subject: [foreign] RFR: 8223489: _Atomic types can cause StackOverflowError In-Reply-To: References: <6D30ECBC-20FA-450F-9204-3C6492D4D56F@oracle.com> <362edd5f-c20a-c6c4-aafb-385f81939713@oracle.com> <351CABF9-0318-4D9C-ABE4-19A95BE86CC6@oracle.com> <93414FB3-8A4F-4A2F-9ECC-F54C112FE81D@oracle.com> <3630DF46-5A14-4656-BE6E-B7022AF10483@oracle.com> <14d135c6-ce27-b3cc-cb61-fe7b18586dfa@oracle.com> <1BD46646-9007-4640-AEE4-96908B70BBB1@oracle.com> <519405d2-08ed-d914-578d-681339382c61@oracle.com> Message-ID: <7005df43-b575-f827-7d91-cfb259c254cd@oracle.com> Cool Maurizio On 20/05/2019 19:24, Henry Jen wrote: > OK, I?ll push version 0 and gather clang binding improvements into a separate patch for later review. > > Cheers, > Henry > >> On May 20, 2019, at 7:10 AM, Maurizio Cimadamore wrote: >> >> I did some more experiments with this, and I couldn't find a great way out of the issues pointed out in this thread. >> >> It is possible to convert both LayoutUtils and TypeDIctionary so that their conversion methods take an extra cursor/TU - but that still leaves you with the problem that, in order to create a Reparser, you need to know which command line arguments have been passed to clang by jextract - and, in general this is not possible. >> >> In other words, whatever we do, we end up creating a dependency from clang module to jextract module - in fact ClangUtils, in order to function, needs some jextract class to setup some contextual info w/o which it cannot run. This is the most significant thing I don't like about the approach - the fact that this very subtle dependency is hidden away in an HashMap, rather than being explicit in the code. >> >> At this point, I think a more hacky solution like the one in [1] might indeed be better. There I don't like the fact that we have to 'observe' all cursors, effectively creating big maps full of mappings from spelling to types. But at least, doing so is 100% self-contained in the clang module and doesn't require a subtle synergy between the clang and the jextract module. >> >> >> If we had a way to recover clang command line options from the TranslationUnit (I looked around clang API but couldn't find any such API point), then it would be possible to do this in a better way. >> >> >> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >> >> >> On 18/05/2019 01:06, Henry Jen wrote: >>> Theoretically, we got Type from Cursor from a HeaderFile, and I checked usage of LayoutUtil, seems the Cursor is available from callers, from which we can track back to the TU. >>> >>> That would allow us to get rid of static registry and have more targeted search in TypeDictionary. >>> >>> In other word, we can go extra miles to refactoring use of LayoutUtil using Cursor rather than Type(lose of information), and if we don?t need that work-around stays in clang binding, we can eliminate this distaste. >>> >>> Cheers, >>> Henry >>> >>>> On May 17, 2019, at 2:44 PM, Henry Jen wrote: >>>> >>>> I explained this in the first email, LayoutUtil in static context and we don?t have a way to get the only instance of jextract. >>>> >>>> Also I did this in libclang so that changed in jextract can be same once the libclang is fixed, all we need to do is remove ClangUtils and then we can put back the Reparser mechanism back into jextract. >>>> >>>> We can choose to keep things in jextract, but that means we need to be able to get LayoutUtil access to THE jextract instance. Let me know if you have better ideas. >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>>> On May 17, 2019, at 1:49 PM, Maurizio Cimadamore wrote: >>>>> >>>>> Hi Henry, >>>>> thanks for taking another look at this. >>>>> >>>>> It's good that you managed to get it working, and I appreciate the effort to clean up the JNI code. >>>>> >>>>> There are things in the Java code that leave me a bit perplexed - for instance, the fact that code from MacroParser (eval method) was copied outside it, to Parser. >>>>> >>>>> And also the fact that we are using a register mechanism and stashing things into statics, which is bad practice. >>>>> >>>>> My understanding is that, there should be only one TranslationUnit cursor generated by jextract parser, right? >>>>> >>>>> So, in reality when you register a reparser, you will always call register() with the same TU, right? >>>>> >>>>> Which means the only thing changing are the arguments... >>>>> >>>>> Now, in your current impl, note that you are doing nothing in order e.g. to make sure that if there's already some reparser set with given arguments, that is reused - so again, I'm confused as to what is the role of the static cache. >>>>> >>>>> I think what we want is to have a way to pass the reparser to the TypeDictionary, which is something that can be done by simply passing the TU to the TypeDIctionary, after which the dictionary can create its own reparser to do the atomic reparsing thing. >>>>> >>>>> In other words, I don't see the need for having a getValueType() method on the cursor itself - I think the capability of reparsing things and seeing through 'atomic' is a capability of the type dictionary and we should focus on how to get that capability there. >>>>> >>>>> Maurizio >>>>> >>>>> >>>>> On 17/05/2019 21:26, Henry Jen wrote: >>>>>> Oops, the link to web rev >>>>>> >>>>>> http://cr.openjdk.java.net/~henryjen/panama/8223489/1/webrev/ >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>>> On May 17, 2019, at 1:26 PM, Henry Jen wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Please review another webrev[1] that use precompile header to do evaluation as suggested by Maurizio, this patch make one instance of Reparser that can be used by both Macro and Atomic type and potentially other use. >>>>>>> >>>>>>> We took in arguments from the origin TU in parser, this is mainly to make sure the precompile header will be using same setting, this helps the c++ mode in StructTest.java which otherwise fail because the mismatch of original TU(with clang option -x c++). >>>>>>> >>>>>>> This has being an issue, but MacroParser currently implemented in a way that just shutoff in case of any failure and it we simply not notice the issue. >>>>>>> >>>>>>> In this patch, I also refactor clang API a bit to move TranslationUnit functions to where it belong, those were in Index because we didn?t expose TranslationUnit in early days, now we have it, we should organize the function accordingly. >>>>>>> >>>>>>> Another change worth noting, is that Index.parse now throw ParsingFailedException if clang parsing failed in some way. This is why jextract was showing >>>>>>> "WARNING: nothing to generate? instead of error out when use command line like 'jextract -C -include-pch -C test.h.gch client.h' >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>>> On May 15, 2019, at 8:50 AM, Henry Jen wrote: >>>>>>>> >>>>>>>> I figured this out, it failed with a different TU, need to catch the exception in a loop. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>>> On May 14, 2019, at 9:36 PM, Henry Jen wrote: >>>>>>>>> >>>>>>>>> BTW, I can get away with incomplete type by adding CXTranslationUnit_Incomplete option, but not sure about typedef, >>>>>>>>> >>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: uchar_t var; >>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$4453798577315022046.h:1:1: error: unknown type name 'uchar_t' >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Henry >>>>>>>>> >>>>>>>>> >>>>>>>>>> On May 14, 2019, at 9:10 PM, Henry Jen wrote: >>>>>>>>>> >>>>>>>>>> Use the reparse mechanism as ClangParser seems a good idea, but there are complications using precompile header. >>>>>>>>>> >>>>>>>>>> In the test case I have, the struct SomeTypes is declared in the header file, but the way we creating new TU use the pch doesn?t work: >>>>>>>>>> >>>>>>>>>> java.lang.IllegalArgumentException: Error with snippet: struct SomeTypes var; >>>>>>>>>> /var/folders/nk/wl6yv8l565g17ykv8658hcq80000gn/T/jextract$17213029476260309473.h:1:18: error: tentative definition has type 'struct SomeTypes' that is never completed >>>>>>>>>> >>>>>>>>>> I like the idea of reparse, but current way is not working(or I haven?t figure it out), and I am not sure if use or clone(save then create) the original TU would have any side effect. At this point, I think my current implementation is more certain with the downside that we don?t have all builtin types, but at this point, that?s all we supported anyway. >>>>>>>>>> >>>>>>>>>> Since this is to work-around current libclang, I expect the better solution is to bundle a patched libclang. But until then, this temporary fix should be good enough. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Henry >>>>>>>>>> >>>>>>>>>>> On May 14, 2019, at 4:03 AM, Maurizio Cimadamore wrote: >>>>>>>>>>> >>>>>>>>>>> I see what you are doing here. I believe there's a cleaner way to get there, which is what we've done to process macros. >>>>>>>>>>> >>>>>>>>>>> That is, when you see a cursor with: >>>>>>>>>>> >>>>>>>>>>> _Atomic("....") >>>>>>>>>>> >>>>>>>>>>> extract the innards of that cursor, and create a new temporary compilation unit for that one, as we do for macros. I think most of the logic we use there should be applicable in this case (only the snippet to be generated for the speculative compilation would be different). And, probably, if we go down that path we could refactor the code a bit so that the functionalities used for the 'speculative' clang evaluation can be shared across macro/atomic support. >>>>>>>>>>> >>>>>>>>>>> Maurizio >>>>>>>>>>> >>>>>>>>>>> On 14/05/2019 02:58, Henry Jen wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Please review a webrev[1] that detects an atomic type to get the correct layout for the type. >>>>>>>>>>>> >>>>>>>>>>>> A proper solution would be have libclang expose the atomic types, so we don?t need to add the ugly hacks in Java to find the type. A patch is submitted[2] to clang project and hopefully this can be fixed in future release. >>>>>>>>>>>> >>>>>>>>>>>> Before that happens, we have our java clang binding trying do that work by: >>>>>>>>>>>> - Put together a temporary header file with C11 types to parse, so we can get this builtin-types. >>>>>>>>>>>> - During the cursor-traversing, we will add in extra types declared in the header files >>>>>>>>>>>> - For an atomic type, we use the type string to get the underlying type. >>>>>>>>>>>> >>>>>>>>>>>> Note that an atomic type is always a builtin type without valid declaration cursor, so we cannot get the translation unit the type actually defined in, so we are doing a global search. Since jextract will only created one translation unit, the hack should work just fine without much pollution. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Henry >>>>>>>>>>>> >>>>>>>>>>>> [1] http://cr.openjdk.java.net/~henryjen/panama/8223489/0/webrev/ >>>>>>>>>>>> [2] https://reviews.llvm.org/D61716 From john.r.rose at oracle.com Mon May 20 20:18:20 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 20 May 2019 13:18:20 -0700 Subject: [foreign] Poor performance? In-Reply-To: References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> Message-ID: On May 20, 2019, at 1:34 AM, Maurizio Cimadamore wrote: > > Most of the code that is spinned at runtime by the binder could be spun statically using a right combo of indy/condy. OTOH, using indy/condy allows spinning to be delayed until the actual first use of each API point, not when the header as a whole is bound. We have also been toying with the idea of a "mindy" ? Method-level indy, where the method as a whole has a BSM that lazily generates the body. With this, a whole method body would turn into (basically) a condy item, to be expanded only when the method is run. If the signatures of the methods are statically fixable, then a very early binding of the APIs and signatures might usefully combine with a very late (maximally lazy) binding of the implementations. Best of both worlds, and all that? (Is that what you meant, Maurizio?) ? John From maurizio.cimadamore at oracle.com Mon May 20 20:41:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 21:41:43 +0100 Subject: [foreign] Poor performance? In-Reply-To: References: <8ab78052903c01ca70c1fc2e22b39bef@xs4all.nl> <23341c51-6086-c367-0465-2d58fb2b959d@oracle.com> <80879f2a-421d-3412-fc35-1fe9dd9f88e7@gmail.com> <59977a02ae9333a55e737ca07659b5a8@xs4all.nl> <4ee155fc6928785309b079f93218995c@xs4all.nl> <4ff2d27e-ed12-dad0-5a04-708f7ef1417f@oracle.com> Message-ID: <9aa66cf3-8269-90e2-0529-5f3ab0164716@oracle.com> On 20/05/2019 21:18, John Rose wrote: > If the signatures of the methods are statically fixable, > then a very early binding of the APIs and signatures > might usefully combine with a very late (maximally > lazy) binding of the implementations. Best of both > worlds, and all that? > > (Is that what you meant, Maurizio?) Yes, I mean - statically generate the bound implementation class - which internally is still free to use dynamically? bound items (such as invoker method handles). Maurizio From henry.jen at oracle.com Mon May 20 20:53:36 2019 From: henry.jen at oracle.com (Henry Jen) Date: Mon, 20 May 2019 13:53:36 -0700 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API Message-ID: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> Hi, Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. Cheers, Henry [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ [2] https://bugs.openjdk.java.net/browse/JDK-8224244 From maurizio.cimadamore at oracle.com Mon May 20 22:45:13 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 20 May 2019 23:45:13 +0100 Subject: [foreign-memaccess] scopes and thread confinement Message-ID: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> Hi, the MemoryScope abstraction is used to carry out 'liveness' checks - that is, to make sure that dereference operation on MemoryAddress occur while the address still points to a valid region. To be able to get to the same level of performances as Unsafe::get/put, we need to be able to hoist the liveness check away in the JIT; but there's a problem here: even if the JIT can see through a bunch of code using addresses allocated by a given scope, it has to conservatively assume that another thread might chime in, and close the scope behind our back. This makes it impossible for the JIT to completely optimize away the check. For this reason, in my document [1] I posited the existence of a 'confined' scope, whose existence is bound to a given thread (and maybe, in the future, a fiber). I've been thinking a lot lately on how we'd like to expose this confinement model - I have arrived to some conclusions, but I still have some question marks, which I'll try to explain here. First, I think it's important to realize that there are two aspects to confinement: 1) access to scope critical operation, such as fork, allocate and terminal operations such as close/merge 2) read/write access to the underlying memory allocated using a scope While both aspects are important, they are not _equally so_. So much so that I'd like here to entertain the possibility that we assign each scope a thread (or fiber) owner, which is then used to confine access to critical scope operations, as in (1). That is - let's treat the critical scope operation as confined - to make sure that a scope can never be closed behind our back - but let's leave the memory region open so that many threads can access it concurrently. After all, the VarHandle API offers very good atomic read/write CAS-like operations, which can be used to implement any kind of synchronization atop. (Of course we could also expose an opt-in scope charateristics which, additionally, confines reads and writes to the given owner thread, as in (2), but that aspect seems less important in the scope of this discussion). If we pull the string on this model, we soon encounter a road-block: what is a global scope? While it seem reasonable for an explicitly forked scope to have an owner thread, how can a global scope have an owner, since it's created with the VM and dies with it? And, if the global scope had a fictional owner (the thread which created it) how could other threads do anything with the global scope? Not surprisingly, this duality between confined and global scopes is also present in the FiberScope API (in project Loom); there, we have two kind of FiberScope implementations, one is called FiberScopeImpl [2] and is the default, effectively confined one (so, you will get an exception when performing operations from another strand). The other is called DetachedFiberScope and is shared and can be used by all threads - it can be thought as the root in the fiber scope ownership model. This is all quite similar to what we're aiming to get at - but with one complication; in the case of FiberScope, having a shared scope implementation is not really that problematic, as a detached scope implementation doesn't have any mutable state and doesn't require any kind of synchronization. But in the case of Panama scopes, there are many things that can go wrong: * a scope keeps a list of all the allocated memory blocks, for reused - if accessed concurrently the scope can get corrupted * a scope keeps a list of all the descendants - again, if a scope is forked concurrently bad things will happen here The good thing is that a global memory scope cannot be closed (like a detached scope in Loom), so we don't have to worry about threads concurrently calling close(). Now, can we imagine a global scope implementation that require no synchronization? I think that could be doable: * instead of going for a sophisticated allocation scheme which minimizes the calls to Unsafe, we could just call Unsafe once for every call to MemoryScope::allocate, so that no shared state is used * we can easily remove the descendant list, and have all children check liveness of their parent, recursively (rather than having the parent closing sub-scopes recursively) Then the question is - what happens to resources allocated inside a global scopes (or resources merged _into_ it) ? I see two options here: 1) we could install an automagic memory collector - which frees memory as soon as the allocated region goes out of scope 2) we do nothing - if you allocate on the global scope no deallocation occurs - the region stays alive until the VM exits I was initially leaning towards (1); I liked the idea of having _some_ deallocation strategy for globally allocated resources; but then I started to realize that this semantics difference between global scopes and regular scope was not without its own issues: * what happens when a memory region is 'resized'? In that case we create a new region - but if the old one is then collected, the new region will just contain garbage! * what happens when we merge into a global scope from a child scope? We go from a deterministic deallocation behavior to a reachability-based one; this could be very confusing for users! * when working with scopes we can assume the resources of a parent will outlive those of the children - so that it is safe to copy pointers from the parent to the children (e.g. such pointers will be valid for as long as the children scope are alive). But if the parent is a global scope featuring the reachability-based deallocation described in (1), this assumption is no longer valid - if the region allocated in the parent is deemed unreachable, it can be collected, which means the parent region can become 'not alive' *before* the children scope is closed!! So I'm starting to see the appeal of (2): a global scope is a scope that is always alive; if it's always alive that must mean that memory allocated inside it is never freed, and will outlive the memory regions allocated by any other scopes. This means that developers should use global scopes with care - knowing that stuff there will never really be deallocated (so it only really makes sense for 'global' memory regions). Same applies for merging a child scope into the global scope - the associated resources will stay alive forever. All this seems to point at the following directions: 1) All forked scopes have an owner thread - fork/close/merge/allocate can only be called within that thread 2) Global scopes have no owner - all threads can call fork/allocate - close/merge are forbidden here 3) Underlying memory access is not restricted to owner thread - multiple thread can synchronize (e.g. with CAS) 3b) If we want to we can implement full confined memory access - e.g. allow memory access only within the boundaries of the owner thread Comments? Maurizio [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html [2] - https://hg.openjdk.java.net/loom/loom/file/cc783ba01af5/src/java.base/share/classes/java/lang/FiberScope.java#l636 [3] - https://hg.openjdk.java.net/loom/loom/file/cc783ba01af5/src/java.base/share/classes/java/lang/FiberScope.java#l617 From ardikars at gmail.com Tue May 21 04:26:47 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Tue, 21 May 2019 11:26:47 +0700 Subject: Implement no copy memory In-Reply-To: References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> Message-ID: <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> > how are you calling the function? From the pcap doc, I see this: > > void got_packet(u_char *args, const struct pcap_pkthdr *header, > const u_char *packet); Sorry, my mistake, fixed. I'm using sun.misc.Unsafe to get the content data and it's deprecated on java 9+ also jdk.intenal.misc.Unsafe not exported. If using panama should I cast the Pointer before get the content data? For example if I want to get 2 byte and then 4 byte with Unsafe I can do like below: Unsafe UNSAFE = ... UNSAFE.getShort(..) UNSAFE.getInt(...) Thanks and Regards > On 20 May 2019, at 21.21, Maurizio Cimadamore wrote: > > Hi, > how are you calling the function? From the pcap doc, I see this: > > void got_packet(u_char *args, const struct pcap_pkthdr *header, > const u_char *packet); > Let's examine this in more detail. First, you'll notice that the function has a void return type. This is logical, because pcap_loop() wouldn't know how to handle a return value anyway. The first argument corresponds to the last argument of pcap_loop(). > > So, if you see that buf.addr() == 0, that probably means you passed NULL as last parameter of the pcap_loop call? > > Maurizio > > On 20/05/2019 14:56, Ardika Rommy Sanjaya wrote: >> Thanks for response, I really appreciate. >> >> I want when every pcap_hander::fn callback function is called, buf::addr() returns the address of u_char *, after that I can use sun.misc.Unsafe to get the content data of u_char * based on that address. >> >> Please refer below code: >> >> This is libpcap api: >> >> int pcap_loop(pcap_t *, int, pcap_handler, u_char *); >> typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, >> const u_char *); >> >> >> Below is java with part, generated by jextract with little changes: >> >> @NativeHeader( >> libraries = {"pcap"} >> ) >> public interface pcap_mapping { >> >> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >> int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); >> >> @FunctionalInterface >> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >> public interface pcap_handler { >> void fn(Pointer buf, Pointer pkthdr, Pointer arg) throws IllegalAccessException; >> } >> >> } >> >> Regards >> >> >>> On 20 May 2019, at 20.30, Jorn Vernee wrote: >>> >>> Hi, >>> >>>> It looks like pamana copy value from 'u_char *' into 'Pointer >>>> buf', correct or not? >>> Pointer is just a wrapper for the native pointer value. Only the value of the pointer itself is copied, not the contents of the memory it points to. >>> >>>> Because when I call buf.addr() it returns 0, >>>> what zero mean? >>> Zero means NULL. >>> >>>> How I can do with no copy? Just returns the address of 'u_char *' like >>>> below in callback method? >>> Pointer::addr returns the value of the pointer, to get the address of the pointer value itself, ? la: >>> >>> uchar_t **bufRef = &buf; >>> >>> This is currently not possible. >>> >>> Can you explain a bit more about your use case? We might be able to offer some suggestions. >>> >>> Cheers, >>> Jorn >>> >>> Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: >>>> Hi, >>>> There is native function in libpcap library: typedef void >>>> (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); >>>> When I use jextract it generate code like below: >>>> @NativeHeader(...) >>>> public interface pcap_h { >>>> ... >>>> @NativeLocation(...) >>>> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >>>> int pcap_loop(Pointer p, int cnt, Callback >>>> callback, Pointer usr); >>>> @FunctionalInterface >>>> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >>>> interface pcap_handler { >>>> void fn(Pointer buf, Pointer pkthdr, >>>> Pointer usr) throws IllegalAccessException; >>>> } >>>> ... >>>> } >>>> It looks like pamana copy value from 'u_char *' into 'Pointer >>>> buf', correct or not? Because when I call buf.addr() it returns 0, >>>> what zero mean? >>>> How I can do with no copy? Just returns the address of 'u_char *' like >>>> below in callback method? >>>> void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, >>>> Pointer usr) throws IllegalAccessException; >>>> Thanks and Regards, >>>> Ardika Rommy Sanjaya From ardikars at gmail.com Tue May 21 04:55:59 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Tue, 21 May 2019 11:55:59 +0700 Subject: Implement no copy memory In-Reply-To: <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> Message-ID: Update > On 21 May 2019, at 11.26, Ardika Rommy Sanjaya wrote: > >> how are you calling the function? From the pcap doc, I see this: >> >> void got_packet(u_char *args, const struct pcap_pkthdr *header, >> const u_char *packet); > Sorry, my mistake, fixed. > > I'm using sun.misc.Unsafe to get the content data and it's deprecated on java 9+ also jdk.intenal.misc.Unsafe not exported. > If using panama should I cast the Pointer before get the content data? Aside using direct ByteBuffer. > > For example if I want to get 2 byte and then 4 byte with Unsafe I can do like below: > > Unsafe UNSAFE = ... > > UNSAFE.getShort(..) > UNSAFE.getInt(...) > > Thanks and Regards > > > > >> On 20 May 2019, at 21.21, Maurizio Cimadamore > wrote: >> >> Hi, >> how are you calling the function? From the pcap doc, I see this: >> >> void got_packet(u_char *args, const struct pcap_pkthdr *header, >> const u_char *packet); >> Let's examine this in more detail. First, you'll notice that the function has a void return type. This is logical, because pcap_loop() wouldn't know how to handle a return value anyway. The first argument corresponds to the last argument of pcap_loop(). >> >> So, if you see that buf.addr() == 0, that probably means you passed NULL as last parameter of the pcap_loop call? >> >> Maurizio >> >> On 20/05/2019 14:56, Ardika Rommy Sanjaya wrote: >>> Thanks for response, I really appreciate. >>> >>> I want when every pcap_hander::fn callback function is called, buf::addr() returns the address of u_char *, after that I can use sun.misc.Unsafe to get the content data of u_char * based on that address. >>> >>> Please refer below code: >>> >>> This is libpcap api: >>> >>> int pcap_loop(pcap_t *, int, pcap_handler, u_char *); >>> typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, >>> const u_char *); >>> >>> >>> Below is java with part, generated by jextract with little changes: >>> >>> @NativeHeader( >>> libraries = {"pcap"} >>> ) >>> public interface pcap_mapping { >>> >>> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >>> int pcap_loop(Pointer p, int cnt, Callback callback, Pointer usr); >>> >>> @FunctionalInterface >>> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >>> public interface pcap_handler { >>> void fn(Pointer buf, Pointer pkthdr, Pointer arg) throws IllegalAccessException; >>> } >>> >>> } >>> >>> Regards >>> >>> >>>> On 20 May 2019, at 20.30, Jorn Vernee wrote: >>>> >>>> Hi, >>>> >>>>> It looks like pamana copy value from 'u_char *' into 'Pointer >>>>> buf', correct or not? >>>> Pointer is just a wrapper for the native pointer value. Only the value of the pointer itself is copied, not the contents of the memory it points to. >>>> >>>>> Because when I call buf.addr() it returns 0, >>>>> what zero mean? >>>> Zero means NULL. >>>> >>>>> How I can do with no copy? Just returns the address of 'u_char *' like >>>>> below in callback method? >>>> Pointer::addr returns the value of the pointer, to get the address of the pointer value itself, ? la: >>>> >>>> uchar_t **bufRef = &buf; >>>> >>>> This is currently not possible. >>>> >>>> Can you explain a bit more about your use case? We might be able to offer some suggestions. >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Ardika Rommy Sanjaya schreef op 2019-05-20 01:33: >>>>> Hi, >>>>> There is native function in libpcap library: typedef void >>>>> (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *); >>>>> When I use jextract it generate code like below: >>>>> @NativeHeader(...) >>>>> public interface pcap_h { >>>>> ... >>>>> @NativeLocation(...) >>>>> @NativeFunction("(u64:${pcap}i32u64:(u64:u8u64:${pcap_pkthdr}u64:u8)vu64:u8)i32") >>>>> int pcap_loop(Pointer p, int cnt, Callback >>>>> callback, Pointer usr); >>>>> @FunctionalInterface >>>>> @NativeCallback("(u64:u8u64:${pcap_pkthdr}u64:u8)v") >>>>> interface pcap_handler { >>>>> void fn(Pointer buf, Pointer pkthdr, >>>>> Pointer usr) throws IllegalAccessException; >>>>> } >>>>> ... >>>>> } >>>>> It looks like pamana copy value from 'u_char *' into 'Pointer >>>>> buf', correct or not? Because when I call buf.addr() it returns 0, >>>>> what zero mean? >>>>> How I can do with no copy? Just returns the address of 'u_char *' like >>>>> below in callback method? >>>>> void fn(long memoryAddressOf_u_char_ptr, Pointer pkthdr, >>>>> Pointer usr) throws IllegalAccessException; >>>>> Thanks and Regards, >>>>> Ardika Rommy Sanjaya > From nick.gasson at arm.com Tue May 21 07:29:18 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 21 May 2019 15:29:18 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> Message-ID: Hi Maurizio On 20/05/2019 19:04, Maurizio Cimadamore wrote: > > I guess that means that, e.g. if your native function takes nothing and > returns a big struct, we will need to make a long array that is big > enough to contain all register values (which are unused in this case) > and fill it with SKIP steps, just in order to use the last step? Yes, that's exactly what happens. So the shuffle recipe looks like: ShuffleRecipe: { Operations: { [STOP, STOP, STOP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, PULL, STOP, STOP, STOP, STOP, STOP, PULL, STOP, STOP] } } It works but it feels quite hacky. > > I think that could work - but at the same time, patch (2) seem more or > less in the spirit of what was done with X87 registers (which are only > supported on Intel platforms), so I can also live with that one, which > has the advantage of being cleaner. OK, I think patch [2] is neater as well. I'll squash that onto the main patch. > > I wouldn't bother with DirectInvoker for now; you can enable it in a > follow up patch if you find out what the issue is, and want to fix it - > but as you say, it's probably better to focus on linkToNative which is > gonna be the future here. > So the error I get is: java.lang.IllegalArgumentException: Invalid size: 36 at java.base/jdk.internal.foreign.memory.BoundedPointer.unsafeGetBits(BoundedPointer.java:331) at java.base/jdk.internal.foreign.memory.BoundedPointer.getBits(BoundedPointer.java:160) at java.base/jdk.internal.foreign.abi.DirectSignatureShuffler.structToLong(DirectSignatureShuffler.java:343) This happens because for >16 byte structs we now have an integer argument register that holds a pointer to the struct rather than the struct's data packed into that register. I can fix it with the simple change below, if that's acceptable? @@ -410,12 +410,17 @@ public class DirectSignatureShuffler { switch (binding.storage().getStorageClass()) { case X87_RETURN_REGISTER: case STACK_ARGUMENT_SLOT: + case INDIRECT_RESULT_REGISTER: //arguments passed in memory not supported return false; case VECTOR_ARGUMENT_REGISTER: case VECTOR_RETURN_REGISTER: //avoid passing around floats as doubles as that leads to trouble return (binding.argument().layout().bitsSize() / 8) == binding.storage().getSize(); + case INTEGER_ARGUMENT_REGISTER: + // On some platforms large by-value structures are passed by + // pointer in integer argument registers + return (binding.argument().layout().bitsSize() / 8) <= binding.storage().getSize(); default: return true; } Thanks, Nick From maurizio.cimadamore at oracle.com Tue May 21 08:14:49 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 09:14:49 +0100 Subject: Implement no copy memory In-Reply-To: <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> Message-ID: <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> On 21/05/2019 05:26, Ardika Rommy Sanjaya wrote: > If using panama should I cast the Pointer before get the content > data? In Panama you would do this: Pointer buf = ... Pointer ps = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT16); short s = ps.get(); or... Pointer pi = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT32); int i = pi.get(); Note that you have to cast through VOID first, otherwise the Panama runtime will complain about incompatible cast. I believe we can relax these rules for cast compatibility a bit - we have some ideas in mind there [1]. [1] - https://mail.openjdk.java.net/pipermail/panama-dev/2019-February/004665.html > > For example if I want to get 2 byte and then 4 byte with Unsafe I can > do like below: > > Unsafe UNSAFE = ... > > UNSAFE.getShort(..) > UNSAFE.getInt(...) > From maurizio.cimadamore at oracle.com Tue May 21 08:17:15 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 09:17:15 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> Message-ID: <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> On 21/05/2019 08:29, Nick Gasson wrote: > Hi Maurizio > > On 20/05/2019 19:04, Maurizio Cimadamore wrote: >> I guess that means that, e.g. if your native function takes nothing and >> returns a big struct, we will need to make a long array that is big >> enough to contain all register values (which are unused in this case) >> and fill it with SKIP steps, just in order to use the last step? > Yes, that's exactly what happens. So the shuffle recipe looks like: > > ShuffleRecipe: { > Operations: { > [STOP, STOP, STOP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, SKIP, PULL, STOP, STOP, STOP, STOP, STOP, PULL, STOP, STOP] } > } > > It works but it feels quite hacky. > >> I think that could work - but at the same time, patch (2) seem more or >> less in the spirit of what was done with X87 registers (which are only >> supported on Intel platforms), so I can also live with that one, which >> has the advantage of being cleaner. > OK, I think patch [2] is neater as well. I'll squash that onto the main > patch. Cool! >> I wouldn't bother with DirectInvoker for now; you can enable it in a >> follow up patch if you find out what the issue is, and want to fix it - >> but as you say, it's probably better to focus on linkToNative which is >> gonna be the future here. >> > So the error I get is: > > java.lang.IllegalArgumentException: Invalid size: 36 > at java.base/jdk.internal.foreign.memory.BoundedPointer.unsafeGetBits(BoundedPointer.java:331) > at java.base/jdk.internal.foreign.memory.BoundedPointer.getBits(BoundedPointer.java:160) > at java.base/jdk.internal.foreign.abi.DirectSignatureShuffler.structToLong(DirectSignatureShuffler.java:343) > > This happens because for >16 byte structs we now have an integer > argument register that holds a pointer to the struct rather than the > struct's data packed into that register. I can fix it with the simple > change below, if that's acceptable? > > @@ -410,12 +410,17 @@ public class DirectSignatureShuffler { > switch (binding.storage().getStorageClass()) { > case X87_RETURN_REGISTER: > case STACK_ARGUMENT_SLOT: > + case INDIRECT_RESULT_REGISTER: > //arguments passed in memory not supported > return false; > case VECTOR_ARGUMENT_REGISTER: > case VECTOR_RETURN_REGISTER: > //avoid passing around floats as doubles as that leads to trouble > return (binding.argument().layout().bitsSize() / 8) == binding.storage().getSize(); > + case INTEGER_ARGUMENT_REGISTER: > + // On some platforms large by-value structures are passed by > + // pointer in integer argument registers > + return (binding.argument().layout().bitsSize() / 8) <= binding.storage().getSize(); > default: > return true; > } Yep - that will disable the 'direct' approach when you have big structs in the indirect register, which is probably for the best (for now). > > Thanks, > Nick From maurizio.cimadamore at oracle.com Tue May 21 08:25:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 09:25:31 +0100 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> Message-ID: Looks very good - couple of comments: * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. * it seem to me that if the parser catches the index exception and then wraps? it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? Maurizio On 20/05/2019 21:53, Henry Jen wrote: > Hi, > > Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. > Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. > > Cheers, > Henry > > [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ > [2] https://bugs.openjdk.java.net/browse/JDK-8224244 From fweimer at redhat.com Tue May 21 08:49:16 2019 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 21 May 2019 10:49:16 +0200 Subject: Implement no copy memory In-Reply-To: <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> (Maurizio Cimadamore's message of "Tue, 21 May 2019 09:14:49 +0100") References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> Message-ID: <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> * Maurizio Cimadamore: > On 21/05/2019 05:26, Ardika Rommy Sanjaya wrote: >> If using panama should I cast the Pointer before get the >> content data? > > In Panama you would do this: > > Pointer buf = ... > > Pointer ps = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT16); > short s = ps.get(); > > or... > > Pointer pi = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT32); > int i = pi.get(); > > Note that you have to cast through VOID first, otherwise the Panama > runtime will complain about incompatible cast. Wouldn't it make sense for packet parsing to construct a ByteBuffer first, and do the parsing using that? Thanks, Florian From maurizio.cimadamore at oracle.com Tue May 21 09:40:54 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 10:40:54 +0100 Subject: Implement no copy memory In-Reply-To: <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> Message-ID: On 21/05/2019 09:49, Florian Weimer wrote: > * Maurizio Cimadamore: > >> On 21/05/2019 05:26, Ardika Rommy Sanjaya wrote: >>> If using panama should I cast the Pointer before get the >>> content data? >> In Panama you would do this: >> >> Pointer buf = ... >> >> Pointer ps = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT16); >> short s = ps.get(); >> >> or... >> >> Pointer pi = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT32); >> int i = pi.get(); >> >> Note that you have to cast through VOID first, otherwise the Panama >> runtime will complain about incompatible cast. > Wouldn't it make sense for packet parsing to construct a ByteBuffer > first, and do the parsing using that? It really depends on how low-level you want to go. If it's packet parsing what you want to do, it might be possible to describe the layout of the packet you want to parse using a 'Layout' and then you could overlay a custom strcut layout onto the pointer, and then access memory via ordinary getter calls. If you want to go the byte-buffer way, there's a way to do that too, using the Pointer::asDirectByteBuffer method. http://hg.openjdk.java.net/panama/dev/file/c65ecdbf4155/src/java.base/share/classes/java/foreign/memory/Pointer.java#l217 The new memory access-centric API we're working on will also be solid option, moving forward, to read (or write) from native memory - through the VarHandle API [1]. With that API you can build a VarHandle to dereference an address at a given layout element - so if you know the layout of the region you are accessing, you can get a bunch of VarHandle to do the read/write and the JIT will be able to get quite good peak performances (we're currently some 15% behind raw usage of Unsafe, with room to improve). [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html > > Thanks, > Florian From jbvernee at xs4all.nl Tue May 21 09:59:45 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 11:59:45 +0200 Subject: [foreign] RFR 8224478: TestSrcDump fails on Windows Message-ID: <829b11309607858a9f03ca0e619631e8@xs4all.nl> Hi, After recent patch that adds a test that creates a symlink [1]. I'm seeing a few test failures. Creating symlinks is a priviledges action on Windows, so this test fails with a FileSystemException, and this in turn seems to cause some other tests to fail as well. Bug: https://bugs.openjdk.java.net/browse/JDK-8224478 Webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224478/webrev.00/ This moves the symlink test into it's own jtreg test file, which uses `@requires os.family != "windows"`. Thanks, Jorn [1] : https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005438.html From maurizio.cimadamore at oracle.com Tue May 21 10:01:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 11:01:31 +0100 Subject: [foreign] RFR 8224478: TestSrcDump fails on Windows In-Reply-To: <829b11309607858a9f03ca0e619631e8@xs4all.nl> References: <829b11309607858a9f03ca0e619631e8@xs4all.nl> Message-ID: <131d694a-7e7d-4c76-f1ef-74c325ea6a92@oracle.com> Looks good! Maurizio On 21/05/2019 10:59, Jorn Vernee wrote: > Hi, > > After recent patch that adds a test that creates a symlink [1]. I'm > seeing a few test failures. > > Creating symlinks is a priviledges action on Windows, so this test > fails with a FileSystemException, and this in turn seems to cause some > other tests to fail as well. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224478 > Webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224478/webrev.00/ > > This moves the symlink test into it's own jtreg test file, which uses > `@requires os.family != "windows"`. > > Thanks, > Jorn > > [1] : > https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005438.html From jbvernee at xs4all.nl Tue May 21 10:08:42 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Tue, 21 May 2019 10:08:42 +0000 Subject: hg: panama/dev: 8224478: TestSrcDump fails on Windows Message-ID: <201905211008.x4LA8hII023717@aojmv0008.oracle.com> Changeset: 5ea3089be5ac Author: jvernee Date: 2019-05-21 12:08 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/5ea3089be5ac 8224478: TestSrcDump fails on Windows Reviewed-by: mcimadamore ! test/jdk/com/sun/tools/jextract/TestSrcDump.java + test/jdk/com/sun/tools/jextract/TestSymlink.java From maurizio.cimadamore at oracle.com Tue May 21 10:14:51 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Tue, 21 May 2019 10:14:51 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905211014.x4LAEpK8029695@aojmv0008.oracle.com> Changeset: 03bdb326a3bc Author: mcimadamore Date: 2019-05-21 12:14 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/03bdb326a3bc Automatic merge with foreign From maurizio.cimadamore at oracle.com Tue May 21 11:52:38 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 12:52:38 +0100 Subject: [foreign-memaccess] RFR 8224483: Split MemoryAddress into separate address/region abstractions Message-ID: <82361d8b-b55b-e6c1-34cc-ed7d7e580a8b@oracle.com> Following discussion at [1], I decided to go ahead and split MemoryAddres into two abstractions: * MemroyAddress will now embody an offest into a... * MemorySegment, which represent a contiguous region of memory The name MemorySegment came from an internal discussion with Brian, where he pointed out that MemoryScope and MemoryRegion where a bit too overlapping, in the english sense of the word. MemorySegment suggests something that has lower/upper boundaries. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224483/ There are many changes since the last patch discussed - mostly because, as I was adding the lower-level allocate methods in MemoryScope I realized that we were missing a lot of checks (w.r.t. alignment and sizes). I've added those and added tests for these as well (2 new tests have been added). Also, the Scope::allocate API should reflect the fact that Unsafe might fail to allocate - again added a test for this. Maurizio From maurizio.cimadamore at oracle.com Tue May 21 12:00:20 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 13:00:20 +0100 Subject: [foreign] RFR 8224478: TestSrcDump fails on Windows In-Reply-To: <131d694a-7e7d-4c76-f1ef-74c325ea6a92@oracle.com> References: <829b11309607858a9f03ca0e619631e8@xs4all.nl> <131d694a-7e7d-4c76-f1ef-74c325ea6a92@oracle.com> Message-ID: <742e5119-5268-2719-19db-ce9615d3fef4@oracle.com> Question: why was this test passing on our build/test infra? ----------System.out:(11/332)---------- [TestNG] Running: com/sun/tools/jextract/TestSrcDump.java test TestSrcDump.testNoPkg(): success test TestSrcDump.testTargetDir(): success =============================================== com/sun/tools/jextract/TestSrcDump.java Total tests run: 2, Failures: 0, Skips: 0 =============================================== This is on Win X64, obtained yesterday. Maurizio On 21/05/2019 11:01, Maurizio Cimadamore wrote: > Looks good! > > Maurizio > > On 21/05/2019 10:59, Jorn Vernee wrote: >> Hi, >> >> After recent patch that adds a test that creates a symlink [1]. I'm >> seeing a few test failures. >> >> Creating symlinks is a priviledges action on Windows, so this test >> fails with a FileSystemException, and this in turn seems to cause >> some other tests to fail as well. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224478 >> Webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224478/webrev.00/ >> >> This moves the symlink test into it's own jtreg test file, which uses >> `@requires os.family != "windows"`. >> >> Thanks, >> Jorn >> >> [1] : >> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005438.html From ardikars at gmail.com Tue May 21 12:09:17 2019 From: ardikars at gmail.com (Ardika Rommy Sanjaya) Date: Tue, 21 May 2019 19:09:17 +0700 Subject: Implement no copy memory In-Reply-To: References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> Message-ID: Hi, I want to do packet parsing manualy (not using Layout). >> Wouldn't it make sense for packet parsing to construct a ByteBuffer >> first, and do the parsing using that? Yes, I saw that DirectByteBuffer using jdk.internal.misc.Unsafe. But why jdk.internal.misc.Unsafe is not exported in module-info.java? is it because security reason? Hopefully one day it will exported, because sun.misc.Unsafe is deprecated. Thanks in advanced > On 21 May 2019, at 16.40, Maurizio Cimadamore wrote: > > > On 21/05/2019 09:49, Florian Weimer wrote: >> * Maurizio Cimadamore: >> >>> On 21/05/2019 05:26, Ardika Rommy Sanjaya wrote: >>>> If using panama should I cast the Pointer before get the >>>> content data? >>> In Panama you would do this: >>> >>> Pointer buf = ... >>> >>> Pointer ps = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT16); >>> short s = ps.get(); >>> >>> or... >>> >>> Pointer pi = buf.cast(NativeTypes.VOID).cast(NativeTypes.UINT32); >>> int i = pi.get(); >>> >>> Note that you have to cast through VOID first, otherwise the Panama >>> runtime will complain about incompatible cast. >> Wouldn't it make sense for packet parsing to construct a ByteBuffer >> first, and do the parsing using that? > > It really depends on how low-level you want to go. If it's packet parsing what you want to do, it might be possible to describe the layout of the packet you want to parse using a 'Layout' and then you could overlay a custom strcut layout onto the pointer, and then access memory via ordinary getter calls. > > If you want to go the byte-buffer way, there's a way to do that too, using the Pointer::asDirectByteBuffer method. > > http://hg.openjdk.java.net/panama/dev/file/c65ecdbf4155/src/java.base/share/classes/java/foreign/memory/Pointer.java#l217 > > The new memory access-centric API we're working on will also be solid option, moving forward, to read (or write) from native memory - through the VarHandle API [1]. With that API you can build a VarHandle to dereference an address at a given layout element - so if you know the layout of the region you are accessing, you can get a bunch of VarHandle to do the read/write and the JIT will be able to get quite good peak performances (we're currently some 15% behind raw usage of Unsafe, with room to improve). > > [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html > >> >> Thanks, >> Florian From jbvernee at xs4all.nl Tue May 21 12:11:27 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 14:11:27 +0200 Subject: [foreign] RFR 8224478: TestSrcDump fails on Windows In-Reply-To: <742e5119-5268-2719-19db-ce9615d3fef4@oracle.com> References: <829b11309607858a9f03ca0e619631e8@xs4all.nl> <131d694a-7e7d-4c76-f1ef-74c325ea6a92@oracle.com> <742e5119-5268-2719-19db-ce9615d3fef4@oracle.com> Message-ID: <6eeeb73cb4b4e71dcb85156da06052eb@xs4all.nl> Maybe you're running the test in an elevated shell on the infra? I was getting something like "a required privilege is not held by the client". I'm on an admin account, but have things set to deny access without explicit permission, which AFAIK is the default for most Windows distros. Jorn Maurizio Cimadamore schreef op 2019-05-21 14:00: > Question: why was this test passing on our build/test infra? > > ----------System.out:(11/332)---------- > [TestNG] Running: > com/sun/tools/jextract/TestSrcDump.java > > test TestSrcDump.testNoPkg(): success > test TestSrcDump.testTargetDir(): success > > =============================================== > com/sun/tools/jextract/TestSrcDump.java > Total tests run: 2, Failures: 0, Skips: 0 > =============================================== > > This is on Win X64, obtained yesterday. > > Maurizio > > On 21/05/2019 11:01, Maurizio Cimadamore wrote: > >> Looks good! >> >> Maurizio >> >> On 21/05/2019 10:59, Jorn Vernee wrote: >> >>> Hi, >>> >>> After recent patch that adds a test that creates a symlink [1]. >>> I'm seeing a few test failures. >>> >>> Creating symlinks is a priviledges action on Windows, so this test >>> fails with a FileSystemException, and this in turn seems to cause >>> some other tests to fail as well. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224478 >>> Webrev: >>> >> > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224478/webrev.00/ >>> >>> >>> This moves the symlink test into it's own jtreg test file, which >>> uses `@requires os.family != "windows"`. >>> >>> Thanks, >>> Jorn >>> >>> [1] : >>> >> > https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005438.html From fweimer at redhat.com Tue May 21 12:11:21 2019 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 21 May 2019 14:11:21 +0200 Subject: Implement no copy memory In-Reply-To: (Ardika Rommy Sanjaya's message of "Tue, 21 May 2019 19:09:17 +0700") References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> Message-ID: <87pnocatxy.fsf@oldenburg2.str.redhat.com> * Ardika Rommy Sanjaya: > Hi, > > I want to do packet parsing manualy (not using Layout). > >> Wouldn't it make sense for packet parsing to construct a ByteBuffer >> first, and do the parsing using that? > > Yes, I saw that DirectByteBuffer using jdk.internal.misc.Unsafe. But > why jdk.internal.misc.Unsafe is not exported in module-info.java? is > it because security reason? Hopefully one day it will exported, > because sun.misc.Unsafe is deprecated. Maurizio has already mentioned the support way to get access to a byte buffer: > If you want to go the byte-buffer way, there's a way to do that too, > using the Pointer::asDirectByteBuffer method. > > http://hg.openjdk.java.net/panama/dev/file/c65ecdbf4155/src/java.base/share/classes/java/foreign/memory/Pointer.java#l217 Thanks, Florian From maurizio.cimadamore at oracle.com Tue May 21 12:26:32 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 13:26:32 +0100 Subject: Implement no copy memory In-Reply-To: References: <263f14f344e64d1d62e91ac29c0b900d@xs4all.nl> <3B9E2162-8699-4339-9F9B-19CDF9D6EB4F@gmail.com> <012fd1ff-42e8-cbe5-e5fe-d73879706227@oracle.com> <87lfz0dwfn.fsf@oldenburg2.str.redhat.com> Message-ID: <1484f0bb-a3db-90a0-366a-5552b7918e72@oracle.com> On 21/05/2019 13:09, Ardika Rommy Sanjaya wrote: > Yes, I saw that DirectByteBuffer using jdk.internal.misc.Unsafe. But > why jdk.internal.misc.Unsafe is not exported in module-info.java? is > it because security reason? > Hopefully one day it will exported, because sun.misc.Unsafe is deprecated. > jdk.internal.misc.Unsafe is not exported because is an internal API, which got encapsulated in the transition from Java 8 to Java 9. You can read all about it in this JEP: https://openjdk.java.net/jeps/260 The path forward is not in exporting jdk.internal.misc.Unsafe, but to create viable alternatives for the functionalities available in that API. Efforts such as the low level memory access API [1], or attempts at defining a safer version of Unsafe.defineAnonymousInner class [2] go in this direction. [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html [2] - https://openjdk.java.net/jeps/181 Maurizio From maurizio.cimadamore at oracle.com Tue May 21 13:14:36 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 14:14:36 +0100 Subject: [foreign] RFR 8224478: TestSrcDump fails on Windows In-Reply-To: <6eeeb73cb4b4e71dcb85156da06052eb@xs4all.nl> References: <829b11309607858a9f03ca0e619631e8@xs4all.nl> <131d694a-7e7d-4c76-f1ef-74c325ea6a92@oracle.com> <742e5119-5268-2719-19db-ce9615d3fef4@oracle.com> <6eeeb73cb4b4e71dcb85156da06052eb@xs4all.nl> Message-ID: <25b10e3b-7c73-a32c-8ae5-a8761bb366b7@oracle.com> Yeah I'm asking around about that, in case that should happen again Maurizio On 21/05/2019 13:11, Jorn Vernee wrote: > Maybe you're running the test in an elevated shell on the infra? > > I was getting something like "a required privilege is not held by the > client". > > I'm on an admin account, but have things set to deny access without > explicit permission, which AFAIK is the default for most Windows distros. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-21 14:00: >> Question: why was this test passing on our build/test infra? >> >> ----------System.out:(11/332)---------- >> [TestNG] Running: >> ? com/sun/tools/jextract/TestSrcDump.java >> >> test TestSrcDump.testNoPkg(): success >> test TestSrcDump.testTargetDir(): success >> >> =============================================== >> com/sun/tools/jextract/TestSrcDump.java >> Total tests run: 2, Failures: 0, Skips: 0 >> =============================================== >> >> This is on Win X64, obtained yesterday. >> >> Maurizio >> >> On 21/05/2019 11:01, Maurizio Cimadamore wrote: >> >>> Looks good! >>> >>> Maurizio >>> >>> On 21/05/2019 10:59, Jorn Vernee wrote: >>> >>>> Hi, >>>> >>>> After recent patch that adds a test that creates a symlink [1]. >>>> I'm seeing a few test failures. >>>> >>>> Creating symlinks is a priviledges action on Windows, so this test >>>> fails with a FileSystemException, and this in turn seems to cause >>>> some other tests to fail as well. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224478 >>>> Webrev: >>>> >>> >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224478/webrev.00/ >>>> >>>> >>>> This moves the symlink test into it's own jtreg test file, which >>>> uses `@requires os.family != "windows"`. >>>> >>>> Thanks, >>>> Jorn >>>> >>>> [1] : >>>> >>> >> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005438.html From jbvernee at xs4all.nl Tue May 21 13:41:05 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 15:41:05 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. Message-ID: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> Hi, After the recent string of benchmarking [1], I've arrived at 2 optimizations to improve the speed of the measured code path. 1.) Specialization of Struct getter MethodHandles per struct class. 2.) Implementation of RuntimeSupport::casterImpl that does a fused cast and offset operation, to avoid creating multiple Pointer objects. The benchmark: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ The optimizations: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ I've split these into 2 so that it's easier to run the benchmarks with and without the optimizations. (benchmark uses the OpenJDK's builtin framework [2]). Since we're now more eagerly instantiating the struct impl class I had to work around partial struct types, since spinning the impl requires a non-partial type and now we're spinning the impl when creating the LayouType for the struct, as opposed to on the first dereference. To do this I'm detecting whether the struct is partial in LayoutType.ofStruct, and using a Reference.OfGrumpy in the case where it can not be resolved. Tbh, I think this makes things a little more clear as well as far as where/how the exception for deref of a partial type is thrown. Results on my machine before the optimization are: Benchmark Mode Cnt Score Error Units GetStruct.jni_baseline avgt 50 14.204 ? 0.566 ns/op GetStruct.panama_get_both avgt 50 507.638 ? 19.462 ns/op GetStruct.panama_get_fieldonly avgt 50 90.236 ? 11.027 ns/op GetStruct.panama_get_structonly avgt 50 370.783 ? 13.744 ns/op And after: Benchmark Mode Cnt Score Error Units GetStruct.jni_baseline avgt 50 13.941 ? 0.485 ns/op GetStruct.panama_get_both avgt 50 41.199 ? 1.632 ns/op GetStruct.panama_get_fieldonly avgt 50 33.432 ? 1.889 ns/op GetStruct.panama_get_structonly avgt 50 13.469 ? 0.781 ns/op Where panama_get_structonly corresponds to 1., and panama_get_fieldonly corresponds to 2. For a total of about 12x speedup. Thanks, Jorn [1] : https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html [2] : https://openjdk.java.net/jeps/230 From jbvernee at xs4all.nl Tue May 21 14:17:52 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 16:17:52 +0200 Subject: [foreign-memaccess] scopes and thread confinement In-Reply-To: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> References: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> Message-ID: <9ebdc2818816b025471e14786b83c6c9@xs4all.nl> > Then the question is - what happens to resources allocated inside a > global scopes (or resources merged _into_ it) ? I see two options > here: > > 1) we could install an automagic memory collector - which frees memory > as soon as the allocated region goes out of scope > 2) we do nothing - if you allocate on the global scope no deallocation > occurs - the region stays alive until the VM exits I like 2. as well. I think memory allocated through the global scope always staying alive is more intuitive as well. (plus dangling pointer concerns when shared pointers with native later on) > Now, can we imagine a global scope implementation that require no > synchronization? I think that could be doable: > > * instead of going for a sophisticated allocation scheme which > minimizes the calls to Unsafe, we could just call Unsafe once for > every call to MemoryScope::allocate, so that no shared state is used > * we can easily remove the descendant list, and have all children > check liveness of their parent, recursively (rather than having the > parent closing sub-scopes recursively) I'm not sure about the second point. The descendant list could be removed, but forked scopes could keep it right (so no need for recursive checks)? Jorn Maurizio Cimadamore schreef op 2019-05-21 00:45: > Hi, > the MemoryScope abstraction is used to carry out 'liveness' checks - > that is, to make sure that dereference operation on MemoryAddress > occur while the address still points to a valid region. > > To be able to get to the same level of performances as > Unsafe::get/put, we need to be able to hoist the liveness check away > in the JIT; but there's a problem here: even if the JIT can see > through a bunch of code using addresses allocated by a given scope, it > has to conservatively assume that another thread might chime in, and > close the scope behind our back. > > This makes it impossible for the JIT to completely optimize away the > check. For this reason, in my document [1] I posited the existence of > a 'confined' scope, whose existence is bound to a given thread (and > maybe, in the future, a fiber). I've been thinking a lot lately on how > we'd like to expose this confinement model - I have arrived to some > conclusions, but I still have some question marks, which I'll try to > explain here. > > First, I think it's important to realize that there are two aspects to > confinement: > > 1) access to scope critical operation, such as fork, allocate and > terminal operations such as close/merge > 2) read/write access to the underlying memory allocated using a scope > > While both aspects are important, they are not _equally so_. So much > so that I'd like here to entertain the possibility that we assign each > scope a thread (or fiber) owner, which is then used to confine access > to critical scope operations, as in (1). That is - let's treat the > critical scope operation as confined - to make sure that a scope can > never be closed behind our back - but let's leave the memory region > open so that many threads can access it concurrently. After all, the > VarHandle API offers very good atomic read/write CAS-like operations, > which can be used to implement any kind of synchronization atop. (Of > course we could also expose an opt-in scope charateristics which, > additionally, confines reads and writes to the given owner thread, as > in (2), but that aspect seems less important in the scope of this > discussion). > > If we pull the string on this model, we soon encounter a road-block: > what is a global scope? While it seem reasonable for an explicitly > forked scope to have an owner thread, how can a global scope have an > owner, since it's created with the VM and dies with it? And, if the > global scope had a fictional owner (the thread which created it) how > could other threads do anything with the global scope? > > Not surprisingly, this duality between confined and global scopes is > also present in the FiberScope API (in project Loom); there, we have > two kind of FiberScope implementations, one is called FiberScopeImpl > [2] and is the default, effectively confined one (so, you will get an > exception when performing operations from another strand). The other > is called DetachedFiberScope and is shared and can be used by all > threads - it can be thought as the root in the fiber scope ownership > model. > > This is all quite similar to what we're aiming to get at - but with > one complication; in the case of FiberScope, having a shared scope > implementation is not really that problematic, as a detached scope > implementation doesn't have any mutable state and doesn't require any > kind of synchronization. But in the case of Panama scopes, there are > many things that can go wrong: > > * a scope keeps a list of all the allocated memory blocks, for reused > - if accessed concurrently the scope can get corrupted > * a scope keeps a list of all the descendants - again, if a scope is > forked concurrently bad things will happen here > > The good thing is that a global memory scope cannot be closed (like a > detached scope in Loom), so we don't have to worry about threads > concurrently calling close(). > > Now, can we imagine a global scope implementation that require no > synchronization? I think that could be doable: > > * instead of going for a sophisticated allocation scheme which > minimizes the calls to Unsafe, we could just call Unsafe once for > every call to MemoryScope::allocate, so that no shared state is used > * we can easily remove the descendant list, and have all children > check liveness of their parent, recursively (rather than having the > parent closing sub-scopes recursively) > > Then the question is - what happens to resources allocated inside a > global scopes (or resources merged _into_ it) ? I see two options > here: > > 1) we could install an automagic memory collector - which frees memory > as soon as the allocated region goes out of scope > 2) we do nothing - if you allocate on the global scope no deallocation > occurs - the region stays alive until the VM exits > > I was initially leaning towards (1); I liked the idea of having _some_ > deallocation strategy for globally allocated resources; but then I > started to realize that this semantics difference between global > scopes and regular scope was not without its own issues: > > * what happens when a memory region is 'resized'? In that case we > create a new region - but if the old one is then collected, the new > region will just contain garbage! > > * what happens when we merge into a global scope from a child scope? > We go from a deterministic deallocation behavior to a > reachability-based one; this could be very confusing for users! > > * when working with scopes we can assume the resources of a parent > will outlive those of the children - so that it is safe to copy > pointers from the parent to the children (e.g. such pointers will be > valid for as long as the children scope are alive). But if the parent > is a global scope featuring the reachability-based deallocation > described in (1), this assumption is no longer valid - if the region > allocated in the parent is deemed unreachable, it can be collected, > which means the parent region can become 'not alive' *before* the > children scope is closed!! > > So I'm starting to see the appeal of (2): a global scope is a scope > that is always alive; if it's always alive that must mean that memory > allocated inside it is never freed, and will outlive the memory > regions allocated by any other scopes. This means that developers > should use global scopes with care - knowing that stuff there will > never really be deallocated (so it only really makes sense for > 'global' memory regions). Same applies for merging a child scope into > the global scope - the associated resources will stay alive forever. > > All this seems to point at the following directions: > > 1) All forked scopes have an owner thread - fork/close/merge/allocate > can only be called within that thread > 2) Global scopes have no owner - all threads can call fork/allocate - > close/merge are forbidden here > 3) Underlying memory access is not restricted to owner thread - > multiple thread can synchronize (e.g. with CAS) > 3b) If we want to we can implement full confined memory access - e.g. > allow memory access only within the boundaries of the owner thread > > Comments? > > Maurizio > > > [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html > [2] - > https://hg.openjdk.java.net/loom/loom/file/cc783ba01af5/src/java.base/share/classes/java/lang/FiberScope.java#l636 > [3] - > https://hg.openjdk.java.net/loom/loom/file/cc783ba01af5/src/java.base/share/classes/java/lang/FiberScope.java#l617 From henry.jen at oracle.com Tue May 21 15:59:13 2019 From: henry.jen at oracle.com (Henry Jen) Date: Tue, 21 May 2019 08:59:13 -0700 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> Message-ID: <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> > On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: > > Looks very good - couple of comments: > > * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: > > 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; > > That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. > Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. > * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? > I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. Cheers, Henry > Maurizio > > On 20/05/2019 21:53, Henry Jen wrote: >> Hi, >> >> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >> >> Cheers, >> Henry >> >> [1] >> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >> >> [2] >> https://bugs.openjdk.java.net/browse/JDK-8224244 From jbvernee at xs4all.nl Tue May 21 16:51:24 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 18:51:24 +0200 Subject: [foreign-memaccess] RFR 8224483: Split MemoryAddress into separate address/region abstractions In-Reply-To: <82361d8b-b55b-e6c1-34cc-ed7d7e580a8b@oracle.com> References: <82361d8b-b55b-e6c1-34cc-ed7d7e580a8b@oracle.com> Message-ID: Hi, Some comments: MemoryAddress.java - I'm wondering if it makes sense to move the ofByteBuffer method to MemorySegment? (This would also mirror the internal impl) MemoryAddressImpl.java - `((MemoryScopeImpl) segment().scope()).checkAlive()` is used in a couple of places. Maybe move this to MemorySegmentImpl, and then call segment.checkAlive()? - I think asDirectByteBuffer needs to do a liveness check as well (should probably just call checkAccess?). MemorySegmentImpl.java - OfEverything is unused and could be removed at this point. - `resize` needs to check for negative offset as well, since we're coming directly from user code. I think this check could be replaced with a call to checkRange (adding a < 0 check for the length there). Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-21 13:52: > Following discussion at [1], I decided to go ahead and split > MemoryAddres into two abstractions: > > * MemroyAddress will now embody an offest into a... > * MemorySegment, which represent a contiguous region of memory > > The name MemorySegment came from an internal discussion with Brian, > where he pointed out that MemoryScope and MemoryRegion where a bit too > overlapping, in the english sense of the word. MemorySegment suggests > something that has lower/upper boundaries. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224483/ > > There are many changes since the last patch discussed - mostly > because, as I was adding the lower-level allocate methods in > MemoryScope I realized that we were missing a lot of checks (w.r.t. > alignment and sizes). I've added those and added tests for these as > well (2 new tests have been added). Also, the Scope::allocate API > should reflect the fact that Unsafe might fail to allocate - again > added a test for this. > > Maurizio From maurizio.cimadamore at oracle.com Tue May 21 17:48:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 18:48:39 +0100 Subject: [foreign-memaccess] scopes and thread confinement In-Reply-To: <9ebdc2818816b025471e14786b83c6c9@xs4all.nl> References: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> <9ebdc2818816b025471e14786b83c6c9@xs4all.nl> Message-ID: On 21/05/2019 15:17, Jorn Vernee wrote: > I'm not sure about the second point. The descendant list could be > removed, but forked scopes could keep it right (so no need for > recursive checks)? If we don't have links from parents to children, when you close the parent it won't be possible to trigger close of the children. Instead, children will have to look back at the parent and see if that has been closed. Do we agree? Maurizio From maurizio.cimadamore at oracle.com Tue May 21 17:50:52 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 18:50:52 +0100 Subject: [foreign-memaccess] RFR 8224483: Split MemoryAddress into separate address/region abstractions In-Reply-To: References: <82361d8b-b55b-e6c1-34cc-ed7d7e580a8b@oracle.com> Message-ID: <834ac1ac-4b0e-c7bd-6b67-f174a8eba8c4@oracle.com> On 21/05/2019 17:51, Jorn Vernee wrote: > Hi, > > Some comments: > > MemoryAddress.java > - I'm wondering if it makes sense to move the ofByteBuffer method to > MemorySegment? (This would also mirror the internal impl) Was thinking that too - probably it makes sense, yes. > > MemoryAddressImpl.java > - `((MemoryScopeImpl) segment().scope()).checkAlive()` is used in a > couple of places. Maybe move this to MemorySegmentImpl, and then call > segment.checkAlive()? ok > - I think asDirectByteBuffer needs to do a liveness check as well > (should probably just call checkAccess?). yep - this part is untested as of yet (as we need to do more validation) but I agree. > > MemorySegmentImpl.java > - OfEverything is unused and could be removed at this point. I thought I did... probably forgot to do the last bit :-) > - `resize` needs to check for negative offset as well, since we're > coming directly from user code. I think this check could be replaced > with a call to checkRange (adding a < 0 check for the length there). Good point. Thanks Maurizio > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-21 13:52: >> Following discussion at [1], I decided to go ahead and split >> MemoryAddres into two abstractions: >> >> * MemroyAddress will now embody an offest into a... >> * MemorySegment, which represent a contiguous region of memory >> >> The name MemorySegment came from an internal discussion with Brian, >> where he pointed out that MemoryScope and MemoryRegion where a bit too >> overlapping, in the english sense of the word. MemorySegment suggests >> something that has lower/upper boundaries. >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224483/ >> >> There are many changes since the last patch discussed - mostly >> because, as I was adding the lower-level allocate methods in >> MemoryScope I realized that we were missing a lot of checks (w.r.t. >> alignment and sizes). I've added those and added tests for these as >> well (2 new tests have been added). Also, the Scope::allocate API >> should reflect the fact that Unsafe might fail to allocate - again >> added a test for this. >> >> Maurizio From maurizio.cimadamore at oracle.com Tue May 21 18:09:20 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 19:09:20 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> Message-ID: <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> Looks good, although I'm a bit worried about the change in semantics w.r.t. eager instantiation. The binder will create a lot of LayoutTypes when generating the implementation - I wonder there were cases before where we created a partial layout type, which then got resolved correctly by the time it was dereferenced (since we do another resolve lazily in StructImplGenerator [1]). [1] - http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 On 21/05/2019 14:41, Jorn Vernee wrote: > Hi, > > After the recent string of benchmarking [1], I've arrived at 2 > optimizations to improve the speed of the measured code path. > > 1.) Specialization of Struct getter MethodHandles per struct class. > 2.) Implementation of RuntimeSupport::casterImpl that does a fused > cast and offset operation, to avoid creating multiple Pointer objects. > > The benchmark: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ > The optimizations: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ > > I've split these into 2 so that it's easier to run the benchmarks with > and without the optimizations. (benchmark uses the OpenJDK's builtin > framework [2]). > > Since we're now more eagerly instantiating the struct impl class I had > to work around partial struct types, since spinning the impl requires > a non-partial type and now we're spinning the impl when creating the > LayouType for the struct, as opposed to on the first dereference. To > do this I'm detecting whether the struct is partial in > LayoutType.ofStruct, and using a Reference.OfGrumpy in the case where > it can not be resolved. Tbh, I think this makes things a little more > clear as well as far as where/how the exception for deref of a partial > type is thrown. > > Results on my machine before the optimization are: > > Benchmark??????????????????????? Mode? Cnt??? Score??? Error Units > GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ?? 0.566 ns/op > GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op > GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op > GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op > > And after: > > Benchmark??????????????????????? Mode? Cnt?? Score?? Error? Units > GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485? ns/op > GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632? ns/op > GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889? ns/op > GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781? ns/op > > Where panama_get_structonly corresponds to 1., and > panama_get_fieldonly corresponds to 2. For a total of about 12x speedup. > > Thanks, > Jorn > > [1] : > https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html > [2] : https://openjdk.java.net/jeps/230 From jbvernee at xs4all.nl Tue May 21 18:44:15 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 20:44:15 +0200 Subject: [foreign-memaccess] scopes and thread confinement In-Reply-To: References: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> <9ebdc2818816b025471e14786b83c6c9@xs4all.nl> Message-ID: Sure, so the global scope implementation will not have this list, but it can not be closed any ways. Then the forked scope implementations will have this list, they can be closed, and use the list to close the children as well. So there would still be a link from parents to children _except_ for the global scope Right? Jorn Maurizio Cimadamore schreef op 2019-05-21 19:48: > On 21/05/2019 15:17, Jorn Vernee wrote: >> I'm not sure about the second point. The descendant list could be >> removed, but forked scopes could keep it right (so no need for >> recursive checks)? > > If we don't have links from parents to children, when you close the > parent it won't be possible to trigger close of the children. Instead, > children will have to look back at the parent and see if that has been > closed. Do we agree? > > Maurizio From jbvernee at xs4all.nl Tue May 21 19:06:54 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 21:06:54 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> Message-ID: Since we have the resolution context for NativeHeader, AFAIK there is no more difference between the resolution call done by StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So I don't think there are any more cases where we would have succeeded to resolve the Struct layout be delaying spinning the impl. At least the tests haven't caught such a case. The other thing is that the partial layout for the getter is caught in StructImplGenerator, but for the setter it's caught when calling bitSize on Unresolved. Saying layouts should be able to be resolved when calling LayoutType.ofStruct means we can use References.OfGrumpy, which makes the two more uniform. I have some ideas for keeping the lazy init semantics, but it's a bit more complex (using a MutableCallSite to mimic indy), and I'm not sure it will work as well. And, well, there was some talk about eagerly spinning the implementations any ways :) Jorn Maurizio Cimadamore schreef op 2019-05-21 20:09: > Looks good, although I'm a bit worried about the change in semantics > w.r.t. eager instantiation. The binder will create a lot of > LayoutTypes when generating the implementation - I wonder there were > cases before where we created a partial layout type, which then got > resolved correctly by the time it was dereferenced (since we do > another resolve lazily in StructImplGenerator [1]). > > [1] - > http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 > > > On 21/05/2019 14:41, Jorn Vernee wrote: >> Hi, >> >> After the recent string of benchmarking [1], I've arrived at 2 >> optimizations to improve the speed of the measured code path. >> >> 1.) Specialization of Struct getter MethodHandles per struct class. >> 2.) Implementation of RuntimeSupport::casterImpl that does a fused >> cast and offset operation, to avoid creating multiple Pointer objects. >> >> The benchmark: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >> The optimizations: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >> >> I've split these into 2 so that it's easier to run the benchmarks with >> and without the optimizations. (benchmark uses the OpenJDK's builtin >> framework [2]). >> >> Since we're now more eagerly instantiating the struct impl class I had >> to work around partial struct types, since spinning the impl requires >> a non-partial type and now we're spinning the impl when creating the >> LayouType for the struct, as opposed to on the first dereference. To >> do this I'm detecting whether the struct is partial in >> LayoutType.ofStruct, and using a Reference.OfGrumpy in the case where >> it can not be resolved. Tbh, I think this makes things a little more >> clear as well as far as where/how the exception for deref of a partial >> type is thrown. >> >> Results on my machine before the optimization are: >> >> Benchmark??????????????????????? Mode? Cnt??? Score??? Error Units >> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ?? 0.566 ns/op >> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op >> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op >> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op >> >> And after: >> >> Benchmark??????????????????????? Mode? Cnt?? Score?? Error? Units >> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485? ns/op >> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632? ns/op >> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889? ns/op >> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781? ns/op >> >> Where panama_get_structonly corresponds to 1., and >> panama_get_fieldonly corresponds to 2. For a total of about 12x >> speedup. >> >> Thanks, >> Jorn >> >> [1] : >> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >> [2] : https://openjdk.java.net/jeps/230 From jbvernee at xs4all.nl Tue May 21 19:16:25 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 21 May 2019 21:16:25 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> Message-ID: Although, now that you bring it up, I tried re-running some of the samples (hadn't done that yet), and I'm seeing some infinite recursion. This is seemingly caused by a circular type reference (e.g. linked list). i.e. to spin the impl of an accessor we need the LayoutType of the struct itself, which then tries to spin the impl again, and so on. I guess this isn't a test case in our suite yet... I'll look into this. Thanks, Jorn Jorn Vernee schreef op 2019-05-21 21:06: > Since we have the resolution context for NativeHeader, AFAIK there is > no more difference between the resolution call done by > StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So > I don't think there are any more cases where we would have succeeded > to resolve the Struct layout be delaying spinning the impl. At least > the tests haven't caught such a case. > > The other thing is that the partial layout for the getter is caught in > StructImplGenerator, but for the setter it's caught when calling > bitSize on Unresolved. Saying layouts should be able to be resolved > when calling LayoutType.ofStruct means we can use References.OfGrumpy, > which makes the two more uniform. > > I have some ideas for keeping the lazy init semantics, but it's a bit > more complex (using a MutableCallSite to mimic indy), and I'm not sure > it will work as well. > > And, well, there was some talk about eagerly spinning the > implementations any ways :) > > Jorn > > Maurizio Cimadamore schreef op 2019-05-21 20:09: >> Looks good, although I'm a bit worried about the change in semantics >> w.r.t. eager instantiation. The binder will create a lot of >> LayoutTypes when generating the implementation - I wonder there were >> cases before where we created a partial layout type, which then got >> resolved correctly by the time it was dereferenced (since we do >> another resolve lazily in StructImplGenerator [1]). >> >> [1] - >> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >> >> >> On 21/05/2019 14:41, Jorn Vernee wrote: >>> Hi, >>> >>> After the recent string of benchmarking [1], I've arrived at 2 >>> optimizations to improve the speed of the measured code path. >>> >>> 1.) Specialization of Struct getter MethodHandles per struct class. >>> 2.) Implementation of RuntimeSupport::casterImpl that does a fused >>> cast and offset operation, to avoid creating multiple Pointer >>> objects. >>> >>> The benchmark: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>> The optimizations: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>> >>> I've split these into 2 so that it's easier to run the benchmarks >>> with and without the optimizations. (benchmark uses the OpenJDK's >>> builtin framework [2]). >>> >>> Since we're now more eagerly instantiating the struct impl class I >>> had to work around partial struct types, since spinning the impl >>> requires a non-partial type and now we're spinning the impl when >>> creating the LayouType for the struct, as opposed to on the first >>> dereference. To do this I'm detecting whether the struct is partial >>> in LayoutType.ofStruct, and using a Reference.OfGrumpy in the case >>> where it can not be resolved. Tbh, I think this makes things a little >>> more clear as well as far as where/how the exception for deref of a >>> partial type is thrown. >>> >>> Results on my machine before the optimization are: >>> >>> Benchmark??????????????????????? Mode? Cnt??? Score??? Error Units >>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ?? 0.566 ns/op >>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op >>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op >>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op >>> >>> And after: >>> >>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error? Units >>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485? ns/op >>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632? ns/op >>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889? ns/op >>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781? ns/op >>> >>> Where panama_get_structonly corresponds to 1., and >>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>> speedup. >>> >>> Thanks, >>> Jorn >>> >>> [1] : >>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Tue May 21 19:22:49 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 21 May 2019 20:22:49 +0100 Subject: [foreign-memaccess] scopes and thread confinement In-Reply-To: References: <39c276ed-1cbc-2de0-a44a-4867d0c61f75@oracle.com> <9ebdc2818816b025471e14786b83c6c9@xs4all.nl> Message-ID: <9b46d516-c6e6-f13c-6abc-048c7f1b63a2@oracle.com> On 21/05/2019 19:44, Jorn Vernee wrote: > Sure, so the global scope implementation will not have this list, but > it can not be closed any ways. Then the forked scope implementations > will have this list, they can be closed, and use the list to close the > children as well. > > So there would still be a link from parents to children _except_ for > the global scope Right? Ok, I see what you mean. Yes, this is another possibility. In reality I was looking into removing the descendant list also for other reasons (e.g. very expensive to set up if you just do a single allocation). But that could go either way yes. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-21 19:48: >> On 21/05/2019 15:17, Jorn Vernee wrote: >>> I'm not sure about the second point. The descendant list could be >>> removed, but forked scopes could keep it right (so no need for >>> recursive checks)? >> >> If we don't have links from parents to children, when you close the >> parent it won't be possible to trigger close of the children. Instead, >> children will have to look back at the parent and see if that has been >> closed. Do we agree? >> >> Maurizio From vivek.r.deshpande at intel.com Tue May 21 22:19:29 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 21 May 2019 22:19:29 +0000 Subject: VectorAPI: Testing for gather and scatter Masked, and Single In-Reply-To: References: Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EB999@ORSMSX106.amr.corp.intel.com> Looks good to me. Regards, Vivek -----Original Message----- From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Halimi, Jean-Philippe Sent: Friday, May 17, 2019 8:58 AM To: panama-dev at openjdk.java.net Subject: VectorAPI: Testing for gather and scatter Masked, and Single Dear all, Here are two patches implementing the testing for gather and scatter VectorAPI calls (masked), as well as single. http://cr.openjdk.java.net/~vdeshpande/VectorAPI/webrev_gatherScatter_allTypes_gatherScatterMasked2/ http://cr.openjdk.java.net/~vdeshpande/VectorAPI/webrev_single2/ Please let me know your thoughts, and I will edit or merge accordingly. :) Thanks -Jp From maurizio.cimadamore at oracle.com Tue May 21 23:15:02 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 00:15:02 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> Message-ID: On 21/05/2019 20:16, Jorn Vernee wrote: > Although, now that you bring it up, I tried re-running some of the > samples (hadn't done that yet), and I'm seeing some infinite > recursion. This is seemingly caused by a circular type reference (e.g. > linked list). i.e. to spin the impl of an accessor we need the > LayoutType of the struct itself, which then tries to spin the impl > again, and so on. I guess this isn't a test case in our suite yet... > > I'll look into this. Good detective work! I guess it would make sense to try and reduce it down to a simpler test, and push the test first. Where I was going with this is - your patch effectively made the lazy resolution inside StructImplGenerator useless. If we really want to explore that option, then we should, I think, remove all lazy resolution sites and see what happens. It is possible that we don't rely so much on laziness as we did in the past (we did some fixes few months ago which stabilized resolution quite a bit) - in which case we can remove the resolution requests, although - I have to admit - I'm a bit skeptical. After all all you need it something like this (as you say): struct foo { ??? struct foo *next; } Which is kind of the killer app for unresolved layouts in the first place. This is translated into a struct interface which has a getter of Pointer. To generate the getter you need to compute its LayoutType which is a pointer LayoutType, so you have to compute the pointee LayoutType which brings you back where you started (the whole 'foo' LayoutType). In other words, since now the creation of LayoutType requires the generation of the struct impl for 'foo' and since that depends (indirectly, through the pointer getter) on being able to produce a LayoutType, you get a circularity. One thing we could try is - instead of eagerly creating the struct impl, why don't we let the Reference.OfStruct having some mutable state in it? That is, we could start off with Reference getter which does the expensive refelective lookup - but then, once it has discovered the constructor MH, it can stash it in some field (which is private to that reference object) and use it later if the getter is used again. Then, you probably still need a ClassValue to stash a mapping between a Class and its Reference.OfStruct; but it seems like this could fit in more naturally? Maurizio > > Thanks, > Jorn > > Jorn Vernee schreef op 2019-05-21 21:06: >> Since we have the resolution context for NativeHeader, AFAIK there is >> no more difference between the resolution call done by >> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So >> I don't think there are any more cases where we would have succeeded >> to resolve the Struct layout be delaying spinning the impl. At least >> the tests haven't caught such a case. >> >> The other thing is that the partial layout for the getter is caught in >> StructImplGenerator, but for the setter it's caught when calling >> bitSize on Unresolved. Saying layouts should be able to be resolved >> when calling LayoutType.ofStruct means we can use References.OfGrumpy, >> which makes the two more uniform. >> >> I have some ideas for keeping the lazy init semantics, but it's a bit >> more complex (using a MutableCallSite to mimic indy), and I'm not sure >> it will work as well. >> >> And, well, there was some talk about eagerly spinning the >> implementations any ways :) >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>> Looks good, although I'm a bit worried about the change in semantics >>> w.r.t. eager instantiation. The binder will create a lot of >>> LayoutTypes when generating the implementation - I wonder there were >>> cases before where we created a partial layout type, which then got >>> resolved correctly by the time it was dereferenced (since we do >>> another resolve lazily in StructImplGenerator [1]). >>> >>> [1] - >>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>> >>> >>> >>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>> Hi, >>>> >>>> After the recent string of benchmarking [1], I've arrived at 2 >>>> optimizations to improve the speed of the measured code path. >>>> >>>> 1.) Specialization of Struct getter MethodHandles per struct class. >>>> 2.) Implementation of RuntimeSupport::casterImpl that does a fused >>>> cast and offset operation, to avoid creating multiple Pointer objects. >>>> >>>> The benchmark: >>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>> The optimizations: >>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>> >>>> I've split these into 2 so that it's easier to run the benchmarks >>>> with and without the optimizations. (benchmark uses the OpenJDK's >>>> builtin framework [2]). >>>> >>>> Since we're now more eagerly instantiating the struct impl class I >>>> had to work around partial struct types, since spinning the impl >>>> requires a non-partial type and now we're spinning the impl when >>>> creating the LayouType for the struct, as opposed to on the first >>>> dereference. To do this I'm detecting whether the struct is partial >>>> in LayoutType.ofStruct, and using a Reference.OfGrumpy in the case >>>> where it can not be resolved. Tbh, I think this makes things a >>>> little more clear as well as far as where/how the exception for >>>> deref of a partial type is thrown. >>>> >>>> Results on my machine before the optimization are: >>>> >>>> Benchmark??????????????????????? Mode? Cnt??? Score??? Error Units >>>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ?? 0.566 ns/op >>>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op >>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op >>>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op >>>> >>>> And after: >>>> >>>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>> >>>> Where panama_get_structonly corresponds to 1., and >>>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>>> speedup. >>>> >>>> Thanks, >>>> Jorn >>>> >>>> [1] : >>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>> >>>> [2] : https://openjdk.java.net/jeps/230 From henry.jen at oracle.com Wed May 22 04:21:12 2019 From: henry.jen at oracle.com (Henry Jen) Date: Tue, 21 May 2019 21:21:12 -0700 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> Message-ID: Updated webrev[1] change ParsingFailingException to extend RuntimeException and revert Handlers changes for declaring exception. I also have an add-on webrev[2] that instead of hard-code all built-in type, utilize reparse to find built-in types on demand. I don?t think it?s necessary better, just throw it as an option. Let me know what you think. [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/1/webrev/ [2] http://cr.openjdk.java.net/~henryjen/panama/8224244/1.1/webrev/ Cheers, Henry > On May 21, 2019, at 8:59 AM, Henry Jen wrote: > > >> On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: >> >> Looks very good - couple of comments: >> >> * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: >> >> 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; >> >> That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. >> > > Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) > > Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. > >> * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? >> > > I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. > > Cheers, > Henry > >> Maurizio >> >> On 20/05/2019 21:53, Henry Jen wrote: >>> Hi, >>> >>> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >>> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >>> >>> Cheers, >>> Henry >>> >>> [1] >>> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >>> >>> [2] >>> https://bugs.openjdk.java.net/browse/JDK-8224244 > From sundararajan.athijegannathan at oracle.com Wed May 22 04:40:13 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Wed, 22 May 2019 10:10:13 +0530 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> Message-ID: <5CE4D2AD.1020203@oracle.com> "sample" Path variable may be renamed as "jextractH" ? -Sundar On 22/05/19, 9:51 AM, Henry Jen wrote: > Updated webrev[1] change ParsingFailingException to extend RuntimeException and revert Handlers changes for declaring exception. > > I also have an add-on webrev[2] that instead of hard-code all built-in type, utilize reparse to find built-in types on demand. I don?t think it?s necessary better, just throw it as an option. > > Let me know what you think. > > [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/1/webrev/ > [2] http://cr.openjdk.java.net/~henryjen/panama/8224244/1.1/webrev/ > > Cheers, > Henry > > >> On May 21, 2019, at 8:59 AM, Henry Jen wrote: >> >> >>> On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: >>> >>> Looks very good - couple of comments: >>> >>> * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: >>> >>> 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; >>> >>> That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. >>> >> Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) >> >> Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. >> >>> * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? >>> >> I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. >> >> Cheers, >> Henry >> >>> Maurizio >>> >>> On 20/05/2019 21:53, Henry Jen wrote: >>>> Hi, >>>> >>>> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >>>> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >>>> >>>> Cheers, >>>> Henry >>>> >>>> [1] >>>> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >>>> >>>> [2] >>>> https://bugs.openjdk.java.net/browse/JDK-8224244 From maurizio.cimadamore at oracle.com Wed May 22 09:27:47 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 10:27:47 +0100 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> Message-ID: <8dcaa155-38ba-7a7d-b4c4-17d149523bea@oracle.com> I like both parts - nice job. In JextractTool, what exactly is the point of catching the parse exception and rethrow? Wouldn't that be propagated out anyway? Maurizio On 22/05/2019 05:21, Henry Jen wrote: > Updated webrev[1] change ParsingFailingException to extend RuntimeException and revert Handlers changes for declaring exception. > > I also have an add-on webrev[2] that instead of hard-code all built-in type, utilize reparse to find built-in types on demand. I don?t think it?s necessary better, just throw it as an option. > > Let me know what you think. > > [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/1/webrev/ > [2] http://cr.openjdk.java.net/~henryjen/panama/8224244/1.1/webrev/ > > Cheers, > Henry > > >> On May 21, 2019, at 8:59 AM, Henry Jen wrote: >> >> >>> On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: >>> >>> Looks very good - couple of comments: >>> >>> * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: >>> >>> 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; >>> >>> That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. >>> >> Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) >> >> Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. >> >>> * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? >>> >> I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. >> >> Cheers, >> Henry >> >>> Maurizio >>> >>> On 20/05/2019 21:53, Henry Jen wrote: >>>> Hi, >>>> >>>> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >>>> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >>>> >>>> Cheers, >>>> Henry >>>> >>>> [1] >>>> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >>>> >>>> [2] >>>> https://bugs.openjdk.java.net/browse/JDK-8224244 From jbvernee at xs4all.nl Wed May 22 09:56:06 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 22 May 2019 11:56:06 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> Message-ID: <436608f145333799ab4dc3c435425e6a@xs4all.nl> Good suggestion! This solves the problem, is nice and simple, and keeps the same times in the benchmark. Updated webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ (only changes to References.java) I've added a test for the failure. I think that can be included as well? I re-ran the samples I have as well, and this time it's all green. Thanks, Jorn Maurizio Cimadamore schreef op 2019-05-22 01:15: > On 21/05/2019 20:16, Jorn Vernee wrote: >> Although, now that you bring it up, I tried re-running some of the >> samples (hadn't done that yet), and I'm seeing some infinite >> recursion. This is seemingly caused by a circular type reference (e.g. >> linked list). i.e. to spin the impl of an accessor we need the >> LayoutType of the struct itself, which then tries to spin the impl >> again, and so on. I guess this isn't a test case in our suite yet... >> >> I'll look into this. > > Good detective work! I guess it would make sense to try and reduce it > down to a simpler test, and push the test first. > > Where I was going with this is - your patch effectively made the lazy > resolution inside StructImplGenerator useless. If we really want to > explore that option, then we should, I think, remove all lazy > resolution sites and see what happens. It is possible that we don't > rely so much on laziness as we did in the past (we did some fixes few > months ago which stabilized resolution quite a bit) - in which case we > can remove the resolution requests, although - I have to admit - I'm a > bit skeptical. After all all you need it something like this (as you > say): > > struct foo { > ??? struct foo *next; > } > > Which is kind of the killer app for unresolved layouts in the first > place. > > This is translated into a struct interface which has a getter of > Pointer. To generate the getter you need to compute its > LayoutType which is a pointer LayoutType, so you have to compute the > pointee LayoutType which brings you back where you started (the whole > 'foo' LayoutType). In other words, since now the creation of > LayoutType requires the generation of the struct impl for 'foo' > and since that depends (indirectly, through the pointer getter) on > being able to produce a LayoutType, you get a circularity. > > One thing we could try is - instead of eagerly creating the struct > impl, why don't we let the Reference.OfStruct having some mutable > state in it? That is, we could start off with Reference getter which > does the expensive refelective lookup - but then, once it has > discovered the constructor MH, it can stash it in some field (which is > private to that reference object) and use it later if the getter is > used again. Then, you probably still need a ClassValue to stash a > mapping between a Class and its Reference.OfStruct; but it seems like > this could fit in more naturally? > > Maurizio > >> >> Thanks, >> Jorn >> >> Jorn Vernee schreef op 2019-05-21 21:06: >>> Since we have the resolution context for NativeHeader, AFAIK there is >>> no more difference between the resolution call done by >>> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So >>> I don't think there are any more cases where we would have succeeded >>> to resolve the Struct layout be delaying spinning the impl. At least >>> the tests haven't caught such a case. >>> >>> The other thing is that the partial layout for the getter is caught >>> in >>> StructImplGenerator, but for the setter it's caught when calling >>> bitSize on Unresolved. Saying layouts should be able to be resolved >>> when calling LayoutType.ofStruct means we can use >>> References.OfGrumpy, >>> which makes the two more uniform. >>> >>> I have some ideas for keeping the lazy init semantics, but it's a bit >>> more complex (using a MutableCallSite to mimic indy), and I'm not >>> sure >>> it will work as well. >>> >>> And, well, there was some talk about eagerly spinning the >>> implementations any ways :) >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>> Looks good, although I'm a bit worried about the change in semantics >>>> w.r.t. eager instantiation. The binder will create a lot of >>>> LayoutTypes when generating the implementation - I wonder there were >>>> cases before where we created a partial layout type, which then got >>>> resolved correctly by the time it was dereferenced (since we do >>>> another resolve lazily in StructImplGenerator [1]). >>>> >>>> [1] - >>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>> Hi, >>>>> >>>>> After the recent string of benchmarking [1], I've arrived at 2 >>>>> optimizations to improve the speed of the measured code path. >>>>> >>>>> 1.) Specialization of Struct getter MethodHandles per struct class. >>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a fused >>>>> cast and offset operation, to avoid creating multiple Pointer >>>>> objects. >>>>> >>>>> The benchmark: >>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>> The optimizations: >>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>> >>>>> I've split these into 2 so that it's easier to run the benchmarks >>>>> with and without the optimizations. (benchmark uses the OpenJDK's >>>>> builtin framework [2]). >>>>> >>>>> Since we're now more eagerly instantiating the struct impl class I >>>>> had to work around partial struct types, since spinning the impl >>>>> requires a non-partial type and now we're spinning the impl when >>>>> creating the LayouType for the struct, as opposed to on the first >>>>> dereference. To do this I'm detecting whether the struct is partial >>>>> in LayoutType.ofStruct, and using a Reference.OfGrumpy in the case >>>>> where it can not be resolved. Tbh, I think this makes things a >>>>> little more clear as well as far as where/how the exception for >>>>> deref of a partial type is thrown. >>>>> >>>>> Results on my machine before the optimization are: >>>>> >>>>> Benchmark??????????????????????? Mode? Cnt??? Score??? Error Units >>>>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ?? 0.566 ns/op >>>>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op >>>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op >>>>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op >>>>> >>>>> And after: >>>>> >>>>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>>> >>>>> Where panama_get_structonly corresponds to 1., and >>>>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>>>> speedup. >>>>> >>>>> Thanks, >>>>> Jorn >>>>> >>>>> [1] : >>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Wed May 22 10:37:19 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 11:37:19 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <436608f145333799ab4dc3c435425e6a@xs4all.nl> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> Message-ID: <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> Looks good - yesterday I was looking at this discussion: http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html I hope we don't run in the condition described there - e.g. that there's no strong reachability from the MH we're caching back to the static ClassValue instance - because, if that would be the case I think that would prevent class unloading. The problem is that the MethodHandle we cache refers to the stuct impl class, and I believe that class refers to some LayoutTypes on its own, which have a Reference inside, so it would be: ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> ClassValue Sundar can you double check? Maurizio On 22/05/2019 10:56, Jorn Vernee wrote: > Good suggestion! This solves the problem, is nice and simple, and > keeps the same times in the benchmark. > > Updated webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ > > (only changes to References.java) > > I've added a test for the failure. I think that can be included as > well? I re-ran the samples I have as well, and this time it's all green. > > Thanks, > Jorn > > Maurizio Cimadamore schreef op 2019-05-22 01:15: >> On 21/05/2019 20:16, Jorn Vernee wrote: >>> Although, now that you bring it up, I tried re-running some of the >>> samples (hadn't done that yet), and I'm seeing some infinite >>> recursion. This is seemingly caused by a circular type reference >>> (e.g. linked list). i.e. to spin the impl of an accessor we need the >>> LayoutType of the struct itself, which then tries to spin the impl >>> again, and so on. I guess this isn't a test case in our suite yet... >>> >>> I'll look into this. >> >> Good detective work! I guess it would make sense to try and reduce it >> down to a simpler test, and push the test first. >> >> Where I was going with this is - your patch effectively made the lazy >> resolution inside StructImplGenerator useless. If we really want to >> explore that option, then we should, I think, remove all lazy >> resolution sites and see what happens. It is possible that we don't >> rely so much on laziness as we did in the past (we did some fixes few >> months ago which stabilized resolution quite a bit) - in which case we >> can remove the resolution requests, although - I have to admit - I'm a >> bit skeptical. After all all you need it something like this (as you >> say): >> >> struct foo { >> ??? struct foo *next; >> } >> >> Which is kind of the killer app for unresolved layouts in the first >> place. >> >> This is translated into a struct interface which has a getter of >> Pointer. To generate the getter you need to compute its >> LayoutType which is a pointer LayoutType, so you have to compute the >> pointee LayoutType which brings you back where you started (the whole >> 'foo' LayoutType). In other words, since now the creation of >> LayoutType requires the generation of the struct impl for 'foo' >> and since that depends (indirectly, through the pointer getter) on >> being able to produce a LayoutType, you get a circularity. >> >> One thing we could try is - instead of eagerly creating the struct >> impl, why don't we let the Reference.OfStruct having some mutable >> state in it? That is, we could start off with Reference getter which >> does the expensive refelective lookup - but then, once it has >> discovered the constructor MH, it can stash it in some field (which is >> private to that reference object) and use it later if the getter is >> used again. Then, you probably still need a ClassValue to stash a >> mapping between a Class and its Reference.OfStruct; but it seems like >> this could fit in more naturally? >> >> Maurizio >> >>> >>> Thanks, >>> Jorn >>> >>> Jorn Vernee schreef op 2019-05-21 21:06: >>>> Since we have the resolution context for NativeHeader, AFAIK there is >>>> no more difference between the resolution call done by >>>> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So >>>> I don't think there are any more cases where we would have succeeded >>>> to resolve the Struct layout be delaying spinning the impl. At least >>>> the tests haven't caught such a case. >>>> >>>> The other thing is that the partial layout for the getter is caught in >>>> StructImplGenerator, but for the setter it's caught when calling >>>> bitSize on Unresolved. Saying layouts should be able to be resolved >>>> when calling LayoutType.ofStruct means we can use References.OfGrumpy, >>>> which makes the two more uniform. >>>> >>>> I have some ideas for keeping the lazy init semantics, but it's a bit >>>> more complex (using a MutableCallSite to mimic indy), and I'm not sure >>>> it will work as well. >>>> >>>> And, well, there was some talk about eagerly spinning the >>>> implementations any ways :) >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>> Looks good, although I'm a bit worried about the change in semantics >>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>> LayoutTypes when generating the implementation - I wonder there were >>>>> cases before where we created a partial layout type, which then got >>>>> resolved correctly by the time it was dereferenced (since we do >>>>> another resolve lazily in StructImplGenerator [1]). >>>>> >>>>> [1] - >>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>> Hi, >>>>>> >>>>>> After the recent string of benchmarking [1], I've arrived at 2 >>>>>> optimizations to improve the speed of the measured code path. >>>>>> >>>>>> 1.) Specialization of Struct getter MethodHandles per struct class. >>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>> fused cast and offset operation, to avoid creating multiple >>>>>> Pointer objects. >>>>>> >>>>>> The benchmark: >>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>> The optimizations: >>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>> >>>>>> I've split these into 2 so that it's easier to run the benchmarks >>>>>> with and without the optimizations. (benchmark uses the OpenJDK's >>>>>> builtin framework [2]). >>>>>> >>>>>> Since we're now more eagerly instantiating the struct impl class >>>>>> I had to work around partial struct types, since spinning the >>>>>> impl requires a non-partial type and now we're spinning the impl >>>>>> when creating the LayouType for the struct, as opposed to on the >>>>>> first dereference. To do this I'm detecting whether the struct is >>>>>> partial in LayoutType.ofStruct, and using a Reference.OfGrumpy in >>>>>> the case where it can not be resolved. Tbh, I think this makes >>>>>> things a little more clear as well as far as where/how the >>>>>> exception for deref of a partial type is thrown. >>>>>> >>>>>> Results on my machine before the optimization are: >>>>>> >>>>>> Benchmark??????????????????????? Mode? Cnt??? Score Error Units >>>>>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ? 0.566 ns/op >>>>>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 ns/op >>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 ns/op >>>>>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 ns/op >>>>>> >>>>>> And after: >>>>>> >>>>>> Benchmark??????????????????????? Mode? Cnt?? Score Error Units >>>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>>>> >>>>>> Where panama_get_structonly corresponds to 1., and >>>>>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>>>>> speedup. >>>>>> >>>>>> Thanks, >>>>>> Jorn >>>>>> >>>>>> [1] : >>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>> [2] : https://openjdk.java.net/jeps/230 From jbvernee at xs4all.nl Wed May 22 10:51:15 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 22 May 2019 12:51:15 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> Message-ID: Ah, good point. > ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> > ClassValue I don't think that last link is quite right though. The LayoutType references the anonymous Reference class, not References.OfStruct (which contains the ClassValue). I think it would be: User Code -> LayoutType -> anonymous Reference -> getter MH -> StructImpl -> LayoutType There could still be a cycle there, but the whole cycle can be GC'd once the reference from user code goes away. Jorn Maurizio Cimadamore schreef op 2019-05-22 12:37: > Looks good - yesterday I was looking at this discussion: > > http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html > > I hope we don't run in the condition described there - e.g. that > there's no strong reachability from the MH we're caching back to the > static ClassValue instance - because, if that would be the case I > think that would prevent class unloading. > > The problem is that the MethodHandle we cache refers to the stuct impl > class, and I believe that class refers to some LayoutTypes on its own, > which have a Reference inside, so it would be: > > ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> > ClassValue > > Sundar can you double check? > > Maurizio > > On 22/05/2019 10:56, Jorn Vernee wrote: >> Good suggestion! This solves the problem, is nice and simple, and >> keeps the same times in the benchmark. >> >> Updated webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >> >> (only changes to References.java) >> >> I've added a test for the failure. I think that can be included as >> well? I re-ran the samples I have as well, and this time it's all >> green. >> >> Thanks, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>> Although, now that you bring it up, I tried re-running some of the >>>> samples (hadn't done that yet), and I'm seeing some infinite >>>> recursion. This is seemingly caused by a circular type reference >>>> (e.g. linked list). i.e. to spin the impl of an accessor we need the >>>> LayoutType of the struct itself, which then tries to spin the impl >>>> again, and so on. I guess this isn't a test case in our suite yet... >>>> >>>> I'll look into this. >>> >>> Good detective work! I guess it would make sense to try and reduce it >>> down to a simpler test, and push the test first. >>> >>> Where I was going with this is - your patch effectively made the lazy >>> resolution inside StructImplGenerator useless. If we really want to >>> explore that option, then we should, I think, remove all lazy >>> resolution sites and see what happens. It is possible that we don't >>> rely so much on laziness as we did in the past (we did some fixes few >>> months ago which stabilized resolution quite a bit) - in which case >>> we >>> can remove the resolution requests, although - I have to admit - I'm >>> a >>> bit skeptical. After all all you need it something like this (as you >>> say): >>> >>> struct foo { >>> ??? struct foo *next; >>> } >>> >>> Which is kind of the killer app for unresolved layouts in the first >>> place. >>> >>> This is translated into a struct interface which has a getter of >>> Pointer. To generate the getter you need to compute its >>> LayoutType which is a pointer LayoutType, so you have to compute the >>> pointee LayoutType which brings you back where you started (the whole >>> 'foo' LayoutType). In other words, since now the creation of >>> LayoutType requires the generation of the struct impl for 'foo' >>> and since that depends (indirectly, through the pointer getter) on >>> being able to produce a LayoutType, you get a circularity. >>> >>> One thing we could try is - instead of eagerly creating the struct >>> impl, why don't we let the Reference.OfStruct having some mutable >>> state in it? That is, we could start off with Reference getter which >>> does the expensive refelective lookup - but then, once it has >>> discovered the constructor MH, it can stash it in some field (which >>> is >>> private to that reference object) and use it later if the getter is >>> used again. Then, you probably still need a ClassValue to stash a >>> mapping between a Class and its Reference.OfStruct; but it seems like >>> this could fit in more naturally? >>> >>> Maurizio >>> >>>> >>>> Thanks, >>>> Jorn >>>> >>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>> Since we have the resolution context for NativeHeader, AFAIK there >>>>> is >>>>> no more difference between the resolution call done by >>>>> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. >>>>> So >>>>> I don't think there are any more cases where we would have >>>>> succeeded >>>>> to resolve the Struct layout be delaying spinning the impl. At >>>>> least >>>>> the tests haven't caught such a case. >>>>> >>>>> The other thing is that the partial layout for the getter is caught >>>>> in >>>>> StructImplGenerator, but for the setter it's caught when calling >>>>> bitSize on Unresolved. Saying layouts should be able to be resolved >>>>> when calling LayoutType.ofStruct means we can use >>>>> References.OfGrumpy, >>>>> which makes the two more uniform. >>>>> >>>>> I have some ideas for keeping the lazy init semantics, but it's a >>>>> bit >>>>> more complex (using a MutableCallSite to mimic indy), and I'm not >>>>> sure >>>>> it will work as well. >>>>> >>>>> And, well, there was some talk about eagerly spinning the >>>>> implementations any ways :) >>>>> >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>> Looks good, although I'm a bit worried about the change in >>>>>> semantics >>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>> LayoutTypes when generating the implementation - I wonder there >>>>>> were >>>>>> cases before where we created a partial layout type, which then >>>>>> got >>>>>> resolved correctly by the time it was dereferenced (since we do >>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>> >>>>>> [1] - >>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>> Hi, >>>>>>> >>>>>>> After the recent string of benchmarking [1], I've arrived at 2 >>>>>>> optimizations to improve the speed of the measured code path. >>>>>>> >>>>>>> 1.) Specialization of Struct getter MethodHandles per struct >>>>>>> class. >>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>>> fused cast and offset operation, to avoid creating multiple >>>>>>> Pointer objects. >>>>>>> >>>>>>> The benchmark: >>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>> The optimizations: >>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>> >>>>>>> I've split these into 2 so that it's easier to run the benchmarks >>>>>>> with and without the optimizations. (benchmark uses the OpenJDK's >>>>>>> builtin framework [2]). >>>>>>> >>>>>>> Since we're now more eagerly instantiating the struct impl class >>>>>>> I had to work around partial struct types, since spinning the >>>>>>> impl requires a non-partial type and now we're spinning the impl >>>>>>> when creating the LayouType for the struct, as opposed to on the >>>>>>> first dereference. To do this I'm detecting whether the struct is >>>>>>> partial in LayoutType.ofStruct, and using a Reference.OfGrumpy in >>>>>>> the case where it can not be resolved. Tbh, I think this makes >>>>>>> things a little more clear as well as far as where/how the >>>>>>> exception for deref of a partial type is thrown. >>>>>>> >>>>>>> Results on my machine before the optimization are: >>>>>>> >>>>>>> Benchmark??????????????????????? Mode? Cnt??? Score Error Units >>>>>>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ? 0.566 ns/op >>>>>>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 >>>>>>> ns/op >>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 >>>>>>> ns/op >>>>>>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 >>>>>>> ns/op >>>>>>> >>>>>>> And after: >>>>>>> >>>>>>> Benchmark??????????????????????? Mode? Cnt?? Score Error Units >>>>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>>>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>>>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>>>>> >>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>>>>>> speedup. >>>>>>> >>>>>>> Thanks, >>>>>>> Jorn >>>>>>> >>>>>>> [1] : >>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Wed May 22 13:55:31 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 22 May 2019 13:55:31 +0000 Subject: hg: panama/dev: 8224483: Split MemoryAddress into separate address/region abstractions Message-ID: <201905221355.x4MDtWIF026423@aojmv0008.oracle.com> Changeset: b8ab55332880 Author: mcimadamore Date: 2019-05-22 14:55 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/b8ab55332880 8224483: Split MemoryAddress into separate address/region abstractions ! src/java.base/share/classes/java/foreign/AbstractLayout.java ! src/java.base/share/classes/java/foreign/MemoryAddress.java ! src/java.base/share/classes/java/foreign/MemoryScope.java + src/java.base/share/classes/java/foreign/MemorySegment.java ! src/java.base/share/classes/java/foreign/PaddingLayout.java ! src/java.base/share/classes/java/foreign/SequenceLayout.java ! src/java.base/share/classes/java/foreign/ValueLayout.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java - src/java.base/share/classes/jdk/internal/foreign/MemoryBoundInfo.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java + src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java + test/jdk/java/foreign/TestLayouts.java + test/jdk/java/foreign/TestScopes.java From maurizio.cimadamore at oracle.com Wed May 22 13:55:53 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 14:55:53 +0100 Subject: [foreign-memaccess] RFR 8224483: Split MemoryAddress into separate address/region abstractions In-Reply-To: <834ac1ac-4b0e-c7bd-6b67-f174a8eba8c4@oracle.com> References: <82361d8b-b55b-e6c1-34cc-ed7d7e580a8b@oracle.com> <834ac1ac-4b0e-c7bd-6b67-f174a8eba8c4@oracle.com> Message-ID: <25125585-1b01-4551-7845-24fc45f6dbaf@oracle.com> Addressed all comments and pushed Thanks Maurizio On 21/05/2019 18:50, Maurizio Cimadamore wrote: > > On 21/05/2019 17:51, Jorn Vernee wrote: >> Hi, >> >> Some comments: >> >> MemoryAddress.java >> - I'm wondering if it makes sense to move the ofByteBuffer method to >> MemorySegment? (This would also mirror the internal impl) > Was thinking that too - probably it makes sense, yes. >> >> MemoryAddressImpl.java >> - `((MemoryScopeImpl) segment().scope()).checkAlive()` is used in a >> couple of places. Maybe move this to MemorySegmentImpl, and then call >> segment.checkAlive()? > ok >> - I think asDirectByteBuffer needs to do a liveness check as well >> (should probably just call checkAccess?). > yep - this part is untested as of yet (as we need to do more > validation) but I agree. >> >> MemorySegmentImpl.java >> - OfEverything is unused and could be removed at this point. > I thought I did... probably forgot to do the last bit :-) >> - `resize` needs to check for negative offset as well, since we're >> coming directly from user code. I think this check could be replaced >> with a call to checkRange (adding a < 0 check for the length there). > > Good point. > > > Thanks > Maurizio > >> >> Cheers, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-21 13:52: >>> Following discussion at [1], I decided to go ahead and split >>> MemoryAddres into two abstractions: >>> >>> * MemroyAddress will now embody an offest into a... >>> * MemorySegment, which represent a contiguous region of memory >>> >>> The name MemorySegment came from an internal discussion with Brian, >>> where he pointed out that MemoryScope and MemoryRegion where a bit too >>> overlapping, in the english sense of the word. MemorySegment suggests >>> something that has lower/upper boundaries. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~mcimadamore/panama/8224483/ >>> >>> There are many changes since the last patch discussed - mostly >>> because, as I was adding the lower-level allocate methods in >>> MemoryScope I realized that we were missing a lot of checks (w.r.t. >>> alignment and sizes). I've added those and added tests for these as >>> well (2 new tests have been added). Also, the Scope::allocate API >>> should reflect the fact that Unsafe might fail to allocate - again >>> added a test for this. >>> >>> Maurizio From org.openjdk at io7m.com Wed May 22 14:04:31 2019 From: org.openjdk at io7m.com (Mark Raynsford) Date: Wed, 22 May 2019 15:04:31 +0100 Subject: hg: panama/dev: 8224483: Split MemoryAddress into separate address/region abstractions In-Reply-To: <201905221355.x4MDtWIF026423@aojmv0008.oracle.com> References: <201905221355.x4MDtWIF026423@aojmv0008.oracle.com> Message-ID: <20190522150431.0aa97258@almond.int.arc7.info> On 2019-05-22T13:55:31 +0000 maurizio.cimadamore at oracle.com wrote: > Changeset: b8ab55332880 > Author: mcimadamore > Date: 2019-05-22 14:55 +0100 > URL: http://hg.openjdk.java.net/panama/dev/rev/b8ab55332880 > > 8224483: Split MemoryAddress into separate address/region abstractions > > ! src/java.base/share/classes/java/foreign/AbstractLayout.java > ! src/java.base/share/classes/java/foreign/MemoryAddress.java > ! src/java.base/share/classes/java/foreign/MemoryScope.java > + src/java.base/share/classes/java/foreign/MemorySegment.java > ! src/java.base/share/classes/java/foreign/PaddingLayout.java > ! src/java.base/share/classes/java/foreign/SequenceLayout.java > ! src/java.base/share/classes/java/foreign/ValueLayout.java > ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java > - src/java.base/share/classes/jdk/internal/foreign/MemoryBoundInfo.java > ! src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java > + src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java > + test/jdk/java/foreign/TestLayouts.java > + test/jdk/java/foreign/TestScopes.java > http://hg.openjdk.java.net/panama/dev/rev/b8ab55332880#l2.14 That looks like it might be a typo: "The set of ." -- Mark Raynsford | http://www.io7m.com From maurizio.cimadamore at oracle.com Wed May 22 14:19:19 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 15:19:19 +0100 Subject: hg: panama/dev: 8224483: Split MemoryAddress into separate address/region abstractions In-Reply-To: <20190522150431.0aa97258@almond.int.arc7.info> References: <201905221355.x4MDtWIF026423@aojmv0008.oracle.com> <20190522150431.0aa97258@almond.int.arc7.info> Message-ID: Whoops - thanks for catching! Maurizio On 22/05/2019 15:04, Mark Raynsford wrote: > On 2019-05-22T13:55:31 +0000 > maurizio.cimadamore at oracle.com wrote: > >> Changeset: b8ab55332880 >> Author: mcimadamore >> Date: 2019-05-22 14:55 +0100 >> URL: http://hg.openjdk.java.net/panama/dev/rev/b8ab55332880 >> >> 8224483: Split MemoryAddress into separate address/region abstractions >> >> ! src/java.base/share/classes/java/foreign/AbstractLayout.java >> ! src/java.base/share/classes/java/foreign/MemoryAddress.java >> ! src/java.base/share/classes/java/foreign/MemoryScope.java >> + src/java.base/share/classes/java/foreign/MemorySegment.java >> ! src/java.base/share/classes/java/foreign/PaddingLayout.java >> ! src/java.base/share/classes/java/foreign/SequenceLayout.java >> ! src/java.base/share/classes/java/foreign/ValueLayout.java >> ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java >> - src/java.base/share/classes/jdk/internal/foreign/MemoryBoundInfo.java >> ! src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java >> + src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java >> + test/jdk/java/foreign/TestLayouts.java >> + test/jdk/java/foreign/TestScopes.java >> > http://hg.openjdk.java.net/panama/dev/rev/b8ab55332880#l2.14 > > That looks like it might be a typo: "The set of ." > From maurizio.cimadamore at oracle.com Wed May 22 14:21:22 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 22 May 2019 14:21:22 +0000 Subject: hg: panama/dev: 8224483: Split MemoryAddress into separate address/region abstractions Message-ID: <201905221421.x4MELNu7011712@aojmv0008.oracle.com> Changeset: 05a84bbc4f6e Author: mcimadamore Date: 2019-05-22 15:20 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/05a84bbc4f6e 8224483: Split MemoryAddress into separate address/region abstractions Fix typo in MemoryAddress javadoc ! src/java.base/share/classes/java/foreign/MemoryAddress.java From jbvernee at xs4all.nl Wed May 22 15:09:29 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 22 May 2019 17:09:29 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> Message-ID: Coming back to this once more, I finally got my profiler working (after setting up a separate project) and saw a lot of time spent getting the field offset: 37.00% c2, level 4 jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, version 691 30.19% c2, level 4 jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 22.12% c2, level 4 org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, version 746 ... i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl can not be inlined, and we're re-computing the field offset over and over again. The fix for this is pretty simple; instead of passing the LayoutPath to the caster, we pre-compute the offset and then pass that. (This should be constant, right?). This yields some more speedup: Benchmark Mode Cnt Score Error Units GetStruct.jni_baseline avgt 50 13.337 ? 0.251 ns/op GetStruct.panama_get_both avgt 50 17.026 ? 0.458 ns/op GetStruct.panama_get_fieldonly avgt 50 7.796 ? 0.166 ns/op GetStruct.panama_get_structonly avgt 50 11.863 ? 0.358 ns/op Putting us pretty much even with jni_baseline. Updated Webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ (Only changes are to RuntimeSupport) Cheers, Jorn Jorn Vernee schreef op 2019-05-22 12:51: > Ah, good point. > >> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >> ClassValue > > I don't think that last link is quite right though. The LayoutType > references the anonymous Reference class, not References.OfStruct > (which contains the ClassValue). > > I think it would be: > > User Code -> LayoutType -> anonymous Reference -> getter MH -> > StructImpl -> LayoutType > > There could still be a cycle there, but the whole cycle can be GC'd > once the reference from user code goes away. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-22 12:37: >> Looks good - yesterday I was looking at this discussion: >> >> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >> >> I hope we don't run in the condition described there - e.g. that >> there's no strong reachability from the MH we're caching back to the >> static ClassValue instance - because, if that would be the case I >> think that would prevent class unloading. >> >> The problem is that the MethodHandle we cache refers to the stuct impl >> class, and I believe that class refers to some LayoutTypes on its own, >> which have a Reference inside, so it would be: >> >> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >> ClassValue >> >> Sundar can you double check? >> >> Maurizio >> >> On 22/05/2019 10:56, Jorn Vernee wrote: >>> Good suggestion! This solves the problem, is nice and simple, and >>> keeps the same times in the benchmark. >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>> >>> (only changes to References.java) >>> >>> I've added a test for the failure. I think that can be included as >>> well? I re-ran the samples I have as well, and this time it's all >>> green. >>> >>> Thanks, >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>> Although, now that you bring it up, I tried re-running some of the >>>>> samples (hadn't done that yet), and I'm seeing some infinite >>>>> recursion. This is seemingly caused by a circular type reference >>>>> (e.g. linked list). i.e. to spin the impl of an accessor we need >>>>> the LayoutType of the struct itself, which then tries to spin the >>>>> impl again, and so on. I guess this isn't a test case in our suite >>>>> yet... >>>>> >>>>> I'll look into this. >>>> >>>> Good detective work! I guess it would make sense to try and reduce >>>> it >>>> down to a simpler test, and push the test first. >>>> >>>> Where I was going with this is - your patch effectively made the >>>> lazy >>>> resolution inside StructImplGenerator useless. If we really want to >>>> explore that option, then we should, I think, remove all lazy >>>> resolution sites and see what happens. It is possible that we don't >>>> rely so much on laziness as we did in the past (we did some fixes >>>> few >>>> months ago which stabilized resolution quite a bit) - in which case >>>> we >>>> can remove the resolution requests, although - I have to admit - I'm >>>> a >>>> bit skeptical. After all all you need it something like this (as you >>>> say): >>>> >>>> struct foo { >>>> ??? struct foo *next; >>>> } >>>> >>>> Which is kind of the killer app for unresolved layouts in the first >>>> place. >>>> >>>> This is translated into a struct interface which has a getter of >>>> Pointer. To generate the getter you need to compute its >>>> LayoutType which is a pointer LayoutType, so you have to compute the >>>> pointee LayoutType which brings you back where you started (the >>>> whole >>>> 'foo' LayoutType). In other words, since now the creation of >>>> LayoutType requires the generation of the struct impl for 'foo' >>>> and since that depends (indirectly, through the pointer getter) on >>>> being able to produce a LayoutType, you get a circularity. >>>> >>>> One thing we could try is - instead of eagerly creating the struct >>>> impl, why don't we let the Reference.OfStruct having some mutable >>>> state in it? That is, we could start off with Reference getter which >>>> does the expensive refelective lookup - but then, once it has >>>> discovered the constructor MH, it can stash it in some field (which >>>> is >>>> private to that reference object) and use it later if the getter is >>>> used again. Then, you probably still need a ClassValue to stash a >>>> mapping between a Class and its Reference.OfStruct; but it seems >>>> like >>>> this could fit in more naturally? >>>> >>>> Maurizio >>>> >>>>> >>>>> Thanks, >>>>> Jorn >>>>> >>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>> Since we have the resolution context for NativeHeader, AFAIK there >>>>>> is >>>>>> no more difference between the resolution call done by >>>>>> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. >>>>>> So >>>>>> I don't think there are any more cases where we would have >>>>>> succeeded >>>>>> to resolve the Struct layout be delaying spinning the impl. At >>>>>> least >>>>>> the tests haven't caught such a case. >>>>>> >>>>>> The other thing is that the partial layout for the getter is >>>>>> caught in >>>>>> StructImplGenerator, but for the setter it's caught when calling >>>>>> bitSize on Unresolved. Saying layouts should be able to be >>>>>> resolved >>>>>> when calling LayoutType.ofStruct means we can use >>>>>> References.OfGrumpy, >>>>>> which makes the two more uniform. >>>>>> >>>>>> I have some ideas for keeping the lazy init semantics, but it's a >>>>>> bit >>>>>> more complex (using a MutableCallSite to mimic indy), and I'm not >>>>>> sure >>>>>> it will work as well. >>>>>> >>>>>> And, well, there was some talk about eagerly spinning the >>>>>> implementations any ways :) >>>>>> >>>>>> Jorn >>>>>> >>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>> semantics >>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>> LayoutTypes when generating the implementation - I wonder there >>>>>>> were >>>>>>> cases before where we created a partial layout type, which then >>>>>>> got >>>>>>> resolved correctly by the time it was dereferenced (since we do >>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>> >>>>>>> [1] - >>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> After the recent string of benchmarking [1], I've arrived at 2 >>>>>>>> optimizations to improve the speed of the measured code path. >>>>>>>> >>>>>>>> 1.) Specialization of Struct getter MethodHandles per struct >>>>>>>> class. >>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>>>> fused cast and offset operation, to avoid creating multiple >>>>>>>> Pointer objects. >>>>>>>> >>>>>>>> The benchmark: >>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>> The optimizations: >>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>> >>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>> benchmarks with and without the optimizations. (benchmark uses >>>>>>>> the OpenJDK's builtin framework [2]). >>>>>>>> >>>>>>>> Since we're now more eagerly instantiating the struct impl class >>>>>>>> I had to work around partial struct types, since spinning the >>>>>>>> impl requires a non-partial type and now we're spinning the impl >>>>>>>> when creating the LayouType for the struct, as opposed to on the >>>>>>>> first dereference. To do this I'm detecting whether the struct >>>>>>>> is partial in LayoutType.ofStruct, and using a >>>>>>>> Reference.OfGrumpy in the case where it can not be resolved. >>>>>>>> Tbh, I think this makes things a little more clear as well as >>>>>>>> far as where/how the exception for deref of a partial type is >>>>>>>> thrown. >>>>>>>> >>>>>>>> Results on my machine before the optimization are: >>>>>>>> >>>>>>>> Benchmark??????????????????????? Mode? Cnt??? Score Error Units >>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50?? 14.204 ? 0.566 >>>>>>>> ns/op >>>>>>>> GetStruct.panama_get_both??????? avgt?? 50? 507.638 ? 19.462 >>>>>>>> ns/op >>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 90.236 ? 11.027 >>>>>>>> ns/op >>>>>>>> GetStruct.panama_get_structonly? avgt?? 50? 370.783 ? 13.744 >>>>>>>> ns/op >>>>>>>> >>>>>>>> And after: >>>>>>>> >>>>>>>> Benchmark??????????????????????? Mode? Cnt?? Score Error Units >>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>>>>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>>>>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>>>>>> >>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about 12x >>>>>>>> speedup. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jorn >>>>>>>> >>>>>>>> [1] : >>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Wed May 22 15:30:37 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 16:30:37 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> Message-ID: <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> Looks good - module pending questions on use of ClassValue. I think we should come up with some kind of test case that shows the ClassValue issue and then test with different approaches Maurizio On 22/05/2019 16:09, Jorn Vernee wrote: > Coming back to this once more, > > I finally got my profiler working (after setting up a separate > project) and saw a lot of time spent getting the field offset: > > ?37.00%?????? c2, level 4 > jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, > version 691 > ?30.19%?????? c2, level 4 > jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 > ?22.12%?????? c2, level 4 > org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, > version 746 > ?... > > i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl can > not be inlined, and we're re-computing the field offset over and over > again. > > The fix for this is pretty simple; instead of passing the LayoutPath > to the caster, we pre-compute the offset and then pass that. (This > should be constant, right?). > > This yields some more speedup: > > Benchmark??????????????????????? Mode? Cnt?? Score?? Error? Units > GetStruct.jni_baseline?????????? avgt?? 50? 13.337 ? 0.251? ns/op > GetStruct.panama_get_both??????? avgt?? 50? 17.026 ? 0.458? ns/op > GetStruct.panama_get_fieldonly?? avgt?? 50?? 7.796 ? 0.166? ns/op > GetStruct.panama_get_structonly? avgt?? 50? 11.863 ? 0.358? ns/op > > Putting us pretty much even with jni_baseline. > > Updated Webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ > > (Only changes are to RuntimeSupport) > > Cheers, > Jorn > > Jorn Vernee schreef op 2019-05-22 12:51: >> Ah, good point. >> >>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>> ClassValue >> >> I don't think that last link is quite right though. The LayoutType >> references the anonymous Reference class, not References.OfStruct >> (which contains the ClassValue). >> >> I think it would be: >> >> User Code -> LayoutType -> anonymous Reference -> getter MH -> >> StructImpl -> LayoutType >> >> There could still be a cycle there, but the whole cycle can be GC'd >> once the reference from user code goes away. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-22 12:37: >>> Looks good - yesterday I was looking at this discussion: >>> >>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >>> >>> >>> I hope we don't run in the condition described there - e.g. that >>> there's no strong reachability from the MH we're caching back to the >>> static ClassValue instance - because, if that would be the case I >>> think that would prevent class unloading. >>> >>> The problem is that the MethodHandle we cache refers to the stuct impl >>> class, and I believe that class refers to some LayoutTypes on its own, >>> which have a Reference inside, so it would be: >>> >>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>> ClassValue >>> >>> Sundar can you double check? >>> >>> Maurizio >>> >>> On 22/05/2019 10:56, Jorn Vernee wrote: >>>> Good suggestion! This solves the problem, is nice and simple, and >>>> keeps the same times in the benchmark. >>>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>>> >>>> (only changes to References.java) >>>> >>>> I've added a test for the failure. I think that can be included as >>>> well? I re-ran the samples I have as well, and this time it's all >>>> green. >>>> >>>> Thanks, >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>>> Although, now that you bring it up, I tried re-running some of >>>>>> the samples (hadn't done that yet), and I'm seeing some infinite >>>>>> recursion. This is seemingly caused by a circular type reference >>>>>> (e.g. linked list). i.e. to spin the impl of an accessor we need >>>>>> the LayoutType of the struct itself, which then tries to spin the >>>>>> impl again, and so on. I guess this isn't a test case in our >>>>>> suite yet... >>>>>> >>>>>> I'll look into this. >>>>> >>>>> Good detective work! I guess it would make sense to try and reduce it >>>>> down to a simpler test, and push the test first. >>>>> >>>>> Where I was going with this is - your patch effectively made the lazy >>>>> resolution inside StructImplGenerator useless. If we really want to >>>>> explore that option, then we should, I think, remove all lazy >>>>> resolution sites and see what happens. It is possible that we don't >>>>> rely so much on laziness as we did in the past (we did some fixes few >>>>> months ago which stabilized resolution quite a bit) - in which >>>>> case we >>>>> can remove the resolution requests, although - I have to admit - >>>>> I'm a >>>>> bit skeptical. After all all you need it something like this (as you >>>>> say): >>>>> >>>>> struct foo { >>>>> ??? struct foo *next; >>>>> } >>>>> >>>>> Which is kind of the killer app for unresolved layouts in the >>>>> first place. >>>>> >>>>> This is translated into a struct interface which has a getter of >>>>> Pointer. To generate the getter you need to compute its >>>>> LayoutType which is a pointer LayoutType, so you have to compute the >>>>> pointee LayoutType which brings you back where you started (the whole >>>>> 'foo' LayoutType). In other words, since now the creation of >>>>> LayoutType requires the generation of the struct impl for 'foo' >>>>> and since that depends (indirectly, through the pointer getter) on >>>>> being able to produce a LayoutType, you get a circularity. >>>>> >>>>> One thing we could try is - instead of eagerly creating the struct >>>>> impl, why don't we let the Reference.OfStruct having some mutable >>>>> state in it? That is, we could start off with Reference getter which >>>>> does the expensive refelective lookup - but then, once it has >>>>> discovered the constructor MH, it can stash it in some field >>>>> (which is >>>>> private to that reference object) and use it later if the getter is >>>>> used again. Then, you probably still need a ClassValue to stash a >>>>> mapping between a Class and its Reference.OfStruct; but it seems like >>>>> this could fit in more naturally? >>>>> >>>>> Maurizio >>>>> >>>>>> >>>>>> Thanks, >>>>>> Jorn >>>>>> >>>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>>> Since we have the resolution context for NativeHeader, AFAIK >>>>>>> there is >>>>>>> no more difference between the resolution call done by >>>>>>> StructImpleGenerator, and the one done by >>>>>>> LayoutTypeImpl.ofStruct. So >>>>>>> I don't think there are any more cases where we would have >>>>>>> succeeded >>>>>>> to resolve the Struct layout be delaying spinning the impl. At >>>>>>> least >>>>>>> the tests haven't caught such a case. >>>>>>> >>>>>>> The other thing is that the partial layout for the getter is >>>>>>> caught in >>>>>>> StructImplGenerator, but for the setter it's caught when calling >>>>>>> bitSize on Unresolved. Saying layouts should be able to be resolved >>>>>>> when calling LayoutType.ofStruct means we can use >>>>>>> References.OfGrumpy, >>>>>>> which makes the two more uniform. >>>>>>> >>>>>>> I have some ideas for keeping the lazy init semantics, but it's >>>>>>> a bit >>>>>>> more complex (using a MutableCallSite to mimic indy), and I'm >>>>>>> not sure >>>>>>> it will work as well. >>>>>>> >>>>>>> And, well, there was some talk about eagerly spinning the >>>>>>> implementations any ways :) >>>>>>> >>>>>>> Jorn >>>>>>> >>>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>>> semantics >>>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>>> LayoutTypes when generating the implementation - I wonder there >>>>>>>> were >>>>>>>> cases before where we created a partial layout type, which then >>>>>>>> got >>>>>>>> resolved correctly by the time it was dereferenced (since we do >>>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>>> >>>>>>>> [1] - >>>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> After the recent string of benchmarking [1], I've arrived at 2 >>>>>>>>> optimizations to improve the speed of the measured code path. >>>>>>>>> >>>>>>>>> 1.) Specialization of Struct getter MethodHandles per struct >>>>>>>>> class. >>>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>>>>> fused cast and offset operation, to avoid creating multiple >>>>>>>>> Pointer objects. >>>>>>>>> >>>>>>>>> The benchmark: >>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>>> The optimizations: >>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>>> >>>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>>> benchmarks with and without the optimizations. (benchmark uses >>>>>>>>> the OpenJDK's builtin framework [2]). >>>>>>>>> >>>>>>>>> Since we're now more eagerly instantiating the struct impl >>>>>>>>> class I had to work around partial struct types, since >>>>>>>>> spinning the impl requires a non-partial type and now we're >>>>>>>>> spinning the impl when creating the LayouType for the struct, >>>>>>>>> as opposed to on the first dereference. To do this I'm >>>>>>>>> detecting whether the struct is partial in >>>>>>>>> LayoutType.ofStruct, and using a Reference.OfGrumpy in the >>>>>>>>> case where it can not be resolved. Tbh, I think this makes >>>>>>>>> things a little more clear as well as far as where/how the >>>>>>>>> exception for deref of a partial type is thrown. >>>>>>>>> >>>>>>>>> Results on my machine before the optimization are: >>>>>>>>> >>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 14.204 ? 0.566 ns/op >>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 507.638 ? 19.462 ns/op >>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 90.236 ? 11.027 ns/op >>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 370.783 ? 13.744 ns/op >>>>>>>>> >>>>>>>>> And after: >>>>>>>>> >>>>>>>>> Benchmark??????????????????????? Mode? Cnt?? Score Error Units >>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.941 ? 0.485 ns/op >>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50? 41.199 ? 1.632 ns/op >>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50? 33.432 ? 1.889 ns/op >>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50? 13.469 ? 0.781 ns/op >>>>>>>>> >>>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about >>>>>>>>> 12x speedup. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jorn >>>>>>>>> >>>>>>>>> [1] : >>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>>> [2] : https://openjdk.java.net/jeps/230 From henry.jen at oracle.com Wed May 22 16:54:54 2019 From: henry.jen at oracle.com (Henry Jen) Date: Wed, 22 May 2019 09:54:54 -0700 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: <8dcaa155-38ba-7a7d-b4c4-17d149523bea@oracle.com> References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> <8dcaa155-38ba-7a7d-b4c4-17d149523bea@oracle.com> Message-ID: <0FA0A6F2-E35E-48B2-8B6E-0971D0ACAE84@oracle.com> Updated[1] with feedback from Sundar and Maurizio. Integrated change for ClangUtils as well. [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/2/webrev/ Cheers, Henry > On May 22, 2019, at 2:27 AM, Maurizio Cimadamore wrote: > > I like both parts - nice job. > > In JextractTool, what exactly is the point of catching the parse exception and rethrow? Wouldn't that be propagated out anyway? > > Maurizio > > On 22/05/2019 05:21, Henry Jen wrote: >> Updated webrev[1] change ParsingFailingException to extend RuntimeException and revert Handlers changes for declaring exception. >> >> I also have an add-on webrev[2] that instead of hard-code all built-in type, utilize reparse to find built-in types on demand. I don?t think it?s necessary better, just throw it as an option. >> >> Let me know what you think. >> >> [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/1/webrev/ >> [2] http://cr.openjdk.java.net/~henryjen/panama/8224244/1.1/webrev/ >> >> Cheers, >> Henry >> >>> On May 21, 2019, at 8:59 AM, Henry Jen wrote: >>> >>> >>>> On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: >>>> >>>> Looks very good - couple of comments: >>>> >>>> * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: >>>> >>>> 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; >>>> >>>> That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. >>>> >>> Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) >>> >>> Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. >>> >>>> * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? >>>> >>> I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. >>> >>> Cheers, >>> Henry >>> >>>> Maurizio >>>> >>>> On 20/05/2019 21:53, Henry Jen wrote: >>>>> Hi, >>>>> >>>>> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >>>>> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> [1] >>>>> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >>>>> >>>>> [2] >>>>> https://bugs.openjdk.java.net/browse/JDK-8224244 From maurizio.cimadamore at oracle.com Wed May 22 17:40:57 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 18:40:57 +0100 Subject: [foreign-memaccess] RFR 8224614: Cleanup MemoryScope and its implementation Message-ID: Hi, this patch implements the approach described in [1]. I've refactored MemoryScopeImpl into an abstract class (AbstractMemoryScopeImpl) and two concrete subclasses: * GlobalMemoryScopeImpl - for global scopes (roots of scope hierarchies) * ConfinedMemoryScopeImpl - for 'local' scopes (created using fork) The former is shared across multiple threads, but there's no need for synchronization as there's no mutable state (as global scopes can't be closed). The latter is thread-confined - which means only the thread owner (which is established at scope-creation time) can do scope operations such as fork/allocate/merge/close. I've also cleaned up the various characteristics flags in MemoryScope; some of those made no longer sense, since we decided against having VarHandle for reading/writing addresses directly. I kept the following: * PINNED - used to mark scopes that cannot be closed; it's a property of global scopes and cannot be set w/o super-user powers * IMMUTABLE - means that the underlying memory cannot be written to * UNALIGNED - means that we allow memory writes when addresses do not conform to the alignment requirements of the layout from which the VarHandle was created * CONFINED - means that memory associated with the scope can be accessed only within the owning thread All characteristics are disabled by default - and it's up to the client to set them. There's no 'inheritance' of characteristics either from parents to children. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224614/ Cheers Maurizio [1] - https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005494.html From maurizio.cimadamore at oracle.com Wed May 22 20:06:50 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 21:06:50 +0100 Subject: [foreign] RFR: 8224244: Cleanup libclang Java API In-Reply-To: <0FA0A6F2-E35E-48B2-8B6E-0971D0ACAE84@oracle.com> References: <056BED96-581F-48F9-AE20-6376CC333A98@oracle.com> <6A20BA23-9FEA-4129-8917-5DBC464286C6@oracle.com> <8dcaa155-38ba-7a7d-b4c4-17d149523bea@oracle.com> <0FA0A6F2-E35E-48B2-8B6E-0971D0ACAE84@oracle.com> Message-ID: <4b792fb7-dec8-d26b-ee00-85cf15439ab1@oracle.com> Looks good! Maurizio On 22/05/2019 17:54, Henry Jen wrote: > Updated[1] with feedback from Sundar and Maurizio. Integrated change for ClangUtils as well. > > [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/2/webrev/ > > Cheers, > Henry > > >> On May 22, 2019, at 2:27 AM, Maurizio Cimadamore wrote: >> >> I like both parts - nice job. >> >> In JextractTool, what exactly is the point of catching the parse exception and rethrow? Wouldn't that be propagated out anyway? >> >> Maurizio >> >> On 22/05/2019 05:21, Henry Jen wrote: >>> Updated webrev[1] change ParsingFailingException to extend RuntimeException and revert Handlers changes for declaring exception. >>> >>> I also have an add-on webrev[2] that instead of hard-code all built-in type, utilize reparse to find built-in types on demand. I don?t think it?s necessary better, just throw it as an option. >>> >>> Let me know what you think. >>> >>> [1] http://cr.openjdk.java.net/~henryjen/panama/8224244/1/webrev/ >>> [2] http://cr.openjdk.java.net/~henryjen/panama/8224244/1.1/webrev/ >>> >>> Cheers, >>> Henry >>> >>>> On May 21, 2019, at 8:59 AM, Henry Jen wrote: >>>> >>>> >>>>> On May 21, 2019, at 1:25 AM, Maurizio Cimadamore wrote: >>>>> >>>>> Looks very good - couple of comments: >>>>> >>>>> * I think it would be handy to collect all the translation unit options into an enum - I'm referring to these: >>>>> >>>>> 46 private final static int CXTranslationUnit_DetailedPreprocessingRecord = 0x01; >>>>> >>>>> That would have been really helpful when writing the reparsing code, when often I had to browse the clang headers to look for these. >>>>> >>>> Note that we didn?t expose an API to take options, so this constants are private, therefore I chose to remove those we don?t use. I thought about adding an API to take options and add all those constants, but didn?t go for it as we don?t need it for now, and this JNI binding is simple wrapper simply for jextract. Once runtime suppose is available, we really want to just use extracted API, which is why we have FFI jextract case. :) >>>> >>>> Regarding enum, I won?t recommend in this case, as they are flags to be combined with OR operator. >>>> >>>>> * it seem to me that if the parser catches the index exception and then wraps it as a runtime exception (IllegalStateException) then you can revert most of the changes to the other classes? Also, another thing to consider: maybe we wanna make the parsing exception an unchecked exception? After all, there's not much recovery that can be done on those, forcing all clients to catch it seems excessive? >>>>> >>>> I think you are right, it?s better ParsingFailingException extends RuntimeException. I was thinking that we should force consideration of the case, but I agree that there is not much to do other than gracefully exit. >>>> >>>> Cheers, >>>> Henry >>>> >>>>> Maurizio >>>>> >>>>> On 20/05/2019 21:53, Henry Jen wrote: >>>>>> Hi, >>>>>> >>>>>> Please review the webrev[1] for clang Java API cleanup that came up with earlier work on atomic type support. >>>>>> Mainly to throw a checked exception on parsing error, and move translation unit APIs into TranslationUnit class. >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> [1] >>>>>> http://cr.openjdk.java.net/~henryjen/panama/8224244/0/webrev/ >>>>>> >>>>>> [2] >>>>>> https://bugs.openjdk.java.net/browse/JDK-8224244 From henry.jen at oracle.com Wed May 22 20:28:57 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Wed, 22 May 2019 20:28:57 +0000 Subject: hg: panama/dev: 8224244: Cleanup libclang Java API Message-ID: <201905222028.x4MKSwDl005116@aojmv0008.oracle.com> Changeset: 864485476d3c Author: henryjen Date: 2019-05-22 13:28 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/864485476d3c 8224244: Cleanup libclang Java API Reviewed-by: mcimadamore, sundar ! src/jdk.internal.clang/share/classes/jdk/internal/clang/ClangUtils.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/Comment.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/Index.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/TranslationUnit.java ! src/jdk.internal.clang/share/native/libjclang/jdk_internal_clang.cpp ! src/jdk.jextract/share/classes/com/sun/tools/jextract/JextractTool.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/parser/MacroParser.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/parser/Parser.java ! test/jdk/com/sun/tools/jextract/Runner.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/ClangUtils.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/Index.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/TranslationUnit.java From maurizio.cimadamore at oracle.com Wed May 22 20:34:53 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 22 May 2019 20:34:53 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905222034.x4MKYsT0007413@aojmv0008.oracle.com> Changeset: b4ef59d5c629 Author: mcimadamore Date: 2019-05-22 22:34 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/b4ef59d5c629 Automatic merge with foreign From henry.jen at oracle.com Wed May 22 20:50:23 2019 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Wed, 22 May 2019 20:50:23 +0000 Subject: hg: panama/dev: Remove stacktrace used for debugging Message-ID: <201905222050.x4MKoNLj016834@aojmv0008.oracle.com> Changeset: 8fbae43980f3 Author: henryjen Date: 2019-05-22 13:50 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/8fbae43980f3 Remove stacktrace used for debugging ! src/jdk.internal.clang/share/classes/jdk/internal/clang/ClangUtils.java ! test/jdk/com/sun/tools/jextract/jclang-ffi/src/jdk/internal/clang/ClangUtils.java From maurizio.cimadamore at oracle.com Wed May 22 20:54:32 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 22 May 2019 20:54:32 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905222054.x4MKsWQY020891@aojmv0008.oracle.com> Changeset: 8550307d1118 Author: mcimadamore Date: 2019-05-22 22:54 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/8550307d1118 Automatic merge with foreign From maurizio.cimadamore at oracle.com Wed May 22 22:28:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 23:28:43 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> Message-ID: <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> I did some more analysis on the ClassValue issue and I'm now convinced that what we are doing is _not_ problematic. What we really care about here is that, if we create a Reference.OfStruct for class Foo, we don't want the ClassValue we're using to cache that reference to prevent unloading of Foo. That is slightly different problem than the one described in [1]. There, the issue is that the storage associated with ClassValue (which lives inside Class objects) keeps growing indefinitively, in case where the computed values keep strong references to the ClassValue itself. This is due to the way in which ClassValue behaves. A ClassValue is not an ordinary map - rather, when you call 'get' on a ClassValue with a given class C, you really ask the Class object for C for its ClassValue storage (a so called ClassValueMap abstraction). This map is essentially a WeakHashMap, where Identity is a field of the ClassValue uniquely identifying it, whereas Entry contains the computed value associated with that class and ClassValue instance (the Entry class has a lot of extra complexity to deal with versioning, which is irrelevant here). So, if the ClassValue instance goes away, the fact that we're using a WeakHashMap here, allows the map to shrink in size. Of course, for this to happen, you don't want to have a strong reference from Entry (that is, from the computed value) back to the ClassValue instance - as in doing so you will prevent collection of the WeakHashMap entries. The bug in [1] shows that, when that happens, it is essentially possible to grow the WeakHashMap attached to a class object at will, until an OOME is produced. But in our case we're not concerned with the fact that we keep adding multiple ClassValue to the _same_ class object; it's actually the opposite: we have a single ClassValue (in References.OfStruct), and many classes. In such a case, when the class goes away (because its classloader goes), it will just go away; there will be nothing preventing the collection of that class. Attached is a test (with two files, Test.java and Dummy.java) - Test creates a new class loader, loads Dummy in it, and then stash a value for the Dummy class into a shared ClassValue. To make things as nasty as possible, the value we're storing has strong references to both the Dummy class and the ClassValue itself. But, as soon as the loader is closed, the finalizer is run as expected and memory usage remains under control. So, popping back to our enhancements, I think what the patch does is legit. In terms of the code, I don't like how the code made OfStruct _not_ a Reference, and is instead using OfStruct as a holder for some helper functionalities, plus the cache, whereas the real reference is an anonymous class generated inside the computeValue() method. It seems to me that we could have Reference.OfStruct keep being a Reference, have a constructor that takes a Class object, and then have a static ClassValue field in References which, upon computeValue creates a new instance of Reference.OfStruct for that class. I think the implementation would be a lot more linear that way (unless I'm missing something). Cheers Maurizio [1] - https://bugs.openjdk.java.net/browse/JDK-8136353 On 22/05/2019 16:30, Maurizio Cimadamore wrote: > Looks good - module pending questions on use of ClassValue. > > I think we should come up with some kind of test case that shows the > ClassValue issue and then test with different approaches > > Maurizio > > On 22/05/2019 16:09, Jorn Vernee wrote: >> Coming back to this once more, >> >> I finally got my profiler working (after setting up a separate >> project) and saw a lot of time spent getting the field offset: >> >> ?37.00%?????? c2, level 4 >> jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, >> version 691 >> ?30.19%?????? c2, level 4 >> jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 >> ?22.12%?????? c2, level 4 >> org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, >> version 746 >> ?... >> >> i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl >> can not be inlined, and we're re-computing the field offset over and >> over again. >> >> The fix for this is pretty simple; instead of passing the LayoutPath >> to the caster, we pre-compute the offset and then pass that. (This >> should be constant, right?). >> >> This yields some more speedup: >> >> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >> GetStruct.jni_baseline?????????? avgt?? 50? 13.337 ? 0.251 ns/op >> GetStruct.panama_get_both??????? avgt?? 50? 17.026 ? 0.458 ns/op >> GetStruct.panama_get_fieldonly?? avgt?? 50?? 7.796 ? 0.166 ns/op >> GetStruct.panama_get_structonly? avgt?? 50? 11.863 ? 0.358 ns/op >> >> Putting us pretty much even with jni_baseline. >> >> Updated Webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ >> >> (Only changes are to RuntimeSupport) >> >> Cheers, >> Jorn >> >> Jorn Vernee schreef op 2019-05-22 12:51: >>> Ah, good point. >>> >>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>> ClassValue >>> >>> I don't think that last link is quite right though. The LayoutType >>> references the anonymous Reference class, not References.OfStruct >>> (which contains the ClassValue). >>> >>> I think it would be: >>> >>> User Code -> LayoutType -> anonymous Reference -> getter MH -> >>> StructImpl -> LayoutType >>> >>> There could still be a cycle there, but the whole cycle can be GC'd >>> once the reference from user code goes away. >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-22 12:37: >>>> Looks good - yesterday I was looking at this discussion: >>>> >>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >>>> >>>> >>>> I hope we don't run in the condition described there - e.g. that >>>> there's no strong reachability from the MH we're caching back to the >>>> static ClassValue instance - because, if that would be the case I >>>> think that would prevent class unloading. >>>> >>>> The problem is that the MethodHandle we cache refers to the stuct impl >>>> class, and I believe that class refers to some LayoutTypes on its own, >>>> which have a Reference inside, so it would be: >>>> >>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>> ClassValue >>>> >>>> Sundar can you double check? >>>> >>>> Maurizio >>>> >>>> On 22/05/2019 10:56, Jorn Vernee wrote: >>>>> Good suggestion! This solves the problem, is nice and simple, and >>>>> keeps the same times in the benchmark. >>>>> >>>>> Updated webrev: >>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>>>> >>>>> (only changes to References.java) >>>>> >>>>> I've added a test for the failure. I think that can be included as >>>>> well? I re-ran the samples I have as well, and this time it's all >>>>> green. >>>>> >>>>> Thanks, >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>>>> Although, now that you bring it up, I tried re-running some of >>>>>>> the samples (hadn't done that yet), and I'm seeing some infinite >>>>>>> recursion. This is seemingly caused by a circular type reference >>>>>>> (e.g. linked list). i.e. to spin the impl of an accessor we need >>>>>>> the LayoutType of the struct itself, which then tries to spin >>>>>>> the impl again, and so on. I guess this isn't a test case in our >>>>>>> suite yet... >>>>>>> >>>>>>> I'll look into this. >>>>>> >>>>>> Good detective work! I guess it would make sense to try and >>>>>> reduce it >>>>>> down to a simpler test, and push the test first. >>>>>> >>>>>> Where I was going with this is - your patch effectively made the >>>>>> lazy >>>>>> resolution inside StructImplGenerator useless. If we really want to >>>>>> explore that option, then we should, I think, remove all lazy >>>>>> resolution sites and see what happens. It is possible that we don't >>>>>> rely so much on laziness as we did in the past (we did some fixes >>>>>> few >>>>>> months ago which stabilized resolution quite a bit) - in which >>>>>> case we >>>>>> can remove the resolution requests, although - I have to admit - >>>>>> I'm a >>>>>> bit skeptical. After all all you need it something like this (as you >>>>>> say): >>>>>> >>>>>> struct foo { >>>>>> ??? struct foo *next; >>>>>> } >>>>>> >>>>>> Which is kind of the killer app for unresolved layouts in the >>>>>> first place. >>>>>> >>>>>> This is translated into a struct interface which has a getter of >>>>>> Pointer. To generate the getter you need to compute its >>>>>> LayoutType which is a pointer LayoutType, so you have to compute the >>>>>> pointee LayoutType which brings you back where you started (the >>>>>> whole >>>>>> 'foo' LayoutType). In other words, since now the creation of >>>>>> LayoutType requires the generation of the struct impl for 'foo' >>>>>> and since that depends (indirectly, through the pointer getter) on >>>>>> being able to produce a LayoutType, you get a circularity. >>>>>> >>>>>> One thing we could try is - instead of eagerly creating the struct >>>>>> impl, why don't we let the Reference.OfStruct having some mutable >>>>>> state in it? That is, we could start off with Reference getter which >>>>>> does the expensive refelective lookup - but then, once it has >>>>>> discovered the constructor MH, it can stash it in some field >>>>>> (which is >>>>>> private to that reference object) and use it later if the getter is >>>>>> used again. Then, you probably still need a ClassValue to stash a >>>>>> mapping between a Class and its Reference.OfStruct; but it seems >>>>>> like >>>>>> this could fit in more naturally? >>>>>> >>>>>> Maurizio >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Jorn >>>>>>> >>>>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>>>> Since we have the resolution context for NativeHeader, AFAIK >>>>>>>> there is >>>>>>>> no more difference between the resolution call done by >>>>>>>> StructImpleGenerator, and the one done by >>>>>>>> LayoutTypeImpl.ofStruct. So >>>>>>>> I don't think there are any more cases where we would have >>>>>>>> succeeded >>>>>>>> to resolve the Struct layout be delaying spinning the impl. At >>>>>>>> least >>>>>>>> the tests haven't caught such a case. >>>>>>>> >>>>>>>> The other thing is that the partial layout for the getter is >>>>>>>> caught in >>>>>>>> StructImplGenerator, but for the setter it's caught when calling >>>>>>>> bitSize on Unresolved. Saying layouts should be able to be >>>>>>>> resolved >>>>>>>> when calling LayoutType.ofStruct means we can use >>>>>>>> References.OfGrumpy, >>>>>>>> which makes the two more uniform. >>>>>>>> >>>>>>>> I have some ideas for keeping the lazy init semantics, but it's >>>>>>>> a bit >>>>>>>> more complex (using a MutableCallSite to mimic indy), and I'm >>>>>>>> not sure >>>>>>>> it will work as well. >>>>>>>> >>>>>>>> And, well, there was some talk about eagerly spinning the >>>>>>>> implementations any ways :) >>>>>>>> >>>>>>>> Jorn >>>>>>>> >>>>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>>>> semantics >>>>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>>>> LayoutTypes when generating the implementation - I wonder >>>>>>>>> there were >>>>>>>>> cases before where we created a partial layout type, which >>>>>>>>> then got >>>>>>>>> resolved correctly by the time it was dereferenced (since we do >>>>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>>>> >>>>>>>>> [1] - >>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> After the recent string of benchmarking [1], I've arrived at >>>>>>>>>> 2 optimizations to improve the speed of the measured code path. >>>>>>>>>> >>>>>>>>>> 1.) Specialization of Struct getter MethodHandles per struct >>>>>>>>>> class. >>>>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>>>>>> fused cast and offset operation, to avoid creating multiple >>>>>>>>>> Pointer objects. >>>>>>>>>> >>>>>>>>>> The benchmark: >>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>>>> The optimizations: >>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>>>> >>>>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>>>> benchmarks with and without the optimizations. (benchmark >>>>>>>>>> uses the OpenJDK's builtin framework [2]). >>>>>>>>>> >>>>>>>>>> Since we're now more eagerly instantiating the struct impl >>>>>>>>>> class I had to work around partial struct types, since >>>>>>>>>> spinning the impl requires a non-partial type and now we're >>>>>>>>>> spinning the impl when creating the LayouType for the struct, >>>>>>>>>> as opposed to on the first dereference. To do this I'm >>>>>>>>>> detecting whether the struct is partial in >>>>>>>>>> LayoutType.ofStruct, and using a Reference.OfGrumpy in the >>>>>>>>>> case where it can not be resolved. Tbh, I think this makes >>>>>>>>>> things a little more clear as well as far as where/how the >>>>>>>>>> exception for deref of a partial type is thrown. >>>>>>>>>> >>>>>>>>>> Results on my machine before the optimization are: >>>>>>>>>> >>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 14.204 ? 0.566 ns/op >>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 507.638 ? 19.462 >>>>>>>>>> ns/op >>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 90.236 ? 11.027 ns/op >>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 370.783 ? 13.744 >>>>>>>>>> ns/op >>>>>>>>>> >>>>>>>>>> And after: >>>>>>>>>> >>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 13.941 ? 0.485 ns/op >>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 41.199 ? 1.632 ns/op >>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 33.432 ? 1.889 ns/op >>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 13.469 ? 0.781 ns/op >>>>>>>>>> >>>>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about >>>>>>>>>> 12x speedup. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jorn >>>>>>>>>> >>>>>>>>>> [1] : >>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Wed May 22 22:43:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 May 2019 23:43:46 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> Message-ID: <664ea231-97f9-a576-6290-84f2a61801ec@oracle.com> Also, I believe the instance field for the lazily computed getter MH should be marked as @Stable, which could possibly squeeze some more out of the JIT. Maurizio On 22/05/2019 23:28, Maurizio Cimadamore wrote: > So, popping back to our enhancements, I think what the patch does is > legit. In terms of the code, I don't like how the code made OfStruct > _not_ a Reference, and is instead using OfStruct as a holder for some > helper functionalities, plus the cache, whereas the real reference is > an anonymous class generated inside the computeValue() method. > > It seems to me that we could have Reference.OfStruct keep being a > Reference, have a constructor that takes a Class object, and then have > a static ClassValue field in References which, upon computeValue > creates a new instance of Reference.OfStruct for that class. I think > the implementation would be a lot more linear that way (unless I'm > missing something) From jbvernee at xs4all.nl Thu May 23 12:38:29 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 23 May 2019 14:38:29 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> Message-ID: <199f576776a184b9b2f2b4862cd5a8c7@xs4all.nl> Response inline.... Maurizio Cimadamore schreef op 2019-05-23 00:28: > I did some more analysis on the ClassValue issue and I'm now convinced > that what we are doing is _not_ problematic. > > What we really care about here is that, if we create a > Reference.OfStruct for class Foo, we don't want the ClassValue we're > using to cache that reference to prevent unloading of Foo. That is > slightly different problem than the one described in [1]. There, the > issue is that the storage associated with ClassValue (which lives > inside Class objects) keeps growing indefinitively, in case where the > computed values keep strong references to the ClassValue itself. This > is due to the way in which ClassValue behaves. > > A ClassValue is not an ordinary map - rather, when you call 'get' on a > ClassValue with a given class C, you really ask the Class object for C > for its ClassValue storage (a so called ClassValueMap abstraction). > This map is essentially a WeakHashMap, where Identity > is a field of the ClassValue uniquely identifying it, whereas Entry > contains the computed value associated with that class and ClassValue > instance (the Entry class has a lot of extra complexity to deal with > versioning, which is irrelevant here). > > So, if the ClassValue instance goes away, the fact that we're using a > WeakHashMap here, allows the map to shrink in size. Of course, for > this to happen, you don't want to have a strong reference from Entry > (that is, from the computed value) back to the ClassValue instance - > as in doing so you will prevent collection of the WeakHashMap entries. > > The bug in [1] shows that, when that happens, it is essentially > possible to grow the WeakHashMap attached to a class object at will, > until an OOME is produced. > > But in our case we're not concerned with the fact that we keep adding > multiple ClassValue to the _same_ class object; it's actually the > opposite: we have a single ClassValue (in References.OfStruct), and > many classes. In such a case, when the class goes away (because its > classloader goes), it will just go away; there will be nothing > preventing the collection of that class. > > Attached is a test (with two files, Test.java and Dummy.java) - Test > creates a new class loader, loads Dummy in it, and then stash a value > for the Dummy class into a shared ClassValue. To make things as nasty > as possible, the value we're storing has strong references to both the > Dummy class and the ClassValue itself. But, as soon as the loader is > closed, the finalizer is run as expected and memory usage remains > under control. Thanks for the extensive research, and for explaining it! It's good to hear that using ClassValue won't be an issue for us. I tried out the test, and I'm also seeing the finalizer being run. > So, popping back to our enhancements, I think what the patch does is > legit. In terms of the code, I don't like how the code made OfStruct > _not_ a Reference, and is instead using OfStruct as a holder for some > helper functionalities, plus the cache, whereas the real reference is > an anonymous class generated inside the computeValue() method. > > It seems to me that we could have Reference.OfStruct keep being a > Reference, have a constructor that takes a Class object, and then have > a static ClassValue field in References which, upon computeValue > creates a new instance of Reference.OfStruct for that class. I think > the implementation would be a lot more linear that way (unless I'm > missing something). Yeah, I think doing that would make more sense. It would also help show what fields a struct Reference actually has. I've also added @Stable to the MethodHandle field (as suggested in your other email) and re-ran the benchmark, but did not see an obvious performance increase. I looked at the profile for `panama_get_structonly`, but nothing really stands out to me: 30.37% c2, level 4 org.sample.generated.GetStruct_panama_get_structonly_jmhTest::panama_get_structonly_avgt_jmhStub, version 746 25.90% Unknown, level 0 java.lang.invoke.MethodHandle::invokeBasic, version 102 16.95% c2, level 4 java.lang.invoke.LambdaForm$MH.0x0000000800c0a840::invoke, version 713 13.50% c2, level 4 org.sample.GetStruct::panama_get_structonly, version 711 It looks like most time is spent on JMH overhead. Updated webrev with your suggestions: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.04/ (Only changes to References) Jorn > Cheers > Maurizio > > [1] - https://bugs.openjdk.java.net/browse/JDK-8136353 > > On 22/05/2019 16:30, Maurizio Cimadamore wrote: >> Looks good - module pending questions on use of ClassValue. >> >> I think we should come up with some kind of test case that shows the >> ClassValue issue and then test with different approaches >> >> Maurizio >> >> On 22/05/2019 16:09, Jorn Vernee wrote: >>> Coming back to this once more, >>> >>> I finally got my profiler working (after setting up a separate >>> project) and saw a lot of time spent getting the field offset: >>> >>> ?37.00%?????? c2, level 4 >>> jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, >>> version 691 >>> ?30.19%?????? c2, level 4 >>> jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 >>> ?22.12%?????? c2, level 4 >>> org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, >>> version 746 >>> ?... >>> >>> i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl >>> can not be inlined, and we're re-computing the field offset over and >>> over again. >>> >>> The fix for this is pretty simple; instead of passing the LayoutPath >>> to the caster, we pre-compute the offset and then pass that. (This >>> should be constant, right?). >>> >>> This yields some more speedup: >>> >>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >>> GetStruct.jni_baseline?????????? avgt?? 50? 13.337 ? 0.251 ns/op >>> GetStruct.panama_get_both??????? avgt?? 50? 17.026 ? 0.458 ns/op >>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 7.796 ? 0.166 ns/op >>> GetStruct.panama_get_structonly? avgt?? 50? 11.863 ? 0.358 ns/op >>> >>> Putting us pretty much even with jni_baseline. >>> >>> Updated Webrev: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ >>> >>> (Only changes are to RuntimeSupport) >>> >>> Cheers, >>> Jorn >>> >>> Jorn Vernee schreef op 2019-05-22 12:51: >>>> Ah, good point. >>>> >>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>> ClassValue >>>> >>>> I don't think that last link is quite right though. The LayoutType >>>> references the anonymous Reference class, not References.OfStruct >>>> (which contains the ClassValue). >>>> >>>> I think it would be: >>>> >>>> User Code -> LayoutType -> anonymous Reference -> getter MH -> >>>> StructImpl -> LayoutType >>>> >>>> There could still be a cycle there, but the whole cycle can be GC'd >>>> once the reference from user code goes away. >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-22 12:37: >>>>> Looks good - yesterday I was looking at this discussion: >>>>> >>>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >>>>> I hope we don't run in the condition described there - e.g. that >>>>> there's no strong reachability from the MH we're caching back to >>>>> the >>>>> static ClassValue instance - because, if that would be the case I >>>>> think that would prevent class unloading. >>>>> >>>>> The problem is that the MethodHandle we cache refers to the stuct >>>>> impl >>>>> class, and I believe that class refers to some LayoutTypes on its >>>>> own, >>>>> which have a Reference inside, so it would be: >>>>> >>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>> ClassValue >>>>> >>>>> Sundar can you double check? >>>>> >>>>> Maurizio >>>>> >>>>> On 22/05/2019 10:56, Jorn Vernee wrote: >>>>>> Good suggestion! This solves the problem, is nice and simple, and >>>>>> keeps the same times in the benchmark. >>>>>> >>>>>> Updated webrev: >>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>>>>> >>>>>> (only changes to References.java) >>>>>> >>>>>> I've added a test for the failure. I think that can be included as >>>>>> well? I re-ran the samples I have as well, and this time it's all >>>>>> green. >>>>>> >>>>>> Thanks, >>>>>> Jorn >>>>>> >>>>>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>>>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>>>>> Although, now that you bring it up, I tried re-running some of >>>>>>>> the samples (hadn't done that yet), and I'm seeing some infinite >>>>>>>> recursion. This is seemingly caused by a circular type reference >>>>>>>> (e.g. linked list). i.e. to spin the impl of an accessor we need >>>>>>>> the LayoutType of the struct itself, which then tries to spin >>>>>>>> the impl again, and so on. I guess this isn't a test case in our >>>>>>>> suite yet... >>>>>>>> >>>>>>>> I'll look into this. >>>>>>> >>>>>>> Good detective work! I guess it would make sense to try and >>>>>>> reduce it >>>>>>> down to a simpler test, and push the test first. >>>>>>> >>>>>>> Where I was going with this is - your patch effectively made the >>>>>>> lazy >>>>>>> resolution inside StructImplGenerator useless. If we really want >>>>>>> to >>>>>>> explore that option, then we should, I think, remove all lazy >>>>>>> resolution sites and see what happens. It is possible that we >>>>>>> don't >>>>>>> rely so much on laziness as we did in the past (we did some fixes >>>>>>> few >>>>>>> months ago which stabilized resolution quite a bit) - in which >>>>>>> case we >>>>>>> can remove the resolution requests, although - I have to admit - >>>>>>> I'm a >>>>>>> bit skeptical. After all all you need it something like this (as >>>>>>> you >>>>>>> say): >>>>>>> >>>>>>> struct foo { >>>>>>> ??? struct foo *next; >>>>>>> } >>>>>>> >>>>>>> Which is kind of the killer app for unresolved layouts in the >>>>>>> first place. >>>>>>> >>>>>>> This is translated into a struct interface which has a getter of >>>>>>> Pointer. To generate the getter you need to compute its >>>>>>> LayoutType which is a pointer LayoutType, so you have to compute >>>>>>> the >>>>>>> pointee LayoutType which brings you back where you started (the >>>>>>> whole >>>>>>> 'foo' LayoutType). In other words, since now the creation of >>>>>>> LayoutType requires the generation of the struct impl for >>>>>>> 'foo' >>>>>>> and since that depends (indirectly, through the pointer getter) >>>>>>> on >>>>>>> being able to produce a LayoutType, you get a circularity. >>>>>>> >>>>>>> One thing we could try is - instead of eagerly creating the >>>>>>> struct >>>>>>> impl, why don't we let the Reference.OfStruct having some mutable >>>>>>> state in it? That is, we could start off with Reference getter >>>>>>> which >>>>>>> does the expensive refelective lookup - but then, once it has >>>>>>> discovered the constructor MH, it can stash it in some field >>>>>>> (which is >>>>>>> private to that reference object) and use it later if the getter >>>>>>> is >>>>>>> used again. Then, you probably still need a ClassValue to stash a >>>>>>> mapping between a Class and its Reference.OfStruct; but it seems >>>>>>> like >>>>>>> this could fit in more naturally? >>>>>>> >>>>>>> Maurizio >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jorn >>>>>>>> >>>>>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>>>>> Since we have the resolution context for NativeHeader, AFAIK >>>>>>>>> there is >>>>>>>>> no more difference between the resolution call done by >>>>>>>>> StructImpleGenerator, and the one done by >>>>>>>>> LayoutTypeImpl.ofStruct. So >>>>>>>>> I don't think there are any more cases where we would have >>>>>>>>> succeeded >>>>>>>>> to resolve the Struct layout be delaying spinning the impl. At >>>>>>>>> least >>>>>>>>> the tests haven't caught such a case. >>>>>>>>> >>>>>>>>> The other thing is that the partial layout for the getter is >>>>>>>>> caught in >>>>>>>>> StructImplGenerator, but for the setter it's caught when >>>>>>>>> calling >>>>>>>>> bitSize on Unresolved. Saying layouts should be able to be >>>>>>>>> resolved >>>>>>>>> when calling LayoutType.ofStruct means we can use >>>>>>>>> References.OfGrumpy, >>>>>>>>> which makes the two more uniform. >>>>>>>>> >>>>>>>>> I have some ideas for keeping the lazy init semantics, but it's >>>>>>>>> a bit >>>>>>>>> more complex (using a MutableCallSite to mimic indy), and I'm >>>>>>>>> not sure >>>>>>>>> it will work as well. >>>>>>>>> >>>>>>>>> And, well, there was some talk about eagerly spinning the >>>>>>>>> implementations any ways :) >>>>>>>>> >>>>>>>>> Jorn >>>>>>>>> >>>>>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>>>>> semantics >>>>>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>>>>> LayoutTypes when generating the implementation - I wonder >>>>>>>>>> there were >>>>>>>>>> cases before where we created a partial layout type, which >>>>>>>>>> then got >>>>>>>>>> resolved correctly by the time it was dereferenced (since we >>>>>>>>>> do >>>>>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>>>>> >>>>>>>>>> [1] - >>>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> After the recent string of benchmarking [1], I've arrived at >>>>>>>>>>> 2 optimizations to improve the speed of the measured code >>>>>>>>>>> path. >>>>>>>>>>> >>>>>>>>>>> 1.) Specialization of Struct getter MethodHandles per struct >>>>>>>>>>> class. >>>>>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a >>>>>>>>>>> fused cast and offset operation, to avoid creating multiple >>>>>>>>>>> Pointer objects. >>>>>>>>>>> >>>>>>>>>>> The benchmark: >>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>>>>> The optimizations: >>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>>>>> benchmarks with and without the optimizations. (benchmark >>>>>>>>>>> uses the OpenJDK's builtin framework [2]). >>>>>>>>>>> >>>>>>>>>>> Since we're now more eagerly instantiating the struct impl >>>>>>>>>>> class I had to work around partial struct types, since >>>>>>>>>>> spinning the impl requires a non-partial type and now we're >>>>>>>>>>> spinning the impl when creating the LayouType for the struct, >>>>>>>>>>> as opposed to on the first dereference. To do this I'm >>>>>>>>>>> detecting whether the struct is partial in >>>>>>>>>>> LayoutType.ofStruct, and using a Reference.OfGrumpy in the >>>>>>>>>>> case where it can not be resolved. Tbh, I think this makes >>>>>>>>>>> things a little more clear as well as far as where/how the >>>>>>>>>>> exception for deref of a partial type is thrown. >>>>>>>>>>> >>>>>>>>>>> Results on my machine before the optimization are: >>>>>>>>>>> >>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 14.204 ? 0.566 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 507.638 ? 19.462 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 90.236 ? 11.027 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 370.783 ? 13.744 >>>>>>>>>>> ns/op >>>>>>>>>>> >>>>>>>>>>> And after: >>>>>>>>>>> >>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 13.941 ? 0.485 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 41.199 ? 1.632 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 33.432 ? 1.889 >>>>>>>>>>> ns/op >>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 13.469 ? 0.781 >>>>>>>>>>> ns/op >>>>>>>>>>> >>>>>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about >>>>>>>>>>> 12x speedup. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jorn >>>>>>>>>>> >>>>>>>>>>> [1] : >>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Thu May 23 13:20:08 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 23 May 2019 14:20:08 +0100 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: <199f576776a184b9b2f2b4862cd5a8c7@xs4all.nl> References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> <199f576776a184b9b2f2b4862cd5a8c7@xs4all.nl> Message-ID: I like the new webrev - only minor quibble on naming - that is, the 'getter' field and the 'makeGetter' method would probably be better named as 'specializedGetter' and 'specializeGetter'/'makeSpecializedGetter' respectively, to carry more meaning. No need for another review if you decide to go for the name change. Cheers Maurizio On 23/05/2019 13:38, Jorn Vernee wrote: > Response inline.... > > Maurizio Cimadamore schreef op 2019-05-23 00:28: >> I did some more analysis on the ClassValue issue and I'm now convinced >> that what we are doing is _not_ problematic. >> >> What we really care about here is that, if we create a >> Reference.OfStruct for class Foo, we don't want the ClassValue we're >> using to cache that reference to prevent unloading of Foo. That is >> slightly different problem than the one described in [1]. There, the >> issue is that the storage associated with ClassValue (which lives >> inside Class objects) keeps growing indefinitively, in case where the >> computed values keep strong references to the ClassValue itself. This >> is due to the way in which ClassValue behaves. >> >> A ClassValue is not an ordinary map - rather, when you call 'get' on a >> ClassValue with a given class C, you really ask the Class object for C >> for its ClassValue storage (a so called ClassValueMap abstraction). >> This map is essentially a WeakHashMap, where Identity >> is a field of the ClassValue uniquely identifying it, whereas Entry >> contains the computed value associated with that class and ClassValue >> instance (the Entry class has a lot of extra complexity to deal with >> versioning, which is irrelevant here). >> >> So, if the ClassValue instance goes away, the fact that we're using a >> WeakHashMap here, allows the map to shrink in size. Of course, for >> this to happen, you don't want to have a strong reference from Entry >> (that is, from the computed value) back to the ClassValue instance - >> as in doing so you will prevent collection of the WeakHashMap entries. >> >> The bug in [1] shows that, when that happens, it is essentially >> possible to grow the WeakHashMap attached to a class object at will, >> until an OOME is produced. >> >> But in our case we're not concerned with the fact that we keep adding >> multiple ClassValue to the _same_ class object; it's actually the >> opposite: we have a single ClassValue (in References.OfStruct), and >> many classes. In such a case, when the class goes away (because its >> classloader goes), it will just go away; there will be nothing >> preventing the collection of that class. >> >> Attached is a test (with two files, Test.java and Dummy.java) - Test >> creates a new class loader, loads Dummy in it, and then stash a value >> for the Dummy class into a shared ClassValue. To make things as nasty >> as possible, the value we're storing has strong references to both the >> Dummy class and the ClassValue itself. But, as soon as the loader is >> closed, the finalizer is run as expected and memory usage remains >> under control. > > Thanks for the extensive research, and for explaining it! It's good to > hear that using ClassValue won't be an issue for us. > > I tried out the test, and I'm also seeing the finalizer being run. > >> So, popping back to our enhancements, I think what the patch does is >> legit. In terms of the code, I don't like how the code made OfStruct >> _not_ a Reference, and is instead using OfStruct as a holder for some >> helper functionalities, plus the cache, whereas the real reference is >> an anonymous class generated inside the computeValue() method. >> >> It seems to me that we could have Reference.OfStruct keep being a >> Reference, have a constructor that takes a Class object, and then have >> a static ClassValue field in References which, upon computeValue >> creates a new instance of Reference.OfStruct for that class. I think >> the implementation would be a lot more linear that way (unless I'm >> missing something). > > Yeah, I think doing that would make more sense. It would also help > show what fields a struct Reference actually has. > > I've also added @Stable to the MethodHandle field (as suggested in > your other email) and re-ran the benchmark, but did not see an obvious > performance increase. I looked at the profile for > `panama_get_structonly`, but nothing really stands out to me: > > ?30.37%?????? c2, level 4 > org.sample.generated.GetStruct_panama_get_structonly_jmhTest::panama_get_structonly_avgt_jmhStub, > version 746 > ?25.90%? Unknown, level 0 java.lang.invoke.MethodHandle::invokeBasic, > version 102 > ?16.95%?????? c2, level 4 > java.lang.invoke.LambdaForm$MH.0x0000000800c0a840::invoke, version 713 > ?13.50%?????? c2, level 4 org.sample.GetStruct::panama_get_structonly, > version 711 > > It looks like most time is spent on JMH overhead. > > Updated webrev with your suggestions: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.04/ > > (Only changes to References) > > Jorn > >> Cheers >> Maurizio >> >> [1] - https://bugs.openjdk.java.net/browse/JDK-8136353 >> >> On 22/05/2019 16:30, Maurizio Cimadamore wrote: >>> Looks good - module pending questions on use of ClassValue. >>> >>> I think we should come up with some kind of test case that shows the >>> ClassValue issue and then test with different approaches >>> >>> Maurizio >>> >>> On 22/05/2019 16:09, Jorn Vernee wrote: >>>> Coming back to this once more, >>>> >>>> I finally got my profiler working (after setting up a separate >>>> project) and saw a lot of time spent getting the field offset: >>>> >>>> ?37.00%?????? c2, level 4 >>>> jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, >>>> version 691 >>>> ?30.19%?????? c2, level 4 >>>> jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 >>>> ?22.12%?????? c2, level 4 >>>> org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, >>>> version 746 >>>> ?... >>>> >>>> i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl >>>> can not be inlined, and we're re-computing the field offset over >>>> and over again. >>>> >>>> The fix for this is pretty simple; instead of passing the >>>> LayoutPath to the caster, we pre-compute the offset and then pass >>>> that. (This should be constant, right?). >>>> >>>> This yields some more speedup: >>>> >>>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.337 ? 0.251 ns/op >>>> GetStruct.panama_get_both??????? avgt?? 50? 17.026 ? 0.458 ns/op >>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 7.796 ? 0.166 ns/op >>>> GetStruct.panama_get_structonly? avgt?? 50? 11.863 ? 0.358 ns/op >>>> >>>> Putting us pretty much even with jni_baseline. >>>> >>>> Updated Webrev: >>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ >>>> >>>> (Only changes are to RuntimeSupport) >>>> >>>> Cheers, >>>> Jorn >>>> >>>> Jorn Vernee schreef op 2019-05-22 12:51: >>>>> Ah, good point. >>>>> >>>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>>> ClassValue >>>>> >>>>> I don't think that last link is quite right though. The LayoutType >>>>> references the anonymous Reference class, not References.OfStruct >>>>> (which contains the ClassValue). >>>>> >>>>> I think it would be: >>>>> >>>>> User Code -> LayoutType -> anonymous Reference -> getter MH -> >>>>> StructImpl -> LayoutType >>>>> >>>>> There could still be a cycle there, but the whole cycle can be GC'd >>>>> once the reference from user code goes away. >>>>> >>>>> Jorn >>>>> >>>>> Maurizio Cimadamore schreef op 2019-05-22 12:37: >>>>>> Looks good - yesterday I was looking at this discussion: >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >>>>>> I hope we don't run in the condition described there - e.g. that >>>>>> there's no strong reachability from the MH we're caching back to the >>>>>> static ClassValue instance - because, if that would be the case I >>>>>> think that would prevent class unloading. >>>>>> >>>>>> The problem is that the MethodHandle we cache refers to the stuct >>>>>> impl >>>>>> class, and I believe that class refers to some LayoutTypes on its >>>>>> own, >>>>>> which have a Reference inside, so it would be: >>>>>> >>>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>>> ClassValue >>>>>> >>>>>> Sundar can you double check? >>>>>> >>>>>> Maurizio >>>>>> >>>>>> On 22/05/2019 10:56, Jorn Vernee wrote: >>>>>>> Good suggestion! This solves the problem, is nice and simple, >>>>>>> and keeps the same times in the benchmark. >>>>>>> >>>>>>> Updated webrev: >>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>>>>>> >>>>>>> (only changes to References.java) >>>>>>> >>>>>>> I've added a test for the failure. I think that can be included >>>>>>> as well? I re-ran the samples I have as well, and this time it's >>>>>>> all green. >>>>>>> >>>>>>> Thanks, >>>>>>> Jorn >>>>>>> >>>>>>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>>>>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>>>>>> Although, now that you bring it up, I tried re-running some of >>>>>>>>> the samples (hadn't done that yet), and I'm seeing some >>>>>>>>> infinite recursion. This is seemingly caused by a circular >>>>>>>>> type reference (e.g. linked list). i.e. to spin the impl of an >>>>>>>>> accessor we need the LayoutType of the struct itself, which >>>>>>>>> then tries to spin the impl again, and so on. I guess this >>>>>>>>> isn't a test case in our suite yet... >>>>>>>>> >>>>>>>>> I'll look into this. >>>>>>>> >>>>>>>> Good detective work! I guess it would make sense to try and >>>>>>>> reduce it >>>>>>>> down to a simpler test, and push the test first. >>>>>>>> >>>>>>>> Where I was going with this is - your patch effectively made >>>>>>>> the lazy >>>>>>>> resolution inside StructImplGenerator useless. If we really >>>>>>>> want to >>>>>>>> explore that option, then we should, I think, remove all lazy >>>>>>>> resolution sites and see what happens. It is possible that we >>>>>>>> don't >>>>>>>> rely so much on laziness as we did in the past (we did some >>>>>>>> fixes few >>>>>>>> months ago which stabilized resolution quite a bit) - in which >>>>>>>> case we >>>>>>>> can remove the resolution requests, although - I have to admit >>>>>>>> - I'm a >>>>>>>> bit skeptical. After all all you need it something like this >>>>>>>> (as you >>>>>>>> say): >>>>>>>> >>>>>>>> struct foo { >>>>>>>> ??? struct foo *next; >>>>>>>> } >>>>>>>> >>>>>>>> Which is kind of the killer app for unresolved layouts in the >>>>>>>> first place. >>>>>>>> >>>>>>>> This is translated into a struct interface which has a getter of >>>>>>>> Pointer. To generate the getter you need to compute its >>>>>>>> LayoutType which is a pointer LayoutType, so you have to >>>>>>>> compute the >>>>>>>> pointee LayoutType which brings you back where you started (the >>>>>>>> whole >>>>>>>> 'foo' LayoutType). In other words, since now the creation of >>>>>>>> LayoutType requires the generation of the struct impl for >>>>>>>> 'foo' >>>>>>>> and since that depends (indirectly, through the pointer getter) on >>>>>>>> being able to produce a LayoutType, you get a circularity. >>>>>>>> >>>>>>>> One thing we could try is - instead of eagerly creating the struct >>>>>>>> impl, why don't we let the Reference.OfStruct having some mutable >>>>>>>> state in it? That is, we could start off with Reference getter >>>>>>>> which >>>>>>>> does the expensive refelective lookup - but then, once it has >>>>>>>> discovered the constructor MH, it can stash it in some field >>>>>>>> (which is >>>>>>>> private to that reference object) and use it later if the >>>>>>>> getter is >>>>>>>> used again. Then, you probably still need a ClassValue to stash a >>>>>>>> mapping between a Class and its Reference.OfStruct; but it >>>>>>>> seems like >>>>>>>> this could fit in more naturally? >>>>>>>> >>>>>>>> Maurizio >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jorn >>>>>>>>> >>>>>>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>>>>>> Since we have the resolution context for NativeHeader, AFAIK >>>>>>>>>> there is >>>>>>>>>> no more difference between the resolution call done by >>>>>>>>>> StructImpleGenerator, and the one done by >>>>>>>>>> LayoutTypeImpl.ofStruct. So >>>>>>>>>> I don't think there are any more cases where we would have >>>>>>>>>> succeeded >>>>>>>>>> to resolve the Struct layout be delaying spinning the impl. >>>>>>>>>> At least >>>>>>>>>> the tests haven't caught such a case. >>>>>>>>>> >>>>>>>>>> The other thing is that the partial layout for the getter is >>>>>>>>>> caught in >>>>>>>>>> StructImplGenerator, but for the setter it's caught when calling >>>>>>>>>> bitSize on Unresolved. Saying layouts should be able to be >>>>>>>>>> resolved >>>>>>>>>> when calling LayoutType.ofStruct means we can use >>>>>>>>>> References.OfGrumpy, >>>>>>>>>> which makes the two more uniform. >>>>>>>>>> >>>>>>>>>> I have some ideas for keeping the lazy init semantics, but >>>>>>>>>> it's a bit >>>>>>>>>> more complex (using a MutableCallSite to mimic indy), and I'm >>>>>>>>>> not sure >>>>>>>>>> it will work as well. >>>>>>>>>> >>>>>>>>>> And, well, there was some talk about eagerly spinning the >>>>>>>>>> implementations any ways :) >>>>>>>>>> >>>>>>>>>> Jorn >>>>>>>>>> >>>>>>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>>>>>> semantics >>>>>>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>>>>>> LayoutTypes when generating the implementation - I wonder >>>>>>>>>>> there were >>>>>>>>>>> cases before where we created a partial layout type, which >>>>>>>>>>> then got >>>>>>>>>>> resolved correctly by the time it was dereferenced (since we do >>>>>>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>>>>>> >>>>>>>>>>> [1] - >>>>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> After the recent string of benchmarking [1], I've arrived >>>>>>>>>>>> at 2 optimizations to improve the speed of the measured >>>>>>>>>>>> code path. >>>>>>>>>>>> >>>>>>>>>>>> 1.) Specialization of Struct getter MethodHandles per >>>>>>>>>>>> struct class. >>>>>>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does >>>>>>>>>>>> a fused cast and offset operation, to avoid creating >>>>>>>>>>>> multiple Pointer objects. >>>>>>>>>>>> >>>>>>>>>>>> The benchmark: >>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>>>>>> The optimizations: >>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>>>>>> >>>>>>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>>>>>> benchmarks with and without the optimizations. (benchmark >>>>>>>>>>>> uses the OpenJDK's builtin framework [2]). >>>>>>>>>>>> >>>>>>>>>>>> Since we're now more eagerly instantiating the struct impl >>>>>>>>>>>> class I had to work around partial struct types, since >>>>>>>>>>>> spinning the impl requires a non-partial type and now we're >>>>>>>>>>>> spinning the impl when creating the LayouType for the >>>>>>>>>>>> struct, as opposed to on the first dereference. To do this >>>>>>>>>>>> I'm detecting whether the struct is partial in >>>>>>>>>>>> LayoutType.ofStruct, and using a Reference.OfGrumpy in the >>>>>>>>>>>> case where it can not be resolved. Tbh, I think this makes >>>>>>>>>>>> things a little more clear as well as far as where/how the >>>>>>>>>>>> exception for deref of a partial type is thrown. >>>>>>>>>>>> >>>>>>>>>>>> Results on my machine before the optimization are: >>>>>>>>>>>> >>>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 14.204 ? 0.566 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 507.638 ? 19.462 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 90.236 ? 11.027 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 370.783 ? 13.744 >>>>>>>>>>>> ns/op >>>>>>>>>>>> >>>>>>>>>>>> And after: >>>>>>>>>>>> >>>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error Units >>>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 13.941 ? 0.485 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 41.199 ? 1.632 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 33.432 ? 1.889 >>>>>>>>>>>> ns/op >>>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 13.469 ? 0.781 >>>>>>>>>>>> ns/op >>>>>>>>>>>> >>>>>>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about >>>>>>>>>>>> 12x speedup. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Jorn >>>>>>>>>>>> >>>>>>>>>>>> [1] : >>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>>>>>> [2] : https://openjdk.java.net/jeps/230 From jbvernee at xs4all.nl Thu May 23 14:12:40 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Thu, 23 May 2019 14:12:40 +0000 Subject: hg: panama/dev: 8224481: Optimize struct getter and field getter paths. Message-ID: <201905231412.x4NECf1E007229@aojmv0008.oracle.com> Changeset: c1b415e10db0 Author: jvernee Date: 2019-05-23 16:11 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/c1b415e10db0 8224481: Optimize struct getter and field getter paths. Reviewed-by: mcimadamore ! src/java.base/share/classes/java/foreign/memory/LayoutType.java ! src/java.base/share/classes/jdk/internal/foreign/RuntimeSupport.java ! src/java.base/share/classes/jdk/internal/foreign/memory/BoundedPointer.java ! src/java.base/share/classes/jdk/internal/foreign/memory/LayoutTypeImpl.java ! src/java.base/share/classes/jdk/internal/foreign/memory/References.java ! test/jdk/java/foreign/types/partial/PartialStructsTest.java From jbvernee at xs4all.nl Thu May 23 14:13:23 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 23 May 2019 16:13:23 +0200 Subject: [foreign] RFR 8224481: Optimize struct getter and field getter paths. In-Reply-To: References: <0f80ae0efb1ae056af287cb831b92e51@xs4all.nl> <7778e3c9-0696-5f6a-c40e-4c5458c1af6c@oracle.com> <436608f145333799ab4dc3c435425e6a@xs4all.nl> <639ba9f3-a5e2-c59c-d6de-2bb7a16b924d@oracle.com> <6e830486-7fc0-ce1e-0e15-d863ed13bc99@oracle.com> <823c3d09-c616-e3e4-47c2-e7ac51041baa@oracle.com> <199f576776a184b9b2f2b4862cd5a8c7@xs4all.nl> Message-ID: Thanks for the reviews! I went with 'specializedGetter' and 'makeSpecializedGetter' (the latter to be more distinct from the former), and pushed. Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-23 15:20: > I like the new webrev - only minor quibble on naming - that is, the > 'getter' field and the 'makeGetter' method would probably be better > named as 'specializedGetter' and > 'specializeGetter'/'makeSpecializedGetter' respectively, to carry more > meaning. > > No need for another review if you decide to go for the name change. > > Cheers > Maurizio > > On 23/05/2019 13:38, Jorn Vernee wrote: >> Response inline.... >> >> Maurizio Cimadamore schreef op 2019-05-23 00:28: >>> I did some more analysis on the ClassValue issue and I'm now >>> convinced >>> that what we are doing is _not_ problematic. >>> >>> What we really care about here is that, if we create a >>> Reference.OfStruct for class Foo, we don't want the ClassValue we're >>> using to cache that reference to prevent unloading of Foo. That is >>> slightly different problem than the one described in [1]. There, the >>> issue is that the storage associated with ClassValue (which lives >>> inside Class objects) keeps growing indefinitively, in case where the >>> computed values keep strong references to the ClassValue itself. This >>> is due to the way in which ClassValue behaves. >>> >>> A ClassValue is not an ordinary map - rather, when you call 'get' on >>> a >>> ClassValue with a given class C, you really ask the Class object for >>> C >>> for its ClassValue storage (a so called ClassValueMap abstraction). >>> This map is essentially a WeakHashMap, where >>> Identity >>> is a field of the ClassValue uniquely identifying it, whereas Entry >>> contains the computed value associated with that class and ClassValue >>> instance (the Entry class has a lot of extra complexity to deal with >>> versioning, which is irrelevant here). >>> >>> So, if the ClassValue instance goes away, the fact that we're using a >>> WeakHashMap here, allows the map to shrink in size. Of course, for >>> this to happen, you don't want to have a strong reference from Entry >>> (that is, from the computed value) back to the ClassValue instance - >>> as in doing so you will prevent collection of the WeakHashMap >>> entries. >>> >>> The bug in [1] shows that, when that happens, it is essentially >>> possible to grow the WeakHashMap attached to a class object at will, >>> until an OOME is produced. >>> >>> But in our case we're not concerned with the fact that we keep adding >>> multiple ClassValue to the _same_ class object; it's actually the >>> opposite: we have a single ClassValue (in References.OfStruct), and >>> many classes. In such a case, when the class goes away (because its >>> classloader goes), it will just go away; there will be nothing >>> preventing the collection of that class. >>> >>> Attached is a test (with two files, Test.java and Dummy.java) - Test >>> creates a new class loader, loads Dummy in it, and then stash a value >>> for the Dummy class into a shared ClassValue. To make things as nasty >>> as possible, the value we're storing has strong references to both >>> the >>> Dummy class and the ClassValue itself. But, as soon as the loader is >>> closed, the finalizer is run as expected and memory usage remains >>> under control. >> >> Thanks for the extensive research, and for explaining it! It's good to >> hear that using ClassValue won't be an issue for us. >> >> I tried out the test, and I'm also seeing the finalizer being run. >> >>> So, popping back to our enhancements, I think what the patch does is >>> legit. In terms of the code, I don't like how the code made OfStruct >>> _not_ a Reference, and is instead using OfStruct as a holder for some >>> helper functionalities, plus the cache, whereas the real reference is >>> an anonymous class generated inside the computeValue() method. >>> >>> It seems to me that we could have Reference.OfStruct keep being a >>> Reference, have a constructor that takes a Class object, and then >>> have >>> a static ClassValue field in References which, upon computeValue >>> creates a new instance of Reference.OfStruct for that class. I think >>> the implementation would be a lot more linear that way (unless I'm >>> missing something). >> >> Yeah, I think doing that would make more sense. It would also help >> show what fields a struct Reference actually has. >> >> I've also added @Stable to the MethodHandle field (as suggested in >> your other email) and re-ran the benchmark, but did not see an obvious >> performance increase. I looked at the profile for >> `panama_get_structonly`, but nothing really stands out to me: >> >> ?30.37%?????? c2, level 4 >> org.sample.generated.GetStruct_panama_get_structonly_jmhTest::panama_get_structonly_avgt_jmhStub, >> version 746 >> ?25.90%? Unknown, level 0 java.lang.invoke.MethodHandle::invokeBasic, >> version 102 >> ?16.95%?????? c2, level 4 >> java.lang.invoke.LambdaForm$MH.0x0000000800c0a840::invoke, version 713 >> ?13.50%?????? c2, level 4 org.sample.GetStruct::panama_get_structonly, >> version 711 >> >> It looks like most time is spent on JMH overhead. >> >> Updated webrev with your suggestions: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.04/ >> >> (Only changes to References) >> >> Jorn >> >>> Cheers >>> Maurizio >>> >>> [1] - https://bugs.openjdk.java.net/browse/JDK-8136353 >>> >>> On 22/05/2019 16:30, Maurizio Cimadamore wrote: >>>> Looks good - module pending questions on use of ClassValue. >>>> >>>> I think we should come up with some kind of test case that shows the >>>> ClassValue issue and then test with different approaches >>>> >>>> Maurizio >>>> >>>> On 22/05/2019 16:09, Jorn Vernee wrote: >>>>> Coming back to this once more, >>>>> >>>>> I finally got my profiler working (after setting up a separate >>>>> project) and saw a lot of time spent getting the field offset: >>>>> >>>>> ?37.00%?????? c2, level 4 >>>>> jdk.internal.foreign.LayoutPaths$$Lambda$66.0x0000000800c05040::getAsLong, >>>>> version 691 >>>>> ?30.19%?????? c2, level 4 >>>>> jdk.internal.foreign.RuntimeSupport::casterImpl, version 724 >>>>> ?22.12%?????? c2, level 4 >>>>> org.sample.generated.GetStruct_panama_get_fieldonly_jmhTest::panama_get_fieldonly_avgt_jmhStub, >>>>> version 746 >>>>> ?... >>>>> >>>>> i.e. the call to LayoutPath.offset() in RuntimeSupport::casterImpl >>>>> can not be inlined, and we're re-computing the field offset over >>>>> and over again. >>>>> >>>>> The fix for this is pretty simple; instead of passing the >>>>> LayoutPath to the caster, we pre-compute the offset and then pass >>>>> that. (This should be constant, right?). >>>>> >>>>> This yields some more speedup: >>>>> >>>>> Benchmark??????????????????????? Mode? Cnt?? Score?? Error Units >>>>> GetStruct.jni_baseline?????????? avgt?? 50? 13.337 ? 0.251 ns/op >>>>> GetStruct.panama_get_both??????? avgt?? 50? 17.026 ? 0.458 ns/op >>>>> GetStruct.panama_get_fieldonly?? avgt?? 50?? 7.796 ? 0.166 ns/op >>>>> GetStruct.panama_get_structonly? avgt?? 50? 11.863 ? 0.358 ns/op >>>>> >>>>> Putting us pretty much even with jni_baseline. >>>>> >>>>> Updated Webrev: >>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.03/ >>>>> >>>>> (Only changes are to RuntimeSupport) >>>>> >>>>> Cheers, >>>>> Jorn >>>>> >>>>> Jorn Vernee schreef op 2019-05-22 12:51: >>>>>> Ah, good point. >>>>>> >>>>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>>>> ClassValue >>>>>> >>>>>> I don't think that last link is quite right though. The LayoutType >>>>>> references the anonymous Reference class, not References.OfStruct >>>>>> (which contains the ClassValue). >>>>>> >>>>>> I think it would be: >>>>>> >>>>>> User Code -> LayoutType -> anonymous Reference -> getter MH -> >>>>>> StructImpl -> LayoutType >>>>>> >>>>>> There could still be a cycle there, but the whole cycle can be >>>>>> GC'd >>>>>> once the reference from user code goes away. >>>>>> >>>>>> Jorn >>>>>> >>>>>> Maurizio Cimadamore schreef op 2019-05-22 12:37: >>>>>>> Looks good - yesterday I was looking at this discussion: >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html >>>>>>> I hope we don't run in the condition described there - e.g. that >>>>>>> there's no strong reachability from the MH we're caching back to >>>>>>> the >>>>>>> static ClassValue instance - because, if that would be the case I >>>>>>> think that would prevent class unloading. >>>>>>> >>>>>>> The problem is that the MethodHandle we cache refers to the stuct >>>>>>> impl >>>>>>> class, and I believe that class refers to some LayoutTypes on its >>>>>>> own, >>>>>>> which have a Reference inside, so it would be: >>>>>>> >>>>>>> ClassValue -> MH -> StructImpl -> LayoutType -> Reference -> >>>>>>> ClassValue >>>>>>> >>>>>>> Sundar can you double check? >>>>>>> >>>>>>> Maurizio >>>>>>> >>>>>>> On 22/05/2019 10:56, Jorn Vernee wrote: >>>>>>>> Good suggestion! This solves the problem, is nice and simple, >>>>>>>> and keeps the same times in the benchmark. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/ >>>>>>>> >>>>>>>> (only changes to References.java) >>>>>>>> >>>>>>>> I've added a test for the failure. I think that can be included >>>>>>>> as well? I re-ran the samples I have as well, and this time it's >>>>>>>> all green. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jorn >>>>>>>> >>>>>>>> Maurizio Cimadamore schreef op 2019-05-22 01:15: >>>>>>>>> On 21/05/2019 20:16, Jorn Vernee wrote: >>>>>>>>>> Although, now that you bring it up, I tried re-running some of >>>>>>>>>> the samples (hadn't done that yet), and I'm seeing some >>>>>>>>>> infinite recursion. This is seemingly caused by a circular >>>>>>>>>> type reference (e.g. linked list). i.e. to spin the impl of an >>>>>>>>>> accessor we need the LayoutType of the struct itself, which >>>>>>>>>> then tries to spin the impl again, and so on. I guess this >>>>>>>>>> isn't a test case in our suite yet... >>>>>>>>>> >>>>>>>>>> I'll look into this. >>>>>>>>> >>>>>>>>> Good detective work! I guess it would make sense to try and >>>>>>>>> reduce it >>>>>>>>> down to a simpler test, and push the test first. >>>>>>>>> >>>>>>>>> Where I was going with this is - your patch effectively made >>>>>>>>> the lazy >>>>>>>>> resolution inside StructImplGenerator useless. If we really >>>>>>>>> want to >>>>>>>>> explore that option, then we should, I think, remove all lazy >>>>>>>>> resolution sites and see what happens. It is possible that we >>>>>>>>> don't >>>>>>>>> rely so much on laziness as we did in the past (we did some >>>>>>>>> fixes few >>>>>>>>> months ago which stabilized resolution quite a bit) - in which >>>>>>>>> case we >>>>>>>>> can remove the resolution requests, although - I have to admit >>>>>>>>> - I'm a >>>>>>>>> bit skeptical. After all all you need it something like this >>>>>>>>> (as you >>>>>>>>> say): >>>>>>>>> >>>>>>>>> struct foo { >>>>>>>>> ??? struct foo *next; >>>>>>>>> } >>>>>>>>> >>>>>>>>> Which is kind of the killer app for unresolved layouts in the >>>>>>>>> first place. >>>>>>>>> >>>>>>>>> This is translated into a struct interface which has a getter >>>>>>>>> of >>>>>>>>> Pointer. To generate the getter you need to compute its >>>>>>>>> LayoutType which is a pointer LayoutType, so you have to >>>>>>>>> compute the >>>>>>>>> pointee LayoutType which brings you back where you started (the >>>>>>>>> whole >>>>>>>>> 'foo' LayoutType). In other words, since now the creation of >>>>>>>>> LayoutType requires the generation of the struct impl for >>>>>>>>> 'foo' >>>>>>>>> and since that depends (indirectly, through the pointer getter) >>>>>>>>> on >>>>>>>>> being able to produce a LayoutType, you get a circularity. >>>>>>>>> >>>>>>>>> One thing we could try is - instead of eagerly creating the >>>>>>>>> struct >>>>>>>>> impl, why don't we let the Reference.OfStruct having some >>>>>>>>> mutable >>>>>>>>> state in it? That is, we could start off with Reference getter >>>>>>>>> which >>>>>>>>> does the expensive refelective lookup - but then, once it has >>>>>>>>> discovered the constructor MH, it can stash it in some field >>>>>>>>> (which is >>>>>>>>> private to that reference object) and use it later if the >>>>>>>>> getter is >>>>>>>>> used again. Then, you probably still need a ClassValue to stash >>>>>>>>> a >>>>>>>>> mapping between a Class and its Reference.OfStruct; but it >>>>>>>>> seems like >>>>>>>>> this could fit in more naturally? >>>>>>>>> >>>>>>>>> Maurizio >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jorn >>>>>>>>>> >>>>>>>>>> Jorn Vernee schreef op 2019-05-21 21:06: >>>>>>>>>>> Since we have the resolution context for NativeHeader, AFAIK >>>>>>>>>>> there is >>>>>>>>>>> no more difference between the resolution call done by >>>>>>>>>>> StructImpleGenerator, and the one done by >>>>>>>>>>> LayoutTypeImpl.ofStruct. So >>>>>>>>>>> I don't think there are any more cases where we would have >>>>>>>>>>> succeeded >>>>>>>>>>> to resolve the Struct layout be delaying spinning the impl. >>>>>>>>>>> At least >>>>>>>>>>> the tests haven't caught such a case. >>>>>>>>>>> >>>>>>>>>>> The other thing is that the partial layout for the getter is >>>>>>>>>>> caught in >>>>>>>>>>> StructImplGenerator, but for the setter it's caught when >>>>>>>>>>> calling >>>>>>>>>>> bitSize on Unresolved. Saying layouts should be able to be >>>>>>>>>>> resolved >>>>>>>>>>> when calling LayoutType.ofStruct means we can use >>>>>>>>>>> References.OfGrumpy, >>>>>>>>>>> which makes the two more uniform. >>>>>>>>>>> >>>>>>>>>>> I have some ideas for keeping the lazy init semantics, but >>>>>>>>>>> it's a bit >>>>>>>>>>> more complex (using a MutableCallSite to mimic indy), and I'm >>>>>>>>>>> not sure >>>>>>>>>>> it will work as well. >>>>>>>>>>> >>>>>>>>>>> And, well, there was some talk about eagerly spinning the >>>>>>>>>>> implementations any ways :) >>>>>>>>>>> >>>>>>>>>>> Jorn >>>>>>>>>>> >>>>>>>>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09: >>>>>>>>>>>> Looks good, although I'm a bit worried about the change in >>>>>>>>>>>> semantics >>>>>>>>>>>> w.r.t. eager instantiation. The binder will create a lot of >>>>>>>>>>>> LayoutTypes when generating the implementation - I wonder >>>>>>>>>>>> there were >>>>>>>>>>>> cases before where we created a partial layout type, which >>>>>>>>>>>> then got >>>>>>>>>>>> resolved correctly by the time it was dereferenced (since we >>>>>>>>>>>> do >>>>>>>>>>>> another resolve lazily in StructImplGenerator [1]). >>>>>>>>>>>> >>>>>>>>>>>> [1] - >>>>>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 >>>>>>>>>>>> On 21/05/2019 14:41, Jorn Vernee wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> After the recent string of benchmarking [1], I've arrived >>>>>>>>>>>>> at 2 optimizations to improve the speed of the measured >>>>>>>>>>>>> code path. >>>>>>>>>>>>> >>>>>>>>>>>>> 1.) Specialization of Struct getter MethodHandles per >>>>>>>>>>>>> struct class. >>>>>>>>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does >>>>>>>>>>>>> a fused cast and offset operation, to avoid creating >>>>>>>>>>>>> multiple Pointer objects. >>>>>>>>>>>>> >>>>>>>>>>>>> The benchmark: >>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/ >>>>>>>>>>>>> The optimizations: >>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> I've split these into 2 so that it's easier to run the >>>>>>>>>>>>> benchmarks with and without the optimizations. (benchmark >>>>>>>>>>>>> uses the OpenJDK's builtin framework [2]). >>>>>>>>>>>>> >>>>>>>>>>>>> Since we're now more eagerly instantiating the struct impl >>>>>>>>>>>>> class I had to work around partial struct types, since >>>>>>>>>>>>> spinning the impl requires a non-partial type and now we're >>>>>>>>>>>>> spinning the impl when creating the LayouType for the >>>>>>>>>>>>> struct, as opposed to on the first dereference. To do this >>>>>>>>>>>>> I'm detecting whether the struct is partial in >>>>>>>>>>>>> LayoutType.ofStruct, and using a Reference.OfGrumpy in the >>>>>>>>>>>>> case where it can not be resolved. Tbh, I think this makes >>>>>>>>>>>>> things a little more clear as well as far as where/how the >>>>>>>>>>>>> exception for deref of a partial type is thrown. >>>>>>>>>>>>> >>>>>>>>>>>>> Results on my machine before the optimization are: >>>>>>>>>>>>> >>>>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error >>>>>>>>>>>>> Units >>>>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 14.204 ? 0.566 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 507.638 ? 19.462 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 90.236 ? 11.027 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 370.783 ? 13.744 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> >>>>>>>>>>>>> And after: >>>>>>>>>>>>> >>>>>>>>>>>>> Benchmark??????????????????????? Mode? Cnt Score Error >>>>>>>>>>>>> Units >>>>>>>>>>>>> GetStruct.jni_baseline?????????? avgt?? 50 13.941 ? 0.485 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_both??????? avgt?? 50 41.199 ? 1.632 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_fieldonly?? avgt?? 50 33.432 ? 1.889 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> GetStruct.panama_get_structonly? avgt?? 50 13.469 ? 0.781 >>>>>>>>>>>>> ns/op >>>>>>>>>>>>> >>>>>>>>>>>>> Where panama_get_structonly corresponds to 1., and >>>>>>>>>>>>> panama_get_fieldonly corresponds to 2. For a total of about >>>>>>>>>>>>> 12x speedup. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Jorn >>>>>>>>>>>>> >>>>>>>>>>>>> [1] : >>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html >>>>>>>>>>>>> [2] : https://openjdk.java.net/jeps/230 From maurizio.cimadamore at oracle.com Thu May 23 14:14:33 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 23 May 2019 14:14:33 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905231414.x4NEEXTJ008834@aojmv0008.oracle.com> Changeset: 13c8fbd3b3ad Author: mcimadamore Date: 2019-05-23 16:14 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/13c8fbd3b3ad Automatic merge with foreign From jbvernee at xs4all.nl Thu May 23 14:42:18 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 23 May 2019 16:42:18 +0200 Subject: [foreign-memaccess] RFR 8224614: Cleanup MemoryScope and its implementation In-Reply-To: References: Message-ID: Looks good! Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-22 19:40: > Hi, > this patch implements the approach described in [1]. > > I've refactored MemoryScopeImpl into an abstract class > (AbstractMemoryScopeImpl) and two concrete subclasses: > > * GlobalMemoryScopeImpl - for global scopes (roots of scope > hierarchies) > * ConfinedMemoryScopeImpl - for 'local' scopes (created using fork) > > The former is shared across multiple threads, but there's no need for > synchronization as there's no mutable state (as global scopes can't be > closed). > > The latter is thread-confined - which means only the thread owner > (which is established at scope-creation time) can do scope operations > such as fork/allocate/merge/close. > > I've also cleaned up the various characteristics flags in MemoryScope; > some of those made no longer sense, since we decided against having > VarHandle for reading/writing addresses directly. I kept the > following: > > * PINNED - used to mark scopes that cannot be closed; it's a property > of global scopes and cannot be set w/o super-user powers > > * IMMUTABLE - means that the underlying memory cannot be written to > > * UNALIGNED - means that we allow memory writes when addresses do not > conform to the alignment requirements of the layout from which the > VarHandle was created > > * CONFINED - means that memory associated with the scope can be > accessed only within the owning thread > > All characteristics are disabled by default - and it's up to the > client to set them. There's no 'inheritance' of characteristics either > from parents to children. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224614/ > > Cheers > Maurizio > > [1] - > https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005494.html From maurizio.cimadamore at oracle.com Thu May 23 14:45:09 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 23 May 2019 14:45:09 +0000 Subject: hg: panama/dev: 8224614: Cleanup MemoryScope and its implementation Message-ID: <201905231445.x4NEj9Uv000435@aojmv0008.oracle.com> Changeset: 369f97d880ab Author: mcimadamore Date: 2019-05-22 17:29 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/369f97d880ab 8224614: Cleanup MemoryScope and its implementation ! src/java.base/share/classes/java/foreign/MemoryScope.java ! src/java.base/share/classes/java/foreign/MemorySegment.java ! src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template + src/java.base/share/classes/jdk/internal/foreign/AbstractMemoryScopeImpl.java + src/java.base/share/classes/jdk/internal/foreign/ConfinedMemoryScopeImpl.java + src/java.base/share/classes/jdk/internal/foreign/GlobalMemoryScopeImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java - src/java.base/share/classes/jdk/internal/foreign/MemoryScopeImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java ! test/jdk/java/foreign/TestMemoryAccess.java ! test/jdk/java/foreign/TestScopes.java From sundararajan.athijegannathan at oracle.com Fri May 24 05:35:47 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 24 May 2019 11:05:47 +0530 Subject: [foreign] test failure with tip Message-ID: <5CE782B3.7080404@oracle.com> I got a test failure with tip [foreign branch]. $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: error: cannot find symbol Recursive r = sc.allocateStruct(Recursive.class); ^ symbol: class Recursive location: class PartialStructsTest $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: error: cannot find symbol Recursive r = sc.allocateStruct(Recursive.class); ^ symbol: class Recursive location: class PartialStructsTest 2 errors result: Failed. Compilation failed: Compilation failed From https://hg.openjdk.java.net/panama/dev/rev/c1b415e10db0, it seems like Recursive.java was not "hg add"-ed? -Sundar From jbvernee at xs4all.nl Fri May 24 08:54:37 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 24 May 2019 10:54:37 +0200 Subject: [foreign] test failure with tip In-Reply-To: <5CE782B3.7080404@oracle.com> References: <5CE782B3.7080404@oracle.com> Message-ID: <2040d63dc9c9e3a23ff44663286247b2@xs4all.nl> Sorry about this, I accidentally committed without doing a pull first yesterday, and when I undid the commit and re-made the patch I didn't notice that it didn't include the new files that I had added. Will rectify shortly. Jorn Sundararajan Athijegannathan schreef op 2019-05-24 07:35: > I got a test failure with tip [foreign branch]. > > $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: > error: cannot find symbol > Recursive r = sc.allocateStruct(Recursive.class); > ^ > symbol: class Recursive > location: class PartialStructsTest > $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: > error: cannot find symbol > Recursive r = sc.allocateStruct(Recursive.class); > ^ > symbol: class Recursive > location: class PartialStructsTest > 2 errors > result: Failed. Compilation failed: Compilation failed > > From https://hg.openjdk.java.net/panama/dev/rev/c1b415e10db0, it > seems like Recursive.java was not "hg add"-ed? > > -Sundar From jbvernee at xs4all.nl Fri May 24 08:56:14 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Fri, 24 May 2019 08:56:14 +0000 Subject: hg: panama/dev: Summry: Add missing files from last commit Message-ID: <201905240856.x4O8uEXY010732@aojmv0008.oracle.com> Changeset: ff52a792cdf7 Author: jvernee Date: 2019-05-24 10:55 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/ff52a792cdf7 Summry: Add missing files from last commit + test/jdk/java/foreign/types/partial/Recursive.java + test/micro/org/openjdk/bench/java/foreign/GetStruct.java + test/micro/org/openjdk/bench/java/foreign/libGetStruct.c From maurizio.cimadamore at oracle.com Fri May 24 09:02:03 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 24 May 2019 09:02:03 +0000 Subject: hg: panama/dev: Automatic merge with foreign Message-ID: <201905240902.x4O9234E012689@aojmv0008.oracle.com> Changeset: 1a1b8038185b Author: mcimadamore Date: 2019-05-24 11:01 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/1a1b8038185b Automatic merge with foreign From sundararajan.athijegannathan at oracle.com Fri May 24 11:27:54 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 24 May 2019 16:57:54 +0530 Subject: [foreign] test failure with tip In-Reply-To: <2040d63dc9c9e3a23ff44663286247b2@xs4all.nl> References: <5CE782B3.7080404@oracle.com> <2040d63dc9c9e3a23ff44663286247b2@xs4all.nl> Message-ID: <5CE7D53A.4030705@oracle.com> Verified that all tests are fine (on Mac) now. Thanks. -Sundar On 24/05/19, 2:24 PM, Jorn Vernee wrote: > Sorry about this, > > I accidentally committed without doing a pull first yesterday, and > when I undid the commit and re-made the patch I didn't notice that it > didn't include the new files that I had added. > > Will rectify shortly. > > Jorn > > Sundararajan Athijegannathan schreef op 2019-05-24 07:35: >> I got a test failure with tip [foreign branch]. >> >> $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: >> >> error: cannot find symbol >> Recursive r = sc.allocateStruct(Recursive.class); >> ^ >> symbol: class Recursive >> location: class PartialStructsTest >> $panama-dev/test/jdk/java/foreign/types/partial/PartialStructsTest.java:89: >> >> error: cannot find symbol >> Recursive r = sc.allocateStruct(Recursive.class); >> ^ >> symbol: class Recursive >> location: class PartialStructsTest >> 2 errors >> result: Failed. Compilation failed: Compilation failed >> >> From https://hg.openjdk.java.net/panama/dev/rev/c1b415e10db0, it >> seems like Recursive.java was not "hg add"-ed? >> >> -Sundar From maurizio.cimadamore at oracle.com Mon May 27 17:45:21 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 27 May 2019 18:45:21 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support Message-ID: Hi, as you know, the foreign memory access API supports bidirectional interop between MemoryAddress and ByteBuffer: 1) MemorySegment::ofByteBuffer(ByteBuffer) 2) MemoryAddress:;asDirectByteBuffer(int bytes) That is (1) can be used to create a memory segment out of an existing byte buffer, whereas (2) can be used to do the opposite, that is, to convert a memory address into a byte buffer. While (1) works pretty reliably (but I've added some tests for it), the implementation for (2) leaves to be desired: * The resulting byte buffer is unaware of the fact that the backing memory is attached to a scope that can be closed * There's no way to create a buffer if the address encapsulates some heap-based memory address This patch solves both issues - and also adds a supported way for creating a memory segment out of a heap-allocated byte array (MemorySegment.ofArray(byte[])) which I think is useful. To solve the scope awareness issue I put together (after discussing extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, which delegates to the underlying buffer implementation after doing a liveness check on the owning scope. That means that operations such as ByteBuffer::getInt will only succeed if the scope the view is associated with is still alive. Note that we need the wrapping only if the associated scope is _not_ pinned - in fact, if the scope is pinned, then it can't be closed, so there's no need to check for liveliness. To do this I had to remove the 'final' modifier from some instance methods in ByteBuffer and Buffer - the theory here is that these 'final' have been added in early days to help the VM out, but are not necessary now (also note that is not possible for a client to create a custom byte buffer implementation, as all constructors are package-private). Finally, the wrapped byte buffer implementation does not support typeful views such as 'asCharBuffer' - this is tricky to support as CharBuffer is not a subtype of ByteBuffer so, to go down that path we'd need to wrap also CharBuffer and friends, which is doable, but we're leaning towards YAGNI for now. Finally I've added a test which writes into memory (using var handles) then reads it back using a segment-backed byte buffer (there are tests for both heap and off-heap variants of the buffer). There's also a test which checks interop between MappedByteBuffer and the foreign API (which will likely be relevant in the context of JEP 352 [1]), and, finally, a test which makes sure that all instance method in the scope-wrapped buffer throws ISE after the scope has been closed. Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ Comments welcome! Cheers Maurizio [1] - https://openjdk.java.net/jeps/352 From john.r.rose at oracle.com Mon May 27 19:12:01 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 27 May 2019 12:12:01 -0700 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: I'm glad to see this. The code looks good. I've been hoping we'd see this day! I agree with your choices about ofArray(byte[]), "final", and CharBuffer. The new scope-aware type of BB will, I hope, be useful for people who want the capabilities of Panama, including better pointer safety, but prefer to work with BB APIs for slinging the bits. FTR, here are a couple of future moves that this one unlocks: * Timely deallocation. If the _buffer field of the new BB class is a DBB then a deallocate operation can be defined which releases the underlying storage immediately, instead of via a Cleaner. This is safe for the same reason scope deallocation is safe. There's a catch: The _buffer needs to be unaliased to any external "wild" BB or separately managed Panama memory block. This typically means (a) the new BB constructor allocates foreign storage directly rather than having it handed to the constructor as an argument and (b) the new BB class has a boolean which encodes the unaliased state. In effect, this means that the underlying block is handled under a discipline of "linear ownership", like a linear type in some formalisms. Which leads to the next point: * Handoff operations on BBs. If/when we do protocols for pushing linear ownership of scopes across threads, the BBs come along for the ride. This will widen the circumstances where scoped BBs will be useful, by removing the constraint the all work happens in one thread. More FTR: The elements of cross-thread handoff are simple, though tricky in implementation details. A memory block B can (in principle) be re-scoped from an owning T1 to a new owner T2 if T1 synchronously finishes all operations on B (release fence) and then locks itself out from B. The transfer to T2 is finished by reversing this process (acquire fence). The midpoint of the process is, all by itself, a useful state, where B is locked out of both T1 and T2 (and all other threads); this is a neutral state from which any thread T2 can pick it up. This neutral state is neither readable nor writable, but is perfect for placing B on a queue, to be picked up by a future thread whose identity is not yet known. So hand-off factors into T1 putting B down, into neutral state, and T2 picking it up some time later. A mutex of some sort is needed to prevent T2 and T3 both picking up a neutral B due to races. This is future stuff which I think will eventually happen but TBH might not. The present integration between BB's and Panama is really neat and doesn't need the extra future stuff? at present. Thanks Maurizio! This breathes new life into BBs, IMO. ? John > On May 27, 2019, at 10:45 AM, Maurizio Cimadamore wrote: > > Hi, > as you know, the foreign memory access API supports bidirectional interop between MemoryAddress and ByteBuffer: > > 1) MemorySegment::ofByteBuffer(ByteBuffer) > 2) MemoryAddress:;asDirectByteBuffer(int bytes) > > That is (1) can be used to create a memory segment out of an existing byte buffer, whereas (2) can be used to do the opposite, that is, to convert a memory address into a byte buffer. > > While (1) works pretty reliably (but I've added some tests for it), the implementation for (2) leaves to be desired: > > * The resulting byte buffer is unaware of the fact that the backing memory is attached to a scope that can be closed > * There's no way to create a buffer if the address encapsulates some heap-based memory address > > This patch solves both issues - and also adds a supported way for creating a memory segment out of a heap-allocated byte array (MemorySegment.ofArray(byte[])) which I think is useful. From maurizio.cimadamore at oracle.com Mon May 27 20:18:56 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 27 May 2019 21:18:56 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: Thanks for the comments. Btw, I realized that my patch doesn't handle very well the creation of buffers from on-heap addresses. That's because the 'offset' of the ByteBuffer::wrap factory is really an array offset, not an address offset - so some conversion is needed there, e.g. (address - BASE) / SCALE. Also I need to double-check on what happens when you resize regions that come from arrays or buffers - some of these operation throw, but I think we should be able to support a MemorySegment::resize in all cases. And, I also need to check what happens when we do a round trip: byte-buffer -> segment -> base address -> byte-buffer. (maybe with some MemoryRegion::resize thrown in there). As for MemorySegment::ofArray I'm thinking of relaxing it to accept all primitive arrays - since that could be useful if a user needs to do a bulk copy from Java heap to an off heap array using MemoryAddress::copy. Some comments inline: On 27/05/2019 20:12, John Rose wrote: > I'm glad to see this. The code looks good. I've been > hoping we'd see this day! I agree with your choices > about ofArray(byte[]), "final", and CharBuffer. > > The new scope-aware type of BB will, I hope, be useful > for people who want the capabilities of Panama, including > better pointer safety, but prefer to work with BB APIs > for slinging the bits. > > FTR, here are a couple of future moves that this one > unlocks: > > * Timely deallocation. If the _buffer field of the new BB > class is a DBB then a deallocate operation can be defined > which releases the underlying storage immediately, instead > of via a Cleaner. This is safe for the same reason scope > deallocation is safe. There's a catch: The _buffer needs > to be unaliased to any external "wild" BB or separately > managed Panama memory block. This typically means > (a) the new BB constructor allocates foreign storage directly > rather than having it handed to the constructor as an > argument and (b) the new BB class has a boolean which > encodes the unaliased state. In effect, this means that > the underlying block is handled under a discipline of > "linear ownership", like a linear type in some formalisms. > Which leads to the next point: Yes, we could provide an AutoCloseable-like variant of a ByteBuffer which internally used a Scope to allocate memory. Then yes, you need to keep track of aliasing to make sure it's safe to clean the scope. > > * Handoff operations on BBs. If/when we do protocols > for pushing linear ownership of scopes across threads, > the BBs come along for the ride. This will widen the > circumstances where scoped BBs will be useful, by > removing the constraint the all work happens in one > thread. Actually, we no longer have the constraint. With the new confinement model I've pushed last week (after discussion - see [1]), only scope operation such as fork/allocate/close/merge are confined to the owner strand. But _all_ threads are free to read/write memory. The assumption is that the VH is rich enough to allow for all sort of synchronization flavor to be built on top. If you truly want a scope which allocates memory that can be read/written by a single strand, you can do so via the CONFINED scope charateristics - e.g. Scope.globalScope().fork(CONFINED). I think this will remove a lot of pressure for some of the things you describe. Yes, in principle there could still be a need of changing ownership for doing scope operation such as allocate - but I'd like first to validate how common a use case that would be. Maurizio > > More FTR: > > The elements of cross-thread handoff are simple, > though tricky in implementation details. A memory > block B can (in principle) be re-scoped from an owning > T1 to a new owner T2 if T1 synchronously finishes all > operations on B (release fence) and then locks itself > out from B. The transfer to T2 is finished by reversing > this process (acquire fence). The midpoint of the process > is, all by itself, a useful state, where B is locked out of > both T1 and T2 (and all other threads); this is a neutral > state from which any thread T2 can pick it up. This > neutral state is neither readable nor writable, but is > perfect for placing B on a queue, to be picked up by > a future thread whose identity is not yet known. So > hand-off factors into T1 putting B down, into neutral > state, and T2 picking it up some time later. A mutex > of some sort is needed to prevent T2 and T3 both > picking up a neutral B due to races. > > This is future stuff which I think will eventually happen > but TBH might not. The present integration between > BB's and Panama is really neat and doesn't need the > extra future stuff? at present. > > Thanks Maurizio! This breathes new life into BBs, IMO. > > ? John > >> On May 27, 2019, at 10:45 AM, Maurizio Cimadamore wrote: >> >> Hi, >> as you know, the foreign memory access API supports bidirectional interop between MemoryAddress and ByteBuffer: >> >> 1) MemorySegment::ofByteBuffer(ByteBuffer) >> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >> >> That is (1) can be used to create a memory segment out of an existing byte buffer, whereas (2) can be used to do the opposite, that is, to convert a memory address into a byte buffer. >> >> While (1) works pretty reliably (but I've added some tests for it), the implementation for (2) leaves to be desired: >> >> * The resulting byte buffer is unaware of the fact that the backing memory is attached to a scope that can be closed >> * There's no way to create a buffer if the address encapsulates some heap-based memory address >> >> This patch solves both issues - and also adds a supported way for creating a memory segment out of a heap-allocated byte array (MemorySegment.ofArray(byte[])) which I think is useful. From maurizio.cimadamore at oracle.com Mon May 27 20:19:59 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 27 May 2019 21:19:59 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: On 27/05/2019 21:18, Maurizio Cimadamore wrote: > Actually, we no longer have the constraint. With the new confinement > model I've pushed last week (after discussion - see [1]), only scope > operation such as fork/allocate/close/merge are confined to the owner > strand. [1]? - https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005494.html From nick.gasson at arm.com Tue May 28 07:34:01 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 28 May 2019 15:34:01 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> Message-ID: <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> Hi all, I've put an updated webrev for the AArch64 port here: http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.2/ This contains all the previous changes plus a few fixes to handle things like "heterogeneous float aggregates" (structs/arrays where all the members are float/double), which are handled specially in the ABI. With this patch all the tests in the jdk_foreign group pass on AArch64, except the two I disabled pending adding long double support. Please let me know if you have any feedback! Thanks, Nick From maurizio.cimadamore at oracle.com Tue May 28 08:48:01 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 09:48:01 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> Message-ID: <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> Looks very good. Some questions/comments: * don't you need some extra set of constants under NativeTypes (for aarch platform?) * UnalignedStructTest - why doesn't it work? There's an f128 there but we're not actually reading/setting it - it's mostly there to force alignment a certain way * the change to DirectSignatureShuffler is subtle, but I think it's ok and shouldn't impact SysV negatively The rest looks good Maurizio On 28/05/2019 08:34, Nick Gasson wrote: > Hi all, > > I've put an updated webrev for the AArch64 port here: > > http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.2/ > > This contains all the previous changes plus a few fixes to handle things > like "heterogeneous float aggregates" (structs/arrays where all the > members are float/double), which are handled specially in the ABI. > > With this patch all the tests in the jdk_foreign group pass on AArch64, > except the two I disabled pending adding long double support. > > Please let me know if you have any feedback! > > Thanks, > Nick From jbvernee at xs4all.nl Tue May 28 09:19:53 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 28 May 2019 11:19:53 +0200 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> Message-ID: Hi Nick, I'm testing the patch, and there was 1 test failure; I noticed that you changed the code in UniversalUpcallHandler from depending on SharedUtils.VECTOR_REGISTER_SIZE to storage.getSize(). I understand why the change is needed, but this should use getMaxSize(), since getSize() can be smaller. After making that change all tests pass on my machine. The rest of the code looks good to me. Cheers, Jorn Nick Gasson schreef op 2019-05-28 09:34: > Hi all, > > I've put an updated webrev for the AArch64 port here: > > http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.2/ > > This contains all the previous changes plus a few fixes to handle > things > like "heterogeneous float aggregates" (structs/arrays where all the > members are float/double), which are handled specially in the ABI. > > With this patch all the tests in the jdk_foreign group pass on AArch64, > except the two I disabled pending adding long double support. > > Please let me know if you have any feedback! > > Thanks, > Nick From jbvernee at xs4all.nl Tue May 28 09:57:06 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 28 May 2019 11:57:06 +0200 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> Message-ID: <93069aae31189b5ca6951276e1ff0b94@xs4all.nl> > * UnalignedStructTest - why doesn't it work? There's an f128 there but > we're not actually reading/setting it - it's mostly there to force > alignment a certain way I noticed it's disabled on Windows as well, but as you say, this is not really needed. There will however be some mismatch between the Java layout and the native one, since the size of long double is not consistent. A more canonical way to add manual padding is to use anonymous bitfields: struct unaligned { long long: 64; long long: 64; short i1; short i2; //padding here? }; That, plus adding the EXPORT macro to the functions, and replacing the f128 in the layout with x128 (for clarity) works for me (I can upload a webrev if wanted). Cheers, Jorn [1] : http://hg.openjdk.java.net/panama/dev/rev/3a85acb09a77#l34.2 Maurizio Cimadamore schreef op 2019-05-28 10:48: > Looks very good. Some questions/comments: > > * don't you need some extra set of constants under NativeTypes (for > aarch platform?) > > * UnalignedStructTest - why doesn't it work? There's an f128 there but > we're not actually reading/setting it - it's mostly there to force > alignment a certain way > > * the change to DirectSignatureShuffler is subtle, but I think it's ok > and shouldn't impact SysV negatively > > The rest looks good > > Maurizio > > > On 28/05/2019 08:34, Nick Gasson wrote: >> Hi all, >> >> I've put an updated webrev for the AArch64 port here: >> >> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.2/ >> >> This contains all the previous changes plus a few fixes to handle >> things >> like "heterogeneous float aggregates" (structs/arrays where all the >> members are float/double), which are handled specially in the ABI. >> >> With this patch all the tests in the jdk_foreign group pass on >> AArch64, >> except the two I disabled pending adding long double support. >> >> Please let me know if you have any feedback! >> >> Thanks, >> Nick From nick.gasson at arm.com Tue May 28 10:06:55 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 28 May 2019 18:06:55 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> Message-ID: <95d0a452-ebeb-903d-9d14-b678d1d71389@arm.com> Hi Jorn, On 28/05/2019 17:19, Jorn Vernee wrote: > I'm testing the patch, and there was 1 test failure; I noticed that you > changed the code in UniversalUpcallHandler from depending on > SharedUtils.VECTOR_REGISTER_SIZE to storage.getSize(). I understand why > the change is needed, but this should use getMaxSize(), since getSize() > can be smaller. After making that change all tests pass on my machine. > Thanks for finding this! I'll change it to getMaxSize(). Nick From maurizio.cimadamore at oracle.com Tue May 28 10:21:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 11:21:31 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <93069aae31189b5ca6951276e1ff0b94@xs4all.nl> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <93069aae31189b5ca6951276e1ff0b94@xs4all.nl> Message-ID: On 28/05/2019 10:57, Jorn Vernee wrote: >> * UnalignedStructTest - why doesn't it work? There's an f128 there but >> we're not actually reading/setting it - it's mostly there to force >> alignment a certain way > > I noticed it's disabled on Windows as well, but as you say, this is > not really needed. There will however be some mismatch between the > Java layout and the native one, since the size of long double is not > consistent. A more canonical way to add manual padding is to use > anonymous bitfields: > > ? struct unaligned { > ??? long long: 64; > ??? long long: 64; > ??? short i1; > ??? short i2; > ??? //padding here? > ? }; > > That, plus adding the EXPORT macro to the functions, and replacing the > f128 in the layout with x128 (for clarity) works for me (I can upload > a webrev if wanted). I think the test is not really testing unaligned access - the name is a bit misleading here. The goal here is mostly to test whether the CallingSequenceBuilder/ShuffleRecipe will add some amount of SKIP in the resulting universal recipe, due to the fact that 'i1' comes after a field that is 16bytes aligned. In other words I wanted to test this path: http://hg.openjdk.java.net/panama/dev/file/1a1b8038185b/src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/CallingSequenceBuilderImpl.java#l346 And how it interacted with this: http://hg.openjdk.java.net/panama/dev/file/1a1b8038185b/src/java.base/share/classes/jdk/internal/foreign/abi/ShuffleRecipe.java#l59 In a way, the f128 in there is irrelevant. In fact you can probably put either x128, f128, i128 or u128 - after all since the struct will be passed on the stack, I believe that UniversalInvoker won't copy the fields separately, but will just blindly copy the struct contents in the appropriate argument binding destinations. Maurizio > > Cheers, > Jorn > > [1] : http://hg.openjdk.java.net/panama/dev/rev/3a85acb09a77#l34.2 > > Maurizio Cimadamore schreef op 2019-05-28 10:48: >> Looks very good. Some questions/comments: >> >> * don't you need some extra set of constants under NativeTypes (for >> aarch platform?) >> >> * UnalignedStructTest - why doesn't it work? There's an f128 there but >> we're not actually reading/setting it - it's mostly there to force >> alignment a certain way >> >> * the change to DirectSignatureShuffler is subtle, but I think it's ok >> and shouldn't impact SysV negatively >> >> The rest looks good >> >> Maurizio >> >> >> On 28/05/2019 08:34, Nick Gasson wrote: >>> Hi all, >>> >>> I've put an updated webrev for the AArch64 port here: >>> >>> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.2/ >>> >>> This contains all the previous changes plus a few fixes to handle >>> things >>> like "heterogeneous float aggregates" (structs/arrays where all the >>> members are float/double), which are handled specially in the ABI. >>> >>> With this patch all the tests in the jdk_foreign group pass on AArch64, >>> except the two I disabled pending adding long double support. >>> >>> Please let me know if you have any feedback! >>> >>> Thanks, >>> Nick From jbvernee at xs4all.nl Tue May 28 10:30:55 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 28 May 2019 12:30:55 +0200 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: Hi, The patch doesn't compile for me: === Output from failing command(s) repeated here === * For target jdk_modules_java.base__the.java.base_batch: h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: error: no suitable constructor found for ByteBuffer(no arguments) super(); ^ constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is not applicable (actual and formal argument lists differ in length) constructor ByteBuffer.ByteBuffer(int,int,int,int) is not applicable (actual and formal argument lists differ in length) h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) in ByteBuffer public ByteBuffer put(byte[] src) { ^ overridden method is final h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: error: hasArray() in ScopedByteBuffer cannot override hasArray() in ByteBuffer ... (rest of output omitted) * All command lines available in /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. === End of repeated output === I see that in the patch you made the necessary changes to java.nio.Buffer, but shouldn't these changes be made to the template file for ByteBuffer as well? Since ScopedByteBuffer extends ByteBuffer? Jorn Maurizio Cimadamore schreef op 2019-05-27 19:45: > Hi, > as you know, the foreign memory access API supports bidirectional > interop between MemoryAddress and ByteBuffer: > > 1) MemorySegment::ofByteBuffer(ByteBuffer) > 2) MemoryAddress:;asDirectByteBuffer(int bytes) > > That is (1) can be used to create a memory segment out of an existing > byte buffer, whereas (2) can be used to do the opposite, that is, to > convert a memory address into a byte buffer. > > While (1) works pretty reliably (but I've added some tests for it), > the implementation for (2) leaves to be desired: > > * The resulting byte buffer is unaware of the fact that the backing > memory is attached to a scope that can be closed > * There's no way to create a buffer if the address encapsulates some > heap-based memory address > > This patch solves both issues - and also adds a supported way for > creating a memory segment out of a heap-allocated byte array > (MemorySegment.ofArray(byte[])) which I think is useful. > > To solve the scope awareness issue I put together (after discussing > extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, > which delegates to the underlying buffer implementation after doing a > liveness check on the owning scope. That means that operations such as > ByteBuffer::getInt will only succeed if the scope the view is > associated with is still alive. Note that we need the wrapping only if > the associated scope is _not_ pinned - in fact, if the scope is > pinned, then it can't be closed, so there's no need to check for > liveliness. > > To do this I had to remove the 'final' modifier from some instance > methods in ByteBuffer and Buffer - the theory here is that these > 'final' have been added in early days to help the VM out, but are not > necessary now (also note that is not possible for a client to create a > custom byte buffer implementation, as all constructors are > package-private). > > Finally, the wrapped byte buffer implementation does not support > typeful views such as 'asCharBuffer' - this is tricky to support as > CharBuffer is not a subtype of ByteBuffer so, to go down that path > we'd need to wrap also CharBuffer and friends, which is doable, but > we're leaning towards YAGNI for now. > > Finally I've added a test which writes into memory (using var handles) > then reads it back using a segment-backed byte buffer (there are tests > for both heap and off-heap variants of the buffer). There's also a > test which checks interop between MappedByteBuffer and the foreign API > (which will likely be relevant in the context of JEP 352 [1]), and, > finally, a test which makes sure that all instance method in the > scope-wrapped buffer throws ISE after the scope has been closed. > > Webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ > > Comments welcome! > > Cheers > Maurizio > > [1] - https://openjdk.java.net/jeps/352 From nick.gasson at arm.com Tue May 28 10:36:42 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 28 May 2019 18:36:42 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> Message-ID: <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> Hi Maurizio, On 28/05/2019 16:48, Maurizio Cimadamore wrote: > > * don't you need some extra set of constants under NativeTypes (for > aarch platform?) Yes, we ought to. I think I reused the LittleEndian.SysVABI ones initially as they almost exactly match except for long double. For definitions like this: public static LayoutType SHORT = isWindows ? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT; Do you want me to change it into something like: public static LayoutType SHORT = isX86 ? (isWindows ? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT) : LittleEndian.AArch64.SHORT; I can see this getting quite messy as we add more platform combinations. > > * UnalignedStructTest - why doesn't it work? There's an f128 there but > we're not actually reading/setting it - it's mostly there to force > alignment a certain way > I think it disabled it at one point because it was failing and saw it used long double so just assumed that was the cause. But I just went back and tested it and it passes so must be have been something else I fixed later. I'll re-enable it. Thanks, Nick From maurizio.cimadamore at oracle.com Tue May 28 11:20:57 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 12:20:57 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> Message-ID: <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> On 28/05/2019 11:36, Nick Gasson wrote: > Hi Maurizio, > > On 28/05/2019 16:48, Maurizio Cimadamore wrote: >> >> * don't you need some extra set of constants under NativeTypes (for >> aarch platform?) > > Yes, we ought to. I think I reused the LittleEndian.SysVABI ones > initially as they almost exactly match except for long double. > > For definitions like this: > > ??? public static LayoutType SHORT = isWindows ? > ??????????? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT; > > Do you want me to change it into something like: > > ?? public static LayoutType SHORT = isX86 ? > ??????????? (isWindows ? > ???????????? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT) > ??????????? : LittleEndian.AArch64.SHORT; > > I can see this getting quite messy as we add more platform combinations. For now we don't have many choices. Once we get switch expressions we can refactor to nicer code. Maurizio > >> >> * UnalignedStructTest - why doesn't it work? There's an f128 there >> but we're not actually reading/setting it - it's mostly there to >> force alignment a certain way >> > > I think it disabled it at one point because it was failing and saw it > used long double so just assumed that was the cause. But I just went > back and tested it and it passes so must be have been something else I > fixed later. I'll re-enable it. > > Thanks, > Nick From maurizio.cimadamore at oracle.com Tue May 28 11:22:21 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 12:22:21 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: I'll check - thanks Maurizio On 28/05/2019 11:30, Jorn Vernee wrote: > Hi, > > The patch doesn't compile for me: > > === Output from failing command(s) repeated here === > * For target jdk_modules_java.base__the.java.base_batch: > h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: > error: no suitable constructor found for ByteBuffer(no arguments) > ??????? super(); > ??????? ^ > ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is > not applicable > ????? (actual and formal argument lists differ in length) > ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not applicable > ????? (actual and formal argument lists differ in length) > h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: > error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) in > ByteBuffer > ??? public ByteBuffer put(byte[] src) { > ????????????????????? ^ > ? overridden method is final > h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: > error: hasArray() in ScopedByteBuffer cannot override hasArray() in > ByteBuffer > ?? ... (rest of output omitted) > > * All command lines available in > /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. > === End of repeated output === > > I see that in the patch you made the necessary changes to > java.nio.Buffer, but shouldn't these changes be made to the template > file for ByteBuffer as well? Since ScopedByteBuffer extends ByteBuffer? > > Jorn > > Maurizio Cimadamore schreef op 2019-05-27 19:45: >> Hi, >> as you know, the foreign memory access API supports bidirectional >> interop between MemoryAddress and ByteBuffer: >> >> 1) MemorySegment::ofByteBuffer(ByteBuffer) >> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >> >> That is (1) can be used to create a memory segment out of an existing >> byte buffer, whereas (2) can be used to do the opposite, that is, to >> convert a memory address into a byte buffer. >> >> While (1) works pretty reliably (but I've added some tests for it), >> the implementation for (2) leaves to be desired: >> >> * The resulting byte buffer is unaware of the fact that the backing >> memory is attached to a scope that can be closed >> * There's no way to create a buffer if the address encapsulates some >> heap-based memory address >> >> This patch solves both issues - and also adds a supported way for >> creating a memory segment out of a heap-allocated byte array >> (MemorySegment.ofArray(byte[])) which I think is useful. >> >> To solve the scope awareness issue I put together (after discussing >> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >> which delegates to the underlying buffer implementation after doing a >> liveness check on the owning scope. That means that operations such as >> ByteBuffer::getInt will only succeed if the scope the view is >> associated with is still alive. Note that we need the wrapping only if >> the associated scope is _not_ pinned - in fact, if the scope is >> pinned, then it can't be closed, so there's no need to check for >> liveliness. >> >> To do this I had to remove the 'final' modifier from some instance >> methods in ByteBuffer and Buffer - the theory here is that these >> 'final' have been added in early days to help the VM out, but are not >> necessary now (also note that is not possible for a client to create a >> custom byte buffer implementation, as all constructors are >> package-private). >> >> Finally, the wrapped byte buffer implementation does not support >> typeful views such as 'asCharBuffer' - this is tricky to support as >> CharBuffer is not a subtype of ByteBuffer so, to go down that path >> we'd need to wrap also CharBuffer and friends, which is doable, but >> we're leaning towards YAGNI for now. >> >> Finally I've added a test which writes into memory (using var handles) >> then reads it back using a segment-backed byte buffer (there are tests >> for both heap and off-heap variants of the buffer). There's also a >> test which checks interop between MappedByteBuffer and the foreign API >> (which will likely be relevant in the context of JEP 352 [1]), and, >> finally, a test which makes sure that all instance method in the >> scope-wrapped buffer throws ISE after the scope has been closed. >> >> Webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >> >> Comments welcome! >> >> Cheers >> Maurizio >> >> [1] - https://openjdk.java.net/jeps/352 From maurizio.cimadamore at oracle.com Tue May 28 11:25:06 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 12:25:06 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> Message-ID: <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> Actually, a cleaner way to get there is: public static LayoutType SHORT = pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, LittleEndian.AArch64.SHORT); and then define the logic inside pick() to select the right value depending on platform/os. At least that way the code will appear less cluttered - and if we make pick() a varargs it can scale to any number of platform/os - as long as we keep some uniformity between the order of arguments being passed. Another way would be to do a reflective lookup of a constant name - but not sure we need all that. Maurizio On 28/05/2019 12:20, Maurizio Cimadamore wrote: > > On 28/05/2019 11:36, Nick Gasson wrote: >> Hi Maurizio, >> >> On 28/05/2019 16:48, Maurizio Cimadamore wrote: >>> >>> * don't you need some extra set of constants under NativeTypes (for >>> aarch platform?) >> >> Yes, we ought to. I think I reused the LittleEndian.SysVABI ones >> initially as they almost exactly match except for long double. >> >> For definitions like this: >> >> ??? public static LayoutType SHORT = isWindows ? >> ??????????? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT; >> >> Do you want me to change it into something like: >> >> ?? public static LayoutType SHORT = isX86 ? >> ??????????? (isWindows ? >> ???????????? LittleEndian.WinABI.SHORT : LittleEndian.SysVABI.SHORT) >> ??????????? : LittleEndian.AArch64.SHORT; >> >> I can see this getting quite messy as we add more platform combinations. > > For now we don't have many choices. Once we get switch expressions we > can refactor to nicer code. > > Maurizio > >> >>> >>> * UnalignedStructTest - why doesn't it work? There's an f128 there >>> but we're not actually reading/setting it - it's mostly there to >>> force alignment a certain way >>> >> >> I think it disabled it at one point because it was failing and saw it >> used long double so just assumed that was the cause. But I just went >> back and tested it and it passes so must be have been something else >> I fixed later. I'll re-enable it. >> >> Thanks, >> Nick From maurizio.cimadamore at oracle.com Tue May 28 12:32:22 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 13:32:22 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: Message-ID: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> Updated webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ Changes: * fixed byte buffer template * regularized logic for resize (now bytebuffer segments also get to resize()!) * moved byte buffer segment into its own class * added several more test to check correctness of resize operation with various segment kinds * opened up MemorySegment.ofArray to allow all primitive arrays (but MemoryAddress.asByteBuffer will still throw if the base object is not byte[]) Maurizio On 28/05/2019 12:22, Maurizio Cimadamore wrote: > I'll check - thanks > > Maurizio > > On 28/05/2019 11:30, Jorn Vernee wrote: >> Hi, >> >> The patch doesn't compile for me: >> >> === Output from failing command(s) repeated here === >> * For target jdk_modules_java.base__the.java.base_batch: >> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >> error: no suitable constructor found for ByteBuffer(no arguments) >> ??????? super(); >> ??????? ^ >> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is >> not applicable >> ????? (actual and formal argument lists differ in length) >> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not applicable >> ????? (actual and formal argument lists differ in length) >> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) in >> ByteBuffer >> ??? public ByteBuffer put(byte[] src) { >> ????????????????????? ^ >> ? overridden method is final >> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >> ByteBuffer >> ?? ... (rest of output omitted) >> >> * All command lines available in >> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >> === End of repeated output === >> >> I see that in the patch you made the necessary changes to >> java.nio.Buffer, but shouldn't these changes be made to the template >> file for ByteBuffer as well? Since ScopedByteBuffer extends ByteBuffer? >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>> Hi, >>> as you know, the foreign memory access API supports bidirectional >>> interop between MemoryAddress and ByteBuffer: >>> >>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>> >>> That is (1) can be used to create a memory segment out of an existing >>> byte buffer, whereas (2) can be used to do the opposite, that is, to >>> convert a memory address into a byte buffer. >>> >>> While (1) works pretty reliably (but I've added some tests for it), >>> the implementation for (2) leaves to be desired: >>> >>> * The resulting byte buffer is unaware of the fact that the backing >>> memory is attached to a scope that can be closed >>> * There's no way to create a buffer if the address encapsulates some >>> heap-based memory address >>> >>> This patch solves both issues - and also adds a supported way for >>> creating a memory segment out of a heap-allocated byte array >>> (MemorySegment.ofArray(byte[])) which I think is useful. >>> >>> To solve the scope awareness issue I put together (after discussing >>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>> which delegates to the underlying buffer implementation after doing a >>> liveness check on the owning scope. That means that operations such as >>> ByteBuffer::getInt will only succeed if the scope the view is >>> associated with is still alive. Note that we need the wrapping only if >>> the associated scope is _not_ pinned - in fact, if the scope is >>> pinned, then it can't be closed, so there's no need to check for >>> liveliness. >>> >>> To do this I had to remove the 'final' modifier from some instance >>> methods in ByteBuffer and Buffer - the theory here is that these >>> 'final' have been added in early days to help the VM out, but are not >>> necessary now (also note that is not possible for a client to create a >>> custom byte buffer implementation, as all constructors are >>> package-private). >>> >>> Finally, the wrapped byte buffer implementation does not support >>> typeful views such as 'asCharBuffer' - this is tricky to support as >>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>> we'd need to wrap also CharBuffer and friends, which is doable, but >>> we're leaning towards YAGNI for now. >>> >>> Finally I've added a test which writes into memory (using var handles) >>> then reads it back using a segment-backed byte buffer (there are tests >>> for both heap and off-heap variants of the buffer). There's also a >>> test which checks interop between MappedByteBuffer and the foreign API >>> (which will likely be relevant in the context of JEP 352 [1]), and, >>> finally, a test which makes sure that all instance method in the >>> scope-wrapped buffer throws ISE after the scope has been closed. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>> >>> Comments welcome! >>> >>> Cheers >>> Maurizio >>> >>> [1] - https://openjdk.java.net/jeps/352 From jbvernee at xs4all.nl Tue May 28 13:20:06 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 28 May 2019 15:20:06 +0200 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> References: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> Message-ID: <648a655c397fd56acad8ca5388bb11cc@xs4all.nl> I'm getting an error on the new test: TEST RESULT: Error. failed to clean up files after test This is probably because a stream to a file is not closed (which is problematic for Windows), though I can't spot where that's happening. I assume a MappedByteBuffer is un-mapped when the FileChannel is closed? (there doesn't seem to be a close/unmap method on MBB). Jorn Maurizio Cimadamore schreef op 2019-05-28 14:32: > Updated webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ > > Changes: > > * fixed byte buffer template > > * regularized logic for resize (now bytebuffer segments also get to > resize()!) > > * moved byte buffer segment into its own class > > * added several more test to check correctness of resize operation > with various segment kinds > > * opened up MemorySegment.ofArray to allow all primitive arrays (but > MemoryAddress.asByteBuffer will still throw if the base object is not > byte[]) > > Maurizio > > On 28/05/2019 12:22, Maurizio Cimadamore wrote: >> I'll check - thanks >> >> Maurizio >> >> On 28/05/2019 11:30, Jorn Vernee wrote: >>> Hi, >>> >>> The patch doesn't compile for me: >>> >>> === Output from failing command(s) repeated here === >>> * For target jdk_modules_java.base__the.java.base_batch: >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >>> error: no suitable constructor found for ByteBuffer(no arguments) >>> ??????? super(); >>> ??????? ^ >>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is >>> not applicable >>> ????? (actual and formal argument lists differ in length) >>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not >>> applicable >>> ????? (actual and formal argument lists differ in length) >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >>> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) in >>> ByteBuffer >>> ??? public ByteBuffer put(byte[] src) { >>> ????????????????????? ^ >>> ? overridden method is final >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >>> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >>> ByteBuffer >>> ?? ... (rest of output omitted) >>> >>> * All command lines available in >>> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >>> === End of repeated output === >>> >>> I see that in the patch you made the necessary changes to >>> java.nio.Buffer, but shouldn't these changes be made to the template >>> file for ByteBuffer as well? Since ScopedByteBuffer extends >>> ByteBuffer? >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>>> Hi, >>>> as you know, the foreign memory access API supports bidirectional >>>> interop between MemoryAddress and ByteBuffer: >>>> >>>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>>> >>>> That is (1) can be used to create a memory segment out of an >>>> existing >>>> byte buffer, whereas (2) can be used to do the opposite, that is, to >>>> convert a memory address into a byte buffer. >>>> >>>> While (1) works pretty reliably (but I've added some tests for it), >>>> the implementation for (2) leaves to be desired: >>>> >>>> * The resulting byte buffer is unaware of the fact that the backing >>>> memory is attached to a scope that can be closed >>>> * There's no way to create a buffer if the address encapsulates some >>>> heap-based memory address >>>> >>>> This patch solves both issues - and also adds a supported way for >>>> creating a memory segment out of a heap-allocated byte array >>>> (MemorySegment.ofArray(byte[])) which I think is useful. >>>> >>>> To solve the scope awareness issue I put together (after discussing >>>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>>> which delegates to the underlying buffer implementation after doing >>>> a >>>> liveness check on the owning scope. That means that operations such >>>> as >>>> ByteBuffer::getInt will only succeed if the scope the view is >>>> associated with is still alive. Note that we need the wrapping only >>>> if >>>> the associated scope is _not_ pinned - in fact, if the scope is >>>> pinned, then it can't be closed, so there's no need to check for >>>> liveliness. >>>> >>>> To do this I had to remove the 'final' modifier from some instance >>>> methods in ByteBuffer and Buffer - the theory here is that these >>>> 'final' have been added in early days to help the VM out, but are >>>> not >>>> necessary now (also note that is not possible for a client to create >>>> a >>>> custom byte buffer implementation, as all constructors are >>>> package-private). >>>> >>>> Finally, the wrapped byte buffer implementation does not support >>>> typeful views such as 'asCharBuffer' - this is tricky to support as >>>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>>> we'd need to wrap also CharBuffer and friends, which is doable, but >>>> we're leaning towards YAGNI for now. >>>> >>>> Finally I've added a test which writes into memory (using var >>>> handles) >>>> then reads it back using a segment-backed byte buffer (there are >>>> tests >>>> for both heap and off-heap variants of the buffer). There's also a >>>> test which checks interop between MappedByteBuffer and the foreign >>>> API >>>> (which will likely be relevant in the context of JEP 352 [1]), and, >>>> finally, a test which makes sure that all instance method in the >>>> scope-wrapped buffer throws ISE after the scope has been closed. >>>> >>>> Webrev: >>>> >>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>>> >>>> Comments welcome! >>>> >>>> Cheers >>>> Maurizio >>>> >>>> [1] - https://openjdk.java.net/jeps/352 From jbvernee at xs4all.nl Tue May 28 13:32:07 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Tue, 28 May 2019 15:32:07 +0200 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: <648a655c397fd56acad8ca5388bb11cc@xs4all.nl> References: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> <648a655c397fd56acad8ca5388bb11cc@xs4all.nl> Message-ID: <6227471d683e094c286d2a52bead3eec@xs4all.nl> FWIW, using /othervm makes this work (probably because the file stream is forcibly closed when the process exits). Cheers, Jorn Jorn Vernee schreef op 2019-05-28 15:20: > I'm getting an error on the new test: > > TEST RESULT: Error. failed to clean up files after test > > This is probably because a stream to a file is not closed (which is > problematic for Windows), though I can't spot where that's happening. > I assume a MappedByteBuffer is un-mapped when the FileChannel is > closed? (there doesn't seem to be a close/unmap method on MBB). > > Jorn > > Maurizio Cimadamore schreef op 2019-05-28 14:32: >> Updated webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ >> >> Changes: >> >> * fixed byte buffer template >> >> * regularized logic for resize (now bytebuffer segments also get to >> resize()!) >> >> * moved byte buffer segment into its own class >> >> * added several more test to check correctness of resize operation >> with various segment kinds >> >> * opened up MemorySegment.ofArray to allow all primitive arrays (but >> MemoryAddress.asByteBuffer will still throw if the base object is not >> byte[]) >> >> Maurizio >> >> On 28/05/2019 12:22, Maurizio Cimadamore wrote: >>> I'll check - thanks >>> >>> Maurizio >>> >>> On 28/05/2019 11:30, Jorn Vernee wrote: >>>> Hi, >>>> >>>> The patch doesn't compile for me: >>>> >>>> === Output from failing command(s) repeated here === >>>> * For target jdk_modules_java.base__the.java.base_batch: >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >>>> error: no suitable constructor found for ByteBuffer(no arguments) >>>> ??????? super(); >>>> ??????? ^ >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is >>>> not applicable >>>> ????? (actual and formal argument lists differ in length) >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not >>>> applicable >>>> ????? (actual and formal argument lists differ in length) >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >>>> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) >>>> in ByteBuffer >>>> ??? public ByteBuffer put(byte[] src) { >>>> ????????????????????? ^ >>>> ? overridden method is final >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >>>> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >>>> ByteBuffer >>>> ?? ... (rest of output omitted) >>>> >>>> * All command lines available in >>>> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >>>> === End of repeated output === >>>> >>>> I see that in the patch you made the necessary changes to >>>> java.nio.Buffer, but shouldn't these changes be made to the template >>>> file for ByteBuffer as well? Since ScopedByteBuffer extends >>>> ByteBuffer? >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>>>> Hi, >>>>> as you know, the foreign memory access API supports bidirectional >>>>> interop between MemoryAddress and ByteBuffer: >>>>> >>>>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>>>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>>>> >>>>> That is (1) can be used to create a memory segment out of an >>>>> existing >>>>> byte buffer, whereas (2) can be used to do the opposite, that is, >>>>> to >>>>> convert a memory address into a byte buffer. >>>>> >>>>> While (1) works pretty reliably (but I've added some tests for it), >>>>> the implementation for (2) leaves to be desired: >>>>> >>>>> * The resulting byte buffer is unaware of the fact that the backing >>>>> memory is attached to a scope that can be closed >>>>> * There's no way to create a buffer if the address encapsulates >>>>> some >>>>> heap-based memory address >>>>> >>>>> This patch solves both issues - and also adds a supported way for >>>>> creating a memory segment out of a heap-allocated byte array >>>>> (MemorySegment.ofArray(byte[])) which I think is useful. >>>>> >>>>> To solve the scope awareness issue I put together (after discussing >>>>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>>>> which delegates to the underlying buffer implementation after doing >>>>> a >>>>> liveness check on the owning scope. That means that operations such >>>>> as >>>>> ByteBuffer::getInt will only succeed if the scope the view is >>>>> associated with is still alive. Note that we need the wrapping only >>>>> if >>>>> the associated scope is _not_ pinned - in fact, if the scope is >>>>> pinned, then it can't be closed, so there's no need to check for >>>>> liveliness. >>>>> >>>>> To do this I had to remove the 'final' modifier from some instance >>>>> methods in ByteBuffer and Buffer - the theory here is that these >>>>> 'final' have been added in early days to help the VM out, but are >>>>> not >>>>> necessary now (also note that is not possible for a client to >>>>> create a >>>>> custom byte buffer implementation, as all constructors are >>>>> package-private). >>>>> >>>>> Finally, the wrapped byte buffer implementation does not support >>>>> typeful views such as 'asCharBuffer' - this is tricky to support as >>>>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>>>> we'd need to wrap also CharBuffer and friends, which is doable, but >>>>> we're leaning towards YAGNI for now. >>>>> >>>>> Finally I've added a test which writes into memory (using var >>>>> handles) >>>>> then reads it back using a segment-backed byte buffer (there are >>>>> tests >>>>> for both heap and off-heap variants of the buffer). There's also a >>>>> test which checks interop between MappedByteBuffer and the foreign >>>>> API >>>>> (which will likely be relevant in the context of JEP 352 [1]), and, >>>>> finally, a test which makes sure that all instance method in the >>>>> scope-wrapped buffer throws ISE after the scope has been closed. >>>>> >>>>> Webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>>>> >>>>> Comments welcome! >>>>> >>>>> Cheers >>>>> Maurizio >>>>> >>>>> [1] - https://openjdk.java.net/jeps/352 From maurizio.cimadamore at oracle.com Tue May 28 13:40:03 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 28 May 2019 14:40:03 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: <648a655c397fd56acad8ca5388bb11cc@xs4all.nl> References: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> <648a655c397fd56acad8ca5388bb11cc@xs4all.nl> Message-ID: <4f0df275-3211-ee13-37f5-8a92d1732499@oracle.com> Had a chat with Alan, there's no way to unmap a MBB - but what other tests do is something like this ??????????? var ref = new WeakReference<>(mbb); ??????????? mbb = null; ??????????? System.gc(); ??????????? while (ref.get() == null) { ??????????????? Thread.sleep(20); ??????????? } I'll add this. Thanks Maurizio On 28/05/2019 14:20, Jorn Vernee wrote: > I'm getting an error on the new test: > > TEST RESULT: Error. failed to clean up files after test > > This is probably because a stream to a file is not closed (which is > problematic for Windows), though I can't spot where that's happening. > I assume a MappedByteBuffer is un-mapped when the FileChannel is > closed? (there doesn't seem to be a close/unmap method on MBB). > > Jorn > > Maurizio Cimadamore schreef op 2019-05-28 14:32: >> Updated webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ >> >> Changes: >> >> * fixed byte buffer template >> >> * regularized logic for resize (now bytebuffer segments also get to >> resize()!) >> >> * moved byte buffer segment into its own class >> >> * added several more test to check correctness of resize operation >> with various segment kinds >> >> * opened up MemorySegment.ofArray to allow all primitive arrays (but >> MemoryAddress.asByteBuffer will still throw if the base object is not >> byte[]) >> >> Maurizio >> >> On 28/05/2019 12:22, Maurizio Cimadamore wrote: >>> I'll check - thanks >>> >>> Maurizio >>> >>> On 28/05/2019 11:30, Jorn Vernee wrote: >>>> Hi, >>>> >>>> The patch doesn't compile for me: >>>> >>>> === Output from failing command(s) repeated here === >>>> * For target jdk_modules_java.base__the.java.base_batch: >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >>>> error: no suitable constructor found for ByteBuffer(no arguments) >>>> ??????? super(); >>>> ??????? ^ >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) >>>> is not applicable >>>> ????? (actual and formal argument lists differ in length) >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not >>>> applicable >>>> ????? (actual and formal argument lists differ in length) >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >>>> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) >>>> in ByteBuffer >>>> ??? public ByteBuffer put(byte[] src) { >>>> ????????????????????? ^ >>>> ? overridden method is final >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >>>> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >>>> ByteBuffer >>>> ?? ... (rest of output omitted) >>>> >>>> * All command lines available in >>>> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >>>> === End of repeated output === >>>> >>>> I see that in the patch you made the necessary changes to >>>> java.nio.Buffer, but shouldn't these changes be made to the >>>> template file for ByteBuffer as well? Since ScopedByteBuffer >>>> extends ByteBuffer? >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>>>> Hi, >>>>> as you know, the foreign memory access API supports bidirectional >>>>> interop between MemoryAddress and ByteBuffer: >>>>> >>>>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>>>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>>>> >>>>> That is (1) can be used to create a memory segment out of an existing >>>>> byte buffer, whereas (2) can be used to do the opposite, that is, to >>>>> convert a memory address into a byte buffer. >>>>> >>>>> While (1) works pretty reliably (but I've added some tests for it), >>>>> the implementation for (2) leaves to be desired: >>>>> >>>>> * The resulting byte buffer is unaware of the fact that the backing >>>>> memory is attached to a scope that can be closed >>>>> * There's no way to create a buffer if the address encapsulates some >>>>> heap-based memory address >>>>> >>>>> This patch solves both issues - and also adds a supported way for >>>>> creating a memory segment out of a heap-allocated byte array >>>>> (MemorySegment.ofArray(byte[])) which I think is useful. >>>>> >>>>> To solve the scope awareness issue I put together (after discussing >>>>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>>>> which delegates to the underlying buffer implementation after doing a >>>>> liveness check on the owning scope. That means that operations >>>>> such as >>>>> ByteBuffer::getInt will only succeed if the scope the view is >>>>> associated with is still alive. Note that we need the wrapping >>>>> only if >>>>> the associated scope is _not_ pinned - in fact, if the scope is >>>>> pinned, then it can't be closed, so there's no need to check for >>>>> liveliness. >>>>> >>>>> To do this I had to remove the 'final' modifier from some instance >>>>> methods in ByteBuffer and Buffer - the theory here is that these >>>>> 'final' have been added in early days to help the VM out, but are not >>>>> necessary now (also note that is not possible for a client to >>>>> create a >>>>> custom byte buffer implementation, as all constructors are >>>>> package-private). >>>>> >>>>> Finally, the wrapped byte buffer implementation does not support >>>>> typeful views such as 'asCharBuffer' - this is tricky to support as >>>>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>>>> we'd need to wrap also CharBuffer and friends, which is doable, but >>>>> we're leaning towards YAGNI for now. >>>>> >>>>> Finally I've added a test which writes into memory (using var >>>>> handles) >>>>> then reads it back using a segment-backed byte buffer (there are >>>>> tests >>>>> for both heap and off-heap variants of the buffer). There's also a >>>>> test which checks interop between MappedByteBuffer and the foreign >>>>> API >>>>> (which will likely be relevant in the context of JEP 352 [1]), and, >>>>> finally, a test which makes sure that all instance method in the >>>>> scope-wrapped buffer throws ISE after the scope has been closed. >>>>> >>>>> Webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>>>> >>>>> Comments welcome! >>>>> >>>>> Cheers >>>>> Maurizio >>>>> >>>>> [1] - https://openjdk.java.net/jeps/352 From maurizio.cimadamore at oracle.com Tue May 28 23:28:14 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 00:28:14 +0100 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> References: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> Message-ID: And here's another: http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v3 This version moves all the code generation in the templates, as for other buffer - so that we can generate a ScopedBuffer class for each basic type - this allows us to implement buffer views (such as 'asCharBuffer') cleanly. I've also added a lot more robustness in the test, which now checks not only that the ByteBuffer projection works as expected, but also that the derived buffer views also work as expected. Cheers Maurizio On 28/05/2019 13:32, Maurizio Cimadamore wrote: > Updated webrev: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ > > Changes: > > * fixed byte buffer template > > * regularized logic for resize (now bytebuffer segments also get to > resize()!) > > * moved byte buffer segment into its own class > > * added several more test to check correctness of resize operation > with various segment kinds > > * opened up MemorySegment.ofArray to allow all primitive arrays (but > MemoryAddress.asByteBuffer will still throw if the base object is not > byte[]) > > Maurizio > > On 28/05/2019 12:22, Maurizio Cimadamore wrote: >> I'll check - thanks >> >> Maurizio >> >> On 28/05/2019 11:30, Jorn Vernee wrote: >>> Hi, >>> >>> The patch doesn't compile for me: >>> >>> === Output from failing command(s) repeated here === >>> * For target jdk_modules_java.base__the.java.base_batch: >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >>> error: no suitable constructor found for ByteBuffer(no arguments) >>> ??????? super(); >>> ??????? ^ >>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is >>> not applicable >>> ????? (actual and formal argument lists differ in length) >>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not >>> applicable >>> ????? (actual and formal argument lists differ in length) >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >>> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) >>> in ByteBuffer >>> ??? public ByteBuffer put(byte[] src) { >>> ????????????????????? ^ >>> ? overridden method is final >>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >>> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >>> ByteBuffer >>> ?? ... (rest of output omitted) >>> >>> * All command lines available in >>> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >>> === End of repeated output === >>> >>> I see that in the patch you made the necessary changes to >>> java.nio.Buffer, but shouldn't these changes be made to the template >>> file for ByteBuffer as well? Since ScopedByteBuffer extends ByteBuffer? >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>>> Hi, >>>> as you know, the foreign memory access API supports bidirectional >>>> interop between MemoryAddress and ByteBuffer: >>>> >>>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>>> >>>> That is (1) can be used to create a memory segment out of an existing >>>> byte buffer, whereas (2) can be used to do the opposite, that is, to >>>> convert a memory address into a byte buffer. >>>> >>>> While (1) works pretty reliably (but I've added some tests for it), >>>> the implementation for (2) leaves to be desired: >>>> >>>> * The resulting byte buffer is unaware of the fact that the backing >>>> memory is attached to a scope that can be closed >>>> * There's no way to create a buffer if the address encapsulates some >>>> heap-based memory address >>>> >>>> This patch solves both issues - and also adds a supported way for >>>> creating a memory segment out of a heap-allocated byte array >>>> (MemorySegment.ofArray(byte[])) which I think is useful. >>>> >>>> To solve the scope awareness issue I put together (after discussing >>>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>>> which delegates to the underlying buffer implementation after doing a >>>> liveness check on the owning scope. That means that operations such as >>>> ByteBuffer::getInt will only succeed if the scope the view is >>>> associated with is still alive. Note that we need the wrapping only if >>>> the associated scope is _not_ pinned - in fact, if the scope is >>>> pinned, then it can't be closed, so there's no need to check for >>>> liveliness. >>>> >>>> To do this I had to remove the 'final' modifier from some instance >>>> methods in ByteBuffer and Buffer - the theory here is that these >>>> 'final' have been added in early days to help the VM out, but are not >>>> necessary now (also note that is not possible for a client to create a >>>> custom byte buffer implementation, as all constructors are >>>> package-private). >>>> >>>> Finally, the wrapped byte buffer implementation does not support >>>> typeful views such as 'asCharBuffer' - this is tricky to support as >>>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>>> we'd need to wrap also CharBuffer and friends, which is doable, but >>>> we're leaning towards YAGNI for now. >>>> >>>> Finally I've added a test which writes into memory (using var handles) >>>> then reads it back using a segment-backed byte buffer (there are tests >>>> for both heap and off-heap variants of the buffer). There's also a >>>> test which checks interop between MappedByteBuffer and the foreign API >>>> (which will likely be relevant in the context of JEP 352 [1]), and, >>>> finally, a test which makes sure that all instance method in the >>>> scope-wrapped buffer throws ISE after the scope has been closed. >>>> >>>> Webrev: >>>> >>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>>> >>>> Comments welcome! >>>> >>>> Cheers >>>> Maurizio >>>> >>>> [1] - https://openjdk.java.net/jeps/352 From vivek.r.deshpande at intel.com Wed May 29 00:03:37 2019 From: vivek.r.deshpande at intel.com (vivek.r.deshpande at intel.com) Date: Wed, 29 May 2019 00:03:37 +0000 Subject: hg: panama/dev: add tests for masked scatter, gather and single apis Message-ID: <201905290003.x4T03b4D017006@aojmv0008.oracle.com> Changeset: 6378b3b30426 Author: jphalimi Date: 2019-05-28 17:03 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/6378b3b30426 add tests for masked scatter, gather and single apis ! test/jdk/jdk/incubator/vector/Byte128VectorTests.java ! test/jdk/jdk/incubator/vector/Byte256VectorTests.java ! test/jdk/jdk/incubator/vector/Byte512VectorTests.java ! test/jdk/jdk/incubator/vector/Byte64VectorTests.java ! test/jdk/jdk/incubator/vector/ByteMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Double128VectorTests.java ! test/jdk/jdk/incubator/vector/Double256VectorTests.java ! test/jdk/jdk/incubator/vector/Double512VectorTests.java ! test/jdk/jdk/incubator/vector/Double64VectorTests.java ! test/jdk/jdk/incubator/vector/DoubleMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Float128VectorTests.java ! test/jdk/jdk/incubator/vector/Float256VectorTests.java ! test/jdk/jdk/incubator/vector/Float512VectorTests.java ! test/jdk/jdk/incubator/vector/Float64VectorTests.java ! test/jdk/jdk/incubator/vector/FloatMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Int128VectorTests.java ! test/jdk/jdk/incubator/vector/Int256VectorTests.java ! test/jdk/jdk/incubator/vector/Int512VectorTests.java ! test/jdk/jdk/incubator/vector/Int64VectorTests.java ! test/jdk/jdk/incubator/vector/IntMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Long128VectorTests.java ! test/jdk/jdk/incubator/vector/Long256VectorTests.java ! test/jdk/jdk/incubator/vector/Long512VectorTests.java ! test/jdk/jdk/incubator/vector/Long64VectorTests.java ! test/jdk/jdk/incubator/vector/LongMaxVectorTests.java ! test/jdk/jdk/incubator/vector/Short128VectorTests.java ! test/jdk/jdk/incubator/vector/Short256VectorTests.java ! test/jdk/jdk/incubator/vector/Short512VectorTests.java ! test/jdk/jdk/incubator/vector/Short64VectorTests.java ! test/jdk/jdk/incubator/vector/ShortMaxVectorTests.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ByteScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Double64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/DoubleMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/DoubleScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Float64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/FloatMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/FloatScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Int64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/IntMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/IntScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Long64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/LongMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/LongScalar.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short128Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short256Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short512Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Short64Vector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ShortMaxVector.java ! test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/ShortScalar.java ! test/jdk/jdk/incubator/vector/gen-template.sh + test/jdk/jdk/incubator/vector/templates/Kernel-Gather-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Kernel-Scatter-Masked-op.template + test/jdk/jdk/incubator/vector/templates/Kernel-Single-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Gather-Masked-op.template ! test/jdk/jdk/incubator/vector/templates/Unit-Gather-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Scatter-Masked-op.template ! test/jdk/jdk/incubator/vector/templates/Unit-Scatter-op.template + test/jdk/jdk/incubator/vector/templates/Unit-Single-op.template ! test/jdk/jdk/incubator/vector/templates/Unit-header.template From maurizio.cimadamore at oracle.com Wed May 29 00:11:12 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 29 May 2019 00:11:12 +0000 Subject: hg: panama/dev: Automatic merge with vectorIntrinsics Message-ID: <201905290011.x4T0BCkT023962@aojmv0008.oracle.com> Changeset: ae5ed850ca36 Author: mcimadamore Date: 2019-05-29 02:11 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/ae5ed850ca36 Automatic merge with vectorIntrinsics From nick.gasson at arm.com Wed May 29 07:00:30 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 29 May 2019 15:00:30 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> Message-ID: <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> Hi Maurizio, On 28/05/2019 19:25, Maurizio Cimadamore wrote: > Actually, a cleaner way to get there is: > > public static LayoutType SHORT = pick(LittleEndian.SysVABI.SHORT, > LittleEndian.WinABI.SHORT, LittleEndian.AArch64.SHORT); > > and then define the logic inside pick() to select the right value > depending on platform/os. > Yes this is much better. I've done this in webrev.3 here: http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ I added NativeTypes.AArch64ABI which differs from SysVABI in that long double is 128 bits and char is unsigned. I wonder if we should add a "SCHAR" signed char type to NativeTypes? As currently we have CHAR and UCHAR which are both unsigned on AArch64. Also fixed the x86 regression found by Jorn and re-enabled the UnalignedStructTest test. Thanks, Nick From maurizio.cimadamore at oracle.com Wed May 29 09:38:16 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 10:38:16 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> Message-ID: <83074b21-7b66-755b-8283-419c3553426b@oracle.com> Looks good - as for SCHAR, yes, we can go for it - and add it across the board (e.g. in all ABIs). Thanks Maurizio On 29/05/2019 08:00, Nick Gasson wrote: > Hi Maurizio, > > On 28/05/2019 19:25, Maurizio Cimadamore wrote: >> Actually, a cleaner way to get there is: >> >> public static LayoutType SHORT = >> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >> LittleEndian.AArch64.SHORT); >> >> and then define the logic inside pick() to select the right value >> depending on platform/os. >> > > Yes this is much better. I've done this in webrev.3 here: > > http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ > > I added NativeTypes.AArch64ABI which differs from SysVABI in that long > double is 128 bits and char is unsigned. I wonder if we should add a > "SCHAR" signed char type to NativeTypes? As currently we have CHAR and > UCHAR which are both unsigned on AArch64. > > Also fixed the x86 regression found by Jorn and re-enabled the > UnalignedStructTest test. > > Thanks, > Nick From jbvernee at xs4all.nl Wed May 29 09:53:06 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 11:53:06 +0200 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <83074b21-7b66-755b-8283-419c3553426b@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> <83074b21-7b66-755b-8283-419c3553426b@oracle.com> Message-ID: <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> Also, if you're touching that file any ways. I think it's good to make all the constants `final`. Jorn Maurizio Cimadamore schreef op 2019-05-29 11:38: > Looks good - as for SCHAR, yes, we can go for it - and add it across > the board (e.g. in all ABIs). > > Thanks > Maurizio > > On 29/05/2019 08:00, Nick Gasson wrote: >> Hi Maurizio, >> >> On 28/05/2019 19:25, Maurizio Cimadamore wrote: >>> Actually, a cleaner way to get there is: >>> >>> public static LayoutType SHORT = >>> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >>> LittleEndian.AArch64.SHORT); >>> >>> and then define the logic inside pick() to select the right value >>> depending on platform/os. >>> >> >> Yes this is much better. I've done this in webrev.3 here: >> >> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ >> >> I added NativeTypes.AArch64ABI which differs from SysVABI in that long >> double is 128 bits and char is unsigned. I wonder if we should add a >> "SCHAR" signed char type to NativeTypes? As currently we have CHAR and >> UCHAR which are both unsigned on AArch64. >> >> Also fixed the x86 regression found by Jorn and re-enabled the >> UnalignedStructTest test. >> >> Thanks, >> Nick From jbvernee at xs4all.nl Wed May 29 09:58:43 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 11:58:43 +0200 Subject: [foreign-memaccess] RFR 8224843: Refine ByteBuffer interop support In-Reply-To: References: <94bf44c2-089e-45f5-252c-ecfe8f28aa22@oracle.com> Message-ID: Looks very good! Cheers, Jorn [1] : https://javadoc.lwjgl.org/org/lwjgl/opengl/GL15.html#glBufferData(int,java.nio.FloatBuffer,int) Maurizio Cimadamore schreef op 2019-05-29 01:28: > And here's another: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v3 > > This version moves all the code generation in the templates, as for > other buffer - so that we can generate a ScopedBuffer class for each > basic type - this allows us to implement buffer views (such as > 'asCharBuffer') cleanly. > > I've also added a lot more robustness in the test, which now checks > not only that the ByteBuffer projection works as expected, but also > that the derived buffer views also work as expected. > > Cheers > Maurizio > > On 28/05/2019 13:32, Maurizio Cimadamore wrote: >> Updated webrev: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224843_v2/ >> >> Changes: >> >> * fixed byte buffer template >> >> * regularized logic for resize (now bytebuffer segments also get to >> resize()!) >> >> * moved byte buffer segment into its own class >> >> * added several more test to check correctness of resize operation >> with various segment kinds >> >> * opened up MemorySegment.ofArray to allow all primitive arrays (but >> MemoryAddress.asByteBuffer will still throw if the base object is not >> byte[]) >> >> Maurizio >> >> On 28/05/2019 12:22, Maurizio Cimadamore wrote: >>> I'll check - thanks >>> >>> Maurizio >>> >>> On 28/05/2019 11:30, Jorn Vernee wrote: >>>> Hi, >>>> >>>> The patch doesn't compile for me: >>>> >>>> === Output from failing command(s) repeated here === >>>> * For target jdk_modules_java.base__the.java.base_batch: >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:41: >>>> error: no suitable constructor found for ByteBuffer(no arguments) >>>> ??????? super(); >>>> ??????? ^ >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int,byte[],int) is >>>> not applicable >>>> ????? (actual and formal argument lists differ in length) >>>> ??? constructor ByteBuffer.ByteBuffer(int,int,int,int) is not >>>> applicable >>>> ????? (actual and formal argument lists differ in length) >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:477: >>>> error: put(byte[]) in ScopedByteBuffer cannot override put(byte[]) >>>> in ByteBuffer >>>> ??? public ByteBuffer put(byte[] src) { >>>> ????????????????????? ^ >>>> ? overridden method is final >>>> h:\cygwin64\home\Jorn\cygwin-projects-new\memaccess\src\java.base\share\classes\java\nio\ScopedByteBuffer.java:484: >>>> error: hasArray() in ScopedByteBuffer cannot override hasArray() in >>>> ByteBuffer >>>> ?? ... (rest of output omitted) >>>> >>>> * All command lines available in >>>> /cygdrive/h/cygwin64/home/Jorn/cygwin-projects-new/memaccess/build/windows-x86_64-server-release/make-support/failure-logs. >>>> === End of repeated output === >>>> >>>> I see that in the patch you made the necessary changes to >>>> java.nio.Buffer, but shouldn't these changes be made to the template >>>> file for ByteBuffer as well? Since ScopedByteBuffer extends >>>> ByteBuffer? >>>> >>>> Jorn >>>> >>>> Maurizio Cimadamore schreef op 2019-05-27 19:45: >>>>> Hi, >>>>> as you know, the foreign memory access API supports bidirectional >>>>> interop between MemoryAddress and ByteBuffer: >>>>> >>>>> 1) MemorySegment::ofByteBuffer(ByteBuffer) >>>>> 2) MemoryAddress:;asDirectByteBuffer(int bytes) >>>>> >>>>> That is (1) can be used to create a memory segment out of an >>>>> existing >>>>> byte buffer, whereas (2) can be used to do the opposite, that is, >>>>> to >>>>> convert a memory address into a byte buffer. >>>>> >>>>> While (1) works pretty reliably (but I've added some tests for it), >>>>> the implementation for (2) leaves to be desired: >>>>> >>>>> * The resulting byte buffer is unaware of the fact that the backing >>>>> memory is attached to a scope that can be closed >>>>> * There's no way to create a buffer if the address encapsulates >>>>> some >>>>> heap-based memory address >>>>> >>>>> This patch solves both issues - and also adds a supported way for >>>>> creating a memory segment out of a heap-allocated byte array >>>>> (MemorySegment.ofArray(byte[])) which I think is useful. >>>>> >>>>> To solve the scope awareness issue I put together (after discussing >>>>> extensively with Alan, CC'ed) a delegating wrapper of ByteBuffer, >>>>> which delegates to the underlying buffer implementation after doing >>>>> a >>>>> liveness check on the owning scope. That means that operations such >>>>> as >>>>> ByteBuffer::getInt will only succeed if the scope the view is >>>>> associated with is still alive. Note that we need the wrapping only >>>>> if >>>>> the associated scope is _not_ pinned - in fact, if the scope is >>>>> pinned, then it can't be closed, so there's no need to check for >>>>> liveliness. >>>>> >>>>> To do this I had to remove the 'final' modifier from some instance >>>>> methods in ByteBuffer and Buffer - the theory here is that these >>>>> 'final' have been added in early days to help the VM out, but are >>>>> not >>>>> necessary now (also note that is not possible for a client to >>>>> create a >>>>> custom byte buffer implementation, as all constructors are >>>>> package-private). >>>>> >>>>> Finally, the wrapped byte buffer implementation does not support >>>>> typeful views such as 'asCharBuffer' - this is tricky to support as >>>>> CharBuffer is not a subtype of ByteBuffer so, to go down that path >>>>> we'd need to wrap also CharBuffer and friends, which is doable, but >>>>> we're leaning towards YAGNI for now. >>>>> >>>>> Finally I've added a test which writes into memory (using var >>>>> handles) >>>>> then reads it back using a segment-backed byte buffer (there are >>>>> tests >>>>> for both heap and off-heap variants of the buffer). There's also a >>>>> test which checks interop between MappedByteBuffer and the foreign >>>>> API >>>>> (which will likely be relevant in the context of JEP 352 [1]), and, >>>>> finally, a test which makes sure that all instance method in the >>>>> scope-wrapped buffer throws ISE after the scope has been closed. >>>>> >>>>> Webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224843/ >>>>> >>>>> Comments welcome! >>>>> >>>>> Cheers >>>>> Maurizio >>>>> >>>>> [1] - https://openjdk.java.net/jeps/352 From jbvernee at xs4all.nl Wed May 29 10:47:29 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 12:47:29 +0200 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> <83074b21-7b66-755b-8283-419c3553426b@oracle.com> <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> Message-ID: <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> But, I realize this is out of scope for this RFR, so we can take care of it separately as well :) FWIW, all tests pass on my end with v3. Jorn Jorn Vernee schreef op 2019-05-29 11:53: > Also, if you're touching that file any ways. I think it's good to make > all the constants `final`. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-29 11:38: >> Looks good - as for SCHAR, yes, we can go for it - and add it across >> the board (e.g. in all ABIs). >> >> Thanks >> Maurizio >> >> On 29/05/2019 08:00, Nick Gasson wrote: >>> Hi Maurizio, >>> >>> On 28/05/2019 19:25, Maurizio Cimadamore wrote: >>>> Actually, a cleaner way to get there is: >>>> >>>> public static LayoutType SHORT = >>>> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >>>> LittleEndian.AArch64.SHORT); >>>> >>>> and then define the logic inside pick() to select the right value >>>> depending on platform/os. >>>> >>> >>> Yes this is much better. I've done this in webrev.3 here: >>> >>> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ >>> >>> I added NativeTypes.AArch64ABI which differs from SysVABI in that >>> long double is 128 bits and char is unsigned. I wonder if we should >>> add a "SCHAR" signed char type to NativeTypes? As currently we have >>> CHAR and UCHAR which are both unsigned on AArch64. >>> >>> Also fixed the x86 regression found by Jorn and re-enabled the >>> UnalignedStructTest test. >>> >>> Thanks, >>> Nick From maurizio.cimadamore at oracle.com Wed May 29 11:42:16 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 12:42:16 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> <83074b21-7b66-755b-8283-419c3553426b@oracle.com> <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> Message-ID: <73c29c3a-ed20-5529-1f15-2f7fa32da789@oracle.com> I'll give a try with our build/test infra too... Maurizio On 29/05/2019 11:47, Jorn Vernee wrote: > But, I realize this is out of scope for this RFR, so we can take care > of it separately as well :) > > FWIW, all tests pass on my end with v3. > > Jorn > > Jorn Vernee schreef op 2019-05-29 11:53: >> Also, if you're touching that file any ways. I think it's good to make >> all the constants `final`. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-29 11:38: >>> Looks good - as for SCHAR, yes, we can go for it - and add it across >>> the board (e.g. in all ABIs). >>> >>> Thanks >>> Maurizio >>> >>> On 29/05/2019 08:00, Nick Gasson wrote: >>>> Hi Maurizio, >>>> >>>> On 28/05/2019 19:25, Maurizio Cimadamore wrote: >>>>> Actually, a cleaner way to get there is: >>>>> >>>>> public static LayoutType SHORT = >>>>> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >>>>> LittleEndian.AArch64.SHORT); >>>>> >>>>> and then define the logic inside pick() to select the right value >>>>> depending on platform/os. >>>>> >>>> >>>> Yes this is much better. I've done this in webrev.3 here: >>>> >>>> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ >>>> >>>> I added NativeTypes.AArch64ABI which differs from SysVABI in that >>>> long double is 128 bits and char is unsigned. I wonder if we should >>>> add a "SCHAR" signed char type to NativeTypes? As currently we have >>>> CHAR and UCHAR which are both unsigned on AArch64. >>>> >>>> Also fixed the x86 regression found by Jorn and re-enabled the >>>> UnalignedStructTest test. >>>> >>>> Thanks, >>>> Nick From maurizio.cimadamore at oracle.com Wed May 29 12:11:40 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 29 May 2019 12:11:40 +0000 Subject: hg: panama/dev: 8224843: Refine ByteBuffer interop support Message-ID: <201905291211.x4TCBfRB018795@aojmv0008.oracle.com> Changeset: 8b98f924a13d Author: mcimadamore Date: 2019-05-29 13:11 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/8b98f924a13d 8224843: Refine ByteBuffer interop support ! make/gensrc/GensrcBuffer.gmk ! src/java.base/share/classes/java/foreign/MemoryAddress.java ! src/java.base/share/classes/java/foreign/MemorySegment.java ! src/java.base/share/classes/java/nio/Buffer.java ! src/java.base/share/classes/java/nio/X-Buffer.java.template + src/java.base/share/classes/java/nio/X-ScopedBuffer-bin.java.template + src/java.base/share/classes/java/nio/X-ScopedBuffer.java.template ! src/java.base/share/classes/jdk/internal/access/JavaNioAccess.java + src/java.base/share/classes/jdk/internal/foreign/ByteBufferMemorySegmentImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java + test/jdk/java/foreign/TestByteBuffer.java From jbvernee at xs4all.nl Wed May 29 12:26:46 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 14:26:46 +0200 Subject: [foreign] RFR 8224835: Pointer validity checks are applied inconsistently Message-ID: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> Hi, Please review the following: Bug: https://bugs.openjdk.java.net/browse/JDK-8224835 Webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.00/ This replaces the various public checkXXX methods on BoundedPointer with a single public checkAccess(AccessMode) method, which is then used everywhere outside of BoundedPointer. This makes sure that every place does all the needed checks, and not only one or the other. I've also checked all the use sites, and noticed that the IllegalAccessException that Pointer::addr declared as being thrown was never actually thrown. Instead an AccessControlException was being thrown. So I've updated the relevant signatures as well, and this allowed for the removal of a bunch of try/catch blocks. I've also removed the checks in References.OfStruct::get and References.OfArray::get, since we're not actually dereferencing the pointer at that point, only wrapping it. The checks will still happen later on when accessing a field or element (i.e. when actually dereferencing the pointer). Thanks, Jorn From maurizio.cimadamore at oracle.com Wed May 29 12:54:01 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 13:54:01 +0100 Subject: [foreign] RFR 8224835: Pointer validity checks are applied inconsistently In-Reply-To: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> References: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> Message-ID: Looks very good - the changes to Pointer:addr alone (and associated cleanups) are worth :-) I understand the choice with respect to liveness checks for structs and arrays - this reflects our big type vs. small type distinction. The sore point to note there is that you'd only get notified of a failure if the element you are accessing is, again, not a struct or an array. What if we added a call to checkAlive in the struct and array constructors? Maurizio On 29/05/2019 13:26, Jorn Vernee wrote: > Hi, > > Please review the following: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224835 > Webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.00/ > > This replaces the various public checkXXX methods on BoundedPointer > with a single public checkAccess(AccessMode) method, which is then > used everywhere outside of BoundedPointer. This makes sure that every > place does all the needed checks, and not only one or the other. > > I've also checked all the use sites, and noticed that the > IllegalAccessException that Pointer::addr declared as being thrown was > never actually thrown. Instead an AccessControlException was being > thrown. So I've updated the relevant signatures as well, and this > allowed for the removal of a bunch of try/catch blocks. > > I've also removed the checks in References.OfStruct::get and > References.OfArray::get, since we're not actually dereferencing the > pointer at that point, only wrapping it. The checks will still happen > later on when accessing a field or element (i.e. when actually > dereferencing the pointer). > > Thanks, > Jorn From jbvernee at xs4all.nl Wed May 29 13:51:00 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 15:51:00 +0200 Subject: [foreign] RFR 8224835: Pointer validity checks are applied inconsistently In-Reply-To: References: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> Message-ID: <716055f906016bbec0082acaf537ff03@xs4all.nl> I feel your sore point. tbh, I think if we go for checks for structs and arrays we should do a bounds check as well. This will make sure that half of the struct/array is not out of bounds. e.g. in the array case, I think a user could do an unsafe cast to make an array pointer larger, and not get an exception when calling get(lastElement) if the element type is a struct or array. One of the niceties of not doing the check for structs/arrays is that this avoids redundant bounds checking when first calling get on a struct/array pointer, and then again when accessing a field/element. But, I think maybe this could also be solved later on with a JIT intrinsic that merges the two checks (though I've heard that for those kinds of larger -> smaller bounds checks this is pretty tricky to do in general). I've added the checks back in: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.01/ I've implemented this by calling checkAccess(Pointer.AccessMode.NONE), which effectively does no read/write access check, since we don't know at that point if the pointer will be written to, or read from yet. Changed files are References (adding the checks), Pointer (adding the NONE access mode) and PointerScopeTest (sharpening the tests again). Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-29 14:54: > Looks very good - the changes to Pointer:addr alone (and associated > cleanups) are worth :-) > > I understand the choice with respect to liveness checks for structs > and arrays - this reflects our big type vs. small type distinction. > The sore point to note there is that you'd only get notified of a > failure if the element you are accessing is, again, not a struct or an > array. What if we added a call to checkAlive in the struct and array > constructors? > > Maurizio > > On 29/05/2019 13:26, Jorn Vernee wrote: >> Hi, >> >> Please review the following: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224835 >> Webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.00/ >> >> This replaces the various public checkXXX methods on BoundedPointer >> with a single public checkAccess(AccessMode) method, which is then >> used everywhere outside of BoundedPointer. This makes sure that every >> place does all the needed checks, and not only one or the other. >> >> I've also checked all the use sites, and noticed that the >> IllegalAccessException that Pointer::addr declared as being thrown was >> never actually thrown. Instead an AccessControlException was being >> thrown. So I've updated the relevant signatures as well, and this >> allowed for the removal of a bunch of try/catch blocks. >> >> I've also removed the checks in References.OfStruct::get and >> References.OfArray::get, since we're not actually dereferencing the >> pointer at that point, only wrapping it. The checks will still happen >> later on when accessing a field or element (i.e. when actually >> dereferencing the pointer). >> >> Thanks, >> Jorn From jbvernee at xs4all.nl Wed May 29 15:01:37 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 17:01:37 +0200 Subject: [foreign] RFR 8224835: Pointer validity checks are applied inconsistently In-Reply-To: <716055f906016bbec0082acaf537ff03@xs4all.nl> References: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> <716055f906016bbec0082acaf537ff03@xs4all.nl> Message-ID: <5098298bf75c144e757ab3d20f7a3283@xs4all.nl> Little update: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.03/ Where I've also added 2 test cases to OutOfBoundsTest for the bounds checking for arrays and structs. And some cases to PointerTest which check that addr() is performing the needed checks. Cheers, Jorn Jorn Vernee schreef op 2019-05-29 15:51: > I feel your sore point. tbh, I think if we go for checks for structs > and arrays we should do a bounds check as well. This will make sure > that half of the struct/array is not out of bounds. e.g. in the array > case, I think a user could do an unsafe cast to make an array pointer > larger, and not get an exception when calling get(lastElement) if the > element type is a struct or array. > > One of the niceties of not doing the check for structs/arrays is that > this avoids redundant bounds checking when first calling get on a > struct/array pointer, and then again when accessing a field/element. > But, I think maybe this could also be solved later on with a JIT > intrinsic that merges the two checks (though I've heard that for those > kinds of larger -> smaller bounds checks this is pretty tricky to do > in general). > > I've added the checks back in: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.01/ > > I've implemented this by calling checkAccess(Pointer.AccessMode.NONE), > which effectively does no read/write access check, since we don't know > at that point if the pointer will be written to, or read from yet. > > Changed files are References (adding the checks), Pointer (adding the > NONE access mode) and PointerScopeTest (sharpening the tests again). > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-29 14:54: >> Looks very good - the changes to Pointer:addr alone (and associated >> cleanups) are worth :-) >> >> I understand the choice with respect to liveness checks for structs >> and arrays - this reflects our big type vs. small type distinction. >> The sore point to note there is that you'd only get notified of a >> failure if the element you are accessing is, again, not a struct or an >> array. What if we added a call to checkAlive in the struct and array >> constructors? >> >> Maurizio >> >> On 29/05/2019 13:26, Jorn Vernee wrote: >>> Hi, >>> >>> Please review the following: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224835 >>> Webrev: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.00/ >>> >>> This replaces the various public checkXXX methods on BoundedPointer >>> with a single public checkAccess(AccessMode) method, which is then >>> used everywhere outside of BoundedPointer. This makes sure that every >>> place does all the needed checks, and not only one or the other. >>> >>> I've also checked all the use sites, and noticed that the >>> IllegalAccessException that Pointer::addr declared as being thrown >>> was never actually thrown. Instead an AccessControlException was >>> being thrown. So I've updated the relevant signatures as well, and >>> this allowed for the removal of a bunch of try/catch blocks. >>> >>> I've also removed the checks in References.OfStruct::get and >>> References.OfArray::get, since we're not actually dereferencing the >>> pointer at that point, only wrapping it. The checks will still happen >>> later on when accessing a field or element (i.e. when actually >>> dereferencing the pointer). >>> >>> Thanks, >>> Jorn From maurizio.cimadamore at oracle.com Wed May 29 15:36:26 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 16:36:26 +0100 Subject: [foreign] RFR 8224835: Pointer validity checks are applied inconsistently In-Reply-To: <5098298bf75c144e757ab3d20f7a3283@xs4all.nl> References: <161e73d05fbd45f9d3f2350be4f29b22@xs4all.nl> <716055f906016bbec0082acaf537ff03@xs4all.nl> <5098298bf75c144e757ab3d20f7a3283@xs4all.nl> Message-ID: Looks good Maurizio On 29/05/2019 16:01, Jorn Vernee wrote: > Little update: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.03/ > > Where I've also added 2 test cases to OutOfBoundsTest for the bounds > checking for arrays and structs. And some cases to PointerTest which > check that addr() is performing the needed checks. > > Cheers, > Jorn > > Jorn Vernee schreef op 2019-05-29 15:51: >> I feel your sore point. tbh, I think if we go for checks for structs >> and arrays we should do a bounds check as well. This will make sure >> that half of the struct/array is not out of bounds. e.g. in the array >> case, I think a user could do an unsafe cast to make an array pointer >> larger, and not get an exception when calling get(lastElement) if the >> element type is a struct or array. >> >> One of the niceties of not doing the check for structs/arrays is that >> this avoids redundant bounds checking when first calling get on a >> struct/array pointer, and then again when accessing a field/element. >> But, I think maybe this could also be solved later on with a JIT >> intrinsic that merges the two checks (though I've heard that for those >> kinds of larger -> smaller bounds checks this is pretty tricky to do >> in general). >> >> I've added the checks back in: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.01/ >> >> I've implemented this by calling checkAccess(Pointer.AccessMode.NONE), >> which effectively does no read/write access check, since we don't know >> at that point if the pointer will be written to, or read from yet. >> >> Changed files are References (adding the checks), Pointer (adding the >> NONE access mode) and PointerScopeTest (sharpening the tests again). >> >> Cheers, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-29 14:54: >>> Looks very good - the changes to Pointer:addr alone (and associated >>> cleanups) are worth :-) >>> >>> I understand the choice with respect to liveness checks for structs >>> and arrays - this reflects our big type vs. small type distinction. >>> The sore point to note there is that you'd only get notified of a >>> failure if the element you are accessing is, again, not a struct or an >>> array. What if we added a call to checkAlive in the struct and array >>> constructors? >>> >>> Maurizio >>> >>> On 29/05/2019 13:26, Jorn Vernee wrote: >>>> Hi, >>>> >>>> Please review the following: >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224835 >>>> Webrev: >>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224835/webrev.00/ >>>> >>>> This replaces the various public checkXXX methods on BoundedPointer >>>> with a single public checkAccess(AccessMode) method, which is then >>>> used everywhere outside of BoundedPointer. This makes sure that >>>> every place does all the needed checks, and not only one or the other. >>>> >>>> I've also checked all the use sites, and noticed that the >>>> IllegalAccessException that Pointer::addr declared as being thrown >>>> was never actually thrown. Instead an AccessControlException was >>>> being thrown. So I've updated the relevant signatures as well, and >>>> this allowed for the removal of a bunch of try/catch blocks. >>>> >>>> I've also removed the checks in References.OfStruct::get and >>>> References.OfArray::get, since we're not actually dereferencing the >>>> pointer at that point, only wrapping it. The checks will still >>>> happen later on when accessing a field or element (i.e. when >>>> actually dereferencing the pointer). >>>> >>>> Thanks, >>>> Jorn From jbvernee at xs4all.nl Wed May 29 15:44:44 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Wed, 29 May 2019 15:44:44 +0000 Subject: hg: panama/dev: 8224835: Pointer validity checks are applied inconsistently Message-ID: <201905291544.x4TFii3M003775@aojmv0008.oracle.com> Changeset: bd64d6097368 Author: jvernee Date: 2019-05-29 17:43 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/bd64d6097368 8224835: Pointer validity checks are applied inconsistently Reviewed-by: mcimadamore ! src/java.base/share/classes/java/foreign/memory/Pointer.java ! src/java.base/share/classes/jdk/internal/foreign/HeaderImplGenerator.java ! src/java.base/share/classes/jdk/internal/foreign/abi/DirectSignatureShuffler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalAdapter.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalNativeInvoker.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UpcallStubs.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/SysVx64ABI.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/Windowsx64ABI.java ! src/java.base/share/classes/jdk/internal/foreign/memory/BoundedPointer.java ! src/java.base/share/classes/jdk/internal/foreign/memory/CallbackImpl.java ! src/java.base/share/classes/jdk/internal/foreign/memory/MemoryBoundInfo.java ! src/java.base/share/classes/jdk/internal/foreign/memory/References.java ! test/jdk/com/sun/tools/jextract/ConstantsTest.java ! test/jdk/java/foreign/OutOfBoundsTest.java ! test/jdk/java/foreign/ScopeTest.java ! test/jdk/java/foreign/types/PointerScopeTest.java ! test/jdk/java/foreign/types/PointerTest.java ! test/jdk/java/foreign/types/StructTest.java From maurizio.cimadamore at oracle.com Wed May 29 16:02:30 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 17:02:30 +0100 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <73c29c3a-ed20-5529-1f15-2f7fa32da789@oracle.com> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> <83074b21-7b66-755b-8283-419c3553426b@oracle.com> <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> <73c29c3a-ed20-5529-1f15-2f7fa32da789@oracle.com> Message-ID: <09c83e81-3c06-36ad-004f-09a61e3c6b2d@oracle.com> All tests green here - this is good to go Maurizio On 29/05/2019 12:42, Maurizio Cimadamore wrote: > I'll give a try with our build/test infra too... > > Maurizio > > On 29/05/2019 11:47, Jorn Vernee wrote: >> But, I realize this is out of scope for this RFR, so we can take care >> of it separately as well :) >> >> FWIW, all tests pass on my end with v3. >> >> Jorn >> >> Jorn Vernee schreef op 2019-05-29 11:53: >>> Also, if you're touching that file any ways. I think it's good to make >>> all the constants `final`. >>> >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-29 11:38: >>>> Looks good - as for SCHAR, yes, we can go for it - and add it across >>>> the board (e.g. in all ABIs). >>>> >>>> Thanks >>>> Maurizio >>>> >>>> On 29/05/2019 08:00, Nick Gasson wrote: >>>>> Hi Maurizio, >>>>> >>>>> On 28/05/2019 19:25, Maurizio Cimadamore wrote: >>>>>> Actually, a cleaner way to get there is: >>>>>> >>>>>> public static LayoutType SHORT = >>>>>> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >>>>>> LittleEndian.AArch64.SHORT); >>>>>> >>>>>> and then define the logic inside pick() to select the right value >>>>>> depending on platform/os. >>>>>> >>>>> >>>>> Yes this is much better. I've done this in webrev.3 here: >>>>> >>>>> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ >>>>> >>>>> I added NativeTypes.AArch64ABI which differs from SysVABI in that >>>>> long double is 128 bits and char is unsigned. I wonder if we >>>>> should add a "SCHAR" signed char type to NativeTypes? As currently >>>>> we have CHAR and UCHAR which are both unsigned on AArch64. >>>>> >>>>> Also fixed the x86 regression found by Jorn and re-enabled the >>>>> UnalignedStructTest test. >>>>> >>>>> Thanks, >>>>> Nick From maurizio.cimadamore at oracle.com Wed May 29 16:17:15 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 17:17:15 +0100 Subject: [foreign-memacces] RFR 8224993: Add Unsafe support for MemoryAddress Message-ID: <8773e3c5-4b0e-f368-671a-0c8a898c4ff8@oracle.com> Hi, after adding byte buffer interop support, I think it's worth looking at another avenue of migration - that is, translating a MemoryAddress into a base/offset coordinate pair, used by the Unsafe class. http://cr.openjdk.java.net/~mcimadamore/panama/8224993/ In addition to add the trivial Unsafe capabilities, this patch also fixes some issues that were left unaddressed after the addition of ByteBuffer support - more specifically, the wrapped 'scoped' buffers do not have the 'address' , 'hb' and the 'capacity' fields set - unfortunately, these fields are accessed in a raw fashion to implement JNI functionalites such as GetDirectBufferAddress and GetDirectBufferCapacity. I've also removed the bit-swapping from the VarHandleMemoryAddressAsBytes class - after all, NativeOrder deals with _byte_ swapping, so there's nothing meaningful we can do at the byte level (and that's also consistent with what ByteBuffer does). Cheers Maurizio From jbvernee at xs4all.nl Wed May 29 17:31:51 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Wed, 29 May 2019 19:31:51 +0200 Subject: [foreign-memacces] RFR 8224993: Add Unsafe support for MemoryAddress In-Reply-To: <8773e3c5-4b0e-f368-671a-0c8a898c4ff8@oracle.com> References: <8773e3c5-4b0e-f368-671a-0c8a898c4ff8@oracle.com> Message-ID: <9fc7d9ebb317db846ab5b09d32957fc1@xs4all.nl> Hi, This looks good! But, there's 1 conversion from a jlong to a jint in libNativeAccess.c, which causes a warning, which then causes an error when building (due to -Werror). The conversion happens in Java_TestNative_getCapacity. Note that GetDirectBufferCapacity returns a jlong. After changing the return type of Java_TestNative_getCapacity to jlong/long as well I'm seeing all tests green. Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-29 18:17: > Hi, > after adding byte buffer interop support, I think it's worth looking > at another avenue of migration - that is, translating a MemoryAddress > into a base/offset coordinate pair, used by the Unsafe class. > > http://cr.openjdk.java.net/~mcimadamore/panama/8224993/ > > In addition to add the trivial Unsafe capabilities, this patch also > fixes some issues that were left unaddressed after the addition of > ByteBuffer support - more specifically, the wrapped 'scoped' buffers > do not have the 'address' , 'hb' and the 'capacity' fields set - > unfortunately, these fields are accessed in a raw fashion to implement > JNI functionalites such as GetDirectBufferAddress and > GetDirectBufferCapacity. > > I've also removed the bit-swapping from the > VarHandleMemoryAddressAsBytes class - after all, NativeOrder deals > with _byte_ swapping, so there's nothing meaningful we can do at the > byte level (and that's also consistent with what ByteBuffer does). > > Cheers > Maurizio From maurizio.cimadamore at oracle.com Wed May 29 17:44:17 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 29 May 2019 17:44:17 +0000 Subject: hg: panama/dev: 8224993: Add Unsafe support for MemoryAddress Message-ID: <201905291744.x4THiHwr018783@aojmv0008.oracle.com> Changeset: 80b7b4ab6611 Author: mcimadamore Date: 2019-05-29 18:41 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/80b7b4ab6611 8224993: Add Unsafe support for MemoryAddress ! src/java.base/share/classes/java/lang/invoke/X-VarHandleMemoryAddressView.java.template ! src/java.base/share/classes/java/nio/Buffer.java ! src/java.base/share/classes/java/nio/X-Buffer.java.template ! src/java.base/share/classes/java/nio/X-ScopedBuffer.java.template ! src/java.base/share/classes/jdk/internal/misc/Unsafe.java ! src/jdk.unsupported/share/classes/sun/misc/Unsafe.java ! test/jdk/java/foreign/TestByteBuffer.java + test/jdk/java/foreign/TestNative.java + test/jdk/java/foreign/libNativeAccess.c From maurizio.cimadamore at oracle.com Wed May 29 17:54:41 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 29 May 2019 18:54:41 +0100 Subject: [foreign-memacces] RFR 8224993: Add Unsafe support for MemoryAddress In-Reply-To: <9fc7d9ebb317db846ab5b09d32957fc1@xs4all.nl> References: <8773e3c5-4b0e-f368-671a-0c8a898c4ff8@oracle.com> <9fc7d9ebb317db846ab5b09d32957fc1@xs4all.nl> Message-ID: <9f9d03bb-e895-87c4-df2f-2f040de5dd1d@oracle.com> Fixed and pushed thanks! Maurizio On 29/05/2019 18:31, Jorn Vernee wrote: > Hi, > > This looks good! > > But, there's 1 conversion from a jlong to a jint in libNativeAccess.c, > which causes a warning, which then causes an error when building (due > to -Werror). The conversion happens in Java_TestNative_getCapacity. > Note that GetDirectBufferCapacity returns a jlong. > > After changing the return type of Java_TestNative_getCapacity to > jlong/long as well I'm seeing all tests green. > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-29 18:17: >> Hi, >> after adding byte buffer interop support, I think it's worth looking >> at another avenue of migration - that is, translating a MemoryAddress >> into a base/offset coordinate pair, used by the Unsafe class. >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224993/ >> >> In addition to add the trivial Unsafe capabilities, this patch also >> fixes some issues that were left unaddressed after the addition of >> ByteBuffer support - more specifically, the wrapped 'scoped' buffers >> do not have the 'address' , 'hb' and the 'capacity' fields set - >> unfortunately, these fields are accessed in a raw fashion to implement >> JNI functionalites such as GetDirectBufferAddress and >> GetDirectBufferCapacity. >> >> I've also removed the bit-swapping from the >> VarHandleMemoryAddressAsBytes class - after all, NativeOrder deals >> with _byte_ swapping, so there's nothing meaningful we can do at the >> byte level (and that's also consistent with what ByteBuffer does). >> >> Cheers >> Maurizio From kishor.kharbas at intel.com Wed May 29 20:44:40 2019 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Wed, 29 May 2019 20:44:40 +0000 Subject: RFR(S) 8225018: 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." Message-ID: Hi, Requesting review for http://cr.openjdk.java.net/~kkharbas/vector-api/8225018/8225018.webrev.00/ Bug - https://bugs.openjdk.java.net/browse/JDK-8225018 Summary - 1. Changed the instructs to take legacy registers when using vblendvps, vblendvpd, vpblendvb instructions. These instructions are not extended by AVX-512. 2. Changed the order (unordered in some cases) so all instructs of same type are together. Testing: The failure can be reproduced consistently by forcing registers to be allocated from upper bank by limiting the lower bank registers. (by changing register class definitions). After the fix, we do not see any more failures. Thanks, Kishor From vladimir.x.ivanov at oracle.com Wed May 29 20:49:41 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 29 May 2019 23:49:41 +0300 Subject: RFR(S) 8225018: 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." In-Reply-To: References: Message-ID: <9bcdd2db-471a-32bf-875f-b1f003687c6a@oracle.com> Looks good. Best regards, Vladimir Ivanov On 29/05/2019 23:44, Kharbas, Kishor wrote: > Hi, > > Requesting review for http://cr.openjdk.java.net/~kkharbas/vector-api/8225018/8225018.webrev.00/ > Bug - https://bugs.openjdk.java.net/browse/JDK-8225018 > > Summary - > > 1. Changed the instructs to take legacy registers when using vblendvps, vblendvpd, vpblendvb instructions. These instructions are not extended by AVX-512. > > 2. Changed the order (unordered in some cases) so all instructs of same type are together. > > Testing: > The failure can be reproduced consistently by forcing registers to be allocated from upper bank by limiting the lower bank registers. (by changing register class definitions). > After the fix, we do not see any more failures. > > Thanks, > Kishor > From kishor.kharbas at intel.com Wed May 29 21:09:17 2019 From: kishor.kharbas at intel.com (kishor.kharbas at intel.com) Date: Wed, 29 May 2019 21:09:17 +0000 Subject: hg: panama/dev: 8225018: [vector] 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." Message-ID: <201905292109.x4TL9Hp7025656@aojmv0008.oracle.com> Changeset: 0bea74e4f0eb Author: kkharbas Date: 2019-05-29 14:08 -0700 URL: http://hg.openjdk.java.net/panama/dev/rev/0bea74e4f0eb 8225018: [vector] 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." Summary: 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." Reviewed-by: vlivanov ! src/hotspot/cpu/x86/x86.ad From maurizio.cimadamore at oracle.com Wed May 29 21:14:46 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Wed, 29 May 2019 21:14:46 +0000 Subject: hg: panama/dev: Automatic merge with vectorIntrinsics Message-ID: <201905292114.x4TLEkOi000258@aojmv0008.oracle.com> Changeset: f94dd38a20f4 Author: mcimadamore Date: 2019-05-29 23:14 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/f94dd38a20f4 Automatic merge with vectorIntrinsics From kishor.kharbas at intel.com Wed May 29 21:18:00 2019 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Wed, 29 May 2019 21:18:00 +0000 Subject: RFR(S) 8225018: 9 unit tests for Vector API failed on SkyLake with assert "(((dst_enc < 16 && nds_enc < 16 ..." In-Reply-To: <9bcdd2db-471a-32bf-875f-b1f003687c6a@oracle.com> References: <9bcdd2db-471a-32bf-875f-b1f003687c6a@oracle.com> Message-ID: Thank you! Pushed the patch. -Kishor > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 29, 2019 1:50 PM > To: Kharbas, Kishor ; panama- > dev at openjdk.java.net > Subject: Re: RFR(S) 8225018: 9 unit tests for Vector API failed on SkyLake with > assert "(((dst_enc < 16 && nds_enc < 16 ..." > > Looks good. > > Best regards, > Vladimir Ivanov > > On 29/05/2019 23:44, Kharbas, Kishor wrote: > > Hi, > > > > Requesting review for > > http://cr.openjdk.java.net/~kkharbas/vector- > api/8225018/8225018.webrev > > .00/ Bug - https://bugs.openjdk.java.net/browse/JDK-8225018 > > > > Summary - > > > > 1. Changed the instructs to take legacy registers when using vblendvps, > vblendvpd, vpblendvb instructions. These instructions are not extended by > AVX-512. > > > > 2. Changed the order (unordered in some cases) so all instructs of same > type are together. > > > > Testing: > > The failure can be reproduced consistently by forcing registers to be > allocated from upper bank by limiting the lower bank registers. (by changing > register class definitions). > > After the fix, we do not see any more failures. > > > > Thanks, > > Kishor > > From ningsheng.jian at arm.com Thu May 30 07:54:00 2019 From: ningsheng.jian at arm.com (ningsheng.jian at arm.com) Date: Thu, 30 May 2019 07:54:00 +0000 Subject: hg: panama/dev: 8223808: initial port of Panama foreign for AArch64 Message-ID: <201905300754.x4U7s0nJ026256@aojmv0008.oracle.com> Changeset: e2f26eb2438d Author: ngasson Date: 2019-05-29 14:48 +0800 URL: http://hg.openjdk.java.net/panama/dev/rev/e2f26eb2438d 8223808: initial port of Panama foreign for AArch64 Reviewed-by: mcimadamore, jvernee + src/hotspot/cpu/aarch64/directUpcallHandler_aarch64.cpp + src/hotspot/cpu/aarch64/foreign_globals_aarch64.cpp + src/hotspot/cpu/aarch64/foreign_globals_aarch64.hpp + src/hotspot/cpu/aarch64/universalNativeInvoker_aarch64.cpp + src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp ! src/hotspot/cpu/x86/universalNativeInvoker_x86.cpp ! src/hotspot/cpu/x86/universalUpcallHandler_x86.cpp ! src/java.base/share/classes/java/foreign/NativeTypes.java ! src/java.base/share/classes/jdk/internal/foreign/abi/CallingSequence.java ! src/java.base/share/classes/jdk/internal/foreign/abi/DirectSignatureShuffler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/ShuffleRecipeClass.java ! src/java.base/share/classes/jdk/internal/foreign/abi/StorageClass.java ! src/java.base/share/classes/jdk/internal/foreign/abi/SystemABI.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalNativeInvoker.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalUpcallHandler.java + src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/AArch64ABI.java + src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/ArgumentClass.java + src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/CallingSequenceBuilderImpl.java + src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/SharedUtils.java ! src/java.base/share/classes/jdk/internal/foreign/memory/LayoutTypeImpl.java ! test/jdk/com/sun/tools/jextract/test8222025/ValistUseTest.java ! test/jdk/java/foreign/LongDoubleTest.java ! test/jdk/java/foreign/StructByValueTest.java ! test/jdk/java/foreign/System/UnixSystem.java + test/jdk/java/foreign/abi/aarch64/CallingSequenceTest.java ! test/jdk/java/foreign/libstructbyvalue.c From nick.gasson at arm.com Thu May 30 07:59:19 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 30 May 2019 15:59:19 +0800 Subject: [foreign] RFR: 8223808: initial port for AArch64 In-Reply-To: <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> References: <0e86a5f3-3cc0-962b-af8b-46f5927e0976@arm.com> <8f2918aa-09b5-45e7-acf3-83b9a054fead@arm.com> <24e9e9e8-2e4f-6572-023f-d929432633f2@oracle.com> <078e4d9d-35b5-8e3c-c857-a8ec7eb7e308@oracle.com> <55c2d42d-3512-b0f6-6934-b6c2b5ada731@arm.com> <7c62132c-cda9-0fbb-4a11-fda534ebb776@oracle.com> <112767ff-9de3-1d69-2fa1-03cd70279b7f@arm.com> <5c5e980b-0046-2473-0d06-7f19f33d5c06@oracle.com> <092a8eea-ef11-bb4b-3b20-30f72e9b50f8@oracle.com> <0739f1e6-7c47-1808-bb56-15b5d18dd967@arm.com> <83074b21-7b66-755b-8283-419c3553426b@oracle.com> <4fdb040894dcb65527f24e92d2201a3a@xs4all.nl> <5da780bed65fcdb8cdd9f14b43fc9ea4@xs4all.nl> Message-ID: Hi Jorn, I'll do the SCHAR / final change in a separate patch as you and Maurizio already tested webrev.3. My colleague Ningsheng helped me push it. Thanks! Nick On 29/05/2019 18:47, Jorn Vernee wrote: > But, I realize this is out of scope for this RFR, so we can take care of > it separately as well :) > > FWIW, all tests pass on my end with v3. > > Jorn > > Jorn Vernee schreef op 2019-05-29 11:53: >> Also, if you're touching that file any ways. I think it's good to make >> all the constants `final`. >> >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-29 11:38: >>> Looks good - as for SCHAR, yes, we can go for it - and add it across >>> the board (e.g. in all ABIs). >>> >>> Thanks >>> Maurizio >>> >>> On 29/05/2019 08:00, Nick Gasson wrote: >>>> Hi Maurizio, >>>> >>>> On 28/05/2019 19:25, Maurizio Cimadamore wrote: >>>>> Actually, a cleaner way to get there is: >>>>> >>>>> public static LayoutType SHORT = >>>>> pick(LittleEndian.SysVABI.SHORT, LittleEndian.WinABI.SHORT, >>>>> LittleEndian.AArch64.SHORT); >>>>> >>>>> and then define the logic inside pick() to select the right value >>>>> depending on platform/os. >>>>> >>>> >>>> Yes this is much better. I've done this in webrev.3 here: >>>> >>>> http://cr.openjdk.java.net/~ngasson/foreign/8223808/webrev.3/ >>>> >>>> I added NativeTypes.AArch64ABI which differs from SysVABI in that >>>> long double is 128 bits and char is unsigned. I wonder if we should >>>> add a "SCHAR" signed char type to NativeTypes? As currently we have >>>> CHAR and UCHAR which are both unsigned on AArch64. >>>> >>>> Also fixed the x86 regression found by Jorn and re-enabled the >>>> UnalignedStructTest test. >>>> >>>> Thanks, >>>> Nick From vladimir.x.ivanov at oracle.com Thu May 30 09:14:55 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 30 May 2019 12:14:55 +0300 Subject: [vector] Fwd: [PATCH] Elemental shifts and rotates speedup In-Reply-To: References: Message-ID: Forwarding to panama-dev@ for review. Best regards, Vladimir Ivanov -------- Forwarded Message -------- Subject: [PATCH] Elemental shifts and rotates speedup Date: Thu, 16 May 2019 02:11:30 +0000 From: Bhateja, Jatin To: hotspot-compiler-dev at openjdk.java.net Hi All, Please find a patch having following changes:- A)Intrinsification of two vector APIs: /1)//VectorShuffle.shuffleIota(VectorSpecies, ?int)/ /2)//VectorShuffle.toVector()/ B)Re-implimentation of following vector APIs using above intrinsified APIs. /1)//Vector.shiftLanesLeft(int)/ /2)//Vector.shiftLanesRight(int)/ /3)//Vector.rotateLanesLeft(int)/ /4)//Vector.rotateLanesRight(int)/ With this we see around ~2X gains in elemental shifts and rotate operations.// Webrev: http://cr.openjdk.java.net/~kkharbas/Jatin/rotate_and_shift_lanes/webrev.00/ Kindly review the patch. Best Regards, Jatin From vladimir.x.ivanov at oracle.com Thu May 30 09:18:09 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 30 May 2019 12:18:09 +0300 Subject: [vector] Fwd: [PATCH] Elemental shifts and rotates speedup In-Reply-To: References: Message-ID: <0b3c3025-792e-e1d9-2436-97feda49a521@oracle.com> Please, ignore. Jatin already resent it to panama-dev at . Best regards, Vladimir Ivanov On 30/05/2019 12:14, Vladimir Ivanov wrote: > Forwarding to panama-dev@ for review. > > Best regards, > Vladimir Ivanov > > -------- Forwarded Message -------- > Subject:???? [PATCH] Elemental shifts and rotates speedup > Date:???? Thu, 16 May 2019 02:11:30 +0000 > From:???? Bhateja, Jatin > To:???? hotspot-compiler-dev at openjdk.java.net > > > Hi All, > > Please find a patch having following changes:- > > A)Intrinsification of two vector APIs: > > /1)//VectorShuffle.shuffleIota(VectorSpecies, ?int)/ > > /2)//VectorShuffle.toVector()/ > > B)Re-implimentation of following vector APIs using above intrinsified APIs. > > /1)//Vector.shiftLanesLeft(int)/ > > /2)//Vector.shiftLanesRight(int)/ > > /3)//Vector.rotateLanesLeft(int)/ > > /4)//Vector.rotateLanesRight(int)/ > > With this we see around ~2X gains in elemental shifts and rotate > operations.// > > Webrev: > http://cr.openjdk.java.net/~kkharbas/Jatin/rotate_and_shift_lanes/webrev.00/ > > > Kindly review the patch. > > Best Regards, > > Jatin > From maurizio.cimadamore at oracle.com Thu May 30 11:15:20 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Thu, 30 May 2019 11:15:20 +0000 Subject: hg: panama/dev: 8224993: 8224843: Refine ByteBuffer interop support? Message-ID: <201905301115.x4UBFLHS026577@aojmv0008.oracle.com> Changeset: 1e31b8307079 Author: mcimadamore Date: 2019-05-30 12:11 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/1e31b8307079 8224993: 8224843: Refine ByteBuffer interop support? Add combinatorial test for memory address bulk copy ! test/jdk/java/foreign/TestByteBuffer.java From jbvernee at xs4all.nl Thu May 30 12:28:02 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 30 May 2019 14:28:02 +0200 Subject: [foreign] RFR 8225034: java/foreign/LongDoubleTest.java fails on Mac OS Message-ID: <1499b105aa63f2b4e7e6f8be01b35f85@xs4all.nl> Hi, After strengthening the access checking for Pointer::addr(), the LongDoubleTest is failing. The problem is that UniversalUpcallHandler creates a READ only pointer, which References::OfLongDouble then tries to unbox using addr(), but this requires READ_WRITE access. To solve this I'm letting References::OfLongDouble do manual access checking for read/write only and use an unchecked version of addr(), which is then used to get/set the value. LongDoubleTest is disabled on Windows, so I've also added a LongDoubleBinderTest that just tests the binder part on all platforms. Bug: https://bugs.openjdk.java.net/browse/JDK-8225034 Webrev: http://cr.openjdk.java.net/~jvernee/panama/webrevs/8225034/webrev.00/ This also fixes a minor hickup which prevents compilation in AArch64ABI, which still expected the old signature of addr(). Thanks, Jorn From sundararajan.athijegannathan at oracle.com Thu May 30 13:15:36 2019 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Thu, 30 May 2019 18:45:36 +0530 Subject: [foreign] RFR 8225034: java/foreign/LongDoubleTest.java fails on Mac OS In-Reply-To: <1499b105aa63f2b4e7e6f8be01b35f85@xs4all.nl> References: <1499b105aa63f2b4e7e6f8be01b35f85@xs4all.nl> Message-ID: <5CEFD778.8050402@oracle.com> With your patch, all tests run fine on Mac OS. -Sundar On 30/05/19, 5:58 PM, Jorn Vernee wrote: > Hi, > > After strengthening the access checking for Pointer::addr(), the > LongDoubleTest is failing. The problem is that UniversalUpcallHandler > creates a READ only pointer, which References::OfLongDouble then tries > to unbox using addr(), but this requires READ_WRITE access. > > To solve this I'm letting References::OfLongDouble do manual access > checking for read/write only and use an unchecked version of addr(), > which is then used to get/set the value. LongDoubleTest is disabled on > Windows, so I've also added a LongDoubleBinderTest that just tests the > binder part on all platforms. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8225034 > Webrev: > http://cr.openjdk.java.net/~jvernee/panama/webrevs/8225034/webrev.00/ > > This also fixes a minor hickup which prevents compilation in > AArch64ABI, which still expected the old signature of addr(). > > Thanks, > Jorn From maurizio.cimadamore at oracle.com Thu May 30 13:12:49 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 30 May 2019 14:12:49 +0100 Subject: [foreign] RFR 8225034: java/foreign/LongDoubleTest.java fails on Mac OS In-Reply-To: <5CEFD778.8050402@oracle.com> References: <1499b105aa63f2b4e7e6f8be01b35f85@xs4all.nl> <5CEFD778.8050402@oracle.com> Message-ID: Patch looks good code-wise. Good to go! Maurizio On 30/05/2019 14:15, Sundararajan Athijegannathan wrote: > With your patch, all tests run fine on Mac OS. > > -Sundar > > On 30/05/19, 5:58 PM, Jorn Vernee wrote: >> Hi, >> >> After strengthening the access checking for Pointer::addr(), the >> LongDoubleTest is failing. The problem is that UniversalUpcallHandler >> creates a READ only pointer, which References::OfLongDouble then >> tries to unbox using addr(), but this requires READ_WRITE access. >> >> To solve this I'm letting References::OfLongDouble do manual access >> checking for read/write only and use an unchecked version of addr(), >> which is then used to get/set the value. LongDoubleTest is disabled >> on Windows, so I've also added a LongDoubleBinderTest that just tests >> the binder part on all platforms. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8225034 >> Webrev: >> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8225034/webrev.00/ >> >> This also fixes a minor hickup which prevents compilation in >> AArch64ABI, which still expected the old signature of addr(). >> >> Thanks, >> Jorn From jbvernee at xs4all.nl Thu May 30 13:38:32 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 30 May 2019 15:38:32 +0200 Subject: [foreign] RFR 8225034: java/foreign/LongDoubleTest.java fails on Mac OS In-Reply-To: References: <1499b105aa63f2b4e7e6f8be01b35f85@xs4all.nl> <5CEFD778.8050402@oracle.com> Message-ID: <98be6168b9cae8f6a40c9a2290bf8264@xs4all.nl> Thanks for the reviews. There's one other thing I realized as well; Util::withOffHeapAddress requires READ_WRITE access for heap-based pointers, since it needs to copy the value back and forth to offheap. I've added a comment to the method that mentions that. I don't think there's any way around requiring READ_WRITE access with the current offheap buffering approach. Jorn Maurizio Cimadamore schreef op 2019-05-30 15:12: > Patch looks good code-wise. Good to go! > > Maurizio > > On 30/05/2019 14:15, Sundararajan Athijegannathan wrote: >> With your patch, all tests run fine on Mac OS. >> >> -Sundar >> >> On 30/05/19, 5:58 PM, Jorn Vernee wrote: >>> Hi, >>> >>> After strengthening the access checking for Pointer::addr(), the >>> LongDoubleTest is failing. The problem is that UniversalUpcallHandler >>> creates a READ only pointer, which References::OfLongDouble then >>> tries to unbox using addr(), but this requires READ_WRITE access. >>> >>> To solve this I'm letting References::OfLongDouble do manual access >>> checking for read/write only and use an unchecked version of addr(), >>> which is then used to get/set the value. LongDoubleTest is disabled >>> on Windows, so I've also added a LongDoubleBinderTest that just tests >>> the binder part on all platforms. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8225034 >>> Webrev: >>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8225034/webrev.00/ >>> >>> This also fixes a minor hickup which prevents compilation in >>> AArch64ABI, which still expected the old signature of addr(). >>> >>> Thanks, >>> Jorn From jbvernee at xs4all.nl Thu May 30 13:49:48 2019 From: jbvernee at xs4all.nl (jbvernee at xs4all.nl) Date: Thu, 30 May 2019 13:49:48 +0000 Subject: hg: panama/dev: 8225034: java/foreign/LongDoubleTest.java fails on Mac OS Message-ID: <201905301349.x4UDnnrK000103@aojmv0008.oracle.com> Changeset: 50961615d2f8 Author: jvernee Date: 2019-05-30 15:48 +0200 URL: http://hg.openjdk.java.net/panama/dev/rev/50961615d2f8 8225034: java/foreign/LongDoubleTest.java fails on Mac OS Reviewed-by: sundar, mcimadamore ! src/java.base/share/classes/jdk/internal/foreign/Util.java ! src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/AArch64ABI.java ! src/java.base/share/classes/jdk/internal/foreign/memory/BoundedPointer.java ! src/java.base/share/classes/jdk/internal/foreign/memory/References.java + test/jdk/java/foreign/LongDoubleBinderTest.java From maurizio.cimadamore at oracle.com Thu May 30 14:59:03 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 30 May 2019 15:59:03 +0100 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) Message-ID: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> Hi, small followup to yesterday's push. As I was benchmarking bulk copy performances, I realized that there's a bug in the methods which I added yesterday on sun.misc.Unsafe - which delegate to the wrong Unsafe (itself), resulting in SO. I also cleaned up uses of Unsafe internally - and added a @ForceInline on MemoryAddressImpl::copy which basically makes it as fast as a raw unsafe call to Unsafe::copyMemory. We're still investigating as to why exactly the annotation is needed to get good inlining. I've also added a shortcircuit in AbstractMemoryScopeImpl::checkAlive, which avoids a call to 'isAlive' if the scope is pinned. This was causing some profile pollution (although performances were still good - probably because of the bi-morphic inline cache). Webrev: http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup/ Maurizio From vladimir.x.ivanov at oracle.com Thu May 30 15:42:52 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 30 May 2019 18:42:52 +0300 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: Nice work, Joshua! I like how it shapes out. Good to see some performance numbers as well. Kishor did some experiments with masked memory operations, so I'd like to hear his thoughts on the patch. Overall, I'm happy with the change as it is now. It solves the correctness issue without sacrificing performance in most of the cases. The only important case which suffers is "masked loops" - when masked kernel is used to avoid explicit loop tail processing. Such loops will end with a deopt most of the time (unless loop tail is empty). So, as a next step, I'd like to see that case to be addressed, so such loops don't trigger uncommon trap. In a longer term, it's worth looking into specific optimizations range checks for masked accesses. Some comments about the code changes: src/hotspot/share/opto/opaquenode.hpp: You have some changes in ProfileBooleanNode. Why providing 0 as a false_count is not enough to get uncommon trap on corresponding branch? src/hotspot/share/opto/doCall.cpp: bool Compile::should_delay_vector_inlining(ciMethod* call_method, JVMState* jvms) { - return UseVectorApiIntrinsics && call_method->is_vector_method(); + return UseVectorApiIntrinsics && call_method->is_vector_method() && + call_method->intrinsic_id() != vmIntrinsics::_ExpectTrue; Why do you need to delay inlining for it? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X-VectorBits.java.template: - if (ax >= 0 && ax + LENGTH <= a.length) { + if (expectTrue(ax >= 0 && ax + LENGTH <= a.length)) { Why don't you use VectorIntrinsics.checkIndex() here? Best regards, Vladimir Ivanov On 10/05/2019 09:41, Joshua Zhu (Arm Technology China) wrote: > Hi Vladimir, > >> It looks promising to introduce a variant of >> VectorIntrinsics.checkIndex() which is used to guard a fast path and is >> annotated with a JIT-compiler hint (akin to >> java.lang.invoke.MethodHandleImpl.profileBoolean() [1], but without >> profiling logic) to override bytecode profiling info, so JIT always puts an >> uncommon trap on the false branch. > > Thanks for your comments. > As you suggested, I introduced VectorIntrinsics.expectTrue() in change [2]. > It's used as below: > if (expectTrue(bool condition)) { > // fast path > } else { > // slow path: uncommon trap > } > > I also wrote a jmh case [3] to check the performance. > See below table for jmh test results. (In Throughput Mode, Unit: ops/ms) > Base without expectTrue (patch [1]) UncommonTrap (patch [2]) > 1000 fastPath 318.228 ? 22.588 457.967 ? 12.622 457.328 ? 11.932 > 10000 fastPath 21.991 ? 2.496 23.360 ? 0.070 24.744 ? 0.213 > 100000 fastPath 1.613 ? 0.007 1.581 ? 0.031 1.631 ? 0.003 > 1000 fastPath + 1 slowPath N/A 57.298 ? 11.033 55.845 ? 0.716 > 10000 fastPath + 1 slowPath N/A 4.537 ? 0.536 15.164 ? 0.098 > 100000 fastPath + 1 slowPath N/A 0.577 ? 0.048 1.564 ? 0.005 > > [1] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/webrev.01/ > [2] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/uncommontrap.webrev.00/ > [3] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/IntVectorJmhTest.java > > Please help review and feel free to share your comments. > Thanks. > > Best Regards, > Joshua > From jbvernee at xs4all.nl Thu May 30 16:09:48 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 30 May 2019 18:09:48 +0200 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> Message-ID: <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> Looks good to me. I wanted to ask about the addition of the APIs to sun.misc.Unsafe last time, but got side-tracked by the test build error and then forgot :/. I had assumed sun.misc.Unsafe would not be updated going forward? What's the policy on this? Thanks, Jorn Maurizio Cimadamore schreef op 2019-05-30 16:59: > Hi, > small followup to yesterday's push. > > As I was benchmarking bulk copy performances, I realized that there's > a bug in the methods which I added yesterday on sun.misc.Unsafe - > which delegate to the wrong Unsafe (itself), resulting in SO. > > I also cleaned up uses of Unsafe internally - and added a @ForceInline > on MemoryAddressImpl::copy which basically makes it as fast as a raw > unsafe call to Unsafe::copyMemory. We're still investigating as to why > exactly the annotation is needed to get good inlining. > > I've also added a shortcircuit in AbstractMemoryScopeImpl::checkAlive, > which avoids a call to 'isAlive' if the scope is pinned. This was > causing some profile pollution (although performances were still good > - probably because of the bi-morphic inline cache). > > Webrev: > http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup/ > > Maurizio From john.r.rose at oracle.com Thu May 30 16:35:50 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 30 May 2019 09:35:50 -0700 Subject: [vectorIntrinsics] [PATCH] Elemental shifts and rotates speedup In-Reply-To: References: Message-ID: <72A0B34C-65E2-4F53-B474-FC2F30C75FF2@oracle.com> I looked at the Java changes. They look good. (We need a way to tell webrev to exclude those generated files. They bulk up the webrev but don't add much information.) I am in the process of "shuffling" the shuffle API a bit, but nothing you've done invalidates my assumptions. I'm glad to see more use of shuffles. I think we want byte-level shuffles to implement byte-swapping (lane-wise Integer.reverseBytes), when there is no instruction that's faster. Perhaps you could look into that. I have code sketched out that uses ByteOrder arguments to gate an application of a bytewise shuffle when byte order != native order. Someone else should look at the JIT work. ? John On May 15, 2019, at 7:41 PM, Bhateja, Jatin wrote: > > > Hi All, > > Please find a patch having following changes:- > > A) Intrinsification of two vector APIs: > > 1) VectorShuffle.shuffleIota(VectorSpecies, int) > > 2) VectorShuffle.toVector() > > B) Re-implimentation of following vector APIs using above intrinsified APIs. > > 1) Vector.shiftLanesLeft(int) > > 2) Vector.shiftLanesRight(int) > > 3) Vector.rotateLanesLeft(int) > > 4) Vector.rotateLanesRight(int) > > With this we see around ~2X gains in elemental shifts and rotate operations. > Webrev: http://cr.openjdk.java.net/~kkharbas/Jatin/rotate_and_shift_lanes/webrev.00/ > > Kindly review the patch. > > Best Regards, > Jatin > > From john.r.rose at oracle.com Thu May 30 16:41:43 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 30 May 2019 09:41:43 -0700 Subject: [vectorIntrinsics] [PATCH] Elemental shifts and rotates speedup In-Reply-To: <72A0B34C-65E2-4F53-B474-FC2F30C75FF2@oracle.com> References: <72A0B34C-65E2-4F53-B474-FC2F30C75FF2@oracle.com> Message-ID: On May 30, 2019, at 9:35 AM, John Rose wrote: > > I looked at the Java changes. They look good. One more comment at the API level. I have been thinking about the Java rules for <>n, versus our vector-level shifting operations, and I think the Java rules are not for the best here. So in Java x< References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> Message-ID: <24e172d6-fdf0-7bc4-7174-4e1a350a0511@oracle.com> New revision: http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup_v2/ I finally got to the bottom of the issue - in part the problem is caused by the fact that MemoryAddessImpl::copy is too big and doesn't get inlined into the caller. But, another problem was that we were doing redundant liveness checks - since the check was called explicitly in MemoryAddressImpl::copy, but also, indirectly, in MemoryAddressImpl::checkAccess. I now eliminated these calls, and moved all liveness check to MemorySegmentImpl::checkRange. I've also added a missing call to checkAlive in MemorySegmentImpl::resize. With these changes the compiled method is still too big, but even w/o the ForceInline annotation, performances are good (and same as with the annotation). Maurizio On 30/05/2019 17:09, Jorn Vernee wrote: > Looks good to me. > > I wanted to ask about the addition of the APIs to sun.misc.Unsafe last > time, but got side-tracked by the test build error and then forgot :/. > I had assumed sun.misc.Unsafe would not be updated going forward? > What's the policy on this? > > Thanks, > Jorn > > Maurizio Cimadamore schreef op 2019-05-30 16:59: >> Hi, >> small followup to yesterday's push. >> >> As I was benchmarking bulk copy performances, I realized that there's >> a bug in the methods which I added yesterday on sun.misc.Unsafe - >> which delegate to the wrong Unsafe (itself), resulting in SO. >> >> I also cleaned up uses of Unsafe internally - and added a @ForceInline >> on MemoryAddressImpl::copy which basically makes it as fast as a raw >> unsafe call to Unsafe::copyMemory. We're still investigating as to why >> exactly the annotation is needed to get good inlining. >> >> I've also added a shortcircuit in AbstractMemoryScopeImpl::checkAlive, >> which avoids a call to 'isAlive' if the scope is pinned. This was >> causing some profile pollution (although performances were still good >> - probably because of the bi-morphic inline cache). >> >> Webrev: >> http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup/ >> >> Maurizio From maurizio.cimadamore at oracle.com Thu May 30 17:12:31 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 30 May 2019 18:12:31 +0100 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> Message-ID: <95270975-79d3-9067-de69-6aa9edfb7f57@oracle.com> My understanding is that sun.misc.Unsafe is available via the jdk.unsupported module - which people can rely upon when writing modular code. The jdk.internal Unsafe is just not exported, so you will have to burst through module boundary checks in order to access it from outside the JDK. Maurizio On 30/05/2019 17:09, Jorn Vernee wrote: > I wanted to ask about the addition of the APIs to sun.misc.Unsafe last > time, but got side-tracked by the test build error and then forgot :/. > I had assumed sun.misc.Unsafe would not be updated going forward? > What's the policy on this? From john.r.rose at oracle.com Thu May 30 17:48:06 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 30 May 2019 10:48:06 -0700 Subject: [vectorIntrinsics] reinterpret vs. reshape vs. cast Message-ID: <356400A7-8AFB-4236-A5C1-916E5D33E09B@oracle.com> I have been thinking about the API points that convert vectors between the various species. The existing points are a good "stab" at what are the basics, but they need reconsideration, especially now that we have decent implementations and know (more directly) what are the underlying "physics" of the Vector API. So, I want to get rid of reshape, merging it into reinterpret. The two methods are not different enough (AFAICT) to warrant parallel implementations. In a private version of the branch, I have rewritten reshape as an alias of reinterpret, but without the extra variable: public abstract Vector reinterpret(VectorSpecies s); /** Use reinterpret. */ @Deprecated public final Vector reshape(VectorSpecies s) { s.check(elementType()); // verify same E return reinterpret(s); } After various constant-folding operations, it still comes out the same, as a call to VI.reinterpret. The "check" call is an extra runtime check which ensures that, in fact, the species has the element type E as claimed by the static type system. Because Java allows unchecked casts (and we use them) I've sprinkled such "check" calls wherever I think the static type system might need a little run-time help. So much for reshape. Now let's talk about the hard problem at the center of all of this: The unpredictable resizing of vectors. Most vector workloads choose a key vector size and use it for nearly all operations. Hardware sometimes prefers to work with a single size at a time, and human beings prefer to reason about constant information contents, rather than having to re-derive a size for every vector sub-expression. This means I think the Vector API should have a clearer policy of shape-invariance. If you start with shape S_128_BIT and do a bunch of vector operations, you should end up with the same shape, unless you intentionally select a operation that is documented to change a shape. This is all the more important in portable shape agnostic code, where you don't know the size of the preferred shape. Having that suddenly change to a non-preferred shape would be a headache. Therefore, I want to make "shape-changing" methods a special category, that is clearly called out in the javadoc, and easy for the user to recognize in the user's code. What would non-reshaping methods be? Well, anything lane-wise that preserves the element type (ETYPE) is obviously shape invariant. Also shuffles, which are not lane-wise but keep VLENGTH and ETYPE constant, are shape invariant. Operations with masks and shuffles are pretty much always shape invariant. In-place reinterpret casts are shape invariant, even as they completely redraw lane types and lane boundaries. Now we come to lane-wise value casts. These are sometimes indispensible but will change the shape of the vector if (as is often the case) the cast changes the bit-size of the ETYPE. The implementation code for this (in the JIT) shows how disruptive this to implement. And I think it's equally disruptive to users. What happens if you need to convert from byte to int, but your preferred byte species cannot scale upward by 4x to an int species of another supported shape? The API requires you to find out in advance whether the larger shape exists to hold the int species for the cast result. If you can't find one, you need to radically recast your computation. This is a portability anti-pattern. Here's what I think is a better way, more portable, easier to implement, and easier for users to manage: Introduce a cast operation which is shape invariant. Something like this: /** Converts this vector lane-wise to a new vector * of a lane-type F, keeping underlying shape constant. * ? */ Vector convert(Conversion conv, int part); Rather than rely on Class to denote the conversion implicitly the user selects a specific conversion operation from a suitable repertoire (TBD). The conversion "knows" its domain and range types, and therefore also "knows" whether it will expand or contract the vector. Byte to int expands, while int to byte contracts, and so on. (Several ISAs refer to this phenomenon as "unpack" and "pack". There's also "zip" and "unzip" in SVE which has a related function, and I AVX has two-vector shuffles that can do the same. Zip could be useful for zero-filling or sign-filling expansion, while unzip could be useful for extraction. There are also mask driven, variable motion, APL-like compress and expand operations on the table which are at least related and maybe useful as implementation tools. There's lots more to be said about implementation, but we can defer that for later.) When a conversion is expanding, in order to retain shape invariance, it is necessary for the conversion to produce 2 (or 4 or 8) output vectors. We can try to hide this fact in the name of simplicity for the user, but it just makes the code shape-shifty, which (IMO) hurts the user's ability to reason about the rest of the code. (If we there are intrinsic or synthetic shapes that can hold all of the bits, that's fine, but it's still a shape-shifting operation, which I propose we avoid in today's design. When we add synthetic multi-vectors, after Valhalla, we can re-introduce shape-shifting code that is portable. But we can't do that today if the number of shapes is a dynamic property. We need synthetic multi-vector shapes to build out a shape-fluid user experience. Can't do that today.) OK, so we have a byte-to-int cast that expands from one input vector to four output vectors. How does the user keep the all straight? I think a good answer is to add the "part" parameter, noted above. The part parameter is present in all lane-wise shape changing operations. Zero is always a valid argument. For a operation which expands by a factor of N, the valid range of part numbers is [0..N-1]. The meaning is simple: It selects which "part" of the output to return. It is *not* a lane index (and I don't want to generalize in that direction). A user doing byte-to-int conversion will simply know that there are four parts to deal with, and work that into the algorithm. (There are at least three ways to deal with parts: Use part 0 only, which means the input vector only uses 25% of its lanes. You load 25% of the input bytes at a time and then expand part 0 to a 100%-sized vector of the same shape. Or, use a little 4-way loop over the parts, disposing of each part separately. Or, unroll the little loop by hand, using 4 temps for parts 0..3. Depends on the application.) If the part number is out of range, you get an array index exception. The pseudo-code of the reference implementation can pretend that the conversion operation produces a tiny array of 4 output vectors, and then the part number is an index into that array. If the conversion doesn't change shape, then zero is the only part number you can ever use. That could be defaulted, but it's not worth the extra overloaded API point IMO. Now, if the conversion contracts, we don't strictly need a part number; we can use a convention which is that the output bits are placed at the beginning of the output vector. But we also want a part number here. Again, zero means "just throw away everything except the beginning of the computation." But non-zero part numbers mean "place the output into another part of the output vector. Why would we do this? Because conversions sometimes come in pairs, and we need to be able to track lane values through multiple conversions, sometimes. To do this sanely, I propose that contracting operations have a part number parameter which "steers" the lane values in away compatible with the inverse conversion, with the same part number. This means that contracting with a non-zero part means that the output is placed in a (zero-filled) vector at lane VLENGTH*part/N, where N is the contraction factor. This means that immediately following with an inverse, and the same part number, will reproduce the original input. That makes these methods much easier to reason about. (Maybe a contracting part number should be either negative or zero, to provide an extra error check. There's no burden on the user to adding a minus sign to the method call, and it will better document what's going on.) I think this "part" idea has legs, and provides a decent way to deal with multi-part results. A beneficial side effect of keeping shape invariance as a principle is that we can concentrate the JIT code on using one register type at a time, and do the fancy footwork for operations like byte to int conversion (with size changes) in Java code where it belongs, rather than in JIT code. I hope to retire the existing cast intrinsic, which is a highly complex 5-phase instruction selection problem, replacing it with a suite of smaller more flexible primitives, some to do shape changing without conversion and others to do conversion without shape changing. ? John From jbvernee at xs4all.nl Thu May 30 18:10:34 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Thu, 30 May 2019 20:10:34 +0200 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <95270975-79d3-9067-de69-6aa9edfb7f57@oracle.com> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> <95270975-79d3-9067-de69-6aa9edfb7f57@oracle.com> Message-ID: <734811485fdcc255dd379a3dda30fe72@xs4all.nl> Thanks, I understood it so far. But, I was wondering if new APIs should be added to sun.misc.Unsafe as well. sun.misc.Unsafe didn't for instance get the rename from get/setObjectXXX to get/setReferenceXXX either. From what I understood, sun.misc.Unsafe is only there to allow code that relied on it previously to keep working under Java 9+, since there is currently no public alternative for some of the functionality. But, in general we'd want to discourage use of sun.misc.Unsafe since it's an internal API. Jorn Maurizio Cimadamore schreef op 2019-05-30 19:12: > My understanding is that sun.misc.Unsafe is available via the > jdk.unsupported module - which people can rely upon when writing > modular code. The jdk.internal Unsafe is just not exported, so you > will have to burst through module boundary checks in order to access > it from outside the JDK. > > Maurizio > > On 30/05/2019 17:09, Jorn Vernee wrote: >> I wanted to ask about the addition of the APIs to sun.misc.Unsafe last >> time, but got side-tracked by the test build error and then forgot :/. >> I had assumed sun.misc.Unsafe would not be updated going forward? >> What's the policy on this? From john.r.rose at oracle.com Thu May 30 18:36:37 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 30 May 2019 11:36:37 -0700 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: On May 30, 2019, at 8:42 AM, Vladimir Ivanov wrote: > > Kishor did some experiments with masked memory operations, so I'd like to hear his thoughts on the patch. I have a sketch of masked memory operations also. The non-gather reads are all unmasked and apply masking after the fact using blend. I have an accurate mask-sensitive range-check operation also. The spec simply says "if any lane N corresponds an OOB array element, you get an exception". Nothing global like "offset >= 0" is needed or (IMO) wanted. That's all I have right now. My versions of the masked stores all go scalar (slow). For masked memory operations I think we should consider using a 2-axis scatter, where one axis selects an array and the other axis selects an element in the array. This is (as I have said in the past) an inherently useful addressing form. And for masked stores, you simply set up a small bit-bucket array, and route all unset lanes through a gather/scatter index that addresses an element of the bit-bucket array. Maybe something like this: public void intoArray($type$[][] bases, IntVector axis1, IntVector axis2); // stores this[N] into bases[i][j] where // i = axis1[N] and j = axis2[N] Or maybe with extra indirections: public void intoArray($type$[][] bases, IntVector axis1, int[][] baseMap, int baseMapOffset, IntVector axis2, int[] indexMap, int indexMapOffset); Then a masked scatter can use a bit-bucket: $type$[][] basesPlus = Arrays.copyOf(bases, bases.length+1); $type$[] bitBucket = new $type$[v.length()]; basesPlus[bases.length] = bitBucket; IntVector axis1Plus = broadcast(bases.length).blend(axis1, m); IntVector axis2Plus = broadcast(0).blend(axis2, m); The masked store (non-scatter) is just this: $type$[] bitBucket = new $type$[v.length()]; $type$[][] bases { a, bitBucket }; IntVector axis1 = broadcast(1).blend(broadcast(0), m); IntVector axis2 = broadcast(0).blend(indexes, m); Underneath the safe 2-axis array store would be an unsafe vectorized addressing mode: private void intoArray(Object bases, LongVector axis1, LongVector axis2); // stores this[N] into unsafe(oop,j) where // oop = unsafe(bases,i), i = axis1[N], // j = axis2[N], and unsafe(b,n) is an //appropriately typed unsafe memory // reference at base b and long offset n. That would be the compiler intrinsic. It must operate atomically with respect to the GC. During the internal states of the intrinsic, the vector unit will manipulate a vector of oops, as the intermediate result of unsafe(bases,i) which input to unsafe(oop,j). ? John From maurizio.cimadamore at oracle.com Thu May 30 20:08:54 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 30 May 2019 21:08:54 +0100 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <734811485fdcc255dd379a3dda30fe72@xs4all.nl> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> <95270975-79d3-9067-de69-6aa9edfb7f57@oracle.com> <734811485fdcc255dd379a3dda30fe72@xs4all.nl> Message-ID: <1ce9a104-00fd-37cf-de58-03d7906f56c0@oracle.com> The way I see this, is that we need some place where to put potentially unsafe API points. jdk.internal is not an acceptable answer, as that's clearly a non-exported module. sun.misc.Unsafe is not great either, as that's something that we want to discourage - but it's the only vehicle we've got right now, so I'm not too worried about having it there, since this is still a prototype. Moving forward I have some ideas on how to do this more cleanly (clearly this is a problem that will show up again, and magnified, once we start talking SystemABI). The basic idea is to put the unsafe parts of the API in a separate module (e.g. jdk.foreign.unsafe) and mark that module as 'non resolvable by default', which means that developers can only 'see' that module if they pass an explicit flag to javac/jvm (--add-modules). This seems like a robust way to add unsafe functionalities, as there will be a clear opt-in from the user. Whether this is a viable alternative, time will tell - we're in the process of discussing these points internally. Maurizio On 30/05/2019 19:10, Jorn Vernee wrote: > Thanks, I understood it so far. But, I was wondering if new APIs > should be added to sun.misc.Unsafe as well. sun.misc.Unsafe didn't for > instance get the rename from get/setObjectXXX to get/setReferenceXXX > either. > > From what I understood, sun.misc.Unsafe is only there to allow code > that relied on it previously to keep working under Java 9+, since > there is currently no public alternative for some of the > functionality. But, in general we'd want to discourage use of > sun.misc.Unsafe since it's an internal API. > > Jorn > > Maurizio Cimadamore schreef op 2019-05-30 19:12: >> My understanding is that sun.misc.Unsafe is available via the >> jdk.unsupported module - which people can rely upon when writing >> modular code. The jdk.internal Unsafe is just not exported, so you >> will have to burst through module boundary checks in order to access >> it from outside the JDK. >> >> Maurizio >> >> On 30/05/2019 17:09, Jorn Vernee wrote: >>> I wanted to ask about the addition of the APIs to sun.misc.Unsafe >>> last time, but got side-tracked by the test build error and then >>> forgot :/. I had assumed sun.misc.Unsafe would not be updated going >>> forward? What's the policy on this? From kishor.kharbas at intel.com Thu May 30 20:34:33 2019 From: kishor.kharbas at intel.com (Kharbas, Kishor) Date: Thu, 30 May 2019 20:34:33 +0000 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: Hi Joshua, Thanks for the patch. Here is the performance of the jmh benchmark on Intel CLX. Results with slow path are as expected - having an uncommon trap over a regular check does give performance benefit. I am not sure why the performance gain in fast path (for 1000 case). The same is observed in ARM's case. Do you know why? Results on Intel CLX: Base without expectTrue (patch [1]) UncommonTrap (patch [2]) 1000 fastPath 416.251 ? 3.086 538.024 ? 3.415 538.871 ? 2.779 10000 fastPath 29.426 ? 3.625 30.389 ? 1.389 30.673 ? 0.354 100000 fastPath 2.018 ? 0.060 2.013 ? 0.005 2.045 ? 0.063 1000 fastPath + 1 slowPath N/A 70.835 ? 10.902 72.236 ? 0.797 10000 fastPath + 1 slowPath N/A 6.512 ? 0.466 19.276 ? 1.769 100000 fastPath + 1 slowPath N/A 0.756 ? 0.016 1.955 ? 0.017 Thanks, Kishor > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Thursday, May 30, 2019 8:43 AM > To: Joshua Zhu (Arm Technology China) ; panama- > dev at openjdk.java.net > Cc: nd ; Kharbas, Kishor > Subject: Re: [vector] RFR 8221816: IndexOutOfBoundsException for > fromArray/intoArray with unset mask lanes - was: RE: > IndexOutOfBoundsException with unset mask lanes > > Nice work, Joshua! I like how it shapes out. > > Good to see some performance numbers as well. > > Kishor did some experiments with masked memory operations, so I'd like to > hear his thoughts on the patch. > > Overall, I'm happy with the change as it is now. It solves the correctness issue > without sacrificing performance in most of the cases. > > The only important case which suffers is "masked loops" - when masked > kernel is used to avoid explicit loop tail processing. Such loops will end with a > deopt most of the time (unless loop tail is empty). So, as a next step, I'd like > to see that case to be addressed, so such loops don't trigger uncommon trap. > > In a longer term, it's worth looking into specific optimizations range checks > for masked accesses. > > Some comments about the code changes: > > src/hotspot/share/opto/opaquenode.hpp: > > You have some changes in ProfileBooleanNode. Why providing 0 as a > false_count is not enough to get uncommon trap on corresponding branch? > > > src/hotspot/share/opto/doCall.cpp: > > bool Compile::should_delay_vector_inlining(ciMethod* call_method, > JVMState* jvms) { > - return UseVectorApiIntrinsics && call_method->is_vector_method(); > + return UseVectorApiIntrinsics && call_method->is_vector_method() && > + call_method->intrinsic_id() != vmIntrinsics::_ExpectTrue; > > Why do you need to delay inlining for it? > > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X- > VectorBits.java.template: > > - if (ax >= 0 && ax + LENGTH <= a.length) { > + if (expectTrue(ax >= 0 && ax + LENGTH <= a.length)) { > > Why don't you use VectorIntrinsics.checkIndex() here? > > Best regards, > Vladimir Ivanov > > On 10/05/2019 09:41, Joshua Zhu (Arm Technology China) wrote: > > Hi Vladimir, > > > >> It looks promising to introduce a variant of > >> VectorIntrinsics.checkIndex() which is used to guard a fast path and is > >> annotated with a JIT-compiler hint (akin to > >> java.lang.invoke.MethodHandleImpl.profileBoolean() [1], but without > >> profiling logic) to override bytecode profiling info, so JIT always puts an > >> uncommon trap on the false branch. > > > > Thanks for your comments. > > As you suggested, I introduced VectorIntrinsics.expectTrue() in change [2]. > > It's used as below: > > if (expectTrue(bool condition)) { > > // fast path > > } else { > > // slow path: uncommon trap > > } > > > > I also wrote a jmh case [3] to check the performance. > > See below table for jmh test results. (In Throughput Mode, Unit: ops/ms) > > Base without expectTrue (patch [1]) UncommonTrap > (patch [2]) > > 1000 fastPath 318.228 ? 22.588 457.967 ? 12.622 457.328 ? > 11.932 > > 10000 fastPath 21.991 ? 2.496 23.360 ? 0.070 24.744 ? > 0.213 > > 100000 fastPath 1.613 ? 0.007 1.581 ? 0.031 1.631 ? 0.003 > > 1000 fastPath + 1 slowPath N/A 57.298 ? 11.033 55.845 ? > 0.716 > > 10000 fastPath + 1 slowPath N/A 4.537 ? 0.536 15.164 ? > 0.098 > > 100000 fastPath + 1 slowPath N/A 0.577 ? 0.048 1.564 ? > 0.005 > > > > [1] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/webrev.01/ > > [2] > http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/uncommontrap.w > ebrev.00/ > > [3] > http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/IntVectorJmhTest > .java > > > > Please help review and feel free to share your comments. > > Thanks. > > > > Best Regards, > > Joshua > > From john.r.rose at oracle.com Fri May 31 07:26:00 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 31 May 2019 00:26:00 -0700 Subject: [vectorIntrinsics] conversions and partial results Message-ID: I just wrote the following javadoc for Vector, concerning the difficult problem of expanding and contracting conversion operations. (It may be too much. We should let the CSR reviewers help us to cut it down. For now more is better.) Does this seem like a good direction to go with these sorts of conversions? Is there a better one? ? John ??????????? + * A lane-wise conversion operation takes one input vector, + * distributing a unary scalar conversion operator across the lanes, + * and produces a resulting vector of the converted values, or as + * many of them as can be fit into the required shape. + * + *

Unlike other lane-wise operations, conversions can change lane + * type, from the input "domain" type to the output "range" type. The + * lane size may change along with the type. In order to manage the + * size changes, lane-wise conversion methods can product partial + * results, under the control of a {@code part} parameter, which + * is {@linkplain Vector.html#conversions explained elsewhere}. + * + *

The following pseudocode expresses the behavior of this operation + * category, including the handling of partial results: + * + *

{@code
+ * ETYPE2 scalar_conversion_op(ETYPE s);
+ * EVector a = ...;
+ * int part = ...;
+ * VectorSpecies dom = a.species();
+ * VectorSpecies ran = dom.withLanes(ETYPE2.class);
+ * assert dom.vectorShape() == ran.vectorShape();
+ * int domlen = dom.vectorLength();
+ * int ranlen = ran.vectorLength();
+ * ETYPE2[] ar = new ETYPE2[ran.vectorLength()];
+ * if (domlen == ranlen) { // in-place
+ *     assert part == 0; 
+ *     assert dom.elementSize() == ran.elementSize();
+ *     for (int i = 0; i < limit; i++) {
+ *         ar[i] = scalar_conversion_op(a.lane(i));
+ *     }
+ * } else if (domlen > ranlen) { // expanding
+ *     assert ran.elementSize() > dom.elementSize();
+ *     int origin = decodePartForExpand(dom, ran, part);
+ *     for (int i = 0; i < ranlen; i++) {
+ *         ETYPE s = a.lane(origin + i);
+ *         ar[i] = scalar_conversion_op(s);
+ *     }
+ * } else { // (domlen < ranlen) // contracting
+ *     assert ran.elementSize() < dom.elementSize();
+ *     int origin = decodePartForContract(dom, ran, part);
+ *     for (int i = 0; i < domlen; i++) {
+ *         ETYPE s = a.lane(i);
+ *         ar[origin + i] = scalar_conversion_op(s);
+ *     }
+ * E2Vector r = E2Vector.fromArray(ran, ar, 0);
+ * }
+ * ??????????? *

Conversions and Partial Results

* This API provides a set of lane-wise conversion operators. * They are described by constants of type * {@link VectorOperation.Conversion}, which are passed as * arguments to the * {@link Vector#convert(VectorOperators.Conversion,int) Vector.convert()} * method. * *

Every conversion operator has a specified * {@linkplain VectorOperations.Conversion#domainType() domain type} and * {@linkplain VectorOperations.Conversion#rangeType() range type}, * which exactly match the lane types of the input and output * vectors. * *

A conversion operator is classified as (respectively) in-place, * expanding, or contracting, depending on whether the bit-size of its * domain type is (respectively) equal, less than, or greater than the * bit-size of its range type. * * An expanding conversion, such as {@code short} to {@code long}, * takes a scalar value and represents it in a larger format (always * with some information redundancy). A contracting conversion, such * as {@code double} to {@code float}, takes a scalar value and * represents it in a smaller format (always with some information * loss). Some in-place conversions may also include information * loss, such as with conversions between {@code long} and {@code * double}, and also {@code int} and {@code float}. Expanding * conversions never "lose bits", but they may sometimes disturb the * sign of a value, if a domain or range is unsigned. * *

This classification is important, because, unless otherwise * documented, conversion operations never change vector * shape, regardless of how they may change lane sizes. * * Therefore an expanding conversion cannot store all of its * results in its output vector, because the output vector has fewer * lanes of larger size, in order to have the same overall bit-size as * its input. * * Likewise, a contracting conversion must store its relatively small * results into a subset of the lanes of the output vector, defaulting * the unused lanes to zero. * * In all cases, the number of lane values actually computed by a * conversion of any kind is the smaller {@code VLENGTH} of the input * and output vectors. We will call this important number {@code B}, * the block size of the conversion. If you need more than {@code B} * values from a vector conversion operation, you must run the * operation more than once. * *

Expanding and contracting conversions are further characterized * by a factor {@code M} which is the (integer) ratio of the domain * and range type sizes. Since all element sizes are currently powers * of two, one size always divides the other. In fact, {@code M} is * {@code 2}, {@code 4}, or {@code 8}. * * As an example, a conversion from {@code byte} to {@code long} * ({@code M=8}) will discard 87.5% of the input values in order to * convert the remaining 12.5% into the roomy {@code long} lanes of * the output vector. The inverse conversion will convert back all of * the large results, but will waste 87.5% of the lanes in the output * vector. * * Only in-place conversions ({@code M=1}) deliver all of * their results in one output vector, without wasting lanes. * *

Given the ratio {@code M}, a second parameter called the block * size {@code B} is derived from the {@code VSIZE} of the * *

To help manage the multiple outputs of expanding conversions, * and merge the multiple inputs of the inverse contracting * conversions, the conversion methods feature an additional parameter * called {@code part}, which selects partial results from expansions, * and also steers the results of contractions in the opposite * direction. The value {@code part} is processed as follows * for each kind of conversion: * *

    *
  • expanding by {@code M}: {@code part} must be in the range * {@code [0..M-1]}, and selects the block of {@code B} input lanes * starting at the origin lane at {@code part*VLENGTH/B}, * where {@code VLENGTH} is the length of the input. * *
  • contracting by {@code M}: {@code part} must be in the range * {@code [0..M-1]}, and steers all {@code B} input lanes into * the output located at the origin lane {@code part*VLENGTH/B}, * where {@code VLENGTH} is the length of the output. * *
  • in-place ({@code M=1}): {@code part} must be zero. * The {@code VLENGTH} of both vectors is {@code B}, and the * the origin lane value is always the first lane. * *
* *

Thus, an expanding conversion can iterate over all possible * output blocks (selected by {@code part} values) to obtain the full * set of converted values, into a sequence of {@code N} output * vectors of length {@code B}. And if the reverse operation is * necessary, a series of contracting conversions can iterate over all * possible input blocks (again selected by {@code part} values) * and merge the results into a vector in which all the lanes * are used to hold a result value. And in all cases, a value of * zero is always valid as a {@code part} parameter, if the user * accepts the resulting pattern of results. From john.r.rose at oracle.com Fri May 31 07:28:13 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 31 May 2019 00:28:13 -0700 Subject: [vectorIntrinsics] lane order and byte order Message-ID: <665CFDEE-A4FF-41BF-9074-348EBB8B7FC8@oracle.com> I just wrote the following. It may be too much, but that's the direction I'm erring in at the moment. During CSR we can transfer some of the non-normative observations into @apiNote or external documentation. *

Lane order and byte order

* * The number of lane values stored in a given vector is referred to * as its {@linkplain #length() vector length} or {@code VLENGTH}. * * It is useful to consider vector lanes as ordered * sequentially from first to last, with the first lane * numbered {@code 0}, the next lane numbered {@code 1}, and so on to * the last lane numbered {@code VLENGTH-1}. This is a temporal * order, where lower-numbered lanes are considered earlier than * higher-numbered (later) lanes. This API uses these terms * in preference to spatial terms such as "left", "right", "high", * and "low". * *

Temporal terminology works well for vectors because they * (usually) represent small fixed-sized segments in a long sequence * of workload elements, where the workload is conceptually traversed * in time order from beginning to end. (This is a mental model: it * does not exclude multicore divide-and-conquer techniques.) Thus, * when a scalar loop is transformed into a vector loop, adjacent * scalar items (one earlier, one later) in the workload end up as * adjacent lanes in a single vector (again, one earlier, one later). * At a vector boundary, the last lane item in the earlier vector is * adjacent to (and just before) the first lane item in the * immediately following vector. * *

Vectors are also sometimes thought of in spatial terms, where * the first lane is placed at an edge of some virtual paper, and * subsequent lanes are presented in order next to it. When using * spatial terms, all directions are equally plausible: Some vector * notations present lanes from left to right, and others from right * to left; still others present from top to bottom or vice versa. * Using the language of time (before, after, first, last) instead of * space (left, right, high, low) is often more likely to avoid * misunderstandings. * *

As second reason to prefer temporal to spatial language about * vector lanes is the fact that the terms "left", "right", "high" and * "low" are widely used to describe the relations between bits in * scalar values. The leftmost or highest bit in a given type is * likely to be a sign bit, while the rightmost or lowest bit is * likely to be the arithmetically least significant, and so on. * Applying these terms to vector lanes risks confusion, however, * because it is relatively rare to find algorithms where, given two * adjacent vector lanes, one lane is somehow more arithmetically * significant than its neighbor, and even in those cases, there is no * general way to know which neighbor is the the more significant. * *

Putting the terms together, we view the information structure * of a vector as a temporal sequence of lanes ("first", "next", * "earlier", "later", "last", etc.) of bit-strings which are * internally ordered spatially (either "low" to "high" or "right" to * "left"). The primitive values in the lanes are decoded from these * bit-strings, in the usual way. Most vector operations, like most * Java scalar operators, treat primitive values as atomic values, but * some operations reveal the internal bit-string structure. * *

When a vector is loaded from or stored into memory, the order * of vector lanes is always consistent with the inherent * ordering of the memory container. This is true whether or not * individual lane elements are subject to "byte swapping" due to * details of byte order. Thus, while the scalar lane elements of * vector might be "byte swapped", the lanes themselves are never * reordered, except by an explicit method call that performs * cross-lane reordering. * *

When vector lane values are stored to Java variables of the * same type, byte swapping is performed if and only if the * implementation of the vector hardware requires such swapping. It * is therefore unconditional and invisible. * *

As a useful fiction, this API presents a consistent illusion * that vector lane bytes are composed into larger lane scalars in * little endian order. This means that storing a vector * into a Java byte array will reveal the successive bytes of the * vector lane values in little-endian order on all platforms, * regardless of native memory order, and also regardless of byte * order (if any) within vector unit registers. * *

This hypothetical little-endian ordering also appears when a * {@linkplain #reinterpret(VectorSpecies) reinterpret conversion} is * applies in such a way that lane boundaries are discarded and * redrawn differently, while maintaining vector bits unchanged. In * such an operation, two adjacent lanes will contribute bytes to a * single new lane (or vice versa), and the sequential order of the * two lanes will determine the arithmetic order of the bytes in the * single lane. In this case, the little-endian convention provides * portable results, so that on all platforms earlier lanes tend to * contribute lower (rightward) bits, and later lanes tend to * contribute higher (leftward) bits. The {@linkplain #asByteVector * reinterpreting conversions} between {@link ByteVector}s and the * other non-byte vectors use this convention to clarify their * portable semantics. * *

The little-endian fiction for relating lane order to per-lane * byte order is slightly preferable to an equivalent big-endian * fiction, because some related formulas are much simpler, * specifically those which renumber bytes after lane structure * changes. The earliest byte is invariantly earliest across all lane * structure changes, but only if little-endian convention are used. * The root cause of this is that bytes in scalars are numbered from * the least significant (rightmost) to the omst significant * (leftmost), and almost never vice-versa. If we habitually numbered * sign bits as zero (as on some computers) then this API would reach * for big-endian fictions to create unified addressing of vector * bytes. * From john.r.rose at oracle.com Fri May 31 07:34:53 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 31 May 2019 00:34:53 -0700 Subject: [vectorIntrinsics] what about div? Message-ID: The top-level vector class doesn't have a divide operation, for what I think are a pair of reasons: 1. It isn't total. It could run into 1/0 and need to throw an exception. 2. Intel hardware does not support it directly. On the other hand, supporting add/sub/mul and not div seems overly surprising to me. I propose that we move div up into the top level class, and make it (for now) a partial operation which throws UOE. Later we can fill it in. I note that SVE seems to support it (though it may well be slow). And for Intel we can use some of the ideas here: https://stackoverflow.com/questions/16822757/sse-integer-division I think we can treat the 1/0 problem as similar to the range check problem (with toArray masked). Basically, we need to do `v.equal(broadcast(0)).anyTrue()` before a div and uncommon-trap to scalar code if it happens. ARM SVE defines zero as the result of x/0, but I don't think we need to do that. For efficient integer division up to 32 bits we can use 64-bit floats. For 64-bit integer division we can eventually optimize with something tricky; see the link above. ? John From john.r.rose at oracle.com Fri May 31 07:41:48 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 31 May 2019 00:41:48 -0700 Subject: [vectorIntrinsics] what about div? In-Reply-To: References: Message-ID: <2023AFC4-FD0B-44AE-A99B-CE3D7D70EC9D@oracle.com> On May 31, 2019, at 12:34 AM, John Rose wrote: > > I propose that we move div up into the top level class, BTW, this proposal relates to my previous one about partial operations. I *do not* want to build special division rules into the JIT. Instead, I want to implement whatever crazy integer division emulation is necessary in *Java*. To do that probably requires temporary expansion of a Vector to *two* Vector temporaries. The JIT shouldn't deal with this, and with the "part" convention we frame the computation in terms of vector pairs like this: Vector part0 = intv.convert(I2D, /*part*/0); Vector part1 = intv.convert(I2D, /*part*/1); ?do float64 division in both part[01]? VectorMask bad0 = findInfOrNan(part0); VectorMask bad1 = findInfOrNan(part1); if (bad0.or(bad1).anyTrue()) throw div0BadThing(); Vector result0 = part0.convert(D2I, /*part*/0); Vector result1 = part1.convert(D2I, /*part*/1); return result0.bitwise(OR, result1); From shade at redhat.com Fri May 31 08:24:43 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 31 May 2019 10:24:43 +0200 Subject: Cross-compiling Panama: clang_getClangVersion detection Message-ID: <57cb2c44-c7fd-d81e-3c03-fe14fcc0b21d@redhat.com> Hi, I was trying to build aarch64 in "foreign" and ran into trouble. My CI cross-compiles with base system cross-compiler. Which means it can compile the binaries, but cannot execute them, including those configure tries to test. So, configure fails with: checking for clang_getClangVersion in -lclang... no configure: error: Cannot locate libclang or headers at the specified locations: /home/buildbot/deps/clang-llvm/aarch64/lib /home/buildbot/deps/clang-llvm/aarch64/include Configure and build passes after the hack like this: diff -r f94dd38a20f4 make/autoconf/lib-clang.m4 --- a/make/autoconf/lib-clang.m4 Wed May 29 23:14:34 2019 +0200 +++ b/make/autoconf/lib-clang.m4 Thu May 30 22:34:38 2019 +0200 @@ -120,7 +120,7 @@ AC_CHECK_HEADER("clang-c/Index.h", [], [ENABLE_LIBCLANG="false"]) if test "x$ENABLE_LIBCLANG" = "xtrue"; then - if test "x$TOOLCHAIN_TYPE" = "xmicrosoft"; then + if test "x$TOOLCHAIN_TYPE" = "xmicrosoft" || test "x$COMPILE_TYPE" = "xcross"; then # Just trust the lib is there LIBS=$LIBCLANG_LIBS else If that makes sense, can you please push this into Panama somewhere? -- Thanks, -Aleksey From maurizio.cimadamore at oracle.com Fri May 31 08:57:46 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 31 May 2019 09:57:46 +0100 Subject: Cross-compiling Panama: clang_getClangVersion detection In-Reply-To: <57cb2c44-c7fd-d81e-3c03-fe14fcc0b21d@redhat.com> References: <57cb2c44-c7fd-d81e-3c03-fe14fcc0b21d@redhat.com> Message-ID: Looks good I'll push Maurizio On 31/05/2019 09:24, Aleksey Shipilev wrote: > Hi, > > I was trying to build aarch64 in "foreign" and ran into trouble. My CI cross-compiles with base > system cross-compiler. Which means it can compile the binaries, but cannot execute them, including > those configure tries to test. So, configure fails with: > > checking for clang_getClangVersion in -lclang... no > configure: error: Cannot locate libclang or headers at the specified locations: > /home/buildbot/deps/clang-llvm/aarch64/lib > /home/buildbot/deps/clang-llvm/aarch64/include > > Configure and build passes after the hack like this: > > diff -r f94dd38a20f4 make/autoconf/lib-clang.m4 > --- a/make/autoconf/lib-clang.m4 Wed May 29 23:14:34 2019 +0200 > +++ b/make/autoconf/lib-clang.m4 Thu May 30 22:34:38 2019 +0200 > @@ -120,7 +120,7 @@ > > AC_CHECK_HEADER("clang-c/Index.h", [], [ENABLE_LIBCLANG="false"]) > if test "x$ENABLE_LIBCLANG" = "xtrue"; then > - if test "x$TOOLCHAIN_TYPE" = "xmicrosoft"; then > + if test "x$TOOLCHAIN_TYPE" = "xmicrosoft" || test "x$COMPILE_TYPE" = "xcross"; then > # Just trust the lib is there > LIBS=$LIBCLANG_LIBS > else > > If that makes sense, can you please push this into Panama somewhere? > From jbvernee at xs4all.nl Fri May 31 09:31:01 2019 From: jbvernee at xs4all.nl (Jorn Vernee) Date: Fri, 31 May 2019 11:31:01 +0200 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <24e172d6-fdf0-7bc4-7174-4e1a350a0511@oracle.com> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> <24e172d6-fdf0-7bc4-7174-4e1a350a0511@oracle.com> Message-ID: <784870c91865c322b916cbda32ac9548@xs4all.nl> You have a spurious import in MemoryAddressImpl of ForceInline now ;) Rest looks good! Cheers, Jorn Maurizio Cimadamore schreef op 2019-05-30 19:11: > New revision: > > http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup_v2/ > > I finally got to the bottom of the issue - in part the problem is > caused by the fact that MemoryAddessImpl::copy is too big and doesn't > get inlined into the caller. But, another problem was that we were > doing redundant liveness checks - since the check was called > explicitly in MemoryAddressImpl::copy, but also, indirectly, in > MemoryAddressImpl::checkAccess. > > I now eliminated these calls, and moved all liveness check to > MemorySegmentImpl::checkRange. > > I've also added a missing call to checkAlive in > MemorySegmentImpl::resize. > > With these changes the compiled method is still too big, but even w/o > the ForceInline annotation, performances are good (and same as with > the annotation). > > Maurizio > > > On 30/05/2019 17:09, Jorn Vernee wrote: >> Looks good to me. >> >> I wanted to ask about the addition of the APIs to sun.misc.Unsafe last >> time, but got side-tracked by the test build error and then forgot :/. >> I had assumed sun.misc.Unsafe would not be updated going forward? >> What's the policy on this? >> >> Thanks, >> Jorn >> >> Maurizio Cimadamore schreef op 2019-05-30 16:59: >>> Hi, >>> small followup to yesterday's push. >>> >>> As I was benchmarking bulk copy performances, I realized that there's >>> a bug in the methods which I added yesterday on sun.misc.Unsafe - >>> which delegate to the wrong Unsafe (itself), resulting in SO. >>> >>> I also cleaned up uses of Unsafe internally - and added a >>> @ForceInline >>> on MemoryAddressImpl::copy which basically makes it as fast as a raw >>> unsafe call to Unsafe::copyMemory. We're still investigating as to >>> why >>> exactly the annotation is needed to get good inlining. >>> >>> I've also added a shortcircuit in >>> AbstractMemoryScopeImpl::checkAlive, >>> which avoids a call to 'isAlive' if the scope is pinned. This was >>> causing some profile pollution (although performances were still good >>> - probably because of the bi-morphic inline cache). >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup/ >>> >>> Maurizio From maurizio.cimadamore at oracle.com Fri May 31 09:39:41 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 31 May 2019 09:39:41 +0000 Subject: hg: panama/dev: 8224993: Add Unsafe support for MemoryAddress Message-ID: <201905310939.x4V9dgIo027087@aojmv0008.oracle.com> Changeset: 2b459521bb91 Author: mcimadamore Date: 2019-05-31 10:36 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/2b459521bb91 8224993: Add Unsafe support for MemoryAddress * Fix bad call from sun.misc.Unsafe to itself * remove redundant checks from MmeoryAddressImpl::copy ! src/java.base/share/classes/jdk/internal/foreign/AbstractMemoryScopeImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemoryAddressImpl.java ! src/java.base/share/classes/jdk/internal/foreign/MemorySegmentImpl.java ! src/jdk.unsupported/share/classes/sun/misc/Unsafe.java From maurizio.cimadamore at oracle.com Fri May 31 09:40:39 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 31 May 2019 10:40:39 +0100 Subject: [foreign-memaccess] RFR 8224993: Add Unsafe support for MemoryAddress (again) In-Reply-To: <784870c91865c322b916cbda32ac9548@xs4all.nl> References: <35a2e3b5-7f99-3e64-d690-1cca1b09c51b@oracle.com> <8d4cf8aba1958fd42fed24df396ee496@xs4all.nl> <24e172d6-fdf0-7bc4-7174-4e1a350a0511@oracle.com> <784870c91865c322b916cbda32ac9548@xs4all.nl> Message-ID: <17fd635c-854f-b592-5019-a2aea158d83c@oracle.com> Fixed and pushed - thanks Maurizio On 31/05/2019 10:31, Jorn Vernee wrote: > You have a spurious import in MemoryAddressImpl of ForceInline now ;) > > Rest looks good! > > Cheers, > Jorn > > Maurizio Cimadamore schreef op 2019-05-30 19:11: >> New revision: >> >> http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup_v2/ >> >> I finally got to the bottom of the issue - in part the problem is >> caused by the fact that MemoryAddessImpl::copy is too big and doesn't >> get inlined into the caller. But, another problem was that we were >> doing redundant liveness checks - since the check was called >> explicitly in MemoryAddressImpl::copy, but also, indirectly, in >> MemoryAddressImpl::checkAccess. >> >> I now eliminated these calls, and moved all liveness check to >> MemorySegmentImpl::checkRange. >> >> I've also added a missing call to checkAlive in >> MemorySegmentImpl::resize. >> >> With these changes the compiled method is still too big, but even w/o >> the ForceInline annotation, performances are good (and same as with >> the annotation). >> >> Maurizio >> >> >> On 30/05/2019 17:09, Jorn Vernee wrote: >>> Looks good to me. >>> >>> I wanted to ask about the addition of the APIs to sun.misc.Unsafe >>> last time, but got side-tracked by the test build error and then >>> forgot :/. I had assumed sun.misc.Unsafe would not be updated going >>> forward? What's the policy on this? >>> >>> Thanks, >>> Jorn >>> >>> Maurizio Cimadamore schreef op 2019-05-30 16:59: >>>> Hi, >>>> small followup to yesterday's push. >>>> >>>> As I was benchmarking bulk copy performances, I realized that there's >>>> a bug in the methods which I added yesterday on sun.misc.Unsafe - >>>> which delegate to the wrong Unsafe (itself), resulting in SO. >>>> >>>> I also cleaned up uses of Unsafe internally - and added a @ForceInline >>>> on MemoryAddressImpl::copy which basically makes it as fast as a raw >>>> unsafe call to Unsafe::copyMemory. We're still investigating as to why >>>> exactly the annotation is needed to get good inlining. >>>> >>>> I've also added a shortcircuit in AbstractMemoryScopeImpl::checkAlive, >>>> which avoids a call to 'isAlive' if the scope is pinned. This was >>>> causing some profile pollution (although performances were still good >>>> - probably because of the bi-morphic inline cache). >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mcimadamore/panama/8224993_followup/ >>>> >>>> Maurizio From maurizio.cimadamore at oracle.com Fri May 31 09:51:51 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 31 May 2019 09:51:51 +0000 Subject: hg: panama/dev: Fix cross-compile issue with libclang detection Message-ID: <201905310951.x4V9ppbk002709@aojmv0008.oracle.com> Changeset: 145c606b4156 Author: mcimadamore Date: 2019-05-31 10:51 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/145c606b4156 Fix cross-compile issue with libclang detection Contributed-by: shade at redhat.com ! make/autoconf/lib-clang.m4 From maurizio.cimadamore at oracle.com Fri May 31 09:52:00 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 31 May 2019 10:52:00 +0100 Subject: Cross-compiling Panama: clang_getClangVersion detection In-Reply-To: References: <57cb2c44-c7fd-d81e-3c03-fe14fcc0b21d@redhat.com> Message-ID: <4dc9570c-6da6-b739-8f4c-6be5a3616c21@oracle.com> Pushed - thanks! Maurizio On 31/05/2019 09:57, Maurizio Cimadamore wrote: > Looks good I'll push > > Maurizio > > On 31/05/2019 09:24, Aleksey Shipilev wrote: >> Hi, >> >> I was trying to build aarch64 in "foreign" and ran into trouble. My >> CI cross-compiles with base >> system cross-compiler. Which means it can compile the binaries, but >> cannot execute them, including >> those configure tries to test. So, configure fails with: >> >> checking for clang_getClangVersion in -lclang... no >> configure: error: Cannot locate libclang or headers at the specified >> locations: >> ???????????? /home/buildbot/deps/clang-llvm/aarch64/lib >> ???????????? /home/buildbot/deps/clang-llvm/aarch64/include >> >> Configure and build passes after the hack like this: >> >> diff -r f94dd38a20f4 make/autoconf/lib-clang.m4 >> --- a/make/autoconf/lib-clang.m4??? Wed May 29 23:14:34 2019 +0200 >> +++ b/make/autoconf/lib-clang.m4??? Thu May 30 22:34:38 2019 +0200 >> @@ -120,7 +120,7 @@ >> >> ?????? AC_CHECK_HEADER("clang-c/Index.h", [], [ENABLE_LIBCLANG="false"]) >> ?????? if test "x$ENABLE_LIBCLANG" = "xtrue"; then >> -????? if test "x$TOOLCHAIN_TYPE" = "xmicrosoft"; then >> +????? if test "x$TOOLCHAIN_TYPE" = "xmicrosoft" || test >> "x$COMPILE_TYPE" = "xcross"; then >> ?????????? # Just trust the lib is there >> ?????????? LIBS=$LIBCLANG_LIBS >> ???????? else >> >> If that makes sense, can you please push this into Panama somewhere? >> From maurizio.cimadamore at oracle.com Fri May 31 09:58:46 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 31 May 2019 09:58:46 +0000 Subject: hg: panama/dev: manual merge with foreign Message-ID: <201905310958.x4V9wki9009756@aojmv0008.oracle.com> Changeset: 397111773cfa Author: mcimadamore Date: 2019-05-31 10:58 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/397111773cfa manual merge with foreign ! src/java.base/share/classes/jdk/internal/foreign/abi/DirectSignatureShuffler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalNativeInvoker.java ! src/java.base/share/classes/jdk/internal/foreign/abi/UniversalUpcallHandler.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/SysVx64ABI.java ! src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/Windowsx64ABI.java From jatin.bhateja at intel.com Fri May 31 10:26:33 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Fri, 31 May 2019 10:26:33 +0000 Subject: [vectorIntrinsics] [PATCH] Elemental shifts and rotates speedup Message-ID: Hi John, Thanks for your comments. Please find my response embedded below. Regards, Jatin -------- Original message -------- From: John Rose Date: 30/05/2019 22:12 (GMT+05:30) To: "Bhateja, Jatin" Cc: panama-dev at openjdk.java.net Subject: Re: [vectorIntrinsics] [PATCH] Elemental shifts and rotates speedup On May 30, 2019, at 9:35 AM, John Rose wrote: > > I looked at the Java changes. They look good. JATIN >> Thanks One more comment at the API level. I have been thinking about the Java rules for <>n, versus our vector-level shifting operations, and I think the Java rules are not for the best here. So in Java x<> I agree, shift count at lane level is not modulo wrapped, for count greater than lane count and -ve shift count special handling has already been done in API definition as you mentioned. ? John From Joshua.Zhu at arm.com Fri May 31 12:32:54 2019 From: Joshua.Zhu at arm.com (Joshua Zhu (Arm Technology China)) Date: Fri, 31 May 2019 12:32:54 +0000 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: Hi Vladimir, Thanks a lot for your review and comments. > The only important case which suffers is "masked loops" - when masked > kernel is used to avoid explicit loop tail processing. Such loops will end with a > deopt most of the time (unless loop tail is empty). So, as a next step, I'd like > to see that case to be addressed, so such loops don't trigger uncommon trap. I agree. If masked kernel is used to avoid explicit loop tail processing, uncommon trap will not be suitable. In my initial thought, in long term masked access should behave in the way like: Main loop (fast path in my patch which succeed in range check) will work same with current implementation. For slow path, masked operation will be intrinsified into corresponding instructions on mask-supported platform. Otherwise it will go scalar. But from review comments on my first patch [1], if mask operation is not directly supported by hardware, scalar on slow path will cause problems if it can't prove access is always in-bounds and prune the slow path. I think the problem is, without uncommon trap, VectorBox will be generated in fast path, for example, after blend for fromArray. This results in Vector will be stored into memory instead of simd register. Is my understanding correct? Anyway let's firstly solve the API correctness issue. > src/hotspot/share/opto/opaquenode.hpp: > > You have some changes in ProfileBooleanNode. Why providing 0 as a > false_count is not enough to get uncommon trap on corresponding branch? When determine whether uncommon trap is suitable to generate, besides branch prediction probabilities, Hotspot will also check if there are too many traps at the current method and bci by seems_stable_comparison(). [2] > src/hotspot/share/opto/doCall.cpp: > > bool Compile::should_delay_vector_inlining(ciMethod* call_method, > JVMState* jvms) { > - return UseVectorApiIntrinsics && call_method->is_vector_method(); > + return UseVectorApiIntrinsics && call_method->is_vector_method() && > + call_method->intrinsic_id() != vmIntrinsics::_ExpectTrue; > > Why do you need to delay inlining for it? For ProfileBooleanNode, profile has to be consumed before GVN elimination. [3] Therefore expectTrue inlining should NOT be delayed. > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X- > VectorBits.java.template: > > - if (ax >= 0 && ax + LENGTH <= a.length) { > + if (expectTrue(ax >= 0 && ax + LENGTH <= a.length)) { > > Why don't you use VectorIntrinsics.checkIndex() here? Uncommon trap will not always be put on the false branch unless loop tail is empty. [4] [1] http://mail.openjdk.java.net/pipermail/panama-dev/2019-April/005105.html [2] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/parse2.cpp#l1644 [3] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/opaquenode.cpp#l87 [4] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/library_call.cpp#l1287 Best Regards, Joshua From maurizio.cimadamore at oracle.com Fri May 31 14:06:11 2019 From: maurizio.cimadamore at oracle.com (maurizio.cimadamore at oracle.com) Date: Fri, 31 May 2019 14:06:11 +0000 Subject: hg: panama/dev: RFR 8224993: Add Unsafe support for MemoryAddress Message-ID: <201905311406.x4VE6Bwj018301@aojmv0008.oracle.com> Changeset: 51965bf20246 Author: mcimadamore Date: 2019-05-31 15:05 +0100 URL: http://hg.openjdk.java.net/panama/dev/rev/51965bf20246 RFR 8224993: Add Unsafe support for MemoryAddress Changes to sun.misc.Unsafe were accidentally removed ! src/jdk.unsupported/share/classes/sun/misc/Unsafe.java From john.r.rose at oracle.com Fri May 31 18:27:00 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 31 May 2019 11:27:00 -0700 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: On May 31, 2019, at 5:32 AM, Joshua Zhu (Arm Technology China) wrote: > > In my initial thought, in long term masked access should behave in the way like: Thanks for thinking about this problem, Joshua. I think it's important to keep chewing on. I am sure that (eventually) we'll find a solution. > Main loop (fast path in my patch which succeed in range check) will work same with current implementation. Yes. If this requirement is to be fully met, and the source code has "masks all the way" through the loop, it follows that we need robust strength reduction where the JIT can see constant-true masks and convert them into regular unmasked operations. The constant-true masks are an annoyance to the main loop and need to be suppressed. They are an artifact of the original source code having just one kernel, which gets used in two ways, (in effect) unmasked for the main loop, and masks for loop boundaries (could include both pre- and post-loops). One problem we puzzling with here is how to design the source-level API so that it's easy to write the loop without masks cluttering up the logic, just so that the edges can be masked. I think this will involve equipping some vector shapes with associated masks which are automatically applied. The JIT's logic currently doesn't allow vector-boxes to be combined in this way (via fields) but I think it might be reasonable to investigate, even before we get inline value types from Valhalla. In any case, whether the masks are "submerged" inside new shapes or hidden in some other clever way, the JIT will want to see these mostly-all-true masks and do the above strength reduction. The branch profiling stuff is a good tool to apply here. Basically we want "v.method(w, MATM(m))" where "MATM" is a branch-predicting intrinsic operator which says "this is a mostly all-true mask". On the fast path out of the operator, the mask can be hardwired to a true constant, and then strength reduction to an unmasked instruction is just a few more steps. Given such a "MATM" operator, we can then split loops into fast and slow versions, based on that operator's oracular advice. The slow loop version will probably run once and exit at the end, although it might also be useful for a pre-loop. I think I'm saying stuff here that all of us are already thinking and discussing, but I wanted to get it out clearly, just in case there's a point here that has been overlooked. > For slow path, masked operation will be intrinsified into corresponding instructions on mask-supported platform. > Otherwise it will go scalar. In some cases we might be able to get a two-step fallback, first to a second set of vector instructions, and then to scalar code. The Java code wants to say say things like "use an aggressive masked form" but back off to a default implementation which still produces vectorized code, but with explicit blends or scatters or whatever to emulate the masked semantics. After that it can go scalar, if the type profile is bad or the hardware is not present. > But from review comments on my first patch [1], if mask operation is not directly supported by hardware, > scalar on slow path will cause problems if it can't prove access is always in-bounds and prune the slow path. I think this is the sort of thing that motivates loop splitting. If a slow path can't be pruned it can be pushed into the bad slow loop. If that bad slow loop is the post-loop, it's only going to go slow on the last partial iteration of the original loop. > I think the problem is, without uncommon trap, VectorBox will be generated in fast path, for example, after blend for fromArray. (Without uncommon trap, or loop splitting that creates a slow version of the loop to catch bad stuff.) > This results in Vector will be stored into memory instead of simd register. Is my understanding correct? Vladimir would know for sure but this sounds right. ? John P.S. What do you ARM experts think about having v.div(w) throw ArithmeticException when any lane in w is zero, for non-float types only? I know ARM forces a non-exceptional result there, but that's not Java-like at all. In general, are we comfortable with adding exception exits to vector operations, like divide-by-zero and AIOOB (array index out of bounds) on scatter/gather? From maurizio.cimadamore at oracle.com Fri May 31 20:44:42 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 31 May 2019 21:44:42 +0100 Subject: [foreign-memaccess] RFC: to scope or not to scope? Message-ID: <579cb292-c029-cf57-d69e-1b7929572b86@oracle.com> Hi, lately I've been thinking hard about the relationship between scopes and memory segments in the foreign-memaccess API. I think some of the decisions we made lately - e.g. make scopes either global (pinned) or closeable (and confined) is a good one. But now I wonder, why do we need scopes in this API? A scope is useful to manage lifecycle of memory resources; you create a scope, do some allocation within it, close the scope and the associated resources are gone. This is a good programming model for high level layers, but is it still a good model for low-level layers? Over the last few weeks I noticed few things: * creating a scope is quite expensive, as there are several data structures associated to it * the ownership/parent mechanism is completely useless in this API, given that the scope parent-of relationship is really only useful when storing pointers from scope A into scope B, an operation that is not available in this API (since we have no pointers here) * the API forces you through quite a bit of hops to get to what you want; code like this is pretty idiomatic: try (MemoryScope scope = MemoryScope.globalScope().fork()) { ??? MemorySegment segment = scope.allocate(32); ??? MemoryAddress address = segment.baseAddress(); ??? ... } * Several of the new MemorySegment sources just use the 'pinned' UNCHECKED scope - de-facto turning scope checks off So, we have quite a complex API, which is used (in part) only when dealing with native memory; in the remaining case it's just noise, mostly. This set me off thinking... what if we brought some of the aspect of memory scope into the memory segment itself (and then get rid of memory scopes) ? That is, let's see if we can add the following things to a segment: * make it AutoCloseable - this way you can use a segment into a try-with-resources, as you would do with scopes * add a 'isAlive' state; a segment starts off in the 'alive' state and then, when it's closed it goes in the 'closed' state - meaning that all addresses originating from it (as well as all the segments) will become invalid * add confinement, so that only the owner thread is allowed to call methods in the segment (e.g. to resize or close it) * add some methods to obtain a read-only segment, a confined one (one which only allows read/writes from owning thread), or a pinned one (one that cannot be closed) In this new world, the above code can be rewritten as follows: try (MemorySegment segment = MemorySegment.ofNative(32)) { ??? MemoryAddress address = segment.baseAddress(); ??? ... } This is more compact and goes straight to the point. I like how compact the API is - and I also like that now the API is very symmetric with the heap cases as well - for instance: try (MemorySegment segment = MemorySegment.ofArray(new byte[32])) { ??? MemoryAddress address = segment.baseAddress(); ??? ... } The only thing that changes is the resource declaration in the try with resources! It's not all rosy of course; this choice has some consequences: * memory segments are no longer immutable; that was possible before, since the mutable state was confined into the scope which was then attached to the segment. Now it's the segment itself that is mutable (in the liveness bit). While this could make transition to value types harder, I don't think it's really a blocker - in reality we could implement a very similar trick where we push the mutable state somewhere else, and then the segment becomes immutable again. Also, in real world cases I expect clients will do some kind of pooling, allocating big segments and then returning small pinned sub-regions to clients (thus avoiding one system call per allocation). In such cases, there is only one master mutable segment - and a lot of small immutable ones. Which is an happy case. * Thinking about what happens when you e.g. resize a region is a bit harder if you can close a region. Should closing a sub-region also close the one it comes from? Or should we throw an exception? Or should we do nothing? or reference counting? After reading the very good docs on Netty [1], I came to the conclusions that closing a derived segment should also result in the closure of the root one. This choice would of course not be a very good one for pooled sub-regions, but for this we can always use the pinning operation - that is: create a resized sub-region, pin it, and return it to the client. I think the combination of resizing + pinning gives quite a bit of power and I don't see any immediate need for doing something with reference counting. I then realized that the pointer scopes we have in foreign right now can in fact be implemented *cleanly* on top of this lower level memory segment mechanism. Here's a snippet of code which demonstrates how one would go about writing a PointerScope which allocates a slab of memory and returns portions of it to the clients: class PointerScopeImpl implements PointerScope { ??? long SEGMENT_SIZE = 64 * 1024; ??? List usedSegments; ??? MemorySegment currentSegment; ??? long offsetInSegment; ??? Pointer allocate(LayoutType type) { ?????? MemorySegment segment = allocateInternal(type.layout().bytesSize(), type.layout().alignmentBytes()); ?????? return new BoundedPointer(type, segment.baseAddress()); ??? } ??? ... ??? MemorySegment allocateInternal(long bytes, long align) { ???????? long size = type.bytesSize(); ???????? if (offsetInSegment + size > SEGMENT_SIZE) { ???????????? usedSegments.add(currentSegment); ???????????? currentSegment = MemorySegment.ofNative(Math.min(SEGMENT_SIZE, size), align); ???????????? offsetInSegment = 0; ???????? } ???????? MemorySegment segment = currentSegment.resize(offsetInSegment, size); ???????? offsetInSegment += size; ???????? return segment; ??? } ??? void close() { ??????? currentSegment.close(); ??????? usedSegments.forEach().close(); ??? } } That is, we can get the same functionality we have in Panama, essentially using segments as a way to get at the Unsafe allocation facilities. This seems pretty cool! I've put together a prototype of this approach (should apply cleanly on top of foreign-memaccess): http://cr.openjdk.java.net/~mcimadamore/panama/scope-removal/ I was pleased at how the tests could be simplified with this approach. I was also please to see that the performance numbers took a significant jump forward, essentially bringing this within reach of raw Unsafe usage (but with the extra safety sprinkled on top). Concluding, this seems yet another of those cases where we were trying to conflate high-level concerns with lower-level concerns, and once we push everything in the right place of the stack, things seems to slot into a lower energy state. What do you think? Cheers Maurizio [1] - https://netty.io/wiki/reference-counted-objects.html From vladimir.x.ivanov at oracle.com Fri May 31 22:13:15 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Sat, 1 Jun 2019 01:13:15 +0300 Subject: [vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes In-Reply-To: References: Message-ID: >> I think the problem is, without uncommon trap, VectorBox will be generated in fast path, for example, after blend for fromArray. > > (Without uncommon trap, or loop splitting that creates > a slow version of the loop to catch bad stuff.) > >> This results in Vector will be stored into memory instead of simd register. Is my understanding correct? > > Vladimir would know for sure but this sounds right. Yes, as experience shows, scalar code will cause problems: vector values have to be boxed first to make existing accessors work. In the worst case, it will completely break box elimination for affected values. But there are some workarounds which can alleviate the effects. In particular, aggressive vector reboxing can narrow the scope by splitting live ranges and localizing the boxed value around the problematic use site. I'd expect reboxing should already work for XxxVector.forEach() since it uses XxxNnnVector.getElements() to access backing array: @Override void forEach(FUnCon f) { int[] vec = getElements(); for (int i = 0; i < length(); i++) { f.apply(i, vec[i]); } } private int[] getElements() { return VectorIntrinsics.maybeRebox(this).vec; } Also, it seems beneficial to migrate away from getElements() & array accessors to XxxVector.get() since it is already intrinsified. (XxxVector.with() looks attractive as well, but it may cause too much churn if used in default implementation. So should be used with caution.) Best regards, Vladimir Ivanov > > ? John > > P.S. What do you ARM experts think about having v.div(w) > throw ArithmeticException when any lane in w is zero, > for non-float types only? I know ARM forces a non-exceptional > result there, but that's not Java-like at all. In general, are > we comfortable with adding exception exits to vector operations, > like divide-by-zero and AIOOB (array index out of bounds) > on scatter/gather? > From vladimir.x.ivanov at oracle.com Fri May 31 22:29:40 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Sat, 1 Jun 2019 01:29:40 +0300 Subject: [vectorIntrinsics] what about div? In-Reply-To: References: Message-ID: > I propose that we move div up into the top level class, Yes, IMO it is the right move. > and make it (for now) a partial operation which throws > UOE. Later we can fill it in. I note that SVE seems to > support it (though it may well be slow). And for Intel > we can use some of the ideas here: > https://stackoverflow.com/questions/16822757/sse-integer-division > > I think we can treat the 1/0 problem as similar to the > range check problem (with toArray masked). Basically, > we need to do `v.equal(broadcast(0)).anyTrue()` before > a div and uncommon-trap to scalar code if it happens. Yes, the problem looks very similar to range checks for masked accesses (maybe even simpler? since zero reliably trigger the exception) and the solution should be easily extended to masked variant (v.equal(broadcast(0)).and(m).anyTrue()), but it's interesting to see how much overhead it introduces. Best regards, Vladimir Ivanov > ARM SVE defines zero as the result of x/0, but I don't think > we need to do that. > > For efficient integer division up to 32 bits we can use > 64-bit floats. For 64-bit integer division we can > eventually optimize with something tricky; see the > link above. > > ? John >