[foreign] RFR 8219822: Jextract should handle complex macro constants
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Mar 4 18:04:23 UTC 2019
Whoops - that was accidentally introduce by the merge.
I pushed a modified webrev which fixes this, for the records:
http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v7/
Maurizio
On 04/03/2019 17:26, Jorn Vernee wrote:
> * In MacroParser, please use Log instead of creating a separate
> Logger, so we can control the log level with the command line option.
>
> Otherwise looks good. All tests pass on my machine with this patch as
> well.
>
> Jorn
>
> Maurizio Cimadamore schreef op 2019-03-04 17:24:
>> The test still had issues... turns out that, at least in some cases,
>> pulling in <stdint.h> also pulled in other globals, in particular
>> this:
>>
>> extern uintptr_t __security_cookie;
>>
>> which is defined in <vcruntime.h>
>>
>> This variable was not the variable the macro parser was interested in,
>> but the code was setup in finding the _first_ available variable, and
>> evaluate that, which would fail in this case.
>>
>> Now, the reason why this doesn't get pulled in our test machines
>> running in our build/test infra is unknown to me - but both me and
>> Jorn have been able to reproduce on a Panama build (and the build I
>> had was produced using the same devkit used in our farm). I suspect
>> there must be something with the test setup which causes the test to
>> spuriously pass.
>>
>> In any case, with this latest revision, the test passes for both my
>> local Windows build and for Jorn's:
>>
>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v7/
>>
>> The main change is the extra filter in MacroPaser::reparse method,
>> which filter cursors by spelling (in addition to kind), and make sure
>> that the cursor contains the (synthetic) string "jextract$". I also
>> added some more FINE debugging statements.
>>
>> Maurizio
>>
>> On 04/03/2019 04:01, Sundararajan Athijegannathan wrote:
>>> Looks good.
>>>
>>> -Sundar
>>>
>>> On 01/03/19, 11:49 PM, Maurizio Cimadamore wrote:
>>>> Another (hopefully last) iteration.
>>>>
>>>> This reverts to the use of the more correct 'uintptr_t'. I figured
>>>> out what the cause of the problems were: when setting up the
>>>> compilation unit used to parse the macro snippet, we were not
>>>> passing again the include folders containing LLVM headers (among
>>>> which stdint.h is defined). As a result, there were issues in
>>>> including that header, but _only_ in the context of macro
>>>> evaluation, which was puzzling.
>>>>
>>>> Webrev:
>>>>
>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v6
>>>>
>>>> Maurizio
>>>>
>>>>
>>>> On 01/03/2019 16:39, Maurizio Cimadamore wrote:
>>>>> Another iteration:
>>>>>
>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v5
>>>>>
>>>>> changes:
>>>>>
>>>>> * <stdint.h> cannot be relied upon; it ultimately depends on the
>>>>> machine having MSVC headers which in our test infra is not guaranteed
>>>>>
>>>>> * switched to 'unsigned long long' which should be big enough :-)
>>>>>
>>>>> * fixed comments in annotations
>>>>>
>>>>> * added some logger support for errors generated by the macro
>>>>> reparsing support, just for diagnosing/debuggability
>>>>>
>>>>> Maurizio
>>>>>
>>>>> On 01/03/2019 13:42, Sundararajan Athijegannathan wrote:
>>>>>> Looks good. Nice & clean!
>>>>>>
>>>>>> Minor: doc comment is between annotations and type. Don't we put
>>>>>> comment before annotations? (NativeNumericConstant and
>>>>>> NativeStringConstant)
>>>>>>
>>>>>> -Sundar
>>>>>>
>>>>>> On 28/02/19, 8:31 PM, Maurizio Cimadamore wrote:
>>>>>>> Updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v4/
>>>>>>>
>>>>>>> This sues <stdint.h> and then 'uintptr_t' instead of long, which
>>>>>>> is the right thing to do.
>>>>>>>
>>>>>>> Did not notice any adverse performance issue. I also added your
>>>>>>> extra test.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 28/02/2019 13:43, Maurizio Cimadamore wrote:
>>>>>>>>
>>>>>>>> On 28/02/2019 13:01, Jorn Vernee wrote:
>>>>>>>>> Hi Maurizio,
>>>>>>>>>
>>>>>>>>> Some comments:
>>>>>>>>>
>>>>>>>>> * For pointers, I thought casting to (long) would truncate the
>>>>>>>>> value on Windows, since long is only 32 bits there, but in
>>>>>>>>> practice this does not seem to be a problem... Any ways, we
>>>>>>>>> should probably make sure that that case keeps working. Here's
>>>>>>>>> the test case I used:
>>>>>>>>
>>>>>>>> Right, I knew I was doing something dirty - Ideally we want
>>>>>>>> size_t, but that depends on stddef.h... I wanted to avoid extra
>>>>>>>> dependencies in the reparsed snippet.
>>>>>>>>
>>>>>>>> Interesting that it works - I'd say the clang eval API keeps
>>>>>>>> the value as a 64-bit value regardless of the cast...
>>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> diff -r 8f30f51887d9
>>>>>>>>> test/jdk/com/sun/tools/jextract/ConstantsTest.java
>>>>>>>>> --- a/test/jdk/com/sun/tools/jextract/ConstantsTest.java Thu
>>>>>>>>> Feb 28 12:34:49 2019 +0100
>>>>>>>>> +++ b/test/jdk/com/sun/tools/jextract/ConstantsTest.java Thu
>>>>>>>>> Feb 28 13:29:40 2019 +0100
>>>>>>>>> @@ -101,7 +101,8 @@
>>>>>>>>> { "CHAR_VALUE", int.class, equalsTo(104) },
>>>>>>>>> //integer char constants have type int
>>>>>>>>> { "MULTICHAR_VALUE", int.class,
>>>>>>>>> equalsTo(26728) }, //integer char constants have type int
>>>>>>>>> { "BOOL_VALUE", boolean.class, equalsTo(true) },
>>>>>>>>> - { "ZERO_PTR", Pointer.class,
>>>>>>>>> pointerEqualsTo(0L) }
>>>>>>>>> + { "ZERO_PTR", Pointer.class,
>>>>>>>>> pointerEqualsTo(0L) },
>>>>>>>>> + { "F_PTR", Pointer.class,
>>>>>>>>> pointerEqualsTo(0xFFFF_FFFF_FFFF_FFFFL) }
>>>>>>>>> };
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> diff -r 8f30f51887d9 test/jdk/com/sun/tools/jextract/constants.h
>>>>>>>>> --- a/test/jdk/com/sun/tools/jextract/constants.h Thu Feb 28
>>>>>>>>> 12:34:49 2019 +0100
>>>>>>>>> +++ b/test/jdk/com/sun/tools/jextract/constants.h Thu Feb 28
>>>>>>>>> 13:29:40 2019 +0100
>>>>>>>>> @@ -65,3 +65,4 @@
>>>>>>>>> #define SUB SUP + 2 //dependency
>>>>>>>>>
>>>>>>>>> #define ZERO_PTR (void*)0;
>>>>>>>>> +#define F_PTR (void*) 0xFFFFFFFFFFFFFFFFLL; // all 1s
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> * (I know this was already there); Should the
>>>>>>>>> MacroParser.toNumber fast path use Long.decode instead, to
>>>>>>>>> catch more values?
>>>>>>>>
>>>>>>>> I discussed this with Sundar - we could, but in practice ints
>>>>>>>> cover like 90% of cases; so there was a desire to keep the
>>>>>>>> surface area of the hack small.
>>>>>>>>
>>>>>>>> Of course if you have real use cases which, like OpenGL, gives
>>>>>>>> a 5x boost when using such an hack, we can revisit!
>>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>>>> So we have a choice here:
>>>>>>>>>>>
>>>>>>>>>>> 1) we extract the value of the constant, and we create a
>>>>>>>>>>> plain constant field - e.g.
>>>>>>>>>>>
>>>>>>>>>>> static final int FOO = 1;
>>>>>>>>>>>
>>>>>>>>>>> which will then work as before (as a true compile-time
>>>>>>>>>>> constant)
>>>>>>>>>>>
>>>>>>>>>>> 2) we aim for regularity, and we just translate as
>>>>>>>>>>>
>>>>>>>>>>> static final int FOO = lib.FOO();
>>>>>>>>>
>>>>>>>>> My vote is on 1., and when condy eventually becomes available,
>>>>>>>>> switch the more complex constants to that.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> Also, I tested this out with the winreg.h case I had, e.g.:
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> typedef struct HKEY__ {
>>>>>>>>> int unused;
>>>>>>>>> } *HKEY;
>>>>>>>>>
>>>>>>>>> typedef unsigned __int64 ULONG_PTR;
>>>>>>>>> typedef long LONG;
>>>>>>>>>
>>>>>>>>> #define HKEY_LOCAL_MACHINE (( HKEY )
>>>>>>>>> (ULONG_PTR)((LONG)0x80000002) )
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> But unfortunately no constant was generated for
>>>>>>>>> HKEY_LOCAL_MACHINE. (Note that __int64 is an MSVC extension,
>>>>>>>>> maybe that's the problem?)
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> On a bit of a tangent; I'm wondering if we really need 2
>>>>>>>>> annotations for this? Could we instead use one annotation with
>>>>>>>>> a `Class carrier` and `String value` instead? i.e. we encode
>>>>>>>>> all constant values as Strings in the annotation and then
>>>>>>>>> reparse them when spinning the header impl.
>>>>>>>>>
>>>>>>>>> Jorn
>>>>>>>>>
>>>>>>>>> Maurizio Cimadamore schreef op 2019-02-28 00:19:
>>>>>>>>>> Hi,
>>>>>>>>>> here's a new revision where I just slightly improved the
>>>>>>>>>> TestConstants
>>>>>>>>>> test, to be more uniform (instead of passing expected
>>>>>>>>>> constant values,
>>>>>>>>>> the test now passes checker predicates, which is more scalable).
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v3/
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>> Maurizio
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 27/02/2019 19:02, Maurizio Cimadamore wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> this patch regularizes support for constant macros (and enum
>>>>>>>>>>> constants too).
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8219822_v2/
>>>>>>>>>>>
>>>>>>>>>>> The goal of this patch is to stop having jextract generating
>>>>>>>>>>> 'default' methods for constants. Instead, two new
>>>>>>>>>>> annotations are introduced:
>>>>>>>>>>>
>>>>>>>>>>> @NativeNumericConstant
>>>>>>>>>>> @NativeStringConstant
>>>>>>>>>>>
>>>>>>>>>>> The first takes a long value, the second a String value.
>>>>>>>>>>>
>>>>>>>>>>> So, given C code like this:
>>>>>>>>>>>
>>>>>>>>>>> #define FOO 1
>>>>>>>>>>>
>>>>>>>>>>> instead of generating a method like this:
>>>>>>>>>>>
>>>>>>>>>>> default int FOO() {
>>>>>>>>>>> return 1;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> We will now generate this:
>>>>>>>>>>>
>>>>>>>>>>> @NativeNumericConstant(1L)
>>>>>>>>>>> int FOO();
>>>>>>>>>>>
>>>>>>>>>>> And then binder will then generate the implementation.
>>>>>>>>>>>
>>>>>>>>>>> Decoupling the generation of the constant methods from
>>>>>>>>>>> jextract has a big advantage: we can now support complex
>>>>>>>>>>> constant types, such as strings (which need to be turned to
>>>>>>>>>>> Pointer<Byte>) and pointer constants (e.g. (void*)0). Note
>>>>>>>>>>> that, whenever we need some 'constant' pointer, we need to
>>>>>>>>>>> allocate, which means we need a scope - which we don't have
>>>>>>>>>>> at extraction time.
>>>>>>>>>>>
>>>>>>>>>>> Efficiency-wise, the binder-generated implementation should
>>>>>>>>>>> be as good as the jextract one - it just ldc a constant
>>>>>>>>>>> (which can even be a pointer constant, via CP patching!) and
>>>>>>>>>>> then returns it.
>>>>>>>>>>>
>>>>>>>>>>> In order to support pointer constants of the kind:
>>>>>>>>>>>
>>>>>>>>>>> #define PTR (void*)0
>>>>>>>>>>>
>>>>>>>>>>> I also tweaked MacroParser, so that now two attempts can be
>>>>>>>>>>> made at evaluating the macro:
>>>>>>>>>>>
>>>>>>>>>>> 1) a first attempt is made using a snippet like this:
>>>>>>>>>>>
>>>>>>>>>>> __auto_type jextract$var = PTR;
>>>>>>>>>>>
>>>>>>>>>>> If this fails (and it does fail in this case, as clang
>>>>>>>>>>> cursor evaluation API doesn't detect this constant), then:
>>>>>>>>>>>
>>>>>>>>>>> 2) we look at the type of the cursor, if the type is a
>>>>>>>>>>> pointer type, then we trigger another snippet parse, this
>>>>>>>>>>> time of:
>>>>>>>>>>>
>>>>>>>>>>> __auto_type jextract$var = (long)PTR;
>>>>>>>>>>>
>>>>>>>>>>> The extra cast will effectively reinterpret the value as a
>>>>>>>>>>> numeric constant. Then we'll create a macro whose type is
>>>>>>>>>>> the pointer type determined at (1) and whose value is the
>>>>>>>>>>> one determined at (2).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Note that, since we only execute step (2) when the type of
>>>>>>>>>>> the cursor in (1) is a pointer type, we don't end up
>>>>>>>>>>> executing the second step very often. I also tried some of
>>>>>>>>>>> the examples, and the extraction time is the same as before
>>>>>>>>>>> this patch.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> There is a slight issue with AsmCodeFactoryExt: ideally,
>>>>>>>>>>> we'd like for constant methods to be turned into static
>>>>>>>>>>> constants here; and we'd like to initialize such constants
>>>>>>>>>>> to the result of the computation of the bound constant
>>>>>>>>>>> method - that is, if the library has a method:
>>>>>>>>>>>
>>>>>>>>>>> @NativeNumericConstant(1L)
>>>>>>>>>>> int FOO();
>>>>>>>>>>>
>>>>>>>>>>> Then, the static wrapper should have the following static
>>>>>>>>>>> field:
>>>>>>>>>>>
>>>>>>>>>>> static final int FOO = lib.FOO();
>>>>>>>>>>>
>>>>>>>>>>> However, if we do so, a test will fail, namely LibEnumTest,
>>>>>>>>>>> as that test uses the constants of the static wrapper as
>>>>>>>>>>> labels in a 'switch' statement; this only works if the
>>>>>>>>>>> constant field has the 'ConstantValue' attribute, and that
>>>>>>>>>>> in turn is only possible if the constant is 'simple' - using
>>>>>>>>>>> a condy-based ConstantValue would be a way out, but that is
>>>>>>>>>>> not something that is supported currently, neither by the VM
>>>>>>>>>>> nor by javac.
>>>>>>>>>>>
>>>>>>>>>>> So we have a choice here:
>>>>>>>>>>>
>>>>>>>>>>> 1) we extract the value of the constant, and we create a
>>>>>>>>>>> plain constant field - e.g.
>>>>>>>>>>>
>>>>>>>>>>> static final int FOO = 1;
>>>>>>>>>>>
>>>>>>>>>>> which will then work as before (as a true compile-time
>>>>>>>>>>> constant)
>>>>>>>>>>>
>>>>>>>>>>> 2) we aim for regularity, and we just translate as
>>>>>>>>>>>
>>>>>>>>>>> static final int FOO = lib.FOO();
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The patch I've submitted does (1) for simple numeric
>>>>>>>>>>> constants and falls back to (2) for complex constants. I'm
>>>>>>>>>>> open to suggestions as to whether it's better to keep this
>>>>>>>>>>> behavior, or simplify it by always aiming for (2).
>>>>>>>>>>>
>>>>>>>>>>> Maurizio
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
More information about the panama-dev
mailing list