Use of long in Nashorn

Fri Dec 11 15:08:02 UTC 2015

I uploaded a new webrev that includes most of the changes you suggested. 
Conversion of long[] from Java now works without losing precision, using 
int, double, or Object arrays. I also added a test for this.

http://cr.openjdk.java.net/~hannesw/8144020/webrev.01/

I didn't implement the int/double overloading of array iterator actions. 
Unless I missed something, I would have to implement two forEach methods 
in each subclass, which seem ugly and error prone.

Additionally, I removed the ArrayData.set method that takes a long 
value, something I had overlooked in my previous patch.

Hannes

Am 2015-12-06 um 11:12 schrieb Hannes Wallnoefer:
> Thanks for the quick review, Attila. Answers inline.
>
> Am 2015-12-04 um 18:39 schrieb Attila Szegedi:
>> * In CodeGenerator SHR implementations (both self-assign and 
>> ordinary) you have method.shr() in loadStack instead of consumeStack. 
>> I was actually staring at this for a while as it seemed wrong to 
>> perform an operation in loadStack, but in the end I decided it’s okay 
>> like this. After all, it’s the toUint32 that’s the optimistic part 
>> here, so this should be fine indeed. I think we had a JIRA issue 
>> saying “make SHR optimistic” but I can’t find it now. If it pops up, 
>> we can mark it as a duplicate of this one.
>
> I've looked for that Jira issue but didn't find it either.
>
>> * I see "assert storeType != Type.LONG;” do we even need Type.LONG 
>> and LongType class anymore?
>
> That assert is a leftover from the conversion process, it shouldn't be 
> needed anymore. We do still use Type.LONG for creating and handling 
> the primitive fields and spill slots with dual fields. That's why I 
> had to keep it.
>
>> * Symbol.java: you could reclaim the HAS_LONG_VALUE bit by shifting 
>> the rest down by one
>
> Will do.
>>
>> * optimization idea: have versions of callback invokers in 
>> NativeArray.java for both int and double indices. Since we know the 
>> length of the array when we enter forEach etc. we could select  the 
>> double version when length > maxint and the int version otherwise. 
>> Actually, we could even have IteratorAction.forEach be overloaded for 
>> int and double, and write the body of IteratorAction.apply() to start 
>> out with the int version, and when the index crosses maxint start 
>> calling the double version (so even for large arrays we’ll iterate 
>> calling int specialization of functions for the cases where it’s 
>> short circuited).
>
> Nice idea, and should be easy to implement. I'll try it out.
>>
>> * array length: could we still have Nashorn APIs that return long? 
>> Optimistic filters will deal with these appropriately, won’t they? I 
>> guess they should since they also need to be able to handle return 
>> values from POJO methods that return long (e.g. 
>> System.currentTimeMillis()). Hence, you could have NativeArray.length 
>> return “long” and let the optimistic machinery decide whether to cast 
>> it as int or double. That would allow you to not have to box the 
>> return value of NativeArray.length.
>
> Yes, we could have things returning long, but it will deoptimize to 
> Object. OptimisticReturnFilters (which do the runtime checks) are not 
> used for ScriptObject properties.
>
>> * NativeNumber: unused import?
> Fixed.
>
>> *Unit32ArrayData: getBoxedElementType went from INTEGER to DOUBLE. 
>> I’m not sure I understand that. I mean, was INTEGER incorrect before?
>
> That obviously has been incorrect before. Actually, that method is 
> only used in NativeArray#concat and will never be invoked on typed 
> arrays. Looking at that NativeArray#concat method it looks a bit fishy 
> to me, assuming all NativeArrays use ContinuousArrayData. I have to 
> investigate further on this.
>
> Back to the issue at hand, int.class/Integer.class is definitely wrong 
> for element type for Uint32. When returning int.class in 
> getElementType, optimistic code that uses optimstic int getter gets 
> incredibly slow when actually deoptimizing to double, because we keep 
> trying to handle elements as ints. (I had this in my code at one time 
> and found pdfjs slowed down to a crawl when changing the optimistic 
> int getter to always deoptimize to double.)
>
> Probably getBoxedElementType should just be a final method instead of 
> abstract in ContinuousArrayData and convert getElementType to boxed 
> counterpart on the fly.
>
>
>>
>> * You didn’t remove LongArrayData.java?
>
> I think I did: 
> http://cr.openjdk.java.net/~hannesw/8144020/webrev.00/src/jdk.scripting.nashorn/share/classes/jdk/nashorn/internal/runtime/arrays/LongArrayData.java.patch
>>
>> * It looks like long[] arrays can now lose precision if passed 
>> through Java.from(), E.g. if you have Java methods “long[] 
>> getLongArray()” and “void setLongArray(long[] arr)” then 
>> pojo.setLongArray(Java.from(pojo.getLongArray()) will lose precision 
>> because NativeJava.copyArray(long[]) produces double[]. Of course, we 
>> could argue that this is expected behavior if you explicitly use 
>> Java.from. Just passing around and manipulating a raw long[] won’t 
>> have that effect (although it’d be good to confirm that in test), it 
>> requires an explicit Java.from(). Still, I wonder if it’d make sense 
>> to have copyArray(long[]) either return Object[] or choose 
>> dynamically between double[] and Object[] based on the maximum 
>> magnitude of an element (you can start with double[] and reallocate 
>> as Object[] when you bump into a large long).
> Good catch. I'll see if I can use existing code in ArrayData to 
> convert to the narrowest array type.
>
> Thanks!
>
> Hannes
>
>> Other than that: great work! Nice to see large swaths of code removed.
>>
>> Attila.
>>
>>> On Dec 4, 2015, at 4:27 PM, Hannes Wallnoefer 
>>> <hannes.wallnoefer at oracle.com> wrote:
>>>
>>> After receiving another long/precision related bug last week I 
>>> decided to go ahead with the removal of longs in Nashorn. It's been 
>>> quite an effort, but I think it looks good now. Below are the links 
>>> to the webrev and Jira issue.
>>>
>>> http://cr.openjdk.java.net/~hannesw/8144020/
>>> https://bugs.openjdk.java.net/browse/JDK-8144020
>>>
>>> This is a rather big patch, but it mostly removes code. There are 
>>> over 2000 deletions vs. 400 insertions. I removed all uses of long 
>>> in our code where it represented JS numbers, including ScriptObject 
>>> property accessors with longs as key or value, and the LongArrayData 
>>> class. With this, all JS numbers are represented as int or double in 
>>> Nashorn. Longs will not "naturally" occur anymore and only be 
>>> present as java.lang.Long instances.
>>>
>>> As expected, the areas that demanded special care were those where 
>>> ES prescribes use of uint32. These are mostly unsigned right shift, 
>>> Uint32Array elements, and the length property of arrays. For right 
>>> shift and Uint32Array elements, I added optimistic implementations 
>>> that return int if possible and deoptimize to double. Pdfjs and 
>>> mandreel are benchmarks that make use of these features, and I 
>>> didn't notice any observable impact on performance. Even when I 
>>> simulated fallback to double performance was still ok (previously 
>>> reported performance regressions for this case were in fact caused 
>>> by an error of mine).
>>>
>>> For the Array length property, I changed the getter in NativeArray 
>>> to return int or double depending on actual value. Previously, the 
>>> length getter always returned long. Since the property is actually 
>>> evaluated by OptimisticTypesCalculator, for-loops with an array 
>>> length as limit now use ints instead of longs to iterate through 
>>> array indices, which is probably a good thing.
>>>
>>> As for longs returned by Java code, their value is always preserved. 
>>> Only when they are used for JS math they will be converted to double 
>>> as prescribed by ECMA. When running with optimistic types we check 
>>> if a long fits into an int or double to avoid deoptimization to 
>>> object. Previously we did this only when converting long to 
>>> optimistic int, but optimistic double did not use any return filter 
>>> for longs, so a long returned for an optimistic double could easily 
>>> lose precision.
>>>
>>> You can find the related changes in OptimisticReturnFilters.java. I 
>>> also added tests to make sure long values are preserved in various 
>>> optimistic and non-optimstic contexts, some of which would have 
>>> previously fail.
>>>
>>> In a previous version of this patch it included functionality to 
>>> only treat ints and doubles or their wrapper objects as native JS 
>>> numbers (e.g. you could invoke Number.prototype methods only on ints 
>>> and doubles). However, this is a quite a hairy issue and I reckoned 
>>> the patch is large enough without it, so I pulled it out and will 
>>> fix this separately as JDK-8143896.
>>>
>>> I've testing and refining this patch for the last few days and think 
>>> it looks pretty good. I thought it was a good idea to discuss this 
>>> to this existing thread before posting a review request. Please let 
>>> me know what you think.
>>>
>>> Thanks,
>>> Hannes
>>>
>>>
>>> Am 2015-11-13 um 15:06 schrieb Attila Szegedi:
>>>> Well, we could just box them in that case. If we only used int and 
>>>> double as primitive number types internally, then a natural 
>>>> representation for a long becomes Long. Java methods returning long 
>>>> could have an optimistic filter that first checks if the value fits 
>>>> in an int (32-bit signed), then in a double (53-bit signed) without 
>>>> loss of precision, and finally deoptimizes to Object and uses Long. 
>>>> int->long->double->Object becomes int->double->Object, with longs 
>>>> of too large magnitude becoming boxed.
>>>>
>>>> Attila.
>>>>
>>>>> On Nov 13, 2015, at 2:46 PM, Jim Laskey 
>>>>> (Oracle)<james.laskey at oracle.com> wrote:
>>>>>
>>>>> The main thing to watch for here is that longs/Longs need to 
>>>>> survive unobstructed between Java calls.  Remember we ran into 
>>>>> trouble in this area (loss of precision when using longs for 
>>>>> encoding.)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Nov 13, 2015, at 8:26 AM, Attila 
>>>>>> Szegedi<attila.szegedi at oracle.com> wrote:
>>>>>>
>>>>>>> On Nov 13, 2015, at 10:31 AM, Hannes 
>>>>>>> Wallnoefer<hannes.wallnoefer at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm currently questioning our use of longs to represent numbers 
>>>>>>> in combination with optimistic typing.
>>>>>> I often feel that the presence of longs is more hassle than 
>>>>>> they’re worth. I’d be all for eliminating them.
>>>>>>
>>>>>>> After all, there's a pretty large range where long arithmetic 
>>>>>>> will be more precise than the doubles required by ECMA.
>>>>>> There’s currently several different places in Nashorn where we 
>>>>>> try to confine the precision of longs to 53 bits (so they aren’t 
>>>>>> more precise than doubles). It’s a bit of a whack-a-mole where 
>>>>>> you can’t always be sure whether you found all instances.
>>>>>>
>>>>>>> For context see this bug, especially the last two comments (the 
>>>>>>> original bug was about number to string conversion which has 
>>>>>>> been solved in the meantime):
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8065910
>>>>>>>
>>>>>>> So the question is: can we use longs at all and be confident 
>>>>>>> that results won't be too precise (and thus, strictly speaking, 
>>>>>>> incorrect)?
>>>>>> Internally, the longs are also used for representing UInt32 even 
>>>>>> in non-optimistic setting, which is only really significant for 
>>>>>> the >>> operator and array indices and lengths; maybe we should 
>>>>>> limit longs to that usage only, or even just use doubles 
>>>>>> internally for UInt32 values that can’t be represented as Int32. 
>>>>>> FWIW, even for the >>> operator we only need it when shifting by 
>>>>>> zero, as in every other case the result will have the topmost bit 
>>>>>> set to 0 and thus fit in Int32 too.
>>>>>>
>>>>>> I guess once Valhalla rolls around, we could easily have an 
>>>>>> unsigned 32 bit int type.
>>>>>>
>>>>>>> Hannes
>>>>>> Attila.
>>>>>>
>