Use of long in Nashorn

Fri Dec 4 17:39:14 UTC 2015

* In CodeGenerator SHR implementations (both self-assign and ordinary) you have method.shr() in loadStack instead of consumeStack. I was actually staring at this for a while as it seemed wrong to perform an operation in loadStack, but in the end I decided it’s okay like this. After all, it’s the toUint32 that’s the optimistic part here, so this should be fine indeed. I think we had a JIRA issue saying “make SHR optimistic” but I can’t find it now. If it pops up, we can mark it as a duplicate of this one.

* I see "assert storeType != Type.LONG;” do we even need Type.LONG and LongType class anymore?

* Symbol.java: you could reclaim the HAS_LONG_VALUE bit by shifting the rest down by one

* optimization idea: have versions of callback invokers in NativeArray.java for both int and double indices. Since we know the length of the array when we enter forEach etc. we could select  the double version when length > maxint and the int version otherwise. Actually, we could even have IteratorAction.forEach be overloaded for int and double, and write the body of IteratorAction.apply() to start out with the int version, and when the index crosses maxint start calling the double version (so even for large arrays we’ll iterate calling int specialization of functions for the cases where it’s short circuited).

* array length: could we still have Nashorn APIs that return long? Optimistic filters will deal with these appropriately, won’t they? I guess they should since they also need to be able to handle return values from POJO methods that return long (e.g. System.currentTimeMillis()). Hence, you could have NativeArray.length return “long” and let the optimistic machinery decide whether to cast it as int or double. That would allow you to not have to box the return value of NativeArray.length.

* NativeNumber: unused import?

*Unit32ArrayData: getBoxedElementType went from INTEGER to DOUBLE. I’m not sure I understand that. I mean, was INTEGER incorrect before?

* You didn’t remove LongArrayData.java?

* It looks like long[] arrays can now lose precision if passed through Java.from(), E.g. if you have Java methods “long[] getLongArray()” and “void setLongArray(long[] arr)” then pojo.setLongArray(Java.from(pojo.getLongArray()) will lose precision because NativeJava.copyArray(long[]) produces double[]. Of course, we could argue that this is expected behavior if you explicitly use Java.from. Just passing around and manipulating a raw long[] won’t have that effect (although it’d be good to confirm that in test), it requires an explicit Java.from(). Still, I wonder if it’d make sense to have copyArray(long[]) either return Object[] or choose dynamically between double[] and Object[] based on the maximum magnitude of an element (you can start with double[] and reallocate as Object[] when you bump into a large long).

Other than that: great work! Nice to see large swaths of code removed.

Attila.

> On Dec 4, 2015, at 4:27 PM, Hannes Wallnoefer <hannes.wallnoefer at oracle.com> wrote:
> 
> After receiving another long/precision related bug last week I decided to go ahead with the removal of longs in Nashorn. It's been quite an effort, but I think it looks good now. Below are the links to the webrev and Jira issue.
> 
> http://cr.openjdk.java.net/~hannesw/8144020/
> https://bugs.openjdk.java.net/browse/JDK-8144020
> 
> This is a rather big patch, but it mostly removes code. There are over 2000 deletions vs. 400 insertions. I removed all uses of long in our code where it represented JS numbers, including ScriptObject property accessors with longs as key or value, and the LongArrayData class. With this, all JS numbers are represented as int or double in Nashorn. Longs will not "naturally" occur anymore and only be present as java.lang.Long instances.
> 
> As expected, the areas that demanded special care were those where ES prescribes use of uint32. These are mostly unsigned right shift, Uint32Array elements, and the length property of arrays. For right shift and Uint32Array elements, I added optimistic implementations that return int if possible and deoptimize to double. Pdfjs and mandreel are benchmarks that make use of these features, and I didn't notice any observable impact on performance. Even when I simulated fallback to double performance was still ok (previously reported performance regressions for this case were in fact caused by an error of mine).
> 
> For the Array length property, I changed the getter in NativeArray to return int or double depending on actual value. Previously, the length getter always returned long. Since the property is actually evaluated by OptimisticTypesCalculator, for-loops with an array length as limit now use ints instead of longs to iterate through array indices, which is probably a good thing.
> 
> As for longs returned by Java code, their value is always preserved. Only when they are used for JS math they will be converted to double as prescribed by ECMA. When running with optimistic types we check if a long fits into an int or double to avoid deoptimization to object. Previously we did this only when converting long to optimistic int, but optimistic double did not use any return filter for longs, so a long returned for an optimistic double could easily lose precision.
> 
> You can find the related changes in OptimisticReturnFilters.java. I also added tests to make sure long values are preserved in various optimistic and non-optimstic contexts, some of which would have previously fail.
> 
> In a previous version of this patch it included functionality to only treat ints and doubles or their wrapper objects as native JS numbers (e.g. you could invoke Number.prototype methods only on ints and doubles). However, this is a quite a hairy issue and I reckoned the patch is large enough without it, so I pulled it out and will fix this separately as JDK-8143896.
> 
> I've testing and refining this patch for the last few days and think it looks pretty good. I thought it was a good idea to discuss this to this existing thread before posting a review request. Please let me know what you think.
> 
> Thanks,
> Hannes
> 
> 
> Am 2015-11-13 um 15:06 schrieb Attila Szegedi:
>> Well, we could just box them in that case. If we only used int and double as primitive number types internally, then a natural representation for a long becomes Long. Java methods returning long could have an optimistic filter that first checks if the value fits in an int (32-bit signed), then in a double (53-bit signed) without loss of precision, and finally deoptimizes to Object and uses Long. int->long->double->Object becomes int->double->Object, with longs of too large magnitude becoming boxed.
>> 
>> Attila.
>> 
>>> On Nov 13, 2015, at 2:46 PM, Jim Laskey (Oracle)<james.laskey at oracle.com>  wrote:
>>> 
>>> The main thing to watch for here is that longs/Longs need to survive unobstructed between Java calls.  Remember we ran into trouble in this area (loss of precision when using longs for encoding.)
>>> 
>>> 
>>> 
>>> 
>>>> On Nov 13, 2015, at 8:26 AM, Attila Szegedi<attila.szegedi at oracle.com>  wrote:
>>>> 
>>>>> On Nov 13, 2015, at 10:31 AM, Hannes Wallnoefer<hannes.wallnoefer at oracle.com>  wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I'm currently questioning our use of longs to represent numbers in combination with optimistic typing.
>>>> I often feel that the presence of longs is more hassle than they’re worth. I’d be all for eliminating them.
>>>> 
>>>>> After all, there's a pretty large range where long arithmetic will be more precise than the doubles required by ECMA.
>>>> There’s currently several different places in Nashorn where we try to confine the precision of longs to 53 bits (so they aren’t more precise than doubles). It’s a bit of a whack-a-mole where you can’t always be sure whether you found all instances.
>>>> 
>>>>> For context see this bug, especially the last two comments (the original bug was about number to string conversion which has been solved in the meantime):
>>>>> 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8065910
>>>>> 
>>>>> So the question is: can we use longs at all and be confident that results won't be too precise (and thus, strictly speaking, incorrect)?
>>>> Internally, the longs are also used for representing UInt32 even in non-optimistic setting, which is only really significant for the >>> operator and array indices and lengths; maybe we should limit longs to that usage only, or even just use doubles internally for UInt32 values that can’t be represented as Int32. FWIW, even for the >>> operator we only need it when shifting by zero, as in every other case the result will have the topmost bit set to 0 and thus fit in Int32 too.
>>>> 
>>>> I guess once Valhalla rolls around, we could easily have an unsigned 32 bit int type.
>>>> 
>>>>> Hannes
>>>> Attila.
>>>> 
>