RFR(XS) 8249719: MethodHandle performance suffers from bad ResolvedMethodTable hash function
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Mon Jul 27 21:08:01 UTC 2020
+1.
On 7/27/20 4:37 PM, Thomas Stüfe wrote:
> I’m fine with that version.
>
> Thanks, Thomas
>
> On Mon 27. Jul 2020 at 22:30, Andrei Pangin <andrei.pangin at gmail.com
> <mailto:andrei.pangin at gmail.com>> wrote:
>
> Thank you for the ideas. Checking the real bucket length looks
> attractive, but as far as I see, it does not always work. First,
> because the table expansion is triggered only after GC. Second,
> because it's still possible to intentionally generate hash
> collisions, but I believe the JVM should not crash even in the
> worst case scenario.
>
yes, I agree. It might not be safe.
Making this manual is fine.
Thanks for fixing this.
Coleen
>
>
> Printing the hash value and parsing the logs looks a bit too
> artificial to me, while my original intention was to test the
> performance (or algorithmic complexity) of the code rather than
> the particular hash code algorithm.
>
> Anyway, if you all don't mind making the test manual, let me just
> stick with this option - not to make the test much more
> complicated than the fix itself :) especially if the table will be
> removed some time soon.
>
> Here's the updated webrev, just in case. It differs only in
> changing "timeout=300" to "manual".
> http://cr.openjdk.java.net/~apangin/8249719/webrev.2/
>
> Thanks again,
> Andrei
>
>
> пн, 27 июл. 2020 г. в 22:12, <coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>:
>
>
>
> On 7/27/20 2:59 PM, Thomas Stüfe wrote:
>> A manual test may be enough too, I mean how probable is it
>> that this particular error would ever resurface. Better would
>> be, maybe, to just add an assert into the insert path which
>> fires if a bucket length is above a certain threshold. That
>> would catch all kinds of bad hashes, not just this one case.
>>
>> BTW maybe we should add a jcmd option to print this table,
>> like we have for the symbol- and stringtable.
>
> Erik Osterlund removes this table with the New Invoke Bindings
> JEP so it's not worth adding more to this right now.
> thanks,
> Coleen
>>
>> Cheers Thomas
>>
>> On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com
>> <mailto:coleen.phillimore at oracle.com>> wrote:
>>
>>
>>
>> On 7/27/20 2:33 PM, Volker Simonis wrote:
>>> I like this solution although it introduces a hard
>>> dependency on the exact log format. But that's probably
>>> not a big issue.
>>>
>>> Another possibility would probably be to add a gtest.
>>>
>>> But I'd also be happy to make the current test a manual
>>> test.
>>>
>>> Let's see what Coleen's opinion is. We can use her as a
>>> tiebreaker :)
>>>
>>> Otherwise I'll leave it up to Andrei to choose his
>>> preferred option :)
>>
>> So what I suggested is risky (I'm curious to see if it
>> works). My second choice is to make it a manual test.
>> Hopefully we won't be changing up the hashcode all the
>> time and risk breaking this again.
>> thanks,
>> Coleen
>>>
>>> On Mon, Jul 27, 2020 at 8:04 PM Thomas Stüfe
>>> <thomas.stuefe at gmail.com
>>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>>
>>> Hi Andrei,
>>>
>>> I was afraid this might happen, but only occurred to
>>> me after the review.
>>>
>>> My proposal would be:
>>>
>>> - in resolvedMethodTable.cpp, extend the log message
>>> in log_insert() and print out the hashcode too (e.g.
>>> "ResolvedMethod entry added for ... (4711)").
>>>
>>> - In the test, start your test as a sub process,
>>> give it -Xlog:membername+table=debug, and scan the
>>> output line for "ResolvedMethod entry added for
>>> <your class name> (<hash>)". Extract the hash. Read
>>> two or three lines and compare the hash.
>>>
>>> The advantage would be that you do not need to load
>>> 200k classes to see that your regression test works,
>>> and it is timing independent. For examples of sub
>>> process spawning and output scanning, there are many
>>> jtreg tests already (e.g. runtime/ErrorHandling has
>>> a few).
>>>
>>> Just my 5 cent. Maybe the others have different
>>> proposals.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin
>>> <andrei.pangin at gmail.com
>>> <mailto:andrei.pangin at gmail.com>> wrote:
>>>
>>> Hi,
>>>
>>> Coleen, Thomas, thank you for your reviews. It
>>> turns out though that my new test times out on
>>> debug builds (thanks Volker for double-checking).
>>>
>>> Test Tier Platform Description
>>> runtime/MemberName/ResolvedMethodTableHash.java
>>> tier1 linux-aarch64-debug Error: Program
>>> `...java' timed out (timeout set to ...ms,
>>> elapsed time including timeout handling was ...ms).
>>> runtime/MemberName/ResolvedMethodTableHash.java
>>> tier1 windows-x64-debug Error: Program
>>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>> timed out (timeout set to ...ms, elapsed time
>>> including timeout handling was ...ms).
>>> runtime/MemberName/ResolvedMethodTableHash.java
>>> tier1 macosx-x64-debug Error: Program
>>> `...java' timed out (timeout set to ...ms,
>>> elapsed time including timeout handling was ...ms).
>>>
>>>
>>> Apparently, we can't rely on timing, since this
>>> test runs two orders of magnitude slower on
>>> debug JVM than on product one. I could simply
>>> mark the test as manual, or is there a better
>>> idea? Maybe, adjust the parameters and run the
>>> test automatically only on product or only on
>>> debug JVM? What do you think?
>>>
>>> Regards,
>>> Andrei
>>>
>>> сб, 25 июл. 2020 г. в 17:34,
>>> <coleen.phillimore at oracle.com
>>> <mailto:coleen.phillimore at oracle.com>>:
>>>
>>>
>>> Hi Andrei,
>>> This looks good. Thank you for finding this
>>> bug. And thanks to Volker for sponsoring it
>>> as well.
>>> Nice to see you on the list, Andrei!
>>> Coleen
>>>
>>> On 7/25/20 3:18 AM, Thomas Stüfe wrote:
>>>> Hi, Andrei,
>>>>
>>>> Good find. I played around with a test of
>>>> generating lots of lambdas and yes, all the
>>>> hashes are equal. With your patch
>>>> invocation time went down by half (that was
>>>> for 10000 lambdas).
>>>>
>>>> The test looks fine though the normal way
>>>> to do this seems to be jcod. I personally
>>>> don't care since the test is nice and self
>>>> contained that way, but someone from the
>>>> Oracle runtime group should confirm this is
>>>> fine (ccing Coleen).
>>>>
>>>> JDK11 seems to be affected too.
>>>>
>>>> This probably also affects jruby.
>>>>
>>>> +1 from me.
>>>>
>>>> ..Thomas
>>>>
>>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei
>>>> Pangin <andrei.pangin at gmail.com
>>>> <mailto:andrei.pangin at gmail.com>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Please review a small fix to a
>>>> not-so-small performance issue that we've
>>>> seen when migrating a production
>>>> application from JDK 8 to JDK 14.
>>>>
>>>> On certain workloads, where Nashorn
>>>> produces thousands MethodHandles,
>>>> ResolvedMethodTable operations become
>>>> extremely slow due to degenerate
>>>> hashcode. This patch basically fixes
>>>> hashcode by including the method
>>>> holder's name in the computation. More
>>>> details in the bug report.
>>>>
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8249719
>>>> Webrev:
>>>> https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>
>>>> Tested: tier1-2, hotspot*runtime
>>>>
>>>> I'll be glad if someone could sponsor
>>>> the patch.
>>>>
>>>> Thank you,
>>>> Andrei Pangin
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list