RFR(XS) 8249719: MethodHandle performance suffers from bad ResolvedMethodTable hash function

Andrei Pangin andrei.pangin at gmail.com
Mon Jul 27 20:29:46 UTC 2020


Thank you for the ideas. Checking the real bucket length looks attractive,
but as far as I see, it does not always work. First, because the table
expansion is triggered only after GC. Second, because it's still possible
to intentionally generate hash collisions, but I believe the JVM should not
crash even in the worst case scenario.

Printing the hash value and parsing the logs looks a bit too artificial to
me, while my original intention was to test the performance (or algorithmic
complexity) of the code rather than the particular hash code algorithm.

Anyway, if you all don't mind making the test manual, let me just
stick with this option - not to make the test much more complicated than
the fix itself :) especially if the table will be removed some time soon.

Here's the updated webrev, just in case. It differs only in changing
"timeout=300" to "manual".
http://cr.openjdk.java.net/~apangin/8249719/webrev.2/

Thanks again,
Andrei


пн, 27 июл. 2020 г. в 22:12, <coleen.phillimore at oracle.com>:

>
>
> On 7/27/20 2:59 PM, Thomas Stüfe wrote:
>
> A manual test may be enough too, I mean how probable is it that this
> particular error would ever resurface. Better would be, maybe, to just add
> an assert into the insert path which fires if a bucket length is above a
> certain threshold. That would catch all kinds of bad hashes, not just this
> one case.
>
> BTW maybe we should add a jcmd option to print this table, like we have
> for the symbol- and stringtable.
>
>
> Erik Osterlund removes this table with the New Invoke Bindings JEP so it's
> not worth adding more to this right now.
> thanks,
> Coleen
>
>
> Cheers Thomas
>
> On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com> wrote:
>
>>
>>
>> On 7/27/20 2:33 PM, Volker Simonis wrote:
>>
>> I like this solution although it introduces a hard dependency on the
>> exact log format. But that's probably not a big issue.
>>
>> Another possibility would probably be to add a gtest.
>>
>> But I'd also be happy to make the current test a manual test.
>>
>> Let's see what Coleen's opinion is. We can use her as a tiebreaker :)
>>
>> Otherwise I'll leave it up to Andrei to choose his preferred option :)
>>
>>
>> So what I suggested is risky (I'm curious to see if it works).  My second
>> choice is to make it a manual test.  Hopefully we won't be changing up the
>> hashcode all the time and risk breaking this again.
>> thanks,
>> Coleen
>>
>>
>>
>> On Mon, Jul 27, 2020 at 8:04 PM Thomas Stüfe <thomas.stuefe at gmail.com>
>> wrote:
>>
>>> Hi Andrei,
>>>
>>> I was afraid this might happen, but only occurred to me after the review.
>>>
>>> My proposal would be:
>>>
>>> - in resolvedMethodTable.cpp, extend the log message in log_insert() and
>>> print out the hashcode too (e.g. "ResolvedMethod entry added for ...
>>> (4711)").
>>>
>>> - In the test, start your test as a sub process, give it
>>> -Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
>>> entry added for <your class name> (<hash>)". Extract the hash. Read two or
>>> three lines and compare the hash.
>>>
>>> The advantage would be that you do not need to load 200k classes to see
>>> that your regression test works, and it is timing independent. For examples
>>> of sub process spawning and output scanning, there are many jtreg tests
>>> already (e.g. runtime/ErrorHandling has a few).
>>>
>>> Just my 5 cent. Maybe the others have different proposals.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Coleen, Thomas, thank you for your reviews. It turns out though that my
>>>> new test times out on debug builds (thanks Volker for double-checking).
>>>>
>>>> Test Tier Platform Description
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1
>>>> linux-aarch64-debug Error: Program `...java' timed out (timeout set to
>>>> ...ms, elapsed time including timeout handling was ...ms).
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
>>>> Program
>>>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>>> timed out (timeout set to ...ms, elapsed time including timeout handling
>>>> was ...ms).
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
>>>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>>>> timeout handling was ...ms).
>>>>
>>>> Apparently, we can't rely on timing, since this test runs two orders of
>>>> magnitude slower on debug JVM than on product one. I could simply mark the
>>>> test as manual, or is there a better idea? Maybe, adjust the parameters and
>>>> run the test automatically only on product or only on debug JVM? What do
>>>> you think?
>>>>
>>>> Regards,
>>>> Andrei
>>>>
>>>> сб, 25 июл. 2020 г. в 17:34, <coleen.phillimore at oracle.com>:
>>>>
>>>>>
>>>>> Hi Andrei,
>>>>> This looks good.  Thank you for finding this bug.  And thanks to
>>>>> Volker for sponsoring it as well.
>>>>> Nice to see you on the list, Andrei!
>>>>> Coleen
>>>>>
>>>>> On 7/25/20 3:18 AM, Thomas Stüfe wrote:
>>>>>
>>>>> Hi, Andrei,
>>>>>
>>>>> Good find. I played around with a test of generating lots of lambdas
>>>>> and yes, all the hashes are equal. With your patch invocation time went
>>>>> down by half (that was for 10000 lambdas).
>>>>>
>>>>> The test looks fine though the normal way to do this seems to be jcod.
>>>>> I personally don't care since the test is nice and self contained that way,
>>>>> but someone from the Oracle runtime group should confirm this is fine
>>>>> (ccing Coleen).
>>>>>
>>>>> JDK11 seems to be affected too.
>>>>>
>>>>> This probably also affects jruby.
>>>>>
>>>>> +1 from me.
>>>>>
>>>>> ..Thomas
>>>>>
>>>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Please review a small fix to a not-so-small performance issue that
>>>>>> we've
>>>>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>>>>
>>>>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>>>>> ResolvedMethodTable operations become extremely slow due to degenerate
>>>>>> hashcode. This patch basically fixes hashcode by including the method
>>>>>> holder's name in the computation. More details in the bug report.
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>>>
>>>>>> Tested: tier1-2, hotspot*runtime
>>>>>>
>>>>>> I'll be glad if someone could sponsor the patch.
>>>>>>
>>>>>> Thank you,
>>>>>> Andrei Pangin
>>>>>>
>>>>>
>>>>>
>>
>


More information about the hotspot-runtime-dev mailing list