RFR(M): 8136445: Performance issue with Nashorn and C2's global code motion
Doerr, Martin
martin.doerr at sap.com
Fri Dec 4 14:02:36 UTC 2015
Hi Vladimir,
thank you very much for your review.
I have changed the Node_Stack initialization as you suggested.
In addition, I changed the iterator to really support removal of nodes:
uint idx = MIN2(_stack.index(), self->outcnt()); // Support removal of nodes.
I believe it's currently not needed in openjdk, but it may be needed in the future.
Hope this is ok.
The new webrev Is here:
http://cr.openjdk.java.net/~mdoerr/8136445_c2_gcm/webrev.01/
Can anybody volunteer to sponsor, please?
Best regards,
Martin
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Donnerstag, 3. Dezember 2015 22:17
To: Doerr, Martin <martin.doerr at sap.com>; Roland Westrelin (roland.westrelin at oracle.com) <roland.westrelin at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
Subject: Re: RFR(M): 8136445: Performance issue with Nashorn and C2's global code motion
You reversed 8011858 changes which made stack much smaller - live_nodes was usually 1/10 of unique nodes:
- stack.map((C->live_nodes() >> 1) + 16, NULL);
+ Node_Stack stack(arena, (C->unique() >> 2) + 16); // pre-grow
Please, use live_nodes with your >> 2 change:
Node_Stack stack(arena, (C->live_nodes() >> 2) + 16); // pre-grow
Iterator changes seems fine to me.
Thanks,
Vladimir
On 12/3/15 9:17 AM, Doerr, Martin wrote:
> Hi,
>
> I have implemented a change which makes Node_Backward_Iterator more efficient for large graphs. The purpose is to fix
> the performance problem we observe in Octane benchmarks.
>
> It lowers compile time dramatically in case JvmtiExport::_can_access_local_variables is on.
>
> The webrev is here:
>
> http://cr.openjdk.java.net/~mdoerr/8136445_c2_gcm/webrev.00/
>
> Please review.
>
> The previous version uses an initial node stack size of (C->unique() >> 1) + 16 which can become pretty large.
>
> My webrev changes it to (C->unique() >> 2) + 16 which is still large. I didn't observe resizing because it was too small.
>
> I guess the stack depth typically stays far below this value, but it may be ok to spend e.g. 0.5 MB in extreme cases.
>
> How was that previous value determined? Should I implement it differently?
>
> Best regards,
>
> Martin
>
More information about the hotspot-compiler-dev
mailing list