MaxBCEAEstimateSize and inlining clarification

Vitaly Davidovich vitalyd at gmail.com
Wed Sep 14 15:46:03 UTC 2016


Hi Vladimir,

Do OSR compilations run EA? I'm looking at some code (roughly) like this:

while (true) {
    for (Entry<...> e : concurrentHashMap.entrySet()) {
         // e does not escape
     }
     Thread.sleep(...);
}

I see the enclosing method OSR compiled, but the iterator and entry aren't
eliminated.  Makes me wonder if OSR doesn't do EA.  Is that the case?

Thanks


On Tuesday, September 13, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> If allocation is done locally in loop it could be SR (but not guaranteed):
>
> for () {
>   Foo f = new Foo();
> }
>
> "Currently" we can't SR it if there is merge:
>
> Foo f = new Foo();
> for () {
>   f = new Foo();
> }
> x = f.x;
>
> Also we can't SR an array if it has index access because we can't map
> loads/stores to concrete element:
>
> int[] a = new int[3];
> for (i) {
>   x = a[i]
> }
>
> If elements are accessed without index (using array to pass or return
> several values) or a loop is fully unrolled we can SR it:
>
> x0 = a[0];
> x1 = a[1];
> x2 = a[2];
>
> Regards,
> Vladimir
>
> On 9/13/16 12:55 PM, Ruslan Cheremin wrote:
>
>> There was also another thread a few months back where I was asking why a
>>> small local array allocation wasn't scalarized, and the answer there was
>>> ordering between loop unrolling and EA passes (I can
>>>
>> dig up that thread if you're interested).
>>
>> It would be very nice, please -- I've tried to google it by myself
>> (because you've noted it already in the thread) but wasn't able to guess
>> right keywords :)
>>
>>
>> 2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com <mailto:
>> vitalyd at gmail.com>>:
>>
>>
>>
>>     On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com
>> <mailto:cheremin at gmail.com>> wrote:
>>
>>         >how it can be made stable to the point where you can rely/depend
>> on it for performance.
>>
>>         Well, same can be said about any JIT optimization -- (may be it
>> is time to rename dynamic runtime to stochastic runtime?). Personally I see
>> SR to be the same order of stability as inlining.
>>         Actually, apart from few SR-specific issues (like with merge
>> points), EA/SR mostly follow inlining: if you have enough scope inlined
>> you'll have, say, 80% chance of SR. From my perspective it
>>         is inlining which is so surprisingly unstable.
>>
>>     Yeah, I'd agree.  The difference, in my mind, is failing to inline a
>> function may not have as drastic performance implications as failing to
>> eliminate temporaries.
>>
>>
>>         BTW: have you considered to share you experience with EA/SR
>> pitfalls? Even if "increase likelihood" is the best option available --
>> there are still very little information about it in the net.
>>
>>     I'm kind of doing that via the few emails on this list :).  I think
>> you pretty much covered the biggest (apparent) flake in the equation -
>> inlining, which can fail for all sorts of different
>>     reasons.  Beyond that, there's the control flow insensitive aspect of
>> the EA, which is tangentially related to inlining (or lack thereof).
>>
>>     There was also another thread a few months back where I was asking
>> why a small local array allocation wasn't scalarized, and the answer there
>> was ordering between loop unrolling and EA passes (I
>>     can dig up that thread if you're interested).  The bizarre thing
>> there was the loop operation was folded into a constant, and the compiled
>> method was returning a constant value, but the array
>>     allocation was left behind (although it wasn't needed).
>>
>>     I agree that there isn't much information about EA in Hotspot
>> (there's a lot of handwaving and inaccuracies online).  In particular, it'd
>> be nice if the performance wiki had a section on making
>>     user code play well with EA (just like it has guidance on some other
>> JIT aspects currently).
>>
>>
>>         ----
>>         Ruslan
>>
>>
>>
>>         2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com
>> <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>             On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <
>> cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                 >That's my understanding as well (and matches what I'm
>> seeing in some synthetic test harnesses).
>>
>>                 Ok, I just tried to clear it out, because it is not the
>> first time I see BCEA... noted in context of scalar replacement, and I
>> start to doubt my eyes :)
>>
>>                 >t's pretty brittle, sadly, and more importantly,
>> unstable.
>>
>>                 Making similar experiments I see the same. E.g.
>> HashMap.get(TupleKey) lookup can be successfully scalarized 99% cases, but
>> scalarization become broken once with slightly changed key
>>                 generation schema -- because hashcodes distribution
>> becomes worse, and HashMap buckets start to convert themself to TreeBins,
>> and TreeBins code is much harder task for EA.
>>
>>                 Another can of worms is mismatch between different
>> inlining heuristics. E.g. FreqInlineSize and InlineSmallCode thresholds may
>> give different decision for the same piece of code, and
>>                 taken inlining decision depends on was method already
>> compiled or not -- which depends on thinnest details of initialization
>> order and execution profile. This scenarios becomes rare in
>>                 1.8 with InlineSmallCode increased, but I'm not sure they
>> are gone...
>>
>>                 Currently, I'm starting to think code needs to be
>> specifically written for EA/SR in mind to be more-or-less stably
>> scalarized. I.e. you can't get it for free (or it will be unstable).
>>
>>             I'm not sure this is practical, to be honest, at least for a
>> big enough application.  I've long considered EA (and scalar replacement)
>> as a bonus optimization, and never to rely on it if
>>             the allocations would hurt otherwise.  I'm just a bit
>> surprised *just* how unstable it appears to be, in the "simplest" of cases.
>>
>>             I think code can be written to increase likelihood of scalar
>> replacement, but I just can't see how it can be made stable to the point
>> where you can rely/depend on it for performance.
>>
>>
>>                 ----
>>                 Ruslan
>>
>>
>>                 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <
>> vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>                     On Tuesday, September 13, 2016, Cheremin Ruslan <
>> cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                         > I'm seeing some code that iterates over a
>> ConcurrentHashMap's entrySet that allocates tens of GB of CHM$MapEntry
>> objects even though they don't escape
>>
>>
>>                         I'm a bit confused: I was sure BCEA-style params
>> do affect EA, but don't affect scalar replacement. With bcEscapeAnalyser
>> you can get (sort of) inter-procedural EA, but this
>>                         only allows you to have more allocations
>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>> NoEscape without real inlining. ArgEscape (afaik) is used only
>>                         for synchronization removals in HotSpot, not for
>> scalar replacements.
>>
>>                         Am I incorrect?
>>
>>                     That's my understanding as well (and matches what I'm
>> seeing in some synthetic test harnesses).
>>
>>                     I'm generally seeing a lot of variability in scalar
>> replacement in particular, all driven by profile data.  HashMap<Integer,
>> ...>::get(int) sometimes works at eliminating the box
>>                     and sometimes doesn't - the difference appears to be
>> whether Integer::equals is inlined or not, which in turn depends on whether
>> the lookup finds something or not and whether the
>>                     number of successful lookups reaches compilation
>> threshold. It's pretty brittle, sadly, and more importantly, unstable.
>>
>>
>>
>>                         ----
>>                         Ruslan
>>
>>
>>
>>                     --
>>                     Sent from my phone
>>
>>
>>
>>
>>
>>
>>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160914/c7b87439/attachment.html>


More information about the hotspot-compiler-dev mailing list