Ping: RFR: JDK-8205523: Explicit barriers for interpreter

Thu Jun 28 09:06:19 UTC 2018

There are other reasons why I would like to distinguish reads and writes:
- write-barriers in Shenandoah generate *much* more code
- I am not sure we can actually easily generate a write-barrier for the
CRC32 intrinsic. Do we have an interpreter frame?

I thought about this a little more. We could trim the API down, and
still retain the flexibility to differentiate between reads and writes.
Instead of:

oop resolve_read_read(oop obj);
oop resolve_read_write(oop obj);

we could do:

oop resolve(DecoratorSet decorators, oop obj);

and pass something like ACCESS_READ or ACCESS_WRITE (kindof what we
already have in C1... probably simply rename those C1-specific
decorators). If backend doesn't see any decorators, it would do the safe
thing.

What do you think? Maybe even extend the runtime-access-interface to
take the same?

Roman

> Hi Andrew and Roman,
> 
> I am a fan of profile guided optimization. I would definitely not mind
> introducing these concepts in the compilers where they are with no doubt
> necessary (and we also have the right tools for dealing with this
> better). In fact, they already have read/write decorators that could be
> used for resolve barriers in our compilers, and can use algorithms to
> safely elide barriers where provably correct, so it makes perfect sense
> for me to use such concepts there.
> I'm just not sure that the interpreter needs to be polluted with this
> conceptual overhead, unless there is at least one benchmark that can
> show that we are solving an actual problem with this. Remember,
> premature optimizations are the root of all evil. In you experience,
> have you ever observed a difference in any application or benchmark, due
> to the less than handful paths in the interpreter having a slightly
> suboptimal barrier being used? If so, I could change my mind.
> 
> Thanks,
> /Erik
> 
> On 2018-06-27 12:44, Andrew Haley wrote:
>> On 06/27/2018 11:42 AM, Roman Kennke wrote:
>>
>>> It should be noted that normal loads and stores are already covered
>>> by the Access API, and we emit the correct read- and write
>>> barriers. This is about a few places that don't easily fit that
>>> model and still require barriers. That is monitor enter/exit (need
>>> WBs anyway) and a few intrinsics, in the interpreter only
>>> CRC32. This requires an RB on the buffer array. Yeah, it's probably
>>> OK to emit WB there too. If it's really hot, it'd be compiled by C1
>>> or C2.
>> Right, but as you correctly note we'll have exactly the same
>> discussion about C1, with the same points made.
>>
>