volatile and caches question

Wed Jul 23 15:26:25 UTC 2014

Hotspot emits a storeload fence for volatile writes and a compiler - only
barrier for volatile loads.  Currently, the storeload is implemented as
"lock add [r/esp], 0".  This is a nop semantically but has a synchronizing
effect.  The resulting cpu/memory behavior is detailed by various online
x86 resources.

Sent from my phone
On Jul 23, 2014 10:13 AM, "Winnie JayClay" <winniejayclay at gmail.com> wrote:

> Hi Aleksey,
>
>
> Thanks, but I have hotspot and x86 question, not that much about
> specification. What is the real implemented hotspot x86 behavior for these
> two scenarios.
>
> Also, personally,  have no idea why you mentioned RSDN.
>
>
>
> On Wednesday, July 23, 2014, Aleksey Shipilev <aleksey.shipilev at oracle.com
> >
> wrote:
>
> > Hi,
> >
> > This is a good question for concurrency-interest, not hotspot-dev:
> >   http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
> >
> > Most of your questions seem to be answered by the JSR 133 Cookbook,
> > which describes the conservative approach to conform to Java Memory
> > Model: http://gee.cs.oswego.edu/dl/jmm/cookbook.html
> >
> > On 07/23/2014 05:05 PM, Winnie JayClay wrote:
> > > Say, if I have class with volatile variable, but only one thread
> operates
> > > on that variable (read and update it), will be memory flushed always on
> > x86
> > > when the same thread read and write? What is the overhead and do you
> have
> > > any optimizations in JDK?
> >
> > "Memory flush" has no place since 2005.
> >
> > VMs are free to optimize volatile accesses, as long as those
> > optimizations fit the Java Memory Model. For example, the multiple
> > volatile accesses in constructor can be optimized since we know whether
> > the variable is not exposed to other threads. For the variables already
> > on heap, it is generally unknown whether we can optimize the accesses
> > without breaking the JMM (there are cases where we can do simple optos,
> > see Cookbook).
> >
> > > Also if I have multiply threads which operate on
> > > single volatile variable: one writer and many readers and writer
> doesn't
> > > write too much, will be caches flushed every time readers access
> volatile
> > > varible and when write didn't write anything?
> >
> > I'm not following what "caches" need "flushing" in this case. Cache
> > coherency already takes care about propagating the values.
> > Low-level-hardware-speaking, memory semantics around "volatiles" is
> > about exposing the data to cache coherency in proper order.
> >
> > On x86, once writer had committed the volatile write, CPU store buffers
> > have drained into memory subsystem, regular coherency takes care that
> > readers are reading the consistent value. In this parlance, reading the
> > volatile variable is almost no different from reading a plain one (not
> > exactly, since compiler optimizations which may break the provisions of
> > JMM are also inhibited).
> >
> > > I also thought to use normal
> > > non-final non-volatile variable and for writer thread create and invoke
> > > synchronized block somwhere after it will update varible to establish
> > > happens-before, i.e. just a synchronized block to flush caches for
> reader
> > > threads to pick up the latest value - by the way, is my understanding
> > > correct that if writer somewhere else invokes synchronized block which
> is
> > > not available for readers, will the readers get the latest value?
> >
> > A lone synchronized{} block used in writer only is not enough to
> > establish happens-before with reader which does not use the
> > synchronized{} on the same object. Moreover, the synchronized{} block on
> > non-escaped object may be eliminated altogether, since its memory
> > effects are not required to be visible by any other code.
> >
> > > Thanks for help, we work on HFT project in java, and performance is
> > > super-critical for us.
> >
> > Judging from the questions alone (yes, I read RSDN occasionally), I
> > would personally recommend to learn about what JMM is guaranteeing by
> > itself (Java Language Spec), then learn how JMM is conservatively
> > implemented (JSR133 Cookbook, etc.), then learn more about hardware
> > guarantees (Intel/AMD SDMs), and then clearly keep in mind the
> > difference between all three. If you have questions about these,
> > concurrency-interest@ has a good supply of folks who are ready to
> > discuss this.
> >
> > Thanks,
> > -Aleksey.
> >
>