discuss about release barrier for final fields initialization
Kuai Wei
kuaiwei.kw at alibaba-inc.com
Tue Jan 9 06:23:59 UTC 2024
Hi,
I made some experiments on object allocation performance. And I found on aarch64 N1, if object has final field, the allocation rate is about 75% of normal allocation.
The cause is C2 will insert a release membar in <init> , which will be translated as "dmb.ish" in aarch64. For normal allocation, a membar storestore is inserted and
is emitted as "dmb.ishst", it make the difference. The test jmh is https://gist.github.com/kuaiwei/f71fba40df29991c93325a8600e34c13 <https://gist.github.com/kuaiwei/f71fba40df29991c93325a8600e34c13 >
java -jar target/benchmarks.jar -f 1 -wi 5 -w 3 -i 3 -r 3 testAlloc
...
Benchmark Mode Cnt Score Error Units
AllocFinal.testAlloc thrpt 3 1167.903 ± 44.973 ops/s
AllocFinal.testAllocWithFinal thrpt 3 915.330 ± 52.596 ops/s
I found only C2 will insert release membar and C1 just insert storestore for both final and normal allocation. In Doug Lea's cookbook https://gee.cs.oswego.edu/dl/jmm/cookbook.html <https://gee.cs.oswego.edu/dl/jmm/cookbook.html >
Only storesotre is required. Alex has a great post on this topic https://shipilev.net/blog/2014/all-fields-are-final/ <https://shipilev.net/blog/2014/all-fields-are-final/ > . It referred a case why loadstore is needed. https://www.hboehm.info/c++mm/no_write_fences.html <https://www.hboehm.info/c++mm/no_write_fences.html >
I checked this case and IMO it looks some legacy architecture may break data dependency and cause problem. As I know, alpha architecture is an example. I think it doesn't
break on modern architecture. Is there other case I missed?
If storestore is enough in this situation, I will send a PR to loose the barrier.
Thanks,
Kuai Wei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20240109/f675a9e4/attachment-0002.htm>
More information about the hotspot-compiler-dev
mailing list