JDK 9 RFR of 6375303: Review use of caching in BigDecimal
Peter Levart
peter.levart at gmail.com
Sat Mar 22 09:01:10 UTC 2014
On 03/22/2014 02:01 AM, Brian Burkhalter wrote:
> On Mar 20, 2014, at 12:49 AM, Aleksey Shipilev
> <aleksey.shipilev at oracle.com <mailto:aleksey.shipilev at oracle.com>> wrote:
>
>> On 03/20/2014 11:06 AM, Peter Levart wrote:
>>> I was thinking about last night, for question: "Why is this
>>> double-checked non-volatile-then-volatile trick not any faster than pure
>>> volatile variant even on ARM platform where volatile read should have
>>> some penalty compared to normal read?", might be in the fact that
>>> Raspberry Pi is a single-core/single-thread "machine". Would anyone with
>>> JVM JIT compiler expertise care to share some insight? I suspect that on
>>> such platform, the compiler optimizes volatile accesses so that they are
>>> performed without otherwise necessary memory fences...
>>
>> Yes, at least C2 is known to not emit memory fences on uniprocessor
>> machines. You need to have a multicore ARM. If you are still interested,
>> contact me privately and I can arrange the access to my personal
>> quad-core Cortex-A9.
>
> This is all very interesting but I would like to have closure so to
> speak on the patches that I posted.
Hi Brian,
> I think the questions still unresolved are:
>
> 1) Is a change of toString() necessary?
I can't speak of that since I don't have empirical data in what
proportion of code it would be beneficial. I can imagine situations
where returned strings are used as keys in HashMap and String's equals
is optimized for same-instance comparison, but I don't know if such
situations happen in practice...
> 2) If #1 is “yes” then which version of the patch? (I think the most
> recent)
Aleksey Shipilev kindly let me access it's quad-code ARM "Big Iron" and
before the connection to it broke yesterday, I managed to get the
results of a single-threaded run of the microbenchmark. While
differences were not spectacular, they were measurable (sorry I haven't
saved the results). The variant with volatile stringCache field and CAS
was a little slower than the double-checked
nonvolatile-then-volatile-read-and-CAS trick.
It's also important that the fast-path method does not exceed the
maximum bytecode size for inlining (35 bytes by default, I think) so the
variant with two methods toString/toStringSlow is necessary to avoid
performance regression.
> 3) Are the other changes OK which are unrelated to toString()?
Looks good. Just a nit. In the following method:
3726 private static void matchScale(BigDecimal[] val) {
3727 if (val[0].scale == val[1].scale) {
3728 // return
3729 } else if (val[0].scale < val[1].scale) {
3730 val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
3731 } else if (val[1].scale < val[0].scale) {
3732 val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
3733 }
3734 }
One of 3 ifs is superfluous. Either the 1st one:
private static void matchScale(BigDecimal[] val) {
*/* if (val[0].scale == val[1].scale) {
// return
} else */* if (val[0].scale < val[1].scale) {
val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
} else if (val[1].scale < val[0].scale) {
val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
}
}
...or the last one:
private static void matchScale(BigDecimal[] val) {
if (val[0].scale == val[1].scale) {
// return
} else if (val[0].scale < val[1].scale) {
val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
} else*/* if (val[1].scale < val[0].scale) */* {
val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
}
}
Regards, Peter
>
> Thanks,
>
> Brian
More information about the core-libs-dev
mailing list