JEP 254: Compact Strings

Tue Jun 2 07:30:21 UTC 2015

TL;DR: In principle, we'd love to do more early testing of Hotspot / JDK
features, but our benchmarks are, honestly, not all that great.  We end up
having to test against live services, which makes this sort of thing really
hard.

More info than you need:

There are two real problems here:

1) To do apples-to-apples comparisons, we have to make sure that *our*
patches all work with whatever version of Hotspot we're testing.

2) Pulling down a new JDK9 - even an official release - usually means that
there are a lot of instabilities, half-finished work, and inefficiencies,
so we can't really run tests very well against it.  That's not a knock on
Hotspot developers; the only way to know about some of these problems is to
run the JDK in infrastructure like ours.  ( An example of something that
hit us hard that no one else would notice:
http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/5ba37c4c0578 )

It took us months to forward port all of our patches to JDK8, and we've
spent the last six months getting it to the point that we're comfortable
enough to ship to our users (just in time for JDK7 EOL!).  That's required
disabling tiered compilation, heavily tweaking code cache flushing (which
is still causing us CPU regressions), rewriting various parts of the
metaspace to behave more efficiently, and fixing various incompatibilities
with our internal patches.  That's completely apart from the dozens of
backwards incompatibilities introduced in JDK8 that triggered a very, very,
very large code cleanup effort, from the new hash iteration order to the
unicode update to the fact that if you call flush on a closed
BufferedOutputStream it now throws an exception.

(We actually ended up randomizing our hash iteration order, which helps us
guard against broken code, is slightly more secure, and means that we never
get bitten by that as part of an upgrade again.)

In short, upgrading is in no sense cheap for us, and trying out new
features is hard.  We usually restrict ourselves to using new features that
can be more-or-less cleanly patched to the version of the JDK we're using
and hidden behind a flag.

This is an important enough change that we might be able to make some
effort, but we'll have to see how it goes.

Jeremy

On Mon, Jun 1, 2015 at 1:31 AM, Aleksey Shipilev <
aleksey.shipilev at oracle.com> wrote:

> (getting back to this)
>
> Hi Jeremy,
>
> On 05/16/2015 03:34 AM, Jeremy Manson wrote:
> > So, I'm pretty dubious, mostly because of the risks mentioned in the JEP.
> > If you need a flag-check and two code paths for every String method,
> that's
> > going to make the String class more slow and bloated, and make it very
> > difficult for the JIT compiler to do its job inlining and intrinsifying
> > calls to String methods.
>
> Yes, we know that's a potential problem, e.g. outlined here:
>  http://cr.openjdk.java.net/~shade/density/equals.txt
>
> The hope is that the string coder check would be amortized by the
> substantial performance improvement with the ubiquitous Latin1
> (optimized) operations. Also, getting a few code generation quirks
> kicked out may further offset the perceived performance costs of doing
> this (you can do such a trick every so often, but not all the time).
>
>
> > The proposed change here has the potential of doing the opposite with
> most
> > String operations - trading off less GC overhead for more mutator cost.
> > But String operations are a pretty big chunk of CPU time, on the whole.
>
> The thing is, many mutator ops on Strings are also improved, because the
> data become more easily cacheable and/or require less steps to complete
> (think vectorization that takes 2x less instructions).
>
>
> > Does anyone really have a sense of how to make this kind of decision?
> The
> > JEP seems mostly to be hoping that other organizations will do the
> testing
> > for you.
>
> It is not true that JEP hopes to have other organizations to do testing
> for it. The JEP tries to illuminate that this is a performance-sensitive
> change, so early testing and feedback is very appreciated. So, if you
> have the String-intensive workloads in your org, can you try and run the
> prototype JDK against it? Our early runs on our workloads of interest
> show the appealing improvements.
>
> That is, the decision to integrate this is not done yet, as we don't
> have the complete performance picture and/or fully-tested prototype. In
> other words, there are quite a few blank spots to fill out. Your data
> may be the part of that picture when we decide to integrate in JDK 9.
>
>
> > (I agree that it is worth doing some experimentation in this area, but I
> > wanted to say this early, because if I could reach back in time and tell
> > you *not* to make the substring change, I would.  We seriously considered
> > simply backing it out locally, but it would have been a lot of effort for
> > us to maintain that kind of patch, and we didn't want our performance
> > tradeoffs to be that much different from the stock JDK's.)
>
> This is your golden ticket: if you come back with concrete data in your
> hands saying that the particular tradeoff the JEP made is not sensible
> for your applications, it would be considered in the decision to
> integrate. But, it should be a real data and/or contrived benchmark
> simulating the real-world scenario, not just theoretical appeals -- we
> know how misguided those can get.
>
>
> Thanks,
> -Aleksey
>
>