JEP 254: Compact Strings
Vitaly Davidovich
vitalyd at gmail.com
Mon Jun 1 13:55:45 UTC 2015
My calculation doesn't assume cacheline granularity; I'm looking at
strictly the strings. What's allocated next to/around them is completely
arbitrary, circumstantial, uncontrollable to a large extent, and often not
repeatable. If you're claiming that some second or even third order
locality effects will be measurable, I don't know how :). I'm sure there
will be some as theoretically it's possible, but it'll be hard to
demonstrate that on anything other than specially crafted microbenchmarks.
Ok, you're talking about some string intrinsics and not general char[]
being vectorized - fair enough.
sent from my phone
On Jun 1, 2015 9:31 AM, "Aleksey Shipilev" <aleksey.shipilev at oracle.com>
wrote:
> On 06/01/2015 03:54 PM, Vitaly Davidovich wrote:
> > While it's true that the denser format will require fewer cachelines, my
> > experience is that most strings are smaller than a single cacheline
> > worth of storage, maybe two lines in some cases; there's just a ton of
> > them in the heap. So the heap footprint should be substantially
> > reduced, but I'm not sure the cache pollution will be significantly
> reduced.
>
> This calculation assumes object allocations are granular to the cache
> lines. They are not: if String takes less space within the cache line,
> it allows *more* object data to be squeezed there. In other words, with
> compact Strings, the entire dataset can take less cache lines, thus
> improving performance.
>
>
> > There's currently no vectorization of char[] scanning (or any
> > vectorization other than memcpy for that matter) - are you referring to
> > the recent Intel contributions here or there's a plan to further improve
> > vectorization in time for this JEP? Just curious.
>
> String methods are intensely intrinsified (and vectorized in those
> implementations). String::equals, String::compareTo, and some
> encoding/decoding come to mind.
>
> I really, really invite you to read the collateral materials from the
> JEP, where we explored quite a few performance characteristics already.
>
>
> Thanks,
> -Aleksey.
>
>
>
More information about the core-libs-dev
mailing list