RFR 8203628: Optimize (masked) byte memory comparisons on x86

Tue May 22 18:01:08 UTC 2018

On May 22, 2018, at 8:53 AM, Aleksey Shipilev <shade at redhat.com> wrote:
> 
> int64 does not look doable, because x86_64 does not let you use imm64 in test/cmp.

I think it's doable; you define a constant operand type for the matcher
which is 64 bits but whose value can be expressed with a 32-bit encoding.
Actually, that's what this one appears to do:

   instruct loadUI2L(rRegL dst, memory mem, immL_32bits mask)
     match(Set dst (AndL (ConvI2L (LoadI mem)) mask));

But the ConvI2L keeps it from being general.  I guess there is an x86
encoding issue with doing a true LoadL through a mask; I don't see
any assembler forms for masked tests except for byte (cmpb), which you
are filling in.  For unmasked compares there are compares of memory
to literal of all sizes (cmp[bwlq]).  Do we cover those in the AD file?

For testing a single bit, a test of any word size in memory could be
strength-reduced to a byte test with an appropriate offset (0..7) and
shifted imm8 byte constant (>>> 8*[0..7]).  Doing that would be useful
for some metadata bit tests, such as the upcoming Class.isValueType,
as well as existing stuff like Class.isInterface.  This should be doable
in the AD file as well.  (Any IR-level narrowing of the memory type
would confuse alias analysis; this should be a private decision inside
the encoding of one matcher rule.)

We could parley your in-memory bit test into those other bit tests also.
That would triple the complexity of your patch, so it can be done in a
separate change; nevertheless I'd like to see it happen.

> 
>> And I have to ask, "what about" int16?  Do we care?  Probably not.
> 
> I prefer to ignore int16 for a time being. The current patch is almost straight-forward port from
> current Shenandoah tree, and we are pretty sure it works reliably.

16-bit is pretty marginal.  byte/int/long are the important types.  But the
single-bit test hack (extended to an imm8 mask test) that I suggested above
should be applied to all sizes.

Point-fixes for particular code shapes are great.  Those often drive us to
improve optimizations not just for the particular code shapes but for some
reasonable superset of those shapes.  Roland's strip mining work is a good
example.  As we close up some holes in an optimization (load through mask
then test), let's think briefly about removing the remaining known holes in
that same optimization, to see if there is low-hanging fruit we can pick at
the same time.

Bottom line:  While you are in this code, please check for remaining holes
for in-memory tests against constants and against bitmasks. I think they might
be listed completely in this mail.  And then please file a tracking bug with your
findings.

There are lots of bits to test out there!

— John (who always asks for more)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20180522/9a7e1fcb/attachment-0001.html>