RFR 8225466 : Optimize matching BMP Slice nodes

Ivan Gerasimov ivan.gerasimov at oracle.com
Tue Oct 29 01:03:01 UTC 2019


Hello!

When building a Pattern object, the regex parser recognizes "slices" - 
continuous char subsequences, which all have to be matched 
case-sensitively/case-insensitively.  Matching with such a slice is 
implemented as a simple loop over a portion of the input.

In the current implementation, on each iteration of the loop it is 
checked if we have hit the end of the input (which is an uncommon case).

This check can be done only once, before the loop, which will make the 
loop lighter.

Benchmark shows up to +4% to the throughput for the case-insensitive 
matching.

Would you please help review the enhancement?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8225466
WEBREV: http://cr.openjdk.java.net/~igerasim/8225466/00/webrev/


----------- benchmark results ---------------

UNFIXED
Benchmark                Mode  Cnt    Score   Error  Units
PatternBench.sliceIFind  avgt   16  190.612 ? 0.336  ns/op

FIXED
Benchmark                Mode  Cnt    Score   Error  Units
PatternBench.sliceIFind  avgt   16  182.954 ? 0.493  ns/op
-------------------------------------------

-- 
With kind regards,
Ivan Gerasimov



More information about the core-libs-dev mailing list