RFR: JDK-8322964 Optimize performance of CSS selector matching [v5]

Mon Mar 11 17:00:00 UTC 2024

On Sat, 9 Mar 2024 06:08:34 GMT, John Hendrikx <jhendrikx at openjdk.org> wrote:

>> modules/javafx.graphics/src/main/java/com/sun/javafx/css/FixedCapacitySet.java line 96:
>> 
>>> 94:              : maximumCapacity == 2 ? new Duo<>()
>>> 95:              : maximumCapacity < 10 ? new Hashless<>(maximumCapacity)  // will reject negative values
>>> 96:                                     : new OpenAddressed<>(maximumCapacity);
>> 
>> I wonder if standard HashSet should be used instead of OpenAddressed one, or should there be another threshold which results in a HashSet, to support large sets? 
>> 
>> Or we can reasonably guarantee that these sets are always small?
>
> It's possible to use the standard `HashSet` as a final fallback (or even to replace `OpenAddressed`), although it would need to be wrapped so support the `freeze` method (or we need to implement this differently).
> 
> There are a few reasons however not to do so:
> - `HashSet` doesn't support `freeze` so we'd need to copy it after it is created; we can't use `Set.of` as this would also require making a copy of the entries first (the entries are added one by one because empty values are filtered) -- the solution in this PR carefully avoids all unnecessary copying and allocations to achieve the better performance
> - The sets are indeed in almost all cases very small (hence the optimizations for small sets) -- they contain style classes that apply to a single node or a single selector.  Nodes with many style classes applied to them (ie. 10 or more) I think rarely occurs in practice. Same goes for selectors; you'd have to have something like `a > b c d e f g h i > j` as selector -- remember that this doesn't apply to something like `a, b, c, d, e, f, g, h, i, j`, as they are actually 10 different selectors with 1 style each.
> - `HashSet` is less memory efficient (between 21.5 to 26.7 bytes per entry with the load factor varying between 0.75 and 0.75 * 0.5; `16 + 4 / lf` bytes per entry) vs `OpenAddressed` which requires between 5.7 and 13.3 bytes per entry (due to its load factor variation between 0.3 and 0.7; `4 / lf` per entry) -- why isn't the JDK using open addressed then you may wonder? Well, it actually does in its `Set.of` implementation, but not for `HashSet`, primarily because `HashSet` has less worse edge cases when hash codes are poor, and it has to be more generally usable (open addressed tables degrade easier and require more maintenance when under constant modification, which doesn't apply for us here).

makes sense, thanks!

-------------

PR Review Comment: https://git.openjdk.org/jfx/pull/1316#discussion_r1520069263