RFR: 8152641: Plugin to generate BMH$Species classes ahead-of-time
Claes Redestad
claes.redestad at oracle.com
Wed Mar 30 16:17:39 UTC 2016
Hi Peter,
something like this, then:
http://cr.openjdk.java.net/~redestad/8152641/webrev.05/
- Added a method only used by the plugin which validates input; added a
comment about the dependency
- Invalid types are logged but otherwise ignored now (I bet someone
might suggest a better way to handle user errors)
- Some cleanup, introduced constant for class name prefix and removed
duplicate type string shortening etc
/Claes
On 2016-03-30 17:17, Peter Levart wrote:
> Hi Claes,
>
> webrev.04 looks good now.
>
> Maybe just one nit. For production quality, plugin input could be
> verified that after expansion it is composed of just the following
> characters: "LIJFD". Otherwise ClassWriter might generate an unusable
> class without complaining (for example if someone sneaks-in characters
> 'S' 'Z' 'B' or 'C')...
>
> Or, better yet, create another method in BMH that will be the "public"
> API between the plugin and BMH which does the validation and calls
> generateConcreteBMHClassBytes(). Internally in BMH the validation is
> not necessary as the types strings are composed programmatically and
> are guaranteed to be correct.
>
> Regards, Peter
>
> On 03/30/2016 04:15 PM, Claes Redestad wrote:
>>
>>
>> On 2016-03-30 14:21, Peter Levart wrote:
>>> Hi Claes,
>>>
>>> On 03/30/2016 12:53 PM, Claes Redestad wrote:
>>>> Hi,
>>>>
>>>> I think Class.forName(Module, String) seemed like a nice
>>>> efficiency/simplicity compromise, but there is some usage of
>>>> lambdas/method-refs in the Module lookup code, especially for
>>>> exploded modules (which get exercised during JDK build). Depending
>>>> on a lambda from code in java.lang.invoke means we fail to bootstrap.
>>>>
>>>> But hey, we're living in an encapsulated world now, and this is in
>>>> java.base, so why not go directly to the BootLoader:
>>>>
>>>> http://cr.openjdk.java.net/~redestad/8152641/webrev.03/
>>>>
>>>> Since this is what's called from Class.forName(Module, String), the
>>>> overhead should be even smaller than in your classForNameInModule test.
>>>
>>> Good idea.
>>>
>>>>
>>>> If I call this final, will you say "Reviewed"? :-)
>>>
>>> Sorry, I don't posses the powers. :-)
>>>
>>> I could say "rEVIEWED", but...
>>>
>>> In the plugin, the input is shortened speciesTypes strings. What
>>> guarantees that they really are normalized? For example, If one
>>> specifies "LLL" as input, it will get expanded into "LLL", the
>>> generated class will have "_L3" as a name suffix, but you will pack
>>> it in the image with "_LLL.class" filename suffix.
>>>
>>> That's another reason why a method in BoundMethodHandle$Factory with
>>> the following signature:
>>>
>>> Map.Entry<String, byte[]> generateConcreteBMHClassBytes(String types);
>>>
>>> ...would be a good idea. It would return class bytes and the name of
>>> the class which you could use to derive the class filename without
>>> hardcoding the same logic in plugin and in BMH.
>>>
>>> You just move the "LambdaForm.shortenSignature(types)" from
>>> getConcreteBMHClass and className/sourceFile calculation from
>>> generateConcreteBMHClass down to generateConcreteBMHClassBytes
>>> method and change the signatures...
>>
>> Yes, it makes sense to keep control over the class name inside the
>> factory class, and this does allow specifying shortened or expanded
>> forms (L3 vs LLL) interchangeably as input to the plugin, which
>> reduces possibility for user errors.
>>
>> How about this: http://cr.openjdk.java.net/~redestad/8152641/webrev.04/
>>
>> /Claes
>>
>>>
>>> Regards, Peter
>>>
>>>>
>>>> /Claes
>>>>
>>>> PS: The default list of types is generated in a few adhoc tests not
>>>> part of this patch. I'm considering proposing add support for
>>>> generating this list at build time. Maybe a JEP called "Build
>>>> system support for profile-guided optimizations", which could also
>>>> handle https://bugs.openjdk.java.net/browse/JDK-8150044
>>>>
>>>> On 2016-03-30 09:53, Peter Levart wrote:
>>>>> Hi Claes,
>>>>>
>>>>> Sorry, I found a flaw in the benchmark (the regex pattern to split
>>>>> comma-separated string into substrings was wrong). What the
>>>>> benchmark did was compare the overheads of a single lookup of a
>>>>> longer class name containing commas. Here's the corrected result
>>>>> of overheads of 5 unsuccessful lookups:
>>>>>
>>>>> Benchmark (generate) (lookup) Mode Cnt
>>>>> Score Error Units
>>>>> ClassForNameBench.classForName LL,LLL,LLLL,LLLLL,LLLLLL
>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 29627.141 ± 567.427 ns/op
>>>>> ClassForNameBench.classForNameInModule LL,LLL,LLLL,LLLLL,LLLLLL
>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 1073.256 ± 23.794 ns/op
>>>>> ClassForNameBench.hashSetContains LL,LLL,LLLL,LLLLL,LLLLLL
>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 33.022 ± 0.066 ns/op
>>>>> ClassForNameBench.switchStatement LL,LLL,LLLL,LLLLL,LLLLLL
>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 38.498 ± 5.842 ns/op
>>>>>
>>>>> ...overheads are a little bigger (x5 approx.).
>>>>>
>>>>>
>>>>> Here's the corrected benchmark:
>>>>>
>>>>>
>>>>> package jdk.test;
>>>>>
>>>>> import org.openjdk.jmh.annotations.*;
>>>>> import org.openjdk.jmh.infra.Blackhole;
>>>>>
>>>>> import java.io.Serializable;
>>>>> import java.lang.invoke.MethodHandle;
>>>>> import java.util.Iterator;
>>>>> import java.util.Set;
>>>>> import java.util.concurrent.TimeUnit;
>>>>> import java.util.stream.Collectors;
>>>>> import java.util.stream.Stream;
>>>>>
>>>>> @BenchmarkMode(Mode.AverageTime)
>>>>> @Fork(value = 1, warmups = 0)
>>>>> @Warmup(iterations = 10)
>>>>> @Measurement(iterations = 10)
>>>>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>>>>> @State(Scope.Thread)
>>>>> public class ClassForNameBench {
>>>>>
>>>>> static final String BMH = "java/lang/invoke/BoundMethodHandle";
>>>>> static final String SPECIES_PREFIX_NAME = "Species_";
>>>>> static final String SPECIES_PREFIX_PATH = BMH + "$" +
>>>>> SPECIES_PREFIX_NAME;
>>>>>
>>>>> @Param({"LL,LLL,LLLL,LLLLL,LLLLLL"})
>>>>> public String generate;
>>>>>
>>>>> @Param({"LLI,LLLI,LLLLI,LLLLLI,LLLLLLI"})
>>>>> public String lookup;
>>>>>
>>>>> private String[] generatedTypes;
>>>>> private String[] lookedUpTypes;
>>>>> private Set<String> generatedNames;
>>>>> private String[] lookedUpNames;
>>>>>
>>>>> @Setup(Level.Trial)
>>>>> public void setup() {
>>>>> generatedTypes = generate.trim().split("\\s*,\\s*");
>>>>> lookedUpTypes = lookup.trim().split("\\s*,\\s*");
>>>>> generatedNames = Stream.of(generatedTypes)
>>>>> .map(types -> SPECIES_PREFIX_PATH +
>>>>> shortenSignature(types))
>>>>> .collect(Collectors.toSet());
>>>>> lookedUpNames = Stream.of(lookedUpTypes)
>>>>> .map(types -> SPECIES_PREFIX_PATH +
>>>>> shortenSignature(types))
>>>>> .toArray(String[]::new);
>>>>> }
>>>>>
>>>>> @Benchmark
>>>>> public void classForName(Blackhole bh) {
>>>>> for (String name : lookedUpNames) {
>>>>> try {
>>>>> bh.consume(Class.forName(name, false,
>>>>> MethodHandle.class.getClassLoader()));
>>>>> } catch (ClassNotFoundException e) {
>>>>> bh.consume(e);
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>> @Benchmark
>>>>> public void classForNameInModule(Blackhole bh) {
>>>>> for (String name : lookedUpNames) {
>>>>> bh.consume(Class.forName(MethodHandle.class.getModule(), name));
>>>>> }
>>>>> }
>>>>>
>>>>> @Benchmark
>>>>> public void hashSetContains(Blackhole bh) {
>>>>> for (String name : lookedUpNames) {
>>>>> bh.consume(generatedNames.contains(name));
>>>>> }
>>>>> }
>>>>>
>>>>> @Benchmark
>>>>> public void switchStatement(Blackhole bh) {
>>>>> for (String types : lookedUpTypes) {
>>>>> bh.consume(getBMHSwitch(types));
>>>>> }
>>>>> }
>>>>>
>>>>> static Class<?> getBMHSwitch(String types) {
>>>>> // should be in sync with @Param generate above...
>>>>> switch (types) {
>>>>> case "LL": return Object.class;
>>>>> case "LLL": return Serializable.class;
>>>>> case "LLLL": return Iterator.class;
>>>>> case "LLLLL": return Throwable.class;
>>>>> case "LLLLLL": return String.class;
>>>>> default: return null;
>>>>> }
>>>>> }
>>>>>
>>>>> // copied from non-public LambdaForm
>>>>> static String shortenSignature(String signature) {
>>>>> // Hack to make signatures more readable when they show up
>>>>> in method names.
>>>>> final int NO_CHAR = -1, MIN_RUN = 3;
>>>>> int c0, c1 = NO_CHAR, c1reps = 0;
>>>>> StringBuilder buf = null;
>>>>> int len = signature.length();
>>>>> if (len < MIN_RUN) return signature;
>>>>> for (int i = 0; i <= len; i++) {
>>>>> // shift in the next char:
>>>>> c0 = c1; c1 = (i == len ? NO_CHAR : signature.charAt(i));
>>>>> if (c1 == c0) { ++c1reps; continue; }
>>>>> // shift in the next count:
>>>>> int c0reps = c1reps; c1reps = 1;
>>>>> // end of a character run
>>>>> if (c0reps < MIN_RUN) {
>>>>> if (buf != null) {
>>>>> while (--c0reps >= 0)
>>>>> buf.append((char) c0);
>>>>> }
>>>>> continue;
>>>>> }
>>>>> // found three or more in a row
>>>>> if (buf == null)
>>>>> buf = new StringBuilder().append(signature, 0, i -
>>>>> c0reps);
>>>>> buf.append((char) c0).append(c0reps);
>>>>> }
>>>>> return (buf == null) ? signature : buf.toString();
>>>>> }
>>>>> }
>>>>>
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>
>>>>>
>>>>> On 03/30/2016 09:40 AM, Peter Levart wrote:
>>>>>> Hi Claes,
>>>>>>
>>>>>> On 03/30/2016 01:03 AM, Claes Redestad wrote:
>>>>>>> Hi Peter, Mandy,
>>>>>>>
>>>>>>> On 2016-03-26 12:47, Peter Levart wrote:
>>>>>>>>
>>>>>>>> Comparing this with proposed code from webrev:
>>>>>>>>
>>>>>>>> 493 try {
>>>>>>>> 494 return (Class<? extends
>>>>>>>> BoundMethodHandle>)
>>>>>>>> 495
>>>>>>>> Class.forName("java.lang.invoke.BoundMethodHandle$Species_" +
>>>>>>>> LambdaForm.shortenSignature(types));
>>>>>>>> 496 } catch (ClassNotFoundException cnf) {
>>>>>>>> 497 // Not pregenerated, generate
>>>>>>>> the class
>>>>>>>> 498 return
>>>>>>>> generateConcreteBMHClass(types);
>>>>>>>> 499 }
>>>>>>>>
>>>>>>>>
>>>>>>>> ...note that the function passed to CLASS_CACHE.computeIfAbsent
>>>>>>>> is executed just once per distinct 'types' argument. Even if
>>>>>>>> you put the generated switch between a call to
>>>>>>>> getConcreteBMHClass and CLASS_CACHE.computeIfAbsent,
>>>>>>>> getConcreteBMHClass(String types) is executed just once per
>>>>>>>> distinct 'types' argument (except in rare occasions when VM can
>>>>>>>> not initialize the loaded class).
>>>>>>>>
>>>>>>>> In this respect a successful Class.forName() is not any worse
>>>>>>>> than static resolving. It's just that unsuccessful
>>>>>>>> Class.forName() represents some overhead for classes that are
>>>>>>>> not pre-generated. So an alternative way to get rid of that
>>>>>>>> overhead is to have a HashSet of 'types' strings for
>>>>>>>> pre-generated classes at hand in order to decide whether to
>>>>>>>> call Class.forName or generateConcreteBMHClass.
>>>>>>>>
>>>>>>>> What's easier to support is another question.
>>>>>>>>
>>>>>>>> Regards, Peter
>>>>>>>
>>>>>>> to have something to compare with I built a version which, like
>>>>>>> you suggest,
>>>>>>> generates a HashSet<String> with the set of generated classes here:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~redestad/8152641/webrev.02/
>>>>>>>
>>>>>>> This adds a fair bit of complexity to the plugin and requires we
>>>>>>> add a nested
>>>>>>> class in BoundMethodHandle that we can replace. Using the anonymous
>>>>>>> compute Function for this seems like the best choice for this.
>>>>>>
>>>>>> ...what I had in mind as alternative to a pregenerated class with
>>>>>> a switch was a simple textual resource file, generated by plugin,
>>>>>> read-in by BMH into a HashSet. No special-purpose class
>>>>>> generation and much less complexity for the plugin.
>>>>>>
>>>>>>>
>>>>>>> I've not been able to measure any statistical difference in real
>>>>>>> startup terms,
>>>>>>> though, and not figured out a smart way to benchmark the
>>>>>>> overhead of the
>>>>>>> CNFE in relation to the class generation (my guess it adds a
>>>>>>> fraction to the
>>>>>>> total cost) and since this adds ever so little footprint and an
>>>>>>> extra lookup to the
>>>>>>> fast path it would seem that this is the wrong trade-off to do here.
>>>>>>
>>>>>> Yes, perhaps it would be best to just use Class.forName(module,
>>>>>> className) instead. I have created a little benchmark to compare
>>>>>> overheads (just overheads) of unsuccessful lookups for
>>>>>> pregenerated classes (a situation where a BMH class is requested
>>>>>> that has not been pregenerated) and here's the result for
>>>>>> overhead of 5 unsuccessfull lookups:
>>>>>>
>>>>>> Benchmark (generate) (lookup) Mode Cnt
>>>>>> Score Error Units
>>>>>> ClassForNameBench.classForName LL,LLL,LLLL,LLLLL,LLLLLL
>>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 6800.878 ± 421.424 ns/op
>>>>>> ClassForNameBench.classForNameInModule LL,LLL,LLLL,LLLLL,LLLLLL
>>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 209.574 ± 2.114 ns/op
>>>>>> ClassForNameBench.hashSetContains LL,LLL,LLLL,LLLLL,LLLLLL
>>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 6.813 ± 0.317 ns/op
>>>>>> ClassForNameBench.switchStatement LL,LLL,LLLL,LLLLL,LLLLLL
>>>>>> LLI,LLLI,LLLLI,LLLLLI,LLLLLLI avgt 10 6.601 ± 0.061 ns/op
>>>>>>
>>>>>> ...compared to runtime BMH class generation and loading this is
>>>>>> really a very minor overhead. I would just use
>>>>>> Class.forName(module, className) and reduce the complexity of the
>>>>>> plugin.
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>>
>>>>>> Here's the benchmark:
>>>>>>
>>>>>> package jdk.test;
>>>>>>
>>>>>> import org.openjdk.jmh.annotations.*;
>>>>>> import org.openjdk.jmh.infra.Blackhole;
>>>>>>
>>>>>> import java.io.Serializable;
>>>>>> import java.lang.invoke.MethodHandle;
>>>>>> import java.util.Iterator;
>>>>>> import java.util.Set;
>>>>>> import java.util.concurrent.TimeUnit;
>>>>>> import java.util.stream.Collectors;
>>>>>> import java.util.stream.Stream;
>>>>>>
>>>>>> @BenchmarkMode(Mode.AverageTime)
>>>>>> @Fork(value = 1, warmups = 0)
>>>>>> @Warmup(iterations = 10)
>>>>>> @Measurement(iterations = 10)
>>>>>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>>>>>> @State(Scope.Thread)
>>>>>> public class ClassForNameBench {
>>>>>>
>>>>>> static final String BMH = "java/lang/invoke/BoundMethodHandle";
>>>>>> static final String SPECIES_PREFIX_NAME = "Species_";
>>>>>> static final String SPECIES_PREFIX_PATH = BMH + "$" +
>>>>>> SPECIES_PREFIX_NAME;
>>>>>>
>>>>>> @Param({"LL,LLL,LLLL,LLLLL,LLLLLL"})
>>>>>> public String generate;
>>>>>>
>>>>>> @Param({"LLI,LLLI,LLLLI,LLLLLI,LLLLLLI"})
>>>>>> public String lookup;
>>>>>>
>>>>>> private String[] generatedTypes;
>>>>>> private String[] lookedUpTypes;
>>>>>> private Set<String> generatedNames;
>>>>>> private String[] lookedUpNames;
>>>>>>
>>>>>> @Setup(Level.Trial)
>>>>>> public void setup() {
>>>>>> generatedTypes = generate.trim().split("\\s+,\\s+");
>>>>>> lookedUpTypes = lookup.trim().split("\\s+,\\s+");
>>>>>> generatedNames = Stream.of(generatedTypes)
>>>>>> .map(types -> SPECIES_PREFIX_PATH
>>>>>> + shortenSignature(types))
>>>>>> .collect(Collectors.toSet());
>>>>>> lookedUpNames = Stream.of(lookedUpTypes)
>>>>>> .map(types -> SPECIES_PREFIX_PATH +
>>>>>> shortenSignature(types))
>>>>>> .toArray(String[]::new);
>>>>>> }
>>>>>>
>>>>>> @Benchmark
>>>>>> public void classForName(Blackhole bh) {
>>>>>> for (String name : lookedUpNames) {
>>>>>> try {
>>>>>> bh.consume(Class.forName(name, false,
>>>>>> MethodHandle.class.getClassLoader()));
>>>>>> } catch (ClassNotFoundException e) {
>>>>>> bh.consume(e);
>>>>>> }
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> @Benchmark
>>>>>> public void classForNameInModule(Blackhole bh) {
>>>>>> for (String name : lookedUpNames) {
>>>>>> bh.consume(Class.forName(MethodHandle.class.getModule(), name));
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> @Benchmark
>>>>>> public void hashSetContains(Blackhole bh) {
>>>>>> for (String name : lookedUpNames) {
>>>>>> bh.consume(generatedNames.contains(name));
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> @Benchmark
>>>>>> public void switchStatement(Blackhole bh) {
>>>>>> for (String types : lookedUpTypes) {
>>>>>> bh.consume(getBMHSwitch(types));
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> static Class<?> getBMHSwitch(String types) {
>>>>>> // should be in sync with @Param generate above...
>>>>>> switch (types) {
>>>>>> case "LL": return Object.class;
>>>>>> case "LLL": return Serializable.class;
>>>>>> case "LLLL": return Iterator.class;
>>>>>> case "LLLLL": return Throwable.class;
>>>>>> case "LLLLLL": return String.class;
>>>>>> default: return null;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> // copied from non-public LambdaForm
>>>>>> static String shortenSignature(String signature) {
>>>>>> // Hack to make signatures more readable when they show
>>>>>> up in method names.
>>>>>> final int NO_CHAR = -1, MIN_RUN = 3;
>>>>>> int c0, c1 = NO_CHAR, c1reps = 0;
>>>>>> StringBuilder buf = null;
>>>>>> int len = signature.length();
>>>>>> if (len < MIN_RUN) return signature;
>>>>>> for (int i = 0; i <= len; i++) {
>>>>>> // shift in the next char:
>>>>>> c0 = c1; c1 = (i == len ? NO_CHAR : signature.charAt(i));
>>>>>> if (c1 == c0) { ++c1reps; continue; }
>>>>>> // shift in the next count:
>>>>>> int c0reps = c1reps; c1reps = 1;
>>>>>> // end of a character run
>>>>>> if (c0reps < MIN_RUN) {
>>>>>> if (buf != null) {
>>>>>> while (--c0reps >= 0)
>>>>>> buf.append((char) c0);
>>>>>> }
>>>>>> continue;
>>>>>> }
>>>>>> // found three or more in a row
>>>>>> if (buf == null)
>>>>>> buf = new StringBuilder().append(signature, 0, i
>>>>>> - c0reps);
>>>>>> buf.append((char) c0).append(c0reps);
>>>>>> }
>>>>>> return (buf == null) ? signature : buf.toString();
>>>>>> }
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards, Peter
>>>>>>
>>>>>>>
>>>>>>> All-in-all I lean towards moving forward with the first, simpler
>>>>>>> revision of this
>>>>>>> patch[1] and then evaluate if webrev.02 or a
>>>>>>> String-switch+static resolve
>>>>>>> as a follow-up.
>>>>>>>
>>>>>>> A compromise would be to keep the SpeciesLookup class introduced
>>>>>>> here
>>>>>>> to allow removing all overhead in case the plugin is disabled.
>>>>>>>
>>>>>>> Mandy: I've not found any test (under jdk/test/tools/jlink or
>>>>>>> elsewhere) which
>>>>>>> has to be updated when adding a plugin like this.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> /Claes
>>>>>>>
>>>>>>> [1] http://cr.openjdk.java.net/~redestad/8152641/webrev.01/
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the jigsaw-dev
mailing list