LambdaMetafactory requires full privilege access, but doesn't seem to actually restrict functionality

Steven Schlansker stevenschlansker at gmail.com
Wed Jan 19 01:23:00 UTC 2022



> On Jan 18, 2022, at 6:43 AM, Michael Kuhlmann <jdk at fiolino.de> wrote:
> 
> Hi Steven!
> 
> Sorry for coming back so lately; the main reason is that I can't answer your question. The issue you are describing indeed is a bit weird, and I can only speculate that it's like this because LambdaMetaFactory was mainly designed to be used in bootstrap methods for invokedynamic calls. So the checks are probably more restrictive than it should be when being used in standard Java code.

Thanks. I had a suspicion that this would be the answer :)

> But I'm curious about your motivation on using Lambdas at all. Of course, I don't have detailed insights and may miss some important context. Maybe my idea is completely wrong and nonsense; in that case I apologize for being so intrusive. But I was already wondering in the past why Jackson is using bytecode generation when simple MethodHandle combinations would be much easier to use.
> 
> Lambdas like Function and BiConsumer seem to be a bad choice for me. There are two main downsides to it:
> - LambdaMetaFactory only accepts DirectMethodHandles. At least it allows simple type conversions as autoboxing and hierarchical castings, but that's it. Even a simple conversion from a String input (what you probably have when parsing a JSON structure) to a numeric value is impossible to integrate.

Right, the optimization I have written is not so complete as to build the full conversion stack as MethodHandles.

Essentially we are using it only to replace Method[Handle].invoke(), since at the time it was written both carried somewhat of a performance penalty:
https://www.optaplanner.org/blog/2018/01/09/JavaReflectionButMuchFaster.html

Jackson's core implementation reads a stream of tokens, and given a Bean-like object, will react to properties by parsing values and calling a setter.
The tokenizer already takes care of all conversions: there's a hand-rolled UTF8 parser that will emit primitives or String values from the token stream.

This optimization only covers the "calling a setter" or "calling a getter" part of the process in response to tokens from the parser.
We do have specializations for common primitives (int, long, double) to avoid boxing.

So Jackson core will read a property name, deserialize e.g. a long or String or nested object, and then need to call a method on the object to store the (possibly primitive) value.

> - I assume you plan to iterate over the generated lambdas. In this case, Hotspot won't be able to optimize much because each Function/BiConsumer has a different implementation. It's impossible to predict much or even inline much. I would expect the performance to be much lower than the current one.

Interestingly, that wasn't what I found.  At least as of Java 14, the Metafactory generation approach (as limited as my first pass was) was competitive with CGLib hand-generated code in many (but not all) cases, and significantly faster than Method.invoke:
https://github.com/FasterXML/jackson-modules-base/tree/2.14/blackbird/benchmarks
Unfortunately I don't have a MethodHandle.invokeExact comparison in that benchmark. I should go back and finish that work, and re-run it all on 17. If we can call non-static MethodHandles with little overhead, then any reason to use the Metafactory here goes away.

I did try to explore how the inlining worked out using jitwatch but I ran into problems: https://github.com/AdoptOpenJDK/jitwatch/issues/282  

> 
> Why not just concatenating MethodHandles directly? The strategy could be as the following:
> - Use filterArguments to make the getter's return type and the setter's argument convertable
> - Use filterArguments again to directly inject the getter into the setter's argument; so getter and setter will merge into one "transfer" handle
> - Use foldArguments to consecutively perform the transfer handle for each attribute; the resulting handle will transfer all attributes from source to destination
> - Use foldArgument again with MethodHandles::identity to let this transfer handle return its argument
> - And again, use foldArguments with the constructor handle to convert the resulting handle into a simple one-argument-handle expecting the source as a parameter and the newly creation target element as the return value.
> - Use such a handle also for type conversion of nested Pojos (for to-many relations, you have to be creative).
> 
> As a result, you will have only one simple MethodHandle to convert the whole structure at once.
> As a background, back in times I was writing a library to convert Solr result objects to standard Pojos. The main logic was to convert Maps to nested Pojos and vice versa. I used exactly that strategy. It as fun, much better than solving Sudoku ;), and it worked well even with Java 8 features. With the new loop and iterate handles introduced with Java 9, you have even more options.

Cool! Yes, all the new MethodHandle factories are quite interesting.

> Really, I'm curious if this could be an approach for Jackson. Or if not, what could be the obstacles.

I think the problem here is just slightly different: your approach will copy from one Java object to another Java object, but what we are doing is reacting to a token stream and trying to bind it to Java objects with as little overhead as possible.

> 
> 
> On 1/12/22 9:56 PM, Steven Schlansker wrote:
>> Hi core-libs-dev,
>> I am maintaining a module for the popular Jackson JSON library that attempts to simplify code-generation code without losing performance.
>> Long ago, it was a huge win to code-generate custom getter / setter / field accessors rather than use core reflection. Now, the gap is closing a lot with MethodHandles, but there still seems to be some benefit.
>> The previous approach used for code generation relied on the CGLib + ASM libraries, which as I am sure you know leads to horrible-to-maintain code since you essentially write bytecode directly.
>> Feature development basically stopped because writing out long chains of `visitVarInsn(ASTORE, 3)` and the like scares off most contributors, myself included.
>> As an experiment, I started to port the custom class generation logic to use LambdaMetafactory. The idea is to use the factory to generate `Function<Bean, T>` getter and `BiConsumer<Bean, T>` setter implementations.
>> Then, use those during (de)serialization to access or set data.  Eventually hopefully the JVM will inline, removing all (?) reflection overhead.
>> The invocation looks like:
>> var lookup = MethodHandles.privateLookupIn(targetClass, MethodHandles.lookup()); // allow non-public access
>> var getter = lookup.unreflect(someGetterMethod);
>> LambdaMetafactory.metafactory(
>>   lookup,
>>   "apply",
>>   methodType(Function.class),
>>   methodType(Object.class, Object.class),
>>   getter,
>>   getter.type())
>> This works well for classes from the same classloader. However, once you try to generate lambdas with implementations loaded from a different classloader, you run into a check in the AbstractValidatingLambdaMetafactory constructor:
>> if (!caller.hasFullPrivilegeAccess()) {
>>   throw new LambdaConversionException(String.format(
>>     "Invalid caller: %s",
>>     caller.lookupClass().getName()));
>> }
>> The `privateLookupIn` call seems to drop MODULE privilege access when looking across ClassLoaders. This appears to be because the "unnamed module" differs between a ClassLoader and its child.
>> This happens without the use of modulepath at all, only classpath, where I would not expect module restrictions to be in play.
>> Through some experimentation, I discovered that while I cannot call the LambdaMetafactory with this less-privileged lookup, I am still allowed to call defineClass.
>> So, I compile a simple class:
>> package <targetclasspackage>;
>> class AccessCracker { static final Lookup LOOKUP = MethodHandles.lookup(); }
>> and inject it into the target class's existing package:
>> lookup = lookup.defineClass(compiledBytes).getField("LOOKUP").get(null);
>> and now I have a full privileged lookup into the target classloader, and the Metafactory then seems to generate lambdas without complaint.
>> This workaround seems to work well, although it's a bummer to have to generate and inject these dynamic accessor classes.
>> It feels inconsistent that on one hand my Lookup is not powerful enough to generate a simple call-site with the Metafactory,
>> but at the same time it is so powerful that I can load arbitrary bytecode into the target classloader, and thus indirectly do what I wanted to do in the first place (with a fair bit more work)
>> There's a bit of additional context here:
>> https://github.com/FasterXML/jackson-modules-base/issues/138
>> https://github.com/FasterXML/jackson-modules-base/pull/162/files
>> Any chance the Metafactory might become powerful enough to generate call sites even across such unnamed Modules in a future release? Or even more generally across arbitrary Modules, if relevant access checks pass?
>> I'm also curious for any feedback on the overall approach of using the Metafactory, perhaps I am way off in the weeds, and should just trust MethodHandles to perform well if you use invokeExact :) JMH does seem to show some benefit though especially with Graal compiler.
>> Thanks a bunch for any thoughts,
>> Steven Schlansker



More information about the core-libs-dev mailing list