generated code and jigsaw modules

Richard Hillegas rhillegas at comcast.net
Wed Oct 10 15:37:57 UTC 2018


Thanks again to Rémi and Alan for their advice. Unfortunately, I have 
not been able to make either approach work, given another complexity of 
Derby's class loading. Let me explain that additional issue.

Derby lets users load jar files into the database. There they live as 
named blobs of bytes. The jar files contain user-defined data types, 
functions, procedures, and aggregators, which are coded in Java and can 
be used in SQL statements. Derby lets users wire these jar files into a 
custom classpath which drives a custom ClassLoader at query-execution 
time. I have not been able to make this custom ClassLoader work with 
either Rémi or Alan's approach. Note that a Derby engine manages many 
databases and each database can have its own custom ClassLoader.

I like the simplicity of Rémi's approach:

java.lang.invoke.MethodHandles.lookup().defineClass(generatedClassBytes)

This approach does indeed put the generated class where I want it: 
inside the Derby engine module. Unfortunately, the ClassLoader of the 
generated class is the application class loader. I can't figure out how 
to force the generated class to use the custom ClassLoader instead. As a 
consequence,  the generated class cannot resolve user-defined functions 
which live inside jar files in the database. Poking the customer 
ClassLoader into the thread's context class loader before calling 
MethodHandles.lookup() doesn't work.

Alan's approach is a bit more complicated. It involves following the 
pattern in com.sun.org.apache.xalan.internal.xsltc.trax.TemplatesImpl. 
It involves generating a temporary module for each generated class and 
then adding more export directives to the engine module so that the 
generated module can call back into the engine. I have to say I'm a 
little confused about the implications of slow memory leaks with this 
approach. I don't know what happens to these generated modules and 
export directives when the generated class is garbage-collected.

More immediately, however, I am up against the same problem which 
plagues Rémi's approach: how do I get the generated module to resolve 
classes in the custom ClassLoader? More specifically, I am stuck trying 
to get the generated module to require the user-written modules, that 
is, the user-written jar files. What I am missing is the ability to 
retrieve the module names of these jar files so that I can craft 
requires directives. The only way I know to get a module name is to use 
ModuleFinder.of(Path...). Unfortunately, the Path interface is an 
abstraction for file systems and is not a good fit for locating a blob 
of bytes stored inside a database.

I would appreciate any further advice about how to get over these speed 
bumps.

Thanks,
-Rick


On 10/4/18 9:10 AM, Richard Hillegas wrote:
> I am looking for advice about how to tighten up module encapsulation 
> while generating byte code on the fly. I ask this question on behalf 
> of Apache Derby, a pure-Java relational database whose original code 
> dates back to Java 1.2. I want to reduce Derby's attack-surface when 
> running with a module path.
>
> First a little context: A relational database is an interpreter for 
> the SQL language. It converts SQL queries into byte code which then 
> runs on a virtual machine embedded in the interpreter. In Derby's 
> case, the virtual machine is the Java VM and the byte code is simply 
> Java byte code. That is, a Derby query plan is a class whose byte code 
> is generated on the fly at run time.
>
> I have converted the Apache Derby codeline into a set of jigsaw 
> modules: https://issues.apache.org/jira/browse/DERBY-6945. 
> Unfortunately, I had to punch holes in the encapsulation of the main 
> Derby module so that the generated query plans could call back into 
> the Derby engine. That is because, by default, generated query plans 
> load into the catch-all, unnamed module. Note that all of these 
> generated classes live in a single package which does not belong to 
> any named module.
>
> 1) Is it possible to load generated code into a named module?
>
> 2) Alternatively, can someone recommend another approach for 
> preserving module encapsulation while generating classes on the fly?
>
> I would appreciate any advice or examples which you can recommend.
>
> Thanks,
> -Rick
>
>



More information about the core-libs-dev mailing list