Call for Dicussion: JEP: Java Expression Trees API

Konstantin Triger kostat at gmail.com
Mon Apr 22 01:32:05 UTC 2024


Author: Konstantin Triger
Type: Feature
Scope: SE
Relates to: JEP 466: Class-File API (Second Preview)
<https://openjdk.org/jeps/466>
Template: 2.0

Summary
-------

Provide a standard API for parsing and semantic analysis of Java methods.

Goals
-----

* Provide an API to convert the bytecode of a Java method into an
expression tree (AST) suitable for semantic analysis in runtime.

Non-Goals
---------

* It is not a goal to obsolete existing libraries or Class-File API.
Rather, the intention is to provide a higher-level abstraction.
* It is not a goal to support all the possible language constructs. Non
logical instructions, such as try/catch, etc are out of scope.

Motivation
----------

Programs usually go beyond the boundaries of a single execution environment
and delegate part of their logic to external servers such as databases.
Even if executed within the same environment (process), there might be
engines that do not understand Java bytecode and require input in another
language, DSL or a configuration file.
In a common Java program, a lot of configuration is supplied via external
files or spread in annotations, making them hard to find and maintain.

While Java and other ecosystem languages excel in expressing logic or
instructions in general, this *_information_* is extremely hard to get and
process in runtime. This makes it practically impossible to use Java
to _*express
the logic_*, but _*delegate_* its execution to an external engine, i.e.
convert Java bytecode to SQL, another DSL or provide a "fluent
configuration".

Consider the following use cases:
1. SQL execution
```java
JPA.SQL((Person p) -> {
    // intended for execution inside an SQL database
    SELECT(p);
    FROM(p);
    WHERE(p.getName() == name);
});
```

2. DSL creation
```java
QueryBuilder<Restaurant> builder = Mongo.queryBuilder(Restaurant.class);

// the following lambda is converted to Bson
Bson filter = builder.filter(r -> r.getStars() >= 2 && r.getStars() < 5
                                         &&
r.getCategories().contains("Bakery"));
```

2.1 What are the options today?
Let's take an example of the new Jakarta API for NoSQL databases:
https://www.jnosql.org/javadoc/jakarta.nosql.core/jakarta/nosql/Template.html

An example of a select method looks like this:
```java
List<Book> books = template.select(Book.class)
         .where("author")
         .eq("Joshua Bloch")
         .and("edition")
         .gt(3)
         .results();
```

But could be this:
```java
var author = "Joshua Bloch";
var edition = 3;

// Under the hood use the Expression Trees API to parse the lambda
List<Book> books = template.select((Book b) -> b.getAuthor() == author &&
b.getEdition() == edition).results();
```

3. Fluent Configuration
```java
// Configure DateOfBirth column
modelBuilder.entity<Student>()
            .property(p::getDateOfBirth) // use Expression Trees API to get
the property name
            .hasColumnName("DOB")
            .hasColumnOrder(2)
            .hasColumnType("datetime2");
```

Description
------------

### The following design goals and principles used for the Expression Trees
API:
* The expression tree can be created in runtime by passing a lambda
instance or java.lang.reflection.Method instance with the corresponding
class instance, if the passed method is an instance method.
* The expression tree is a logical/semantic representation of a method
body, including the method parameters. Possible node types are not tied to
Java language and able to express a construct in any imperative language
compiled to Java bytecode.
* To be successfully parsed, the method must not contain any non-supported
constructs; otherwise a runtime exception is thrown.
* Methods accepting Lambdas intended for parsing can be marked with an
annotation, thus tools and IDEs can have a fair ability to identify lambdas
with non-supported constructs.
* The expression tree is immutable. This facilitates reliable sharing when
a tree is being analyzed or transformed.

### Nodes and Expression Types

Nodes are Java classes. Each node may have a different Expression Type. For
example, `BinaryExpression` node might have type `Add` or `Divide`.

#### Nodes
```
   - abstract Expression - provides the base class from which the classes
that describe expression tree
                          nodes are derived. It also contains static
factory methods to create the
                          various node types.
   - ConstantExpression  extends Expression - Describes an expression that
has a constant value.
   - UnaryExpression extends Expression - Describes an expression that has
a unary operator.
   - BinaryExpression extends UnaryExpression - Describes an expression
that has a binary operator.
   - ParameterExpression extends Expression - Describes an indexed argument
or parameter expression.
   - abstract InvocableExpression extends Expression - Describes an
expression that can be invoked by applying
                    0 or more arguments and might return a result.
   - MemberExpression extends InvocableExpression - Describes accessing a
field or method.
   - LambdaExpression<F> extends InvocableExpression - Describes a lambda
expression.
                     Captures a block of code that is equivalent to a
method body.
   - DelegateExpression extends InvocableExpression - Describes a higher
order construct, where InvocableExpression
                      is chained.
   - InvocationExpression extends Expression - Describes an expression that
applies a list of argument expressions
                      to an InvocableExpression
   - BlockExpression extends Expression - Describes a sequence of
expressions
   - NewArrayInitExpression extends Expression - Describes a
one-dimensional array and initialising it from a list of elements.
```

#### Expression Types
```
    - Add         // A node that represents arithmetic addition.
    - BitwiseAnd  // A node that represents a bitwise AND operation.
    - LogicalAnd  // A node that represents a logical AND operation.
    - ArrayIndex  // A node that represents indexing into an array.
    - ArrayLength // A node that represents getting the length of an array.
    - Coalesce    // A node that represents a null coalescing operation.
    - Conditional // A node that represents a conditional operation.
    - Constant    // A node that represents an expression that has a
constant value.
    - Convert     // A node that represents a cast or conversion operation.
    - Divide      // A node that represents arithmetic division.
    - Equal       // A node that represents an equality comparison.
    - ExclusiveOr // A node that represents a bitwise XOR operation.
    - GreaterThan // A node that represents a "greater than" numeric
comparison.
    - GreaterThanOrEqual  // A node that represents a "greater than or
equal" numeric comparison.
    - Invoke      // A node that represents application of a list of
argument expressions to an InvocableExpression.
    - IsNull        // A node that represents a null test.
    - IsNonNull   // A node that represents a non null test.
    - Lambda      // A node that represents a lambda expression.
    - Delegate    // A node that represents a lambda chain expression.
    - LeftShift   // A node that represents a bitwise left-shift operation.
    - LessThan    // A node that represents a "less than" numeric
comparison.
    - LessThanOrEqual // A node that represents a "less than or equal"
numeric comparison.
    - FieldAccess     // A node that represents reading from a field.
    - MethodAccess    // A node that represents a method call.
    - Modulo      // A node that represents an arithmetic remainder
operation.
    - Multiply    // A node that represents arithmetic multiplication.
    - Negate      // A node that represents an arithmetic negation
operation.
    - New         // A node that represents calling a constructor to create
a new object.
    - NewArrayInit    // An operation that creates a new one-dimensional
array and initializes it from a list of elements.
    - BitwiseNot  // A node that represents a bitwise complement operation.
    - LogicalNot  // A node that represents a logical NOT operation.
    - NotEqual    // A node that represents an inequality comparison.
    - BitwiseOr   // A node that represents a bitwise OR operation.
    - LogicalOr   // A node that represents a logical OR operation.
    - Parameter   // A node that represents a parameter index defined in
the context of the expression.
    - RightShift  // A node that represents a bitwise right-shift operation.
    - Subtract    // A node that represents arithmetic subtraction.
    - InstanceOf  // A node that represents a type test.
    - Block       // A node that contains a sequence of other nodes.
```

### Entry Point

`LambdaExpression` will introduce a static method that returns a parsed
Expression Tree:

```java
public static <T> LambdaExpression<T> parse(T lambda) {...}
```

### Analysis and transformation

There is a builtin support for a visitor pattern via `ExpressionVisitor`
interface that defines `visit` methods for all the non-abstract node
classes.
In addition, there is an `abstract SimpleExpressionVisitor implements
ExpressionVisitor`. By default, it performs a recursive traversal of
expressions and their sub-expressions.
If none is modified an original expression is returned; if any
sub-expression is modified, then a new expression is created, recursively.

With this a member expression transformer might look like this:

```java
val transformer = new SimpleExpressionVisitor() {
    @Override
    public Expression visit(MemberExpression e) {
        ...
        return <transformed expression>
    }
};
```

Let's consider a hypothetical fluent builder that needs to extract property
name:
```java
Fluent<Customer> f = new Fluent<Customer>();
f.property(Customer::getData)...;
```

To extract the Member, the user will need to override a single `visit()`
method. Like the following:

```java
class MemberExtractor extends SimpleExpressionVisitor {
    private MemberExpression memberExpression;

    @Override
    public Expression visit(MemberExpression e) {
        memberExpression = e;
        return e;
    }
}
```

A more complete implementation for the Fluent class above can be found here
<https://github.com/streamx-co/ExTree/blob/master/src/test/java/co/streamx/fluent/extree/Fluent.java>
.

### Other Java ecosystem languages and "special" methods

Some constructs do not have a direct translation into the Java byte code
and are implemented using a runtime library method call. The runtime
library developer might be interested in translating a method call
expression into a different expression.
Consider the following code:
```java
var b1 = new BigDecimal(67891);
var b2 = new BigDecimal(12346);
if (b1.compareTo(b2) == 0) {
...
```

There should be a possibility for a library developer to transform an
expression calling to `compareTo()` method and then comparing to zero into
a logical comparizon expression between the two decimals.
Currently the mechanism to achieve this is not specified in this draft, but
eventually must be introduced in this or a follow-up JEP.

### Practical usability / Existing art

To be successful in achieving the JEP goals, there is a need for
comprehensive POC projects.

I have created 3 prototype projects:
1. ExTree <https://github.com/streamx-co/ExTree>. The project prototypes
this JEP on top of ASM.
2. FluentJPA <https://github.com/streamx-co/FluentJPA>. The project uses
Java to write SQL and integrates with JPA.
3. FluentMongo <https://github.com/streamx-co/FluentMongo>. The project
uses Java to write Mongo queries i.e. Mongo BSON documents.

These projects helped to shape this API.

Testing
----

TBD

Alternatives
-----

An obvious alternative is to not integrate this into JRE and keep it as a
separate project.
Being considered low level and possibly "fragile" technology, it will be
hardly accepted into the mainstream projects where it might serve the large
Java community.

Dependencies
----
JEP 466

-- 
Regards,
Konstantin Triger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/discuss/attachments/20240422/8799c663/attachment-0001.htm>


More information about the discuss mailing list