[code-reflection] RFR: 8324789: Add line number information to code models

Paul Sandoz psandoz at openjdk.org
Tue Apr 23 23:11:53 UTC 2024


Enable operations to have originating source location information, specifically source reference, line and column information.

The location information may be set (one or more times) on an operation while it is unbound. Once it is bound, when it is a member of a block and that block is bound, it can no longer be set. Copying an operation will also copy the location. This enables preservation when transforming, especially for lowering.

The compiler generates location information from a tree node, and sets it on the associated operation(s). For the operations associated with a reflected method (FuncOp) or quoted lambda expression (LambdaOp or ClosureOp) the source reference (a URI) is added to the location, whereas for all other operations the source reference is absent.

Example:


    @CodeReflection                          // 47
    static int f(int n) {                    // 48
        int sum = 0;                         // 49
        for (int i = 0; i < n; i++)  {       // 50
            sum += i;                        // 51
        }                                    // 52
        return sum;                          // 53
    }                                        // 54


Model:


func @"f" @loc="47:5:file:///Users/sandoz/Projects/jdk/test/babylon-test/src/test/java/T.java" (%0 : int)int -> {
    %1 : Var<int> = var %0 @"n" @loc="47:5";
    %2 : int = constant @"0" @loc="49:19";
    %3 : Var<int> = var %2 @"sum" @loc="49:9";
    java.for @loc="50:9"
        ()Var<int> -> {
            %4 : int = constant @"0" @loc="50:22";
            %5 : Var<int> = var %4 @"i" @loc="50:14";
            yield %5 @loc="50:9";
        }
        (%6 : Var<int>)boolean -> {
            %7 : int = var.load %6 @loc="50:25";
            %8 : int = var.load %1 @loc="50:29";
            %9 : boolean = lt %7 %8 @loc="50:25";
            yield %9 @loc="50:9";
        }
        (%10 : Var<int>)void -> {
            %11 : int = var.load %10 @loc="50:32";
            %12 : int = constant @"1" @loc="50:32";
            %13 : int = add %11 %12 @loc="50:32";
            var.store %10 %13 @loc="50:32";
            yield @loc="50:9";
        }
        (%14 : Var<int>)void -> {
            %15 : int = var.load %3 @loc="51:13";
            %16 : int = var.load %14 @loc="51:20";
            %17 : int = add %15 %16 @loc="51:13";
            var.store %3 %17 @loc="51:13";
            java.continue @loc="50:9";
        };
    %18 : int = var.load %3 @loc="53:16";
    return %18 @loc="53:9";
};


Lowered model:


func @"f" @loc="47:5:file:///Users/sandoz/Projects/jdk/test/babylon-test/src/test/java/T.java" (%0 : int)int -> {
    %1 : Var<int> = var %0 @"n" @loc="47:5";
    %2 : int = constant @"0" @loc="49:19";
    %3 : Var<int> = var %2 @"sum" @loc="49:9";
    %4 : int = constant @"0" @loc="50:22";
    %5 : Var<int> = var %4 @"i" @loc="50:14";
    branch ^block_0;
  
  ^block_0:
    %6 : int = var.load %5;
    %7 : int = var.load %1;
    %8 : boolean = lt %6 %7 @loc="50:25";
    cbranch %8 ^block_1 ^block_2;
  
  ^block_1:
    %9 : int = var.load %3;
    %10 : int = var.load %5;
    %11 : int = add %9 %10 @loc="51:13";
    var.store %3 %11;
    branch ^block_3;
  
  ^block_3:
    %12 : int = var.load %5;
    %13 : int = constant @"1" @loc="50:32";
    %14 : int = add %12 %13 @loc="50:32";
    var.store %5 %14;
    branch ^block_0;
  
  ^block_2:
    %15 : int = var.load %3;
    return %15 @loc="53:9";
};


Location information is generated only when the compiler is configured to generate line number debug information in bytecode.

Location information may be dropped by either transforming and setting the operation's location to no location, or by writing the operation to the textual form and using the writer option to drop the location information. The latter is used in tests to ensure location information does not affect comparisons of models.
  
### Potential further work in subsequent PRs

Some finessing is likely required by the compiler as to when location information should be present on an operation, and if so what location information, or otherwise can be absent. When lowering we can see some operations, which were added by the transformation, have no source information, only those that were copied do. Bytecode generation of the line number table will help guide in this respect.

This work has also exposed limitations with operation attributes. Operation attributes are really a mechanism for serialization and deserialization of code models. An operation that has additional state needs to convey and consume that state when serializing and deserializing. Therefore we should consider attributes part of `OpWithDefinition` rather than `Op`, and likely rename the former `ExternalizableOp`. We don't need a general open-ended mechanism for adding state to operations. Instead operations should use all the facilities of the language for managing and exposing specific state, and where applicable use common operation abstractions that facilitate pattern matching.

-------------

Commit messages:
 - Tests
 - Serialize models droping location information
 - Doc updates
 - Merge remote-tracking branch 'upstream/code-reflection' into location
 - Add options to op writer
 - Updates.
 - Add location to Op.

Changes: https://git.openjdk.org/babylon/pull/54/files
  Webrev: https://webrevs.openjdk.org/?repo=babylon&pr=54&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8324789
  Stats: 498 lines in 14 files changed: 456 ins; 9 del; 33 mod
  Patch: https://git.openjdk.org/babylon/pull/54.diff
  Fetch: git fetch https://git.openjdk.org/babylon.git pull/54/head:pull/54

PR: https://git.openjdk.org/babylon/pull/54


More information about the babylon-dev mailing list