Source location information

Fri Apr 5 21:55:32 UTC 2024

> On Apr 5, 2024, at 1:16 AM, Hannes Greule <hannesgreule at outlook.de> wrote:
> 
> When it comes to transforming, I think it is important to find a sweet spot between "the programmer *has to* deal with location information" and "the programmer *can* deal with location information".
> 
> There are already a few transformations around, so that might help (e.g. the SPIR-V code, but also lowering). Picking lowering as an example, in best case it's possible to write lowering code without even seeing anything about location information - it should be an implicit but transparent process. I wonder if this similarly applies to most transformations or if it typically gets more difficult pretty early (how often are multiple models merged into one? or is replacing one op with zero or more *new* operations the common case?).
> 

Lowering is an excellent test case, where in theory we should not have to be explicit about location in the lowering code.

> One idea to not pollute the API is to have e.g. a `withLocation(...)` method (or something more generic for any metadata?).
> Maybe it is then possible to e.g. let the Builder add the position information from the current context.
> 

Leveraging the block builder is a good idea. When an op gets bound (to a block) the current location, if any, could get also bound (updating non-public information just once after which it is fixed and cannot change).

> I was also thinking about a third modelling option, trying to combine the two you presented: Make location information an Op, but rather than having it somewhere in a list of other operations, let it *contain* other other Ops (one or many?) to which the location applies. I think this would also allow for transparency, but it might be more difficult to access the information (similar to how Gary already wrote, just that we need to track parents now).
> 

This could also make it harder to process models. I think we should reserve nesting as much as possible to modeling actual code structure. So the structure is the same with and without location information.

Paul.

> On 02.04.24 22:31, Paul Sandoz wrote:
>> Same here.
>> Yes, we should also capture the source file location information. In general a model might be produced by transforming two or more models from different sources. To avoid repetition we might need some dominance rule.
>> What if operations are intermixed between those with and without location information? For those without should we infer the location from the nearest dominating operation with location? That may be problematic if non-dependent operations are shuffled around. I suppose that inference is up to the consumer of such models? Maybe we only need to state requirements for the models produced by the source compiler and the lowering behavior to core operations (from which we can then generate bytecode)?
>> Transformation-wise, supporting location for copying an operation is easy (and trivially so for removal). What about replacement? Should any replacing operations get the location of the operation being replaced unless they explicitly have location? I must admit I don’t like way such specifics might impact transformation.
>> API-wise it may be challenging to support operation construction without and with location information. Don’t want to pollute all the operation factory methods with an optional parameter.
>> Paul.
>>> On Apr 2, 2024, at 1:45 AM, Gary Frost <gary.frost at oracle.com> wrote:
>>> 
>>> I prefer the former form, where we tag the op with location info
>>> 
>>> func @"f" @loc="11:5" (%0 : int, %1 : int)int -> {
>>>     %2 : Var<int> = var %0 @"a" @loc="11:18";
>>>     %3 : Var<int> = var %1 @"b" @loc="11:25";
>>>      ....
>>> };
>>> 
>>> Over the latter form
>>> 
>>> func @"f" (%0 : int, %1 : int)int -> {
>>>     line @"11:18";
>>>     %2 : Var<int> = var %0 @"a";
>>>     line @"11:25";
>>>     %3 : Var<int> = var %1 @"b";
>>> }
>>> 
>>> As it is more obvious to me how we might handle transformations. Otherwise we need to track 'prev-sibling' nodes in the tree...
>>> 
>>> Q. though.  Don't we also need to capture the 'source file' somehow.... Can we do this from the model?
>>> 
>>> Maybe the func level @loc also includes the source ?
>>> 
>>> func @"f" @loc="SourceFile.java:11:5" (%0 : int, %1 : int)int -> {
>>>     %2 : Var<int> = var %0 @"a" @loc="11:18";
>>>     %3 : Var<int> = var %1 @"b" @loc="11:25";
>>>      ....
>>> };
>>> 
>>> From: babylon-dev <babylon-dev-retn at openjdk.org> on behalf of Paul Sandoz <paul.sandoz at oracle.com>
>>> Sent: Friday, March 29, 2024 8:29 PM
>>> To: babylon-dev at openjdk.org <babylon-dev at openjdk.org>
>>> Subject: Source location information
>>>  Hi,
>>> 
>>> Attached is a document discussing support for source location in code models. It briefly presents some possible approaches and requirements, and does not (yet) choose a specific approach and describe in more detail (because I don’t know what that should be).
>>> 
>>> Paul.