Bytecode generation, Source code mappings, JCov, Future (Patch)

Mon May 5 22:26:14 PDT 2008

Hi Alex - sorry for the slow response...

Alex Rau wrote:
> Hi Jim,
>
> thanks for the detailed info. Unfortunately I've not had much time 
> this week to investigate deeply on your proposal (compiler API / 
> debugger API). Here are some things I came up so far - please correct 
> me in case I got something wrong:
>
> 1) The debugger API is based on a design with two virtual machines 
> involved ( the debugger vm and the vm which gets debugged). While this 
> fits perfectly a debugging or profiling scenario where two virtual 
> machines are always involved this does not properly line up with my 
> scenario where only one instance of a virtual machine exists. Our 
> software is based on top of a readily available (compiled) build. It 
> performs modifications on the byte code of the build, runs all unit 
> tests and generates xml reports (all done in the mentioned single vm 
> in one shot). That's all. A second vm is just not existing and would 
> mean much more overhead to our design just for getting column 
> information.
>
Yes, that is right.   I was just pointing out why we didn't add 
character position info in JDK 6 - debuggers can get by without it, and it
would be a lot of work to add it to the JPDA APIs.   I thought that 
since you have the byte codes and the source code,
you should be able to do the same thing NetBeans does wrt using the 
compiler API to relate the two.

> 2) I could not yet find my way through the compiler and debugger API 
> from a technical point f view to really have the column information in 
> the end. I've already had a look on the netbeans sources and 
> (probably) found the right code location but I have to investigate on 
> that in more detail. However this indicates somehow that it's getting 
> much more tricky compared to the variant where the compiler itself 
> outputs the column information into the byte code via additional 
> attributes. 
Yes, I suspect that what NetBeans has done is tricky, but it seems to 
work ok for them.   And I agree that having the compiler output
character position info probably isn't very difficult (especially since 
it can already do that). 
> A question here: is is necessary to recompile on the fly during 
> debugging to get the line/column information ? If yes then this would 
> make it even more difficult and would mean that we have to support an 
> additional compilation process while up to now we strictly rely on 
> already performed compilations. We work on byte code exclusively and 
> the sources are only required for the report generation.
I haven't investigated what NetBeans does, but it is my understanding 
that the debugger uses the compiler API to parse a method (or file)
but it does that in-process; it doesn't spawn a separate compilation 
process.
>
> 3) I think that line numbers and column information are actually 
> "attributes" of the compiler ( result ) in a broader sense. It always 
> depends on the compiler what values these attributes will have. 
> Compared to for example a duration of a method invocation (profiling) 
> or a certain value of a variable (debugging) the latter are *always* 
> runtime-dependent values. What I'd like to say is: there are static ( 
> runtime-independent, "compiler only"-dependent ) attributes (line and 
> column info) and dynamic attributes ( runtime and execution dependent 
> ) attributes (invocation duration, variable value). I see a "natural" 
> separation between those where static attributes should be stored 
> statically (e.g. in the byte code) and dynamic attributes should be 
> accessible dynamically (like the debugger API allows). This does  
> imply as well that while we are interested in static attributes of the 
> compiler it's really not necessary to reread these attributes with 
> every modification on bytecode level. Having these information at a 
> single point of time (after the compilation is finished) is totally 
> sufficient compared to getting the information during runtime every time.
>
I don't think I can argue with what you are saying.   And, given that 
the Peter von der Ahé scheme allows debuggers to do what they need,
or at least most of what they need, with no modification to javac, the 
VM Spec , or the JPDA APIs,  adding the column info is a low
priority item for JPDA.
>
> It looks to me that what I want to achieve belongs more to the 
> compiler than somewhere else. Any comments ?
Seems like that is correct.  This fine grained info is needed by our own 
jcov tool, and now by your tool.  It still seems to me
like the first choice would be to use the Peter von der Ahé scheme and 
if that just isn't feasible, then the second choice
would be to standardize the current CharacterRangeTable attribute (or 
some variation) as an optional attribute in the class file spec.
I can see the advantages of putting the fine-grained info in a separate 
file, but it seems to me that it would be problematic
to specify this in some spec - I don't think we even have a spec in 
which to put such a definition.

- jjh

>
>
> Best regards,
>
> Alex
>
>
> On 24.04.2008, at 04:53, Jim Holmlund wrote:
>
>> Just to summarize:
>> - jcov is an internal to Sun tool.
>> - to support jcov, a .class file attribute called the 
>> CharacterRangeTable attribute was
>> defined and javac was changed to output it in response to the 
>> -Xjcov(I think) command line option:
>> CharacterRangeTable_attribute {
>> u2 attribute_name_index;
>> u4 attribute_length;
>> u2 character_range_table_length;
>> { u2 start_pc;
>> u2 end_pc;
>> u4 character_range_start;
>> u4 character_range_end;
>> u2 flags;
>> } character_range_table[character_range_table_length];
>> }
>> The 'flags' field item describes the kind of range, eg statement, 
>> block, assignment,
>> flow_controller ..
>>
>> - the CharacterRangeTable was never added to the VM Spec.
>>
>> - jcov used the old JVMPI. Robert rewrote it to do byte code 
>> instrumentation
>> via java.lang.instrument. It still uses the CharacterRangeTable.
>>
>> As Robert mentioned, we have had requests from debuggers to include 
>> this kind of info in the .class file, for example to allow stepping 
>> thru terms of an expression, multiple statements on one line, etc. We 
>> planned to do something for this in JDK 6, eg, formalize the 
>> CharacterRangeTable attribute by adding it to the definition of the 
>> class file in the VM spec, and add functionality to JVM TI, JDWP, and 
>> JDI to allow debuggers to access this information.
>>
>> When Peter von der Ahé heard about this, he suggested that we not do 
>> this and instead proposed a solution that required no changes to be 
>> made to the JDK. His idea was that an IDE has the source code for a 
>> file in which fine grained stepping is desired, and the IDE can get 
>> the bytecodes from the debuggee VM via JDI (Method.bytecodes()). The 
>> IDE can then use the compiler APIs introduced in JDK 6
>> http://www.artima.com/lejava/articles/compiler_api.html
>> to match the source code to the bytecodes to find the bytecodes that 
>> correspond to source constructs of interest. This idea was 
>> investigated by the NetBeans debugger team and found to be effective, 
>> so it was implemented as the 'expression stepping' feature in 
>> NetBeans 6.0:
>> http://www.netbeans.org/features/java/debugger.html
>>
>> So, we ended up not needing character offset information in JPDA and 
>> so we didn't add the CharacterRangeTable attribute to the VM spec. 
>> Adding thisinformation to JPDA would be very low on our list of 
>> things to do, unless
>> some needs arise that can't be handled by Peter's technique.
>>
>> I wonder if Alex could also use Peter's idea. Alex did mention that 
>> the tools he is interested
>> in normally have the source code available so maybe he could.
>>
>> - jjh
>>
>> Jonathan Gibbons wrote:
>>> Hi Serviceability folk,
>>>
>>> The Subject line is from a thread on the compiler-dev list. You 
>>> might be interested to check it out here:
>>> http://mail.openjdk.java.net/pipermail/compiler-dev/2008-April/thread.html#300 
>>>
>>>
>>> The thread concerns an interest in improving the information about 
>>> source location generated by the compiler, javac, and more 
>>> specifically, increasing the resolution of the info from line-based 
>>> coordinates to source-based coordinates. The submitter is also 
>>> talking about using side files for the info, which (if I recall 
>>> correctly) I have heard folk such as Jim discuss before now.
>>>
>>> What would be the interest from the serviceability group about any 
>>> such work? Is it "on your radar", "sometime eventually", or "it'll 
>>> never happen"? :-)
>>>
>>> -- Jon
>>>
>>> P.S. Warning: the submitter has provided a patch on the compiler-dev 
>>> thread but has not yet signed the SCA.
>>>
>>>
>