Bytecode generation, Source code mappings, JCov, Future

Sat Apr 19 08:38:27 PDT 2008

Hello,
I am  involved in development of sun's internal jcov tool.
As I heard, long ago it was part of sun's jvm, and, as I understand,
this is what you're talking about. Modern jcov is standalone java 
application (plus some C code to implement JVMPI / JVMTI agents), which 
uses different techniques to instrument bytecode, before it is loaded 
inside VM. Those sun teams, who would like to know code coverage 
metrics, use our jcov. I do not know, if anybody use those JVM 
functionality, which you call "jcov".
Regarding CharacterRangeTable, this information is used by jcov to 
markup src code with  coverage results, in process of report generation.

I will response to any your questions with pleasure,

Thanks,
-- Sergey Borodin
Jonathan Gibbons wrote:
> Alex Rau wrote:
>> On 17.04.2008, at 00:12, Jonathan Gibbons wrote:
>>
>>> Jcov is still around, though it has never really got the care and  
>>> attention it merited.  Every now and then, there is interest in  
>>> making it more available. It generally comes down to lack of  demand 
>>> and internal resources. :-(
>>
>>
>> What exactly is the purpose of jcov in current JDK's ? The coverage  
>> feature itself has been superseded by JVMTI while the jcov  
>> implementation still exists, is that correct ? Legacy coverage  tools 
>> (at least I know) rely anyway on line information exclusively  and 
>> are therefore not interested in column information or offsets.  So 
>> what use case is left for JCov ? Am I correct, that it just  exists 
>> more or less as an artificial piece of code from past times  without 
>> a real purpose ?
> My understanding is that jcov is still used within Sun for  
> determining code coverage of the various different test suites.
>
>>
>> Well. Actually I think the implementation deserves more attention  
>> from the compiler's point of view. JVMTI and JVMPI focus on making  
>> the state of the VM (during runtime) more visible. JCov in it's  
>> current implementation makes the compiler operation (in a different  
>> way via byte code attributes) visible and that's perfectly fine in  
>> my opinion and useful for tools which focus on static information  on 
>> class files (independently what happens during runtime).
>>
>> It's however open whether the compiler operation information is  
>> really required to be stored in the byte code. Another option would  
>> be to create a single file or one file for each class which  
>> exclusively contains the information. These files could be enhanced  
>> with other information from the compiler execution as well (what  
>> ever that could be) - some kind of standard compiler tracing output.
>>
>> Due to my personal interest I'd love to work on that feature. If it  
>> all goes well and it's getting along in a way which is useful for  
>> general purpose - perfect.
> We have had internal "hallway"-type discussions on the future of the  
> CharacterRangeTable attribute and ways to go forward. Having  separate 
> files would certainly make it simpler to distribute such  files for 
> existing class libraries that do not want to incur the hit  of the 
> extra class file space required.
>
> I think that experiments like this on the compiler, and on related  
> tools, are a great idea. To get started, you can just pull down a  
> copy of OpenJDK from the Mercurial repositories. However, note that  
> to contribute anything back to OpenJDK, you'll need to sign the SCA  
> form, available on this page http://openjdk.java.net/legal/
>
>>
>>> The features for jcov support are still in the compiler.  The  
>>> switch is -Xjcov, it causes a CharacterRangeTable attribute to be  
>>> added to the class file.  The format of the attribute is not great  
>>> -- in particular, it uses the old "packed line/offset" format for  
>>> coordinates, which may not work for very long lines (over 1024  
>>> characters) that sometimes occur in mechanically generated code.  
>>> Internally, javac now uses a simple character offset from the  
>>> beginning of the file to define a source position. It would  
>>> arguably be good to evolve the CharacterRangeTable attribute (in a  
>>> compatible way) to using character offsets.
>>>
>>>
>>
>> I agree.  I think - given that javac itself  uses (as you  mentioned) 
>> a single offset value it would make sense to stick with  it and use 
>> this value for generating appropriate byte code  including these 
>> offset attributes. Tools using these values in  general have access 
>> to the source code anyway as most often the  code is pretty printed 
>> in HTML or something else. So mapping a  single file offset to a 
>> certain line + offset combination would be  easily possible during 
>> parsing the source code for such tools.
>
> The code for getting line and offset from character number is in  
> javac, in util/Log.java.
>
>>
>>
>> So in case the usefulness of properly tracing above mentioned  
>> information is common sense - what would be the best way ? Here  some 
>> aspects:
>>
>> - interleaved output with byte code vs. separate output (file)  
>> interface
>> - replace/reuse/adapt jcov code vs. keep jcov as it is and  implement 
>> new stuff from scratch in parallel (compatibility?)
>> - absolute file offsets as you mentioned vs. something else ?
> It would be interesting to get some jcov folk in the discussion to  
> discuss the merits of making jcov available and using that as a  basis 
> for work, compared to starting over.
>
>>
>> regards,
>>
>> alex
>>
>>
>>>
>>> -- Jon
>>>
>>>
>>>
>>>
>>> Alex Rau wrote:
>>>> Hi all,
>>>>
>>>> I've searched the web and asked at forum.sun.com but my answer  
>>>> regarding the javac compiler couldn't be answered yet. so here we  
>>>> go with a rational first:
>>>>
>>>> Rational: I'd like to achieve the following: The byte code  created 
>>>> with javac should not only contain line number  information in 
>>>> debug mode, but additionally I want to track  exactly which 
>>>> statement (including column information as there  can be multiple 
>>>> statements in one source code line) leads to a  certain bytecode 
>>>> instruction or a set of instructions. I strongly  need this kind of 
>>>> functionality in javac as the project I'm  working on is settled in 
>>>> the testing area (mutation testing in  java on bytecode level  
>>>> which you can imagine as some kind of  (beta) code coverage 
>>>> software with a different and IMHO superior  technique - 
>>>> http://retroduction.org for more details if you are  interested). 
>>>> The software must be able to report mutated  *statements* which 
>>>> means that line informations are insufficient.
>>>>
>>>> My Question is: in older JDK's there was something called JCov  
>>>> which enhanced byte code with additional information regarding  
>>>> which statements finally led to a/multiple bytecode  
>>>> instruction(s). I stumbled upon this while debugging javac when I  
>>>> wanted to learn it's design/code. I think it's mostly what I need  
>>>> - however the JCov switch is a hidden feature. It was "more"  
>>>> public in JDK 1.2 and was pretended to have been ported to the  
>>>> JVMPI interface later. However I did not find where the port  
>>>> should have been integrated - I'm no JVMPI (and JVMTI) guru.  
>>>> Honestly I doubt that something similar is in the JVM(P/T)I  
>>>> toolset...
>>>>
>>>> So perhaps someone has more knowledge about JCov and what  happened 
>>>> to it. Furthermore I'd like to push a discussion on  whether this 
>>>> would be a candidate for the kitchen sink (no JLS  changes, just 
>>>> internal) with the goal to implement the above  described 
>>>> functionality *including a supported and public  interface* (not a 
>>>> hidden feature anymore). The first benefit  could then obviously be 
>>>> that javac could be mutation tested ;)  Furthermore other Java 
>>>> developers would have solid information  and documentation about 
>>>> this API instead of relying on mostly  unknown and hidden features.
>>>>
>>>> Best Regards,
>>>>
>>>> Alex
>>>>
>>>
>>
>>
>