Proposal for a small language change
Reinier Zwitserloot
reinier at zwitserloot.com
Mon Apr 13 14:58:22 PDT 2009
Bad Idea:
What you're talking about is microbenchmarking. Trying to do it, is
almost always a good indicator that you're Doing It Wrong. Java
doesn't microbenchmark well for several reasons. If you really must
benchmark some code, here's how you MUST do it; the following scheme
is a neccessary but not sufficient guarantee that you might just be
getting relevant timing numbers:
Set up something that can be looped over many times. Then, time the
time it takes to run the entire loop. Make sure the data that is
calculated is used someplace; for example, if you're calculating
lists, add the sizes of the lists to a number, then, at the end, print
the number (do the printing after the timing, of course - it takes
ages!) - otherwise the hotspot compiler will just excise the entire
code!
Time it for many different loop invocations, and draw a chart. loop
count on the x-axis, time taken (in total, not per iteration) on the y-
axis.
Keep doing this until you get a graph that looks a little like this:
for the first X iterations, the line is roughly linear, with a slope
of m. Then, there's a sudden spike, and after the spike, the line
becomes linear again, with a different slope of n. n is significantly
smaller than m (the line is closer to the horizontal).
Once you've reached this chart, you've got your answer: The spike is
the hotspot compiler kicking in. The m slope gives you the time for
the loop pre-hotspot and is mostly irrelevant. The n slope gives you
the post-hotspot and is more relevant.
This is better than blind microbenchmarking but I'm leaving out many,
many important details about how to do microbenchmarking appropriately
in java.
A 'cycle' method really isn't going to help much - the timing you need
to do is many orders of magnitude larger than what System.nanoTime()
can reliably measure, so existing tools are fine.
Secondly, and more importantly: That's not where timing is _supposed_
to be done in the first place. That's where JVMTI comes in - that's
where you need to be doing this. Download a profiler (netbeans has a
nice one!) and work with that. A JVM debug can generate cycle
instructions if that really would be helpful, just like debug JVMs can
print the assembler code generated when hotspotting for the local
architecture.
If the profiler team needs something to improve their ability to
profile, that should certainly be considered, but I seriously doubt
they'll be needing a cycle() JVM bytecode to do their job.
--Reinier Zwitserloot
On Apr 13, 2009, at 23:43, Ulf Zibis wrote:
> +1
>
> -Ulf
>
>
> Am 13.04.2009 23:21, Angelo Borsotti schrieb:
>> Author: Angelo Borsotti, former Senior Director Software Technology,
>> Alcatel-Lucent Optics
>>
>> Overview
>> Feature Summary: fast and accurate measurement of execution time
>> Major Advantage: optimization of algorithms by measuring execution
>> times of small snippets of code executed many times
>> Major Benefit: greatly increases observability of programs
>> Major Disadvantage: very little cost
>> Alternatives: there are no alternatives
>>
>> Examples:
>>
>> Simple example:
>> long c0 = System.cycles();
>> i++;
>> long c = System.cycles() - c0;
>> Advanced example:
>> long cycles = 0;
>> for (int i = 0; i < bound; i++){
>> ...
>> long c0 = System.cycles();
>> some statements
>> cycles += System.cycles()- c0;
>> }
>>
>> Details
>> Specification
>> Add a new method System.cycles().
>> Compilation
>> The method must be compiled inline, translating it into a single
>> instruction,
>> present in most architectures, that reads the cycle counter
>> register (e.g. the
>> TSC in x86 architectures).
>> To measure execution time, System.nanoTime() is currently
>> provided. This, however,
>> is by far too inaccurate to measure the execution time of code
>> which lies inside
>> methods, possibly inside loops. The accuracy that it provides is
>> comparable to the
>> time needed to execute thousands of java statements, which is
>> too low.
>> Moreover, the time spent to execute the nanoTime() method itself
>> makes this tool
>> too much invasive. Execution times become often much higher when
>> nanoTime() is
>> added, to the point to provide useless results.
>> Note that when a piece of code lies inside loops, measuring its
>> execution
>> time means adding many small durations. This means that the
>> invasivity
>> and the accuracy of the tool to measure time is extremely
>> important.
>> Note that also a native method would be too much invasive. The
>> only way to provide
>> a means to measure execution times that introduces an acceptable
>> noise (i.e. an
>> error that is sufficiently lower than the times measured) is to
>> compile the call
>> inline into a machine instruction. Profilers are orders of
>> magnitude coarser than
>> what is needed.
>> Testing
>> A simple test case in which a very simple example (as the one
>> above) is used,
>> that computes the number of cycles needed to perform a simple
>> operation. The
>> result should then be compared agains an estimate.
>> Library suppor
>> None needed
>> Reflective APIs
>> No change
>> Other changes
>> None
>> Migration
>> None
>>
>> Compatibility
>> Completely downward compatible
>>
>> References
>> Bug ID: 6685613
>>
>>
>>
>
>
More information about the coin-dev
mailing list