Proposal for two new samples
    Michael Mirwaldt 
    michael.mirwaldt at gmail.com
       
    Mon Jul 27 21:22:47 UTC 2015
    
    
  
Hi,
may I introduce me: I am Michael Mirwaldt from Germany,
studied Computer Science with a diploma (which is about a master's degree).
I have programmed in Java for nearly ten years now.
I would like to add two samples I miss in the jmh repository.
They could help jmh users to experience the effect and demonstrate that 
on their lectures.
1) Branch prediction
- it demonstrates how branch prediction/misprediction can lead to 
better/worse performance.
- it "loops" through a sorted and an unsorted array and "consumes" only 
high values
- I got the idea from 
http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array 
where that phenomena was discussed
- I could not check whether the branch misses ratio increases.
If you are interested I will do so and should observe that with the 
command line tool perf on a linux machine.
I should observe it with the command line tool perf on a linux machine.
- my sample gives reliable results
e.g.
Benchmark                                                      Mode 
Cnt   Score   Error  Units
JMHSample_36_BranchPrediction.benchmark_sortedArray avgt   25  12,741 ± 
0,151  ns/op
JMHSample_36_BranchPrediction.benchmark_sortedArray:counter avgt   25  
12,602 ± 0,143  ns/op
JMHSample_36_BranchPrediction.benchmark_unsortedArray avgt   25  19,710 
± 0,986  ns/op
JMHSample_36_BranchPrediction.benchmark_unsortedArray:counter avgt   25  
19,524 ± 0,935  ns/op
2) Matrix copy
- it demonstrates that it matters how you iterate through a two 
dimensional array (when you copy a matrix)
- iterating "column by column" is often faster than "row by row" because 
it leads to less cache faults.
- I was inspired by a lecture on a conference where somebody mentioned that
- I could not observe whether the the cache hits ratio sinks when 
running the sample on my windows machine yet.
If you are interested I will do so and should observe that with the 
command line tool perf on a linux machine.
- my sample gives reliable results
e.g.
Benchmark                                                    Mode Cnt   
Score   Error  Units
JMHSample_37_MatrixCopy.benchmark_transposeColByCol          avgt 25  
41,073 ± 2,023  ns/op
JMHSample_37_MatrixCopy.benchmark_transposeColByCol:counter  avgt 25  
40,344 ± 2,089  ns/op
JMHSample_37_MatrixCopy.benchmark_transposeRowByRow          avgt 25  
28,393 ± 0,571  ns/op
JMHSample_37_MatrixCopy.benchmark_transposeRowByRow:counter  avgt 25  
28,148 ± 0,600  ns/op
What do you think of these two simple samples?
Are the results 'significant in numbers' for you?
How can I submit my sample code so that you can try it out/review them?
I would really apppreciate your feedback.
Brgds,
Michael
    
    
More information about the jmh-dev
mailing list