AW: Caliper CharMatcher Confusion
Bernd Eckenfels
ecki at zusammenkunft.net
Fri Aug 8 09:12:56 UTC 2014
Hello,
Not a full analysis, but two comments:
first of all you should level the playing field by outlining your implementation into an external utilities class (like guava) which works in a CharSequence and secondly I wonder if benchmarking additionally with a much bigger haystack (longer search input by using a length and type parameter matrix (no match, all match, 10%match; len 10,100,200,500,2k) gives additional insight.
gruss
Bernd
--
http://bernd.eckenfels.net
----- Ursprüngliche Nachricht -----
Von: "Eugen Rabii" <eugen.rabii at gmail.com>
Gesendet: 08.08.2014 10:54
An: "jmh-dev at openjdk.java.net" <jmh-dev at openjdk.java.net>
Betreff: Caliper CharMatcher Confusion
So basically a weird piece of code in the famous guava library for
CharMatcher.removeFrom
Utility method for removing a char from a String, sound not that
complicated, until I looked at the sources:
public String removeFrom(CharSequence sequence) {
String string = sequence.toString();
int pos = indexIn(string);
if (pos == -1) {
return string;
}
char[] chars = string.toCharArray();
int spread = 1;
// This unusual loop comes from extensive benchmarking
OUT: while (true) {
pos++;
while (true) {
if (pos == chars.length) {
break OUT;
}
if (matches(chars[pos])) {
break;
}
chars[pos - spread] = chars[pos];
pos++;
}
spread++;
}
return new String(chars, 0, pos - spread);
}
So that comment there : "This unusual loop comes from extensive
benchmarking" got me thinking, I do not trust Caliper very much (JMH to
blame), thus as a result I do not trust
their benchmarks too much, so I decided to code mine. In the end I
reached almost the same logic that they did, almost. And decided to test
it with JMH.
Can some of you professionals tell me if from a *JMH stand of point is
this a correct testing approach*? I'm thinking to test at least cold
start too.
package org.madmonky.guava;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import com.google.common.base.CharMatcher;
@Warmup(iterations=5, time=1, timeUnit=TimeUnit.SECONDS)
@BenchmarkMode(Mode.AverageTime)
@Measurement(iterations=3, time=1, timeUnit=TimeUnit.SECONDS)
@Fork(3)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class CharMatcherTest {
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(".*" + CharMatcherTest.class.getSimpleName()
+ ".*")
.threads(4)
.build();
new Runner(opt).run();
}
CharMatcher charMatcher;
@Param({"e", "eeeee", "efefefefe", "eeeemeeeeseeeeer", "dfgrry",
"wertyeoiuyeeeeeteee"})
String input;
char searched;
@Setup
public void prepare(){
charMatcher = CharMatcher.is('e');
searched = 'e';
}
@Benchmark
public String mineFirstVersion(){
char [] array = input.toCharArray();
boolean reachedTheEnd = false;
int totalCount = 0;
for(int i=0;i<array.length;++i){
int howManyInIteration = 0;
while(array[i] == searched){
++i;
++howManyInIteration;
if(i == array.length) {
reachedTheEnd = true;
break;
}
}
totalCount += howManyInIteration;
if(!reachedTheEnd) array[i-totalCount] = array[i];
}
return new String(array, 0, (array.length - totalCount));
}
@Benchmark
public String guavaRemoveFrom(){
return charMatcher.removeFrom(input);
}
}
P.S. The results (if the approach is correct) show that their
implementation is a bit slower.
Thank you very much for your time,
Eugene.
More information about the jmh-dev
mailing list