Gap Buffer based AbstractStringBuilder implementation
Marek Kozieł
develop4lasu at gmail.com
Tue Nov 24 09:08:20 UTC 2009
2009/11/17 Goktug Gokdogan <gokdogan at gmail.com>:
> Hi.
> As you know, java.lang.StringXXXX classes favor insertion to the end in
> terms of performance. Adding to beginning or middle of these sequences
> causes most of the array to be shifted toward the end every time. Every ones
> in a while I end up changing my algorithms those are based on prefixing
> strings (ex. building a fully qualified name from class name to root
> direction). Sometimes these kind of changes could result in less readable
> code. And I should note that, there are lots of developers out there who
> does not know the internal details of these classes.
> Anyway, this behavior was frustrating me for a while and I decided to
> suggest a gap buffer based modification to AbstractStringBuilder to solve
> this problem but never had time before to prototype it.
> I tested the original StringBuilder against the prototype. Preliminary
> results for the duration of 100000 insertions of a short string:
> Original | Prototype
> append => ~33 | ~34
> insert beginning => ~32000 | ~38
> insert random => ~16000 | ~10000
>
> A negligible overhead appears for appending (which could be avoided with
> shortcuts), but lots of performance gain achieved for other cases.
> If we handle insertion at zero as a special case in insert method or if we
> add an another method like 'prepend', the insertion at beginning will show
> exactly same performance characteristics of insertion at the end.
> In my opinion, this is a well-worth modification to string building classes.
> If anybody agrees on sponsoring, I can tidy up code and contribute to
> OpenJDK.
> - Goktug
>
> PS: I've used following code for rough testing:
>
> private static final int LOOP_COUNT = 100000;
> public static void main(String[] args) {
> long nanoTime = System.nanoTime();
> testStandardAppend();
> //testAppendBeginning();
> //testAppendRandom();
> long span = System.nanoTime() - nanoTime;
> System.out.println(span / 1000000);
> }
> private static void testStandardAppend() {
> StringBuilder builder = new StringBuilder("initial");
> for (int i = 0; i < LOOP_COUNT; i++) {
> builder.append("tryouts");
> }
> }
> private static void testAppendBeginning() {
> StringBuilder builder = new StringBuilder("initial");
> for (int i = 0; i < LOOP_COUNT; i++) {
> builder.insert(0, "tryouts");
> }
> }
> private static void testAppendRandom() {
> Random random = new Random();
> StringBuilder builder = new StringBuilder("initial");
> for (int i = 0; i < LOOP_COUNT; i++) {
> builder.insert(random.nextInt(builder.length()), "tryouts");
> }
> }
Hello.
I think there is no point to implement that without size manager for
such resizeable structures because this mean that work will have to be
done twice.
So if any one would like to reimplement AbstractStringBuilder he
should also test how much custom memory management impact on used
memory, including decreasing buffer, and exchange size*=2 capacity
increasing.
Smf like:
if (size<some) size =max(size*2,requested*3/2);
else size = max(size+const,requested+const);
There are many reasons why String cannot share memory with changeable Objects.
If anything can be optimized then in my opinion it is:
Returned String can be hold in Builder using Reference (so it could be
garbaged ) and returned again if no modification were performed, and
on modification Reference should be cleaned.
Following String problems If there are concerns about unnecessary
String creations i suggest add one more Interface as standard rather
than hard to control solutions (need still some analysis):
interface ToStringBuilder{
// Appends current object to given builder, or created new builder
if null is given
StringBuilder appendTo(StringBuilder ret);
}
class SampleImplementation implements ToStringBuilder{
String toString(){
return appendTo(null).toString();
}
StringBuilder toBuilder(StringBuilder ret){
if (ret==null) ret= new StringBuilder();
...
return ret;
}
}
This could get caught up with efficiency.
The only thing which need to be considered is easy way to add some
type of simple transaction on this(but i'm not sure about it):
/* If object is appending data to builder and exception occurs (so
proper endAppendTransation will be not called) then all appends after
this call will be discarded */
public void startAppendTransation(Object key)
public void endAppendTransation(Object key)
--
Regards.
Lasu aka Marek Kozieł
http://lasu2string.blogspot.com/
More information about the core-libs-dev
mailing list