Thanks for your valueable feedback!

Here is the current status of my patch alpha version:
>> http://jmmc.fr/~bourgesl/share/java2d-pisces/
>> There is still a lot to be done: clean-up, stats, pisces class instance
>> recycling (renderer, stroker ...) and of course sizing correctly initial
>> arrays (dirty or clean) in the RendererContext (thread local storage).
>> For performance reasons, I am using now RendererContext members first
>> (cache for rowAARLE for example) before using ArrayCaches (dynamic arrays).
> Thank you Laurent, those are some nice speedups.
I think it can still be improved: I hope to make it as fast as ductus or
maybe more (I have several idea for aggressive optimizations) but the main
improvement consist in reusing memory (like C / C++ does) to avoid wasted
memory / GC overhead in concurrent environment.

> About the thread local storage, that is a sensible choice for highly
> concurrent systems, at the same time, web containers normally complain about
> orphaned thread locals created by an application and not cleaned up.
> Not sure if ones created at the core libs level get special treatment, but
> in general, I guess it would be nice to have some way to clean them up.

You're right that's why my patch is not ready !

I chose ThreadLocal for simplicity and clarity but I see several issues:
1/ Web container: ThreadLocal must be clean up when stopping an application
to avoid memory leaks (application becomes unloadable due to classloader
2/ ThreadLocal access is the fastest way to get the RendererContext as it
does not require any lock (unsynchronized); As I get the RendererContext
once per rendering request, I think the ThreadLocal can be replaced by a
thread-safe ConcurrentLinkedQueue<RendererContext> but it may become a
performance bootleneck
3/ Using a ConcurrentLinkedQueue<RendererContext> requires an efficient /
proper cache eviction to free memory (Weak or Soft references ?) or using
statistics (last usage timestamp, usage counts)

Any other idea (core-libs) to have an efficient thread context in a web
container ?

I'm not familiar with the API, but is there any way to clean them up when
> the graphics2d gets disposed of?

The RenderingEngine is instanciated by the JVM once and I do not see in the
RenderingEngine interface any way to perform callbacks for warmup / cleanup
... nor access to the Graphics RenderingHints (other RFE for tuning

> A web application has no guarantee to see the same thread ever again
> during his life, so thread locals have to be cleaned right away.

I advocate ThreadLocal can lead to wasted memory as only few concurrent
threads can really use their RendererContext instance while others can
simply answer web requests => let's use a
ConcurrentLinkedQueue<RendererContext> with a proper cache eviction.

> Either that, or see if there is any way to store the array caches in a
> global structure backed by a concurrent collection to reduce/eliminate
> contention.

Yes, it is a interesting alternative to benchmark.

