[OpenJDK Rasterizer] RFR: Marlin renderer #2

Tue Jun 9 22:10:55 UTC 2015

Hi Laurent,

BTW, I wanted to point out that this is something we can look at after 
we get the current set of fixes in.  I just have to find an hour where I 
can look over the last webrev and review all of the static methods and 
it looks like it could go in.

I'm not necessarily thinking of solving everyone's caching concerns, 
though that would be nice, I'm mainly looking at making the caching 
solution used here easy to verify, easy to work on without making a 
mistake in the future, and easy to tune without having to play with a 
bunch of variants of static methods.  In the end, all of them boil down 
to 3 actions - allocate, grow, and dispose - but since it is using 
static methods then that means an explosion to deal with things like 
array type, dirtiness and, potentially in the future, different growth 
metrics.  I think that luckily the current difference in metrics happens 
to align with whether the arrays are dirty or not, so the matrix is 
smaller than it might otherwise be, but there would be an exponential 
growth in the number of "very similarly named" static methods should we 
introduce any other reason for one particular growable array instance to 
be different...

			...jim

On 6/9/2015 1:36 PM, Laurent Bourgès wrote:
> Hi Jim,
>
> Your proposal is interesting: I would have loved doing such smart array
> cache for general use !
> Do you think such a general ArrayCache would be helpful for other java2d
> or jdk algorithms needing intensive array allocation ?
>
> However, I focused mostly my work on getting the maximum performance and
> also not changing too much the original pisces code (to let you review
> it more easily).
>
>     Here's a suggestion on how I would set up the array caching code,
>     mainly with regard to making it easier to make sure we are managing
>     then consistently, but also to allow for more flexible tuning later.
>
>
> I agree the array management code may look complicated (but less than
> C/C++).
> As I am using netbeans, I am abusing of its 'Find Usages' feature :
> Usages of Renderer.edgeBuckets [11 occurrences]
> sun.java2d.marlin.Renderer
> addLine
>    394:  final int[] _edgeBuckets      = edgeBuckets;
> Renderer
>    472:  edgeBuckets = edgeBuckets_initial;
> init
> *  523:  edgeBuckets = rdrCtx.getIntArray(edgeBucketsLength);
> *dispose
>    589:  if (edgeBuckets == edgeBuckets_initial) {
>    591:  IntArrayCache.fill(edgeBuckets,      buckets_minY,
> *  597:  rdrCtx.putIntArray(edgeBuckets,      buckets_minY,
> *  599:  edgeBuckets = edgeBuckets_initial;
>    605:  } else if (edgeBuckets != edgeBuckets_initial) {
> *  607:  rdrCtx.putIntArray(edgeBuckets, 0, 0);
> *  608:  edgeBuckets = edgeBuckets_initial;
> _endRendering
>    708:  final int[] _edgeBuckets = edgeBuckets;
>
> So I can quickly check if I am using the appropriate IntArrayCache
> (clean variant) !
>
>     First, make all interactions go through the cache object itself -
>     get, grow, put.  This gets rid of a number of static methods with a
>     bunch of similar names and groups the management methods in a class
>     hierarchy.
>
>
> It seems a good approach but requires me a big refactoring: a Cache_V2
> will gather many Cache_V1 (including bucket / size management).
> Moreover, it raises another issue: for now all caches are gathered into
> the ArrayCacheHolder to wrap them with a single WeakReference => small
> memory footprint + early GC cleanup.
>
> That's the aim of the RendererContext.getArrayCachesHolder() method
> which permits dynamic ArrayCachesHolder retrieval or creation. Actually
> a Cache is needed to both get/grow but also release an array and
> meanwhile, it is possible that GC had freed the Cache memory !
>
> But this point is more problematic below:
>
>     Second, have each consumer of arrays use its own cache object.  Some
>     of them may be the same object reference between different "users",
>     and some of them may be different objects but share their underlying
>     arrays with each other, but a given piece of code could be traced to
>     always get/grow/put its arrays into the same cache object.  That way
>     it's easier to see that you aren't mixing/matching clean/dirty arrays.
>
>
> It seems a bit verbose: 1 cache field per array = adding a lot of class
> fields ! Maybe the same cache reference could be still used for "friend'
> arrays like: edgeBuckets/edgeBucketCounts, crossings/aux_crossings, ...
>
>     DirtyFooCache
>          foo[] get(size)
>          foo[] grow(foo[], newsize)  // or widen?
>          void put(foo[])
>
>     CleanFooCache
>          foo[] get(size)
>          foo[] grow(foo[], newsize, <used parts>)
>          void put(foo[], <used parts>)
>
>
> Looks like a good design.
>
>
>     Then code that manages an array named bar would be managed as:
>     (It may be the same instance as another FooCache
>       or it may share its arrays with another FooCache
>       but the "barCache" reference is only ever used for the bar array...)
>          [Clean|Dirty]FooCache barCache = make/get/share a Cache instance;
>          foo[] bar;
>          ...
>          bar = barCache.get(initSize);
>          ...
>          bar = barCache.grow/widen(bar, newSize[, usage if clean]);
>          ...
>          barCache.put(bar[, usage if clean]);
>          bar = null;
>
>     It's easier to verify that bar is being managed consistently if it
>     always goes back to its own bar object.  You can then decide if too
>     different uses of foo[] arrays should share the same arrays or not
>     based on profiling.  You can also potentially have different growth
>     algorithms for 2 different arrays of the same type, but possibly
>     even sharing the same underlying arrays.  Any degree of
>     customization is possible underneath the covers behind a FooCache
>     basic interface.
>
>
> I like your proposal but I maximized the array sharing ie Cache
> instances for following reasons:
> - maximize array reuse (bucket approach) = reduce the memory footprint
> => lower GC overhead because less memory is wasted by cached arrays
> - each RendererContext has its own Cache instances: in multithreaded
> environment (web server), it becomes critical to have the smaller memory
> footprint and avoid concurrency issue = no shared cache between threads
> - all that work can be seen as an Thread Local Array Allocation (TLAA)
> that could be part of the GC algorithm itself to benefit to the all VM:
> dirty vs clean arrays, reusing vs allocation + GC, partial cleaning
> (Arrays 2.0 ?) ...
>
>     I'd also either make the growth heuristics an argument to
>     creating/fetching a FooCache instance or make it subject to a helper
>     "CacheSizeManager" or "CacheSizeStrategy" object in such a way that
>     different uses of IntCache might be able to have different growth
>     strategies - and to possibly even manage different sizing strategies
>     on top of the same set of underlying arrays (i.e. one has buckets
>     that grow by <<2 and another by <<1 and for the even buckets, they
>     share arrays, for instance).
>
>
> It led me to imagine a nice Array Cache framework at the jdk or jvm
> level ... at least not exclusive to Marlin !
>
>     The main thing for the near term, is that it would be nice to have
>     each array use its own cache object so that we can easily verify
>     that it is being consistent with its handling of arrays...
>
>
> If you are not happy with the latest webrev (better naming convention),
> we should discuss again how to do all these changes in a progressive manner.
>
> Cheers,
> Laurent