performance and memory optimization of layouts
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Aug 13 13:46:53 UTC 2020
I can no longer find your repository.
I think I've suggested something in the past related to a similar issue,
not sure if you acted on in or not.
Basically, the suggestion was to define a set of your own layout
constants, which contained a special attribute which could be used for
deciding whether something is a NativeInteger, or something else. This
is the same approach used by the ABI layer and works very well.
With something like that there is no need to do an equals() - you just
have to get the value of a well-known attribute (e.g. lookup in an HashMap).
Maurizio
On 13/08/2020 14:06, Ty Young wrote:
> Hi,
>
>
> I took a little time to look into optimizing the performance of my
> abstraction layer as FMA hasn't changed in any radical, breaking way
> and I'm happy with the overall design of my abstraction layer.
>
>
> In order to look into what could be optimized, I set the number of
> worker threads in my JavaFX application to 1 so that Nvidia attribute
> updates are done in a linear fashion and can be more easily reasoned
> as to how much of a performance impact any given one has and why. I
> then use Netbean's built-in profiler to view the CPU time was being
> taken. Runnables to be updated are given to the worker thread pool
> every 500 ms.
>
>
> Unsurprisingly to me, besides PCIe TX/RX attributes which supposedly
> are hung up within NVML itself, the attribute that represents GPU
> processes is the worst by far(see img1). This attribute is actually
> multiple native function calls jammed into one attribute which all
> utilize arrays of structs.
>
>
> Viewing the call tree(see img2) shows that a major contributor to the
> amount of this is caused by ValueLayout.equals() but there is some
> self-time in the upper NativeObject.getNativeObject() and
> NativeValue.ofUnsafeValueeLayout calls as well. ValueLayout.equals()
> is used in a if-else chain because you need to know which NativeValue
> implementation should be returned. If the layout is an integer then
> return NativeInteger, for example. It is maybe possible to order this
> if-else chain in a way that may return faster results without hitting
> every else-if(e.g. bytes first, then integers, then longs, etc) but
> that's always going to be a presumptuous, arbitrary order that may not
> actually be faster in some situations.
>
>
> What could be done to improve this? I can't think of any absolute
> fixes but an improvement would be to extend the ValueLayout so that
> you have a NumberLayout and a PointerLayout. You could then use
> instanceof to presumably filter things faster and more cheaply so that
> the mentioned else-if chain does not need to check for a pointer
> layout. The PointerLayout specific checks could be moved to its own
> static method. It's a small change, but it's presumably an improvement
> even if small.
>
>
> Unfortunately I can't do this myself because of sealed types so here I
> am.
>
>
> Another thing that needs optimizing is the memory allocation waste of
> getting an attribute. Every call to attribute(string name) allocated a
> new Optional instance which was often times used by my abstraction for
> a check and then immediately discarded. I wanted to do a bunch of
> layout checks to make sure that the MemoryLayout is valid, but after
> viewing the amount of garbage being generated standing out like a sore
> thumb, I decided to remove those checks(they are really important
> too). The amount of memory wasted wasn't worth it. The answer to this
> is presumably going to be value types, but it isn't clear when it's
> going to be delivered.
>
>
> Once again, if MemoryLayout and its extensions weren't sealed I could
> do things to improve both performance and memory waste as well as fix
> the other issue like attributes being factored into equality checks
> when it isn't wanted. Yes, I realize I'm beating a dead horse at this
> point but that dead horse is still causing issues.
>
>
> Could the suggested ValueLayout changes be done, at the very least? Or
> maybe somekind of equals() performance optimizations or something?
>
>
>
>
>
>
>
>
>
More information about the panama-dev
mailing list