Stack allocation prototype for C2
Nikola Grcevski
Nikola.Grcevski at microsoft.com
Wed Jul 8 21:18:06 UTC 2020
Hi Andrew,
>Here's my concern.
>
>Java stacks are, in general, pretty small. This is good, and makes for
>economical memory usage. This is particularly useful for Project Loom, >where there can be enormous numbers of "virtual" threads. These threads,
>while they are not active, are stored in the heap.
>As you might imagine, the idea of embedded objects (which, of course, >cannot be collected) in these virtual threads does not delight me at all. >Is this likely to be a real problem, do you think, or are all of the
>stack-allocated objects so small that I shouldn't be concerned?
Your concern about memory consumption increase is very valid, especially in the context of project Loom.
We only stack allocate java objects of size 256B or less and arrays of length less than 64 elements. There’s also a C2 per method limit of how many stack slots can be allocated. After the stack slots limit is reached, we stop stack allocating more objects. These checks limit the overall amount of stack space that will be consumed. We see stack allocation as an addition to scalar replacement. Currently, scalar replacement will increase the stack size and we expect stack allocation to grow the stack, a similar but larger amount per object. Scalar replacement does not preserve the header words nor unused fields where stack allocated objects do.
We have collected some static data to understand the amount of increase of the stack size, but perhaps we need to extend the measurement in scenarios that are closer to typical project Loom use cases.
Out of all programs in the Renaissance benchmark suite, ALS is where we stack allocate the most. There are about 2,500 methods compiled with C2 in ALS and the average method stack size can be found in the table below:
No stack allocation, average per method stack size: 69.9 B
With stack allocation, average per method stack size: 72.2 B
Average stack allocated object size: 25.7 B
MAX stack allocated object size: 96 B
It comes to about less than 2.5 bytes increase on average (or 3%) in a program where we’ve seen the most opportunities so far.
We observe similar numbers in the DaCapoScala benchmark suite in the benchmarks where we stack allocate a lot: TMT and FACTORIE.
FACTORIE (around 650 compiled methods)
No stack allocation, average per method stack size: 63.9 B
With stack allocation, , average per method stack size: 66.6 B
Average stack allocated object size: 25.5 B
MAX stack allocated object size: 48 B
TMT (around 900 compiled methods)
No stack allocation, average per method stack size: 67.4 B
With stack allocation, average per method stack size: 70.1 B
Average stack allocated object size: 23.5 B
MAX stack allocated object size: 40 B
If there is data from other workloads you would like to see, in particular when using Loom, please let us know. Also, if there are any other metrics you would like to see we can add those to our must gather list going forward.
If it turns out that the stack size increase is unacceptable, we can add further heuristics to do cost benefit analysis while deciding whether to stack allocate a given allocation candidate. For example, we might stack allocate only smaller objects, objects used in loops or only those in code with high frequency.
Thanks for reviewing.
Nikola
-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Andrew Haley
Sent: July 2, 2020 4:16 AM
To: Charlie Gracie <Charlie.Gracie at microsoft.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: Stack allocation prototype for C2
On 29/06/2020 22:05, Charlie Gracie wrote:
> Here is the prototype code for our work on adding stack allocation to
> the HotSpot C2 compiler. We are looking for any and all feedback as we
> hope to move from a prototype to something that could be contributed.
We certainly need a repo where it can go. It could either be adopted by an existing project or it could have a project of its own. The latter is perhaps a bad idea because it would be too isolated.
> A change of this size is difficult to review so we understand the
> process will be thorough and will take time to complete. Any
> suggestions on how to allow for collaboration with others, if they
> wanted to, would also be appreciated (i.e., a repo somewhere).
Here's my concern.
Java stacks are, in general, pretty small. This is good, and makes for economical memory usage. This is particularly useful for Project Loom, where there can be enormous numbers of "virtual" threads. These threads, while they are not active, are stored in the heap.
As you might imagine, the idea of embedded objects (which, of course, cannot be collected) in these virtual threads does not delight me at all. Is this likely to be a real problem, do you think, or are all of the stack-allocated objects so small that I shouldn't be concerned?
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&data=02%7C01%7CNikola.Grcevski%40microsoft.com%7C10c6163e539749badcbb08d81e604677%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292746135351851&sdata=p1Zx%2Fpoe5F5RoDGmrrPIggZ8UN%2BT4WFBZHTaxLkJ4A8%3D&reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&data=02%7C01%7CNikola.Grcevski%40microsoft.com%7C10c6163e539749badcbb08d81e604677%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292746135351851&sdata=k%2FlcpRETCDafZMvL%2B4P3abYrK4Eb83SOkoZcVBWoeS8%3D&reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list