RFR JDK-8059510 Compact symbol table layout inside shared archive

Fri Oct 3 01:57:53 UTC 2014

I like the new compact position-independent format, but I am uncomfortable with
adding #ifdefs unless there is a good reason to do so.  What's the reason here?

Maybe this is really a question for Ioi, but why use two data structures when one will do the job?

That is, if you have to implement and support the position-independent format for LP64,
why not use it also for !LP64?  The extra "addl" instructions and table base pointer are noise.

And it's not just one localized #ifdef; there are several in the proposed changeset.
If we do relocatable images in the future, the divergent relocation rules will cause even more.

Overall, we should be supporting both 32- and 64-bit systems in common code,
and more so over time, not splitting new code with #ifdefs.

— John

P.S. One might think, "what's another #ifdef when there are so many?".
It's a judgement call, of course.  But note these two grep counts:
$ cat $(hg loc -I src/share/vm) | grep -c '#.*LP64'
236
$ cat ~/Downloads/hotspot-7.patch | grep -c '#.*LP64'
9
The proposed change adds, all by itself, 4% to our #ifdef load for LP64.

On Oct 2, 2014, at 2:33 PM, Jiangli Zhou <jiangli.zhou at oracle.com> wrote:

> Please review the webrev for JDK-8059510 <https://bugs.openjdk.java.net/browse/JDK-8059510> for JDK9: http://cr.openjdk.java.net/~jiangli/8059510/webrev.00/.
> 
> The shared classes in the CDS archive and runtime loaded classes used the same symbol table, which was a hashtable with 24-byte entries on 64-bit machine or 12-byte entries on 32-bit machine. It also used a pointer for bucket slot. In the webrev, we separate the symbol table for shared classes and runtime classes into two. While the runtime symbole table remain unchanged, the shared classes use a much compact table, which uses 8-byte per entry on both 32-bit and 64-bit machines. Each entry contains the symbol hash (4-byte). On 32-bit machine, it contains the pointer (4-byte) to the symbol. On 64-bit machine, it uses 4-byte offset from the base of the table.
> 
> // juint hash;
> //#ifdef _LP64
> // juint offset; /* Symbol *sym = (Symbol*)(SharedBaseAddress + offset) */
> //#else
> // Symbol* sym;
> //#endif
> 
> 
> The shared symbol lookup is quick. The targeting bucket is calculated using the hash (bucket index = hash % _bucket_count). The bucket sizes are pre-calculated and also stored in the archive along with the symbol table. So we don't need to calculate the bucket sizes at runtime.
> 
> The separate shared symbol table in the archive is now read-only during runtime. No entry is added/removed from the shared symbol table. Rehashing of the runtime symbol table does not affect the shared symbol table in the archive either. This helps memory sharing by avoid writes to the shared memory.
> 
> As part of the change, two dumping utilities were added to jcmd for dumping symbol table and string table.
> 
> The majority of the code in the webrev were contributed by Ioi Lam.
> 
> Thanks,
> Jiangli
> 
>