on non-static field layout

Dmytro Sheyko dmytro.sheyko at gmail.com
Mon Feb 16 08:04:40 UTC 2015


Hello,

The proposed field layouting algorithm is related to following bug reports
https://bugs.openjdk.java.net/browse/JDK-8024912
https://bugs.openjdk.java.net/browse/JDK-8024913

Adding explicitly aleksey.shipilev at oracle.com because it seems he
worked on field layout (when he worked on @Contended).

Thanks,
Dmytro

--------
> From: dmytro_sheyko at hotmail.com
> To: hotspot-dev at openjdk.java.net
> Subject: on non-static field layout
> Date: Thu, 12 Feb 2015 17:53:47 +0200

Hello,

I would like to share a couple of thoughts-proposals about non-static
field layout and get feed back from you.

1. about reference fields

I can see that reference fields are tried to be laid out together.
Moreover reference fields of subclass can be appended to the pack of
reference fields of its superclass, reducing number of oop_map entries
(especially when -XX:FieldsAllocationStyle=2). However reference
fields can still be scattered throughout the object and oop_map can
have more than 1 entry.

What about if reference fields were allocated BEFORE header (with
negative offset)? In this case they all would form single solid
cluster. Maybe we wouldn't need oop_map at all, knowing just number of
reference fields would be enough.

2. about filling gaps

Current approach of field layout tries to allocate fields densely by
sorting them by their sizes and placing them from largest
(long/double) to shortest (byte/boolean). Gap that may appear before
long/double fields due to alignment is tried to be filled by shorter
fields of the same class (-XX:+CompactFields). But this approach is
still not perfect because it does not fill gaps between fields in
superclasses.

I believe we can allocate fields more densely (i.e. without
unnecessary gaps in superclasses) with one pass (i.e. without
sorting). When fields are aligned and packed densely, there can be
zero or one 1-byte gap, zero or one 2-bytes gap and zero or one
4-bytes gap. So we can just keep track of these gaps and use them when
occasion offers. E.g. when we need to allocate 2-byte field, first we
try to use 2-bytes gap, otherwise we try to use 4-bytes gap (actually
only the first half of it, the second half becomes 2-bytes gap),
otherwise we append field to the end.

Finally the algorithm of nonstatic field layout may look something like below:

int oops_count;    // number of oop fields
int descent_size;  // 8b aligned size of those part of object that is
below header (header and primitive fields, but not oop fields)
int vacant_4b_off; // offset of vacant 4 bytes space (always 4b
aligned), 0 if there is no such space
int vacant_2b_off; // offset of vacant 2 bytes space (always 2b
aligned), 0 if there is no such space
int vacant_1b_off; // offset of vacant 1 byte  space, 0 if there is no
such space

// Before laying out nonstatic fields, copy this information (i.e.
oops_count, descent_size, vacant_?b_off) from super class
// for java.lang.Object they have following values
//                 | 32 bit | 64 bit                 | 64 bit                 |
//                 |        | -XX:+UseCompressedOops | -XX:-UseCompressedOops |
//   --------------+--------+------------------------+------------------------+
//   *header size* |      8 |                     12 |                     16 |
//   oops_count    |      0 |                      0 |                      0 |
//   descent_size  |      8 |                     16 |                     16 |
//   vacant_4b_off |      0 |                     12 |                      0 |
//   vacant_2b_off |      0 |                      0 |                      0 |
//   vacant_1b_off |      0 |                      0 |                      0 |


for (AllFieldStream fs(_fields, _cp); !fs.done(); fs.next()) {

   int real_offset;
   FieldAllocationType atype = (FieldAllocationType) fs.allocation_type();

   switch (atype) {
   ...
   case NONSTATIC_OOP: {
      // just prepend
      oops_count += 1;
      real_offset = -(oops_count * BytesPerHeapOop);
      break;
   }
   case NONSTATIC_DOUBLE: { // 8 bytes: long or double
      // just append
      real_offset   = descent_size;
      descent_size += BytesPerLong;
      break;
   }
   case NONSTATIC_WORD: { // 4 bytes: int or float
      if (vacant_4b_off != 0) {
         // use vacant 4b space if possible
         real_offset   = vacant_4b_off;
         vacant_4b_off = 0;
      } else {
         // otherwise append...
         real_offset   = descent_size;
         // ... and the second half of appended 8 bytes becomes vacant 4b space
         vacant_4b_off = descent_size + BytesPerInt;
         descent_size += BytesPerLong;
      }
      break;
   }
   case NONSTATIC_SHORT: { // 2 bytes: short or char
      if (vacant_2b_off != 0) {
         // use vacant 2b space if possible...
         real_offset   = vacant_2b_off;
         vacant_2b_off = 0;
      } else if (vacant_4b_off != 0) {
         // then try to use the first half of vacant 4b space...
         real_offset   = vacant_4b_off;
         // ... and the second half becomes vacant 2b space
         vacant_2b_off = vacant_4b_off + BytesPerShort
         vacant_4b_off = 0;
      } else {
         // otherwise append
         real_offset   = descent_size;
         // the rest becomes vacant
         vacant_2b_off = descent_size + BytesPerShort;
         vacant_4b_off = descent_size + BytesPerInt;
         descent_size += BytesPerLong;
      }
      break;
   }
   case NONSTATIC_BYTE: { // 1 byte: byte or boolean
      if (vacant_1b_off != 0) {
         real_offset   = vacant_1b_off;
         vacant_1b_off = 0;
      } else if (vacant_2b_off != 0) {
         real_offset   = vacant_2b_off;
         vacant_1b_off = vacant_2b_off + 1
         vacant_2b_off = 0;
      } else if (vacant_4b_off != 0) {
         real_offset   = vacant_4b_off;
         vacant_1b_off = vacant_4b_off + 1
         vacant_2b_off = vacant_4b_off + BytesPerShort
         vacant_4b_off = 0;
      } else {
         real_offset   = descent_size;
         vacant_1b_off = descent_size + 1;
         vacant_2b_off = descent_size + BytesPerShort;
         vacant_4b_off = descent_size + BytesPerInt;
         descent_size += BytesPerLong;
      }
      break;
   }
   fs.set_offset(real_offset);
}

// ascent_size: 8b aligned size of those part of object that is above header
int ascent_size = align_size_up(oops_count * BytesPerHeapOop, BytesPerLong);

// when class instance is allocated, the pointer to allocated space is
to be advanced by ascent_size to get pointer to object
//
// pointer to allocated space--> [ ref field m  ] <-+
//                               ...                | -- ascent size
//                               [ ref field 1  ] <-+
// pointer to object ----------> [ header       ] <-+
//                               [ prim field 1 ]   |
//                               [ prim field 2 ]   + -- descent size
//                               ...                |
//                               [ prim field n ] <-+

int total_size = accent_size + descent_size;

Regards,
Dmytro


More information about the hotspot-runtime-dev mailing list