review request (M): partial 6711911: remove HeapWord dependency from MemRegion

Thu Jul 10 12:59:54 PDT 2008

Some comments inline.

John Rose (John.Rose at Sun.COM) wrote:
> (Sent to the general list, since MemRegion is a low-level class used  
> in many places.)
> 
> I've been working on object layout extensions, and have run into a  
> limitation in the MemRegion type that I want to fix.  Since a proper  
> fix will require a number of trivial code touches, I thought I'd send  
> out a heads-up.
> 
> Problem:  The MemRegion type is integral to all sorts of address  
> range calculations, but it is unable to resolve offsets or size less  
> than the native word size.  This is particularly a problem with  
> compressed oops, since they are 32 bits on a 64-bit machine.  It also  
> makes MemRegions less useful (and potentially buggy) for fine-grained  
> address range calculations.

As I understand it, this is a feature--it guarantees HeapWord
alignment.  We in gc-land are the main consumers of MemRegion and so
are mainly concerned w/objects and heap regions (e.g., old gen or
eden), which are constrained to start and end on HeapWord boundaries.

> Solution:  Make the dependency on word size more explicit by putting  
> the word "word" into MemRegion member functions that depend somehow  
> on the HeapWord type.  Add byte-wise versions of the member  
> functions, putting the word "byte" into them.  The existing member  
> functions are given a neutral "void*" type (or they could be removed).
> 
> http://webrev.invokedynamic.info/jrose/6711911.memr/
> 
> The slight downside of this is that about half of the uses of the  
> "start" and "end" member functions appear to be linked to an  
> assumption about HeapWord, while the others look like pure (unscaled)  
> addresses.  When I recompile the system with the "start" and "end"  
> changed to return "void*", the places where those pointers are mixed  
> with HeapWord, or subject to address arithmetic, pop up as errors and  
> I change them to "start_word" and "end_word".  These are the trivial  
> code touches.  The benefit of this process is that each code touch  
> can be evaluated for whether it masks a bug with compressed oops.
> 
> Comments?
> 
> Thanks,
> -- John
> 
> P.S.  I think this change moves in the right direction along another  
> path, which is replacing many size computations in the JVM with  
> size_t instead of int scaled by HeapWord.  I suspect (though am not  
> sure) that there is no benefit to using scaled sizes (an int scaled  
> by HeapWordSize).  So eventually I think we should measure object  
> sizes and offsets with an unscaled size_t.  In any event, using int  
> instead of size_t (scaled or not) creates a constant overflow hazard  
> on 64-bit systems.

Amen to banishing int (signed types in general) for sizes.

As for scaling, the benefit mentioned above is that HeapWord alignment
is guaranteed.  You can't have an unaligned start address or size.
FWIW, I've always wondered about the cost of the instructions to scale
the values.  But it's a nice form of error prevention.

-John