CRR (M): 7157073: G1: type change size_t -> uint for region counts / indexes

Igor Veresov igor.veresov at oracle.com
Thu Apr 5 17:39:01 UTC 2012


I think if we want to save on the data structure size it's better to narrow the type in the structure, and then add accessors that get and set the value and receive and return a size_t. This was we get rid of unnecessary casts and bugs and save on space. Also, asserts can be put in the setter making sure that the value is not bigger than the storage. 

igor


On Wednesday, April 4, 2012 at 11:37 AM, Tony Printezis wrote:

> So:
> 
> maybe no: Igor and Mikael
> yes: Bengt
> 
> Any more votes? :-)
> 
> Tony
> 
> On 04/03/2012 06:02 PM, Mikael Vidstedt wrote:
> > 
> > I was mostly interested in the background to the change and felt the 
> > same way Igor did about the potential problems of computing a size by 
> > for example shifting the ints. I'm not willing to fight for it though :)
> > 
> > /Mikael
> > 
> > On 2012-04-03 09:11, Tony Printezis wrote:
> > > So, Bengt made his case. Any objections to this getting pushed?
> > > 
> > > Tony
> > > 
> > > On 04/02/2012 10:09 AM, Bengt Rutisson wrote:
> > > > 
> > > > Hi everyone,
> > > > 
> > > > I like this change because I think it adds information. I know I'm 
> > > > probably wrong about this in a strict meaning (based on the 
> > > > definition of size_t) but here's how I see it:
> > > > 
> > > > To me size_t communicates that this is a variable that contains a 
> > > > value that is somehow related to a memory size. For example, I want 
> > > > to allocate an object of a size_t size or a region has a size_t size.
> > > > 
> > > > If I want to have a value that is related to the memory size, but is 
> > > > not a size but rather based on a pointer I would use intptr_t. Even 
> > > > though the compiler might allow me to do similar operations on a 
> > > > size_t and on most (all our supported) platforms they will have the 
> > > > same width. But it adds information to say intptr_t instead of 
> > > > size_t. It tells the reader that this is actually a pointer and not 
> > > > a size.
> > > > 
> > > > In the same fashion I think something that is a "count" - number of 
> > > > something - should be some kind of int-value rather than a size_t. 
> > > > That communicates to the reader that this is a number and not a size 
> > > > or a pointer.
> > > > 
> > > > Personally I think "number of regions" in G1 is just such a case. It 
> > > > is not a memory size, so for me it adds extra information to change 
> > > > it from size_t to int. For this particular value I also think it 
> > > > shows the intent better as well. We are aiming for 5000 regions or 
> > > > something in that order. We are not seriously suggesting billions of 
> > > > regions. And if we were, I think we should use the int64_t type to 
> > > > communicate that.
> > > > 
> > > > Not sure if this made any sense. I won't insist on this being pushed 
> > > > - it just makes much more sense to me.
> > > > 
> > > > Bengt
> > > > 
> > > > On 2012-04-02 15:29, Tony Printezis wrote:
> > > > > PS Peter Kessler suggested the use of specific types for region 
> > > > > counts and indexes, like:
> > > > > 
> > > > > typedef uint region_count_t;
> > > > > typedef uint region_index_t;
> > > > > 
> > > > > to make a little more explicit what variables / fields represent. I 
> > > > > think it's a good idea. If we decide to go with this change, let me 
> > > > > know if this is something I should also add.
> > > > > 
> > > > > Tony
> > > > > 
> > > > > On 04/02/2012 09:27 AM, Tony Printezis wrote:
> > > > > > Mikael and Igor,
> > > > > > 
> > > > > > Members of the GC group objected to the use of size_t for counts 
> > > > > > and indexes. Personally, I don't mind it given that it's less 
> > > > > > error-prone (IMHO) and I think a bit more consistent (less casting 
> > > > > > during heap size -> index calculations). But, I'm OK with the 
> > > > > > change too (if we have lots of counts / indexes in some data 
> > > > > > structures, they will be a little more cache friendly).
> > > > > > 
> > > > > > I'm happy if folks "fight it out" and tell me what to do - I'm OK 
> > > > > > either way.
> > > > > > 
> > > > > > Tony
> > > > > > 
> > > > > > On 03/30/2012 06:41 PM, Mikael Vidstedt wrote:
> > > > > > > 
> > > > > > > I'm also interested in the background to the change. For 
> > > > > > > structures that are very common in memory it makes sense to 
> > > > > > > streamline the size of the fields and structs. Is that the case 
> > > > > > > here or is something else driving the change?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Mikael
> > > > > > > 
> > > > > > > On 2012-03-30 15:06, Igor Veresov wrote:
> > > > > > > > On a second thought, what is the rationale for doing this? This 
> > > > > > > > seems to be quite error prone I have to say. In case where you 
> > > > > > > > have to compute a size of something that is in units that are 
> > > > > > > > less than a region you have to take special care and do the 
> > > > > > > > cast. Like if you say: size_t region_offset_bytes = region_index 
> > > > > > > > << HeapRegion::LogOfHRGrainBytes; that would already be an error 
> > > > > > > > that is not easy to find - you'll just get a bogus answer and 
> > > > > > > > the compiler will be fine with it.
> > > > > > > > 
> > > > > > > > Just my $0.02..
> > > > > > > > 
> > > > > > > > igor
> > > > > > > > 
> > > > > > > > On Tuesday, March 27, 2012 at 9:34 AM, Tony Printezis wrote:
> > > > > > > > 
> > > > > > > > > Hi all,
> > > > > > > > > 
> > > > > > > > > I'd like a couple of reviews for this cleanup. We have been using
> > > > > > > > > size_t's to represent region counts / indexes and I have been 
> > > > > > > > > encouraged
> > > > > > > > > to change those to uint's. This unfortunately turned out to be 
> > > > > > > > > quite an
> > > > > > > > > extensive change that was hard to localize. I've split the 
> > > > > > > > > change into
> > > > > > > > > two webrevs.
> > > > > > > > > 
> > > > > > > > > First, this breaks down a long line in g1CollectedHeap.cpp. I I 
> > > > > > > > > have it
> > > > > > > > > on a separate webrev as it messes up the webrev pages for the 
> > > > > > > > > rest of
> > > > > > > > > the changes (i.e., the rest of the changes are easier to read 
> > > > > > > > > with this
> > > > > > > > > excluded):
> > > > > > > > > 
> > > > > > > > > http://cr.openjdk.java.net/~tonyp/7157073/webrev.0/webrev.0.G1LongLineFix/ 
> > > > > > > > > <http://cr.openjdk.java.net/%7Etonyp/7157073/webrev.0/webrev.0.G1LongLineFix/> 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > The main body of the changes is here:
> > > > > > > > > 
> > > > > > > > > http://cr.openjdk.java.net/~tonyp/7157073/webrev.0/webrev.1.G1Uints/ 
> > > > > > > > > <http://cr.openjdk.java.net/%7Etonyp/7157073/webrev.0/webrev.1.G1Uints/> 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Some important comments:
> > > > > > > > > 
> > > > > > > > > * I tried to change as many of the size_t occurrences as 
> > > > > > > > > possible but I
> > > > > > > > > didn't touch the following:
> > > > > > > > > 
> > > > > > > > > - HeapRegionRemSet - we're in the process of reworking this 
> > > > > > > > > code so I
> > > > > > > > > decided to leave it mostly as is (I only added a few necessary 
> > > > > > > > > casts)
> > > > > > > > > 
> > > > > > > > > - CollectionSetChooser / HeapRegion::sort_index() /
> > > > > > > > > HeapRegion::_sort_index - I have a CR open for a CSetChooser 
> > > > > > > > > cleanup and
> > > > > > > > > I'll do the conversion as part of that cleanup (I did change a 
> > > > > > > > > few types
> > > > > > > > > on the CSetChooser external API to avoid casts from methods 
> > > > > > > > > that access it).
> > > > > > > > > 
> > > > > > > > > - HeapRegion::_young_index_in_cset / 
> > > > > > > > > HeapRegion::young_index_in_cset() -
> > > > > > > > > The code looks as if it relies on the fact that this is an int. 
> > > > > > > > > It seems
> > > > > > > > > well localized so I left it as is and we can look at it on a 
> > > > > > > > > separate CR.
> > > > > > > > > 
> > > > > > > > > * I had to make some changes to the vmStructs_g1.hpp and I need to
> > > > > > > > > ensure that the SA code works with those changes. 
> > > > > > > > > Unfortunately, I'm
> > > > > > > > > going on vacation for a few days tonight and I haven't had a 
> > > > > > > > > chance to
> > > > > > > > > do that yet. I'll look at the SA more closely when I get back 
> > > > > > > > > and, if I
> > > > > > > > > need to make any changes to the SA, I'll publish those 
> > > > > > > > > separately (they
> > > > > > > > > will be self-contained anyway). I just wanted to get the main 
> > > > > > > > > body of
> > > > > > > > > the changes out before I leave so any potential reviewers can get
> > > > > > > > > started on it.
> > > > > > > > > 
> > > > > > > > > * Testing: the source builds on all platforms and passes JPRT. 
> > > > > > > > > And I did
> > > > > > > > > a lot of extra 64-bit testing (as any issues should mainly show 
> > > > > > > > > up in
> > > > > > > > > 64-bit builds).
> > > > > > > > > 
> > > > > > > > > Tony 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20120405/32efc42e/attachment.htm>


More information about the hotspot-gc-dev mailing list