CRR (M): 7157073: G1: type change size_t -> uint for region counts / indexes

Wed Apr 4 18:37:34 UTC 2012

So:

maybe no: Igor and Mikael
yes: Bengt

Any more votes? :-)

Tony

On 04/03/2012 06:02 PM, Mikael Vidstedt wrote:
>
> I was mostly interested in the background to the change and felt the 
> same way Igor did about the potential problems of computing a size by 
> for example shifting the ints. I'm not willing to fight for it though :)
>
> /Mikael
>
> On 2012-04-03 09:11, Tony Printezis wrote:
>> So, Bengt made his case. Any objections to this getting pushed?
>>
>> Tony
>>
>> On 04/02/2012 10:09 AM, Bengt Rutisson wrote:
>>>
>>> Hi everyone,
>>>
>>> I like this change because I think it adds information. I know I'm 
>>> probably wrong about this in a strict meaning (based on the 
>>> definition of size_t) but here's how I see it:
>>>
>>> To me size_t communicates that this is a variable that contains a 
>>> value that is somehow related to a memory size. For example, I want 
>>> to allocate an object of a size_t size or a region has a size_t size.
>>>
>>> If I want to have a value that is related to the memory size, but is 
>>> not a size but rather based on a pointer I would use intptr_t. Even 
>>> though the compiler might allow me to do similar operations on a 
>>> size_t and on most (all our supported) platforms they will have the 
>>> same width. But it adds information to say intptr_t instead of 
>>> size_t. It tells the reader that this is actually a pointer and not 
>>> a size.
>>>
>>> In the same fashion I think something that is a "count" - number of 
>>> something - should be some kind of int-value rather than a size_t. 
>>> That communicates to the reader that this is a number and not a size 
>>> or a pointer.
>>>
>>> Personally I think "number of regions" in G1 is just such a case. It 
>>> is not a memory size, so for me it adds extra information to change 
>>> it from size_t to int. For this particular value I also think it 
>>> shows the intent better as well. We are aiming for 5000 regions or 
>>> something in that order. We are not seriously suggesting billions of 
>>> regions. And if we were, I think we should use the int64_t type to 
>>> communicate that.
>>>
>>> Not sure if this made any sense. I won't insist on this being pushed 
>>> -  it just makes much more sense to me.
>>>
>>> Bengt
>>>
>>> On 2012-04-02 15:29, Tony Printezis wrote:
>>>> PS Peter Kessler suggested the use of specific types for region 
>>>> counts and indexes, like:
>>>>
>>>> typedef uint region_count_t;
>>>> typedef uint region_index_t;
>>>>
>>>> to make a little more explicit what variables / fields represent. I 
>>>> think it's a good idea. If we decide to go with this change, let me 
>>>> know if this is something I should also add.
>>>>
>>>> Tony
>>>>
>>>> On 04/02/2012 09:27 AM, Tony Printezis wrote:
>>>>> Mikael and Igor,
>>>>>
>>>>> Members of the GC group objected to the use of size_t for counts 
>>>>> and indexes. Personally, I don't mind it given that it's less 
>>>>> error-prone (IMHO) and I think a bit more consistent (less casting 
>>>>> during heap size -> index calculations). But, I'm OK with the 
>>>>> change too (if we have lots of counts / indexes in some data 
>>>>> structures, they will be a little more cache friendly).
>>>>>
>>>>> I'm happy if folks "fight it out" and tell me what to do - I'm OK 
>>>>> either way.
>>>>>
>>>>> Tony
>>>>>
>>>>> On 03/30/2012 06:41 PM, Mikael Vidstedt wrote:
>>>>>>
>>>>>> I'm also interested in the background to the change. For 
>>>>>> structures that are very common in memory it makes sense to 
>>>>>> streamline the size of the fields and structs. Is that the case 
>>>>>> here or is something else driving the change?
>>>>>>
>>>>>> Thanks,
>>>>>> Mikael
>>>>>>
>>>>>> On 2012-03-30 15:06, Igor Veresov wrote:
>>>>>>> On a second thought, what is the rationale for doing this? This 
>>>>>>> seems to be quite error prone I have to say. In case where you 
>>>>>>> have to compute a size of something that is in units that are 
>>>>>>> less than a region you have to take special care and do the 
>>>>>>> cast. Like if you say: size_t region_offset_bytes = region_index 
>>>>>>> << HeapRegion::LogOfHRGrainBytes; that would already be an error 
>>>>>>> that is not easy to find - you'll just get a bogus answer and 
>>>>>>> the compiler will be fine with it.
>>>>>>>
>>>>>>> Just my $0.02..
>>>>>>>
>>>>>>> igor
>>>>>>>
>>>>>>> On Tuesday, March 27, 2012 at 9:34 AM, Tony Printezis wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I'd like a couple of reviews for this cleanup. We have been using
>>>>>>>> size_t's to represent region counts / indexes and I have been 
>>>>>>>> encouraged
>>>>>>>> to change those to uint's. This unfortunately turned out to be 
>>>>>>>> quite an
>>>>>>>> extensive change that was hard to localize. I've split the 
>>>>>>>> change into
>>>>>>>> two webrevs.
>>>>>>>>
>>>>>>>> First, this breaks down a long line in g1CollectedHeap.cpp. I I 
>>>>>>>> have it
>>>>>>>> on a separate webrev as it messes up the webrev pages for the 
>>>>>>>> rest of
>>>>>>>> the changes (i.e., the rest of the changes are easier to read 
>>>>>>>> with this
>>>>>>>> excluded):
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~tonyp/7157073/webrev.0/webrev.0.G1LongLineFix/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Etonyp/7157073/webrev.0/webrev.0.G1LongLineFix/> 
>>>>>>>>
>>>>>>>>
>>>>>>>> The main body of the changes is here:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~tonyp/7157073/webrev.0/webrev.1.G1Uints/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Etonyp/7157073/webrev.0/webrev.1.G1Uints/> 
>>>>>>>>
>>>>>>>>
>>>>>>>> Some important comments:
>>>>>>>>
>>>>>>>> * I tried to change as many of the size_t occurrences as 
>>>>>>>> possible but I
>>>>>>>> didn't touch the following:
>>>>>>>>
>>>>>>>> - HeapRegionRemSet - we're in the process of reworking this 
>>>>>>>> code so I
>>>>>>>> decided to leave it mostly as is (I only added a few necessary 
>>>>>>>> casts)
>>>>>>>>
>>>>>>>> - CollectionSetChooser / HeapRegion::sort_index() /
>>>>>>>> HeapRegion::_sort_index - I have a CR open for a CSetChooser 
>>>>>>>> cleanup and
>>>>>>>> I'll do the conversion as part of that cleanup (I did change a 
>>>>>>>> few types
>>>>>>>> on the CSetChooser external API to avoid casts from methods 
>>>>>>>> that access it).
>>>>>>>>
>>>>>>>> - HeapRegion::_young_index_in_cset / 
>>>>>>>> HeapRegion::young_index_in_cset() -
>>>>>>>> The code looks as if it relies on the fact that this is an int. 
>>>>>>>> It seems
>>>>>>>> well localized so I left it as is and we can look at it on a 
>>>>>>>> separate CR.
>>>>>>>>
>>>>>>>> * I had to make some changes to the vmStructs_g1.hpp and I need to
>>>>>>>> ensure that the SA code works with those changes. 
>>>>>>>> Unfortunately, I'm
>>>>>>>> going on vacation for a few days tonight and I haven't had a 
>>>>>>>> chance to
>>>>>>>> do that yet. I'll look at the SA more closely when I get back 
>>>>>>>> and, if I
>>>>>>>> need to make any changes to the SA, I'll publish those 
>>>>>>>> separately (they
>>>>>>>> will be self-contained anyway). I just wanted to get the main 
>>>>>>>> body of
>>>>>>>> the changes out before I leave so any potential reviewers can get
>>>>>>>> started on it.
>>>>>>>>
>>>>>>>> * Testing: the source builds on all platforms and passes JPRT. 
>>>>>>>> And I did
>>>>>>>> a lot of extra 64-bit testing (as any issues should mainly show 
>>>>>>>> up in
>>>>>>>> 64-bit builds).
>>>>>>>>
>>>>>>>> Tony
>>>>>>>
>>>>>>
>>>
>