RFR (xs) 8145940: TempNewSymbol should have correct copy and assignment functions

Lois Foltan lois.foltan at oracle.com
Wed Jan 13 18:21:19 UTC 2016


On 1/13/2016 11:28 AM, Coleen Phillimore wrote:
>
> Hi, Thanks Kim, Markus and Lois.  This is the latest version that I've 
> retested and added to the internal test:
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8145940.03/

I think this looks good.  A few minor comments:

- symbolTable.cpp - you could have the test case contain a 
Symbol*/TempNewSymbol* defined within an inner scope to ensure the 
correct ref counting happens within the inner scope and upon exit from 
the inner scope.

- symbolTable.hpp - Glad to see the comment ahead of the assignment 
operator.  I find the last sentence a bit confusing.  Can you rephrase 
to something like, "rhs is passed by value, within the scope of this 
method it is a copy.  At method exit it contains the former value of 
_temp, triggering the correct refcount decrement upon destruction."

I don't need to see another webrev.

Thanks,
Lois

>
> Thanks,
> Coleen
>
> On 1/13/16 5:35 AM, Markus Gronlund wrote:
>> Hi Coleen,
>>
>> The Test_TempNewSymbol() look fine - thank you for adding it.
>>
>> Maybe you could also complement with a simple self-assignment test as 
>> well?
>> ...
>>
>>    s1 = s1; // self assignment
>>    assert(s1->refcount() == 2, "should still be two abc (s1 and ss)");
>>
>> ...
>>
>> Thanks!
>> Markus
>>
>>
>> -----Original Message-----
>> From: Coleen Phillimore
>> Sent: den 12 januari 2016 22:59
>> To: Markus Gronlund; Kim Barrett
>> Cc: hotspot-runtime-dev
>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct copy 
>> and assignment functions
>>
>>
>>
>> On 1/12/16 7:34 AM, Markus Gronlund wrote:
>>> Hi Kim,
>>>
>>> Thanks a lot for the (detailed) reminder about the copy-and-swap 
>>> idiom. I agree with your reasoning.
>> Two for (strongly?) and two against (myself, not so strongly and we 
>> can always add a comment to save someone else's time).  I can change 
>> it (hoping my test still pass).  Did you look at the test?
>>> About the return value from operator=, I would suggest retaining the 
>>> void as-is as well.
>> Ok
>>> Maybe we should investigate if it is possible to have the functions 
>>> in SymbolTable that increment Symbol*'s return a TempNewSymbol( or 
>>> some other "smartptr" concept) instead of a direct Symbol*?
>> When changing symbolOops symbolHandles, to Symbol* with refcounting I 
>> had an implementation using SymbolHandle as a smart pointer.  I think 
>> I had the assignment operator transfer ownership of the pointer, 
>> iirc.  I had added reference counting statistics at that time (which 
>> I have
>> output in my RFR).   My memory was that the number of Atomic:: add and
>> subtract operations were an order of magnitude larger.  I saved a lot 
>> of email on this except the numbers.  Symbols mostly come into the 
>> JVM through the constant pool, with these exceptions that use 
>> TempNewSymbol.
>>
>>> In that case we could make the type conversion constructor private 
>>> and make SymbolTable a friend of TempNewSymbol - this would avoid 
>>> accidental wrapping of arbitrary Symbol*'s. I will look into the 
>>> possibility of this.
>> This sounds like a good idea.
>>
>> Thanks,
>> Coleen
>>> Thanks again for your clarification
>>> Markus
>>>
>>> -----Original Message-----
>>> From: Kim Barrett
>>> Sent: den 12 januari 2016 05:22
>>> To: Coleen Phillimore
>>> Cc: hotspot-runtime-dev
>>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct copy
>>> and assignment functions
>>>
>>> On Jan 11, 2016, at 8:34 PM, Coleen Phillimore 
>>> <coleen.phillimore at oracle.com> wrote:
>>>> Kim and Lois,
>>>>
>>>> Thanks for looking at this.   This is my latest webrev with the 
>>>> ExecuteInternalVMTest in it:
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8145940.02/webrev/index.html
>>>>
>>>> See comments below.
>>>>
>>>> On 1/11/16 6:45 PM, Kim Barrett wrote:
>>>>> On Jan 7, 2016, at 3:02 PM, Coleen Phillimore 
>>>>> <coleen.phillimore at oracle.com> wrote:
>>>>>> Summary: Add clear() to the assignment operator and add copy 
>>>>>> constructor.
>>>>>>
>>>>>> Ran all jtreg, colocated and non-colocated tests. RunThese -jck 
>>>>>> with PrintSymbolTableSizeHistogram statistics:
>>>>>>
>>>>>>    Percent removed          1.35
>>>>>>    Reference counts          194583
>>>>>>
>>>>>> clean:
>>>>>>
>>>>>>    Percent removed          1.53
>>>>>>    Reference counts          194245
>>>>>>
>>>>>> Without a reference counting copy constructor, we could remove a 
>>>>>> TempNewSymbol's Symbol if a GC happens. Consider:
>>>>>>
>>>>>> TempNewSymbol ts = SymbolTable::new_symbol("abc"); // Hit GC
>>>>>>
>>>>>> The ref count for "abc" is 1 when created by new_symbol, and the 
>>>>>> destructor could run after the copy into ts, decrementing the 
>>>>>> reference count to 0 again.  GC could unlink that symbol from the 
>>>>>> symbol table.  Fortunately, we haven't seen this bug.
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8145940/
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8145940
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>> For the assignment operator, I think better is:
>>>>>
>>>>> void operator=(TempNewSymbol s) {
>>>>>     Symbol* tmp = s._temp;
>>>>>     s._temp = _temp;
>>>>>     _temp = tmp;
>>>>> }
>>>>>
>>>>> This is the well-known copy and swap idiom, with a by-hand "swap".
>>>>>
>>>>> http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap
>>>>> -
>>>>> idiom
>>>> Very tricky.  Lois and I figured out the trick.  It doesn't elide 
>>>> any Atomic::* reference counting though, which is what is worrysome 
>>>> for performance, just hides it in the destructor of 's'.  I don't 
>>>> want to change this.
>>> Copy-swap saves several refcount manipulations in some (common) 
>>> cases, by taking advantage of rvalue copy elision.
>>>
>>> Coleen's:
>>>
>>> void operator=(const TempNewSymbol& s) {
>>>     clear();
>>>     _temp = s._temp;
>>>     if (_temp != NULL) _temp->increment_refcount(); }
>>>
>>> Markus's:
>>>
>>> const TempNewSymbol& operator=(const TempNewSymbol& rhs) {
>>>     if (this != &rhs && _temp != rhs._temp) {
>>>       clear();
>>>       _temp = rhs._temp;
>>>       if (_temp != NULL) {
>>>         _temp->increment_refcount();
>>>       }
>>>     }
>>>     return *this;
>>> }
>>>
>>> Kim's:
>>>
>>> void operator=(TempNewSymbol rhs) {
>>>     Symbol* tmp = rhs._temp;
>>>     rhs._temp = _temp;
>>>     _temp = tmp;
>>> }
>>>
>>> [I'm ignoring differences in the return value in the discussion below.
>>> I think it ought to be either "TempNewSymbol&" (not const) or void.]
>>>
>>> Markus's is mostly the same as Coleen's, except it handles a couple 
>>> of special cases differently.
>>>
>>> Markus's handles self-assignment, where Coleen's doesn't. For 
>>> correctness that's important; in actual practice self-assignment is 
>>> generally rare to non-existent, so always paying overhead for it is 
>>> unpleasant.
>>>
>>> Markus's also specially handles re-assignment to the same symbol, 
>>> avoiding any refcount changes in that case.  I suggest this is 
>>> another rare case, and adding overhead in the normal case to get 
>>> that rare benefit is not the right tradeoff.
>>>
>>> Assuming we're not in one of those special cases and neither Symbol* 
>>> is NULL, Coleen/Markus decrements the old and increments the new. 
>>> And in an important special case where rhs is an rvalue, there will 
>>> be another decrement as the temporary goes out of scope. So two 
>>> refcount changes when rhs is an lvalue, three when it's an rvalue.
>>>
>>> In contrast, Kim's does an increment and a decrement when rhs is an 
>>> lvalue, and *only* a decrement when rhs is an rvalue and rvalue copy 
>>> elision is performed by the compiler, which any decent optimizing 
>>> compiler will do.
>>>
>>> The rhs is an rvalue in the following situation:
>>>
>>>     TempNewSymbol tns = ... ;
>>>     ...
>>>     tns = expr_returning_Symbol* ;
>>>
>>> The rhs gets implicitly converted to a temporary (rvalue) 
>>> TempNewSymbol referring to the symbol (no refcount change), that 
>>> temporary is directly bound to the assignment argument (copy 
>>> elided), swap the symbols, and the rvalue converted temp goes out of 
>>> scope and decrements the refcount of the old Symbol*.
>>>
>>> The copy-swap version also handles the (rare) self-assignment and 
>>> re-assignment to the same symbol cases. In these cases Kim's still 
>>> does the same refcounting changes as otherwise. But it's better to 
>>> do extra work in the rare cases and keep the normal cases fast.
>>>
>>> This is why copy-swap has become the "usual idiom"for assignment 
>>> operators. There might be cases where it's not the best answer, but 
>>> usually it is. And this is why (to answer Lois's issue) one should 
>>> use it even though it's not obvious if one is not previously 
>>> familiar with the idiom; one should instead learn the idiom, and 
>>> then one spends time looking hard at implementations that do 
>>> something else.
>>>
>>> Of course, this may all be entirely in the noise; I hope that 
>>> TempNewSymbol assignment is not a high frequency operation.
>>>
>>>
>>>>> I think the updated description of TempNewSymbol isn't really right.
>>>>> Because the conversion constructor doesn't increment the reference
>>>>> count, it must *not* be used to capture some arbitrary reference to
>>>>> a Symbol*. Only a new symbol or one obtained from one of the lookup
>>>>> functions really should be managed by this class.
>>>> Yes, this is a subtlety of the design and one that I can't really 
>>>> fix right now.  I thought of changing TempNewSymbol to convert from 
>>>> const char * but there are cases where we don't want to add a symbol.
>>>>
>>>> There is a comment above TempNewSymbol.  I don't know what to add 
>>>> to it.   And Markus's comments are really good.
>>> I was confused about which webrev I was looking at.  Coleen's 
>>> doesn't seem to change those comments.
>>> It was (parts of) Markus's comments that worried me.  And the whole 
>>> design is tricky and risky.
>>>
>



More information about the hotspot-runtime-dev mailing list