RFR (xs) 8145940: TempNewSymbol should have correct copy and assignment functions

Lois Foltan lois.foltan at oracle.com
Wed Jan 13 21:14:08 UTC 2016


Looks good!
Lois

On 1/13/2016 3:50 PM, Coleen Phillimore wrote:
>
> Hi, Unfortunately, you're all getting another webrev.  I decided that 
> I should make the test more robust, in the case where java 
> -XX:+ExecuteInternalVMTests -version loads a class with the symbol 
> "abc", "efg", "hij", etc. in it.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8145940.05/
>
>
> Thanks,
> Coleen
>
>
> On 1/13/16 1:21 PM, Lois Foltan wrote:
>>
>> On 1/13/2016 11:28 AM, Coleen Phillimore wrote:
>>>
>>> Hi, Thanks Kim, Markus and Lois.  This is the latest version that 
>>> I've retested and added to the internal test:
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8145940.03/
>>
>> I think this looks good.  A few minor comments:
>>
>> - symbolTable.cpp - you could have the test case contain a 
>> Symbol*/TempNewSymbol* defined within an inner scope to ensure the 
>> correct ref counting happens within the inner scope and upon exit 
>> from the inner scope.
>>
>> - symbolTable.hpp - Glad to see the comment ahead of the assignment 
>> operator.  I find the last sentence a bit confusing. Can you rephrase 
>> to something like, "rhs is passed by value, within the scope of this 
>> method it is a copy.  At method exit it contains the former value of 
>> _temp, triggering the correct refcount decrement upon destruction."
>>
>> I don't need to see another webrev.
>>
>> Thanks,
>> Lois
>>
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 1/13/16 5:35 AM, Markus Gronlund wrote:
>>>> Hi Coleen,
>>>>
>>>> The Test_TempNewSymbol() look fine - thank you for adding it.
>>>>
>>>> Maybe you could also complement with a simple self-assignment test 
>>>> as well?
>>>> ...
>>>>
>>>>    s1 = s1; // self assignment
>>>>    assert(s1->refcount() == 2, "should still be two abc (s1 and ss)");
>>>>
>>>> ...
>>>>
>>>> Thanks!
>>>> Markus
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Coleen Phillimore
>>>> Sent: den 12 januari 2016 22:59
>>>> To: Markus Gronlund; Kim Barrett
>>>> Cc: hotspot-runtime-dev
>>>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct 
>>>> copy and assignment functions
>>>>
>>>>
>>>>
>>>> On 1/12/16 7:34 AM, Markus Gronlund wrote:
>>>>> Hi Kim,
>>>>>
>>>>> Thanks a lot for the (detailed) reminder about the copy-and-swap 
>>>>> idiom. I agree with your reasoning.
>>>> Two for (strongly?) and two against (myself, not so strongly and we 
>>>> can always add a comment to save someone else's time). I can change 
>>>> it (hoping my test still pass).  Did you look at the test?
>>>>> About the return value from operator=, I would suggest retaining 
>>>>> the void as-is as well.
>>>> Ok
>>>>> Maybe we should investigate if it is possible to have the 
>>>>> functions in SymbolTable that increment Symbol*'s return a 
>>>>> TempNewSymbol( or some other "smartptr" concept) instead of a 
>>>>> direct Symbol*?
>>>> When changing symbolOops symbolHandles, to Symbol* with refcounting 
>>>> I had an implementation using SymbolHandle as a smart pointer.  I 
>>>> think I had the assignment operator transfer ownership of the 
>>>> pointer, iirc.  I had added reference counting statistics at that 
>>>> time (which I have
>>>> output in my RFR).   My memory was that the number of Atomic:: add and
>>>> subtract operations were an order of magnitude larger.  I saved a 
>>>> lot of email on this except the numbers.  Symbols mostly come into 
>>>> the JVM through the constant pool, with these exceptions that use 
>>>> TempNewSymbol.
>>>>
>>>>> In that case we could make the type conversion constructor private 
>>>>> and make SymbolTable a friend of TempNewSymbol - this would avoid 
>>>>> accidental wrapping of arbitrary Symbol*'s. I will look into the 
>>>>> possibility of this.
>>>> This sounds like a good idea.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>> Thanks again for your clarification
>>>>> Markus
>>>>>
>>>>> -----Original Message-----
>>>>> From: Kim Barrett
>>>>> Sent: den 12 januari 2016 05:22
>>>>> To: Coleen Phillimore
>>>>> Cc: hotspot-runtime-dev
>>>>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct copy
>>>>> and assignment functions
>>>>>
>>>>> On Jan 11, 2016, at 8:34 PM, Coleen Phillimore 
>>>>> <coleen.phillimore at oracle.com> wrote:
>>>>>> Kim and Lois,
>>>>>>
>>>>>> Thanks for looking at this.   This is my latest webrev with the 
>>>>>> ExecuteInternalVMTest in it:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~coleenp/8145940.02/webrev/index.html
>>>>>>
>>>>>> See comments below.
>>>>>>
>>>>>> On 1/11/16 6:45 PM, Kim Barrett wrote:
>>>>>>> On Jan 7, 2016, at 3:02 PM, Coleen Phillimore 
>>>>>>> <coleen.phillimore at oracle.com> wrote:
>>>>>>>> Summary: Add clear() to the assignment operator and add copy 
>>>>>>>> constructor.
>>>>>>>>
>>>>>>>> Ran all jtreg, colocated and non-colocated tests. RunThese -jck 
>>>>>>>> with PrintSymbolTableSizeHistogram statistics:
>>>>>>>>
>>>>>>>>    Percent removed          1.35
>>>>>>>>    Reference counts          194583
>>>>>>>>
>>>>>>>> clean:
>>>>>>>>
>>>>>>>>    Percent removed          1.53
>>>>>>>>    Reference counts          194245
>>>>>>>>
>>>>>>>> Without a reference counting copy constructor, we could remove 
>>>>>>>> a TempNewSymbol's Symbol if a GC happens. Consider:
>>>>>>>>
>>>>>>>> TempNewSymbol ts = SymbolTable::new_symbol("abc"); // Hit GC
>>>>>>>>
>>>>>>>> The ref count for "abc" is 1 when created by new_symbol, and 
>>>>>>>> the destructor could run after the copy into ts, decrementing 
>>>>>>>> the reference count to 0 again.  GC could unlink that symbol 
>>>>>>>> from the symbol table.  Fortunately, we haven't seen this bug.
>>>>>>>>
>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8145940/
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8145940
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>> For the assignment operator, I think better is:
>>>>>>>
>>>>>>> void operator=(TempNewSymbol s) {
>>>>>>>     Symbol* tmp = s._temp;
>>>>>>>     s._temp = _temp;
>>>>>>>     _temp = tmp;
>>>>>>> }
>>>>>>>
>>>>>>> This is the well-known copy and swap idiom, with a by-hand "swap".
>>>>>>>
>>>>>>> http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap 
>>>>>>>
>>>>>>> -
>>>>>>> idiom
>>>>>> Very tricky.  Lois and I figured out the trick.  It doesn't elide 
>>>>>> any Atomic::* reference counting though, which is what is 
>>>>>> worrysome for performance, just hides it in the destructor of 
>>>>>> 's'.  I don't want to change this.
>>>>> Copy-swap saves several refcount manipulations in some (common) 
>>>>> cases, by taking advantage of rvalue copy elision.
>>>>>
>>>>> Coleen's:
>>>>>
>>>>> void operator=(const TempNewSymbol& s) {
>>>>>     clear();
>>>>>     _temp = s._temp;
>>>>>     if (_temp != NULL) _temp->increment_refcount(); }
>>>>>
>>>>> Markus's:
>>>>>
>>>>> const TempNewSymbol& operator=(const TempNewSymbol& rhs) {
>>>>>     if (this != &rhs && _temp != rhs._temp) {
>>>>>       clear();
>>>>>       _temp = rhs._temp;
>>>>>       if (_temp != NULL) {
>>>>>         _temp->increment_refcount();
>>>>>       }
>>>>>     }
>>>>>     return *this;
>>>>> }
>>>>>
>>>>> Kim's:
>>>>>
>>>>> void operator=(TempNewSymbol rhs) {
>>>>>     Symbol* tmp = rhs._temp;
>>>>>     rhs._temp = _temp;
>>>>>     _temp = tmp;
>>>>> }
>>>>>
>>>>> [I'm ignoring differences in the return value in the discussion 
>>>>> below.
>>>>> I think it ought to be either "TempNewSymbol&" (not const) or void.]
>>>>>
>>>>> Markus's is mostly the same as Coleen's, except it handles a 
>>>>> couple of special cases differently.
>>>>>
>>>>> Markus's handles self-assignment, where Coleen's doesn't. For 
>>>>> correctness that's important; in actual practice self-assignment 
>>>>> is generally rare to non-existent, so always paying overhead for 
>>>>> it is unpleasant.
>>>>>
>>>>> Markus's also specially handles re-assignment to the same symbol, 
>>>>> avoiding any refcount changes in that case.  I suggest this is 
>>>>> another rare case, and adding overhead in the normal case to get 
>>>>> that rare benefit is not the right tradeoff.
>>>>>
>>>>> Assuming we're not in one of those special cases and neither 
>>>>> Symbol* is NULL, Coleen/Markus decrements the old and increments 
>>>>> the new. And in an important special case where rhs is an rvalue, 
>>>>> there will be another decrement as the temporary goes out of 
>>>>> scope. So two refcount changes when rhs is an lvalue, three when 
>>>>> it's an rvalue.
>>>>>
>>>>> In contrast, Kim's does an increment and a decrement when rhs is 
>>>>> an lvalue, and *only* a decrement when rhs is an rvalue and rvalue 
>>>>> copy elision is performed by the compiler, which any decent 
>>>>> optimizing compiler will do.
>>>>>
>>>>> The rhs is an rvalue in the following situation:
>>>>>
>>>>>     TempNewSymbol tns = ... ;
>>>>>     ...
>>>>>     tns = expr_returning_Symbol* ;
>>>>>
>>>>> The rhs gets implicitly converted to a temporary (rvalue) 
>>>>> TempNewSymbol referring to the symbol (no refcount change), that 
>>>>> temporary is directly bound to the assignment argument (copy 
>>>>> elided), swap the symbols, and the rvalue converted temp goes out 
>>>>> of scope and decrements the refcount of the old Symbol*.
>>>>>
>>>>> The copy-swap version also handles the (rare) self-assignment and 
>>>>> re-assignment to the same symbol cases. In these cases Kim's still 
>>>>> does the same refcounting changes as otherwise. But it's better to 
>>>>> do extra work in the rare cases and keep the normal cases fast.
>>>>>
>>>>> This is why copy-swap has become the "usual idiom"for assignment 
>>>>> operators. There might be cases where it's not the best answer, 
>>>>> but usually it is. And this is why (to answer Lois's issue) one 
>>>>> should use it even though it's not obvious if one is not 
>>>>> previously familiar with the idiom; one should instead learn the 
>>>>> idiom, and then one spends time looking hard at implementations 
>>>>> that do something else.
>>>>>
>>>>> Of course, this may all be entirely in the noise; I hope that 
>>>>> TempNewSymbol assignment is not a high frequency operation.
>>>>>
>>>>>
>>>>>>> I think the updated description of TempNewSymbol isn't really 
>>>>>>> right.
>>>>>>> Because the conversion constructor doesn't increment the reference
>>>>>>> count, it must *not* be used to capture some arbitrary reference to
>>>>>>> a Symbol*. Only a new symbol or one obtained from one of the lookup
>>>>>>> functions really should be managed by this class.
>>>>>> Yes, this is a subtlety of the design and one that I can't really 
>>>>>> fix right now.  I thought of changing TempNewSymbol to convert 
>>>>>> from const char * but there are cases where we don't want to add 
>>>>>> a symbol.
>>>>>>
>>>>>> There is a comment above TempNewSymbol.  I don't know what to add 
>>>>>> to it.   And Markus's comments are really good.
>>>>> I was confused about which webrev I was looking at. Coleen's 
>>>>> doesn't seem to change those comments.
>>>>> It was (parts of) Markus's comments that worried me.  And the 
>>>>> whole design is tricky and risky.
>>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list