RFR (xs) 8145940: TempNewSymbol should have correct copy and assignment functions

Coleen Phillimore coleen.phillimore at oracle.com
Wed Jan 13 20:50:45 UTC 2016


Hi, Unfortunately, you're all getting another webrev.  I decided that I 
should make the test more robust, in the case where java 
-XX:+ExecuteInternalVMTests -version loads a class with the symbol 
"abc", "efg", "hij", etc. in it.

open webrev at http://cr.openjdk.java.net/~coleenp/8145940.05/


Thanks,
Coleen


On 1/13/16 1:21 PM, Lois Foltan wrote:
>
> On 1/13/2016 11:28 AM, Coleen Phillimore wrote:
>>
>> Hi, Thanks Kim, Markus and Lois.  This is the latest version that 
>> I've retested and added to the internal test:
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8145940.03/
>
> I think this looks good.  A few minor comments:
>
> - symbolTable.cpp - you could have the test case contain a 
> Symbol*/TempNewSymbol* defined within an inner scope to ensure the 
> correct ref counting happens within the inner scope and upon exit from 
> the inner scope.
>
> - symbolTable.hpp - Glad to see the comment ahead of the assignment 
> operator.  I find the last sentence a bit confusing. Can you rephrase 
> to something like, "rhs is passed by value, within the scope of this 
> method it is a copy.  At method exit it contains the former value of 
> _temp, triggering the correct refcount decrement upon destruction."
>
> I don't need to see another webrev.
>
> Thanks,
> Lois
>
>>
>> Thanks,
>> Coleen
>>
>> On 1/13/16 5:35 AM, Markus Gronlund wrote:
>>> Hi Coleen,
>>>
>>> The Test_TempNewSymbol() look fine - thank you for adding it.
>>>
>>> Maybe you could also complement with a simple self-assignment test 
>>> as well?
>>> ...
>>>
>>>    s1 = s1; // self assignment
>>>    assert(s1->refcount() == 2, "should still be two abc (s1 and ss)");
>>>
>>> ...
>>>
>>> Thanks!
>>> Markus
>>>
>>>
>>> -----Original Message-----
>>> From: Coleen Phillimore
>>> Sent: den 12 januari 2016 22:59
>>> To: Markus Gronlund; Kim Barrett
>>> Cc: hotspot-runtime-dev
>>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct 
>>> copy and assignment functions
>>>
>>>
>>>
>>> On 1/12/16 7:34 AM, Markus Gronlund wrote:
>>>> Hi Kim,
>>>>
>>>> Thanks a lot for the (detailed) reminder about the copy-and-swap 
>>>> idiom. I agree with your reasoning.
>>> Two for (strongly?) and two against (myself, not so strongly and we 
>>> can always add a comment to save someone else's time). I can change 
>>> it (hoping my test still pass).  Did you look at the test?
>>>> About the return value from operator=, I would suggest retaining 
>>>> the void as-is as well.
>>> Ok
>>>> Maybe we should investigate if it is possible to have the functions 
>>>> in SymbolTable that increment Symbol*'s return a TempNewSymbol( or 
>>>> some other "smartptr" concept) instead of a direct Symbol*?
>>> When changing symbolOops symbolHandles, to Symbol* with refcounting 
>>> I had an implementation using SymbolHandle as a smart pointer.  I 
>>> think I had the assignment operator transfer ownership of the 
>>> pointer, iirc.  I had added reference counting statistics at that 
>>> time (which I have
>>> output in my RFR).   My memory was that the number of Atomic:: add and
>>> subtract operations were an order of magnitude larger.  I saved a 
>>> lot of email on this except the numbers.  Symbols mostly come into 
>>> the JVM through the constant pool, with these exceptions that use 
>>> TempNewSymbol.
>>>
>>>> In that case we could make the type conversion constructor private 
>>>> and make SymbolTable a friend of TempNewSymbol - this would avoid 
>>>> accidental wrapping of arbitrary Symbol*'s. I will look into the 
>>>> possibility of this.
>>> This sounds like a good idea.
>>>
>>> Thanks,
>>> Coleen
>>>> Thanks again for your clarification
>>>> Markus
>>>>
>>>> -----Original Message-----
>>>> From: Kim Barrett
>>>> Sent: den 12 januari 2016 05:22
>>>> To: Coleen Phillimore
>>>> Cc: hotspot-runtime-dev
>>>> Subject: Re: RFR (xs) 8145940: TempNewSymbol should have correct copy
>>>> and assignment functions
>>>>
>>>> On Jan 11, 2016, at 8:34 PM, Coleen Phillimore 
>>>> <coleen.phillimore at oracle.com> wrote:
>>>>> Kim and Lois,
>>>>>
>>>>> Thanks for looking at this.   This is my latest webrev with the 
>>>>> ExecuteInternalVMTest in it:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8145940.02/webrev/index.html
>>>>>
>>>>> See comments below.
>>>>>
>>>>> On 1/11/16 6:45 PM, Kim Barrett wrote:
>>>>>> On Jan 7, 2016, at 3:02 PM, Coleen Phillimore 
>>>>>> <coleen.phillimore at oracle.com> wrote:
>>>>>>> Summary: Add clear() to the assignment operator and add copy 
>>>>>>> constructor.
>>>>>>>
>>>>>>> Ran all jtreg, colocated and non-colocated tests. RunThese -jck 
>>>>>>> with PrintSymbolTableSizeHistogram statistics:
>>>>>>>
>>>>>>>    Percent removed          1.35
>>>>>>>    Reference counts          194583
>>>>>>>
>>>>>>> clean:
>>>>>>>
>>>>>>>    Percent removed          1.53
>>>>>>>    Reference counts          194245
>>>>>>>
>>>>>>> Without a reference counting copy constructor, we could remove a 
>>>>>>> TempNewSymbol's Symbol if a GC happens. Consider:
>>>>>>>
>>>>>>> TempNewSymbol ts = SymbolTable::new_symbol("abc"); // Hit GC
>>>>>>>
>>>>>>> The ref count for "abc" is 1 when created by new_symbol, and the 
>>>>>>> destructor could run after the copy into ts, decrementing the 
>>>>>>> reference count to 0 again.  GC could unlink that symbol from 
>>>>>>> the symbol table.  Fortunately, we haven't seen this bug.
>>>>>>>
>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8145940/
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8145940
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>> For the assignment operator, I think better is:
>>>>>>
>>>>>> void operator=(TempNewSymbol s) {
>>>>>>     Symbol* tmp = s._temp;
>>>>>>     s._temp = _temp;
>>>>>>     _temp = tmp;
>>>>>> }
>>>>>>
>>>>>> This is the well-known copy and swap idiom, with a by-hand "swap".
>>>>>>
>>>>>> http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap
>>>>>> -
>>>>>> idiom
>>>>> Very tricky.  Lois and I figured out the trick.  It doesn't elide 
>>>>> any Atomic::* reference counting though, which is what is 
>>>>> worrysome for performance, just hides it in the destructor of 
>>>>> 's'.  I don't want to change this.
>>>> Copy-swap saves several refcount manipulations in some (common) 
>>>> cases, by taking advantage of rvalue copy elision.
>>>>
>>>> Coleen's:
>>>>
>>>> void operator=(const TempNewSymbol& s) {
>>>>     clear();
>>>>     _temp = s._temp;
>>>>     if (_temp != NULL) _temp->increment_refcount(); }
>>>>
>>>> Markus's:
>>>>
>>>> const TempNewSymbol& operator=(const TempNewSymbol& rhs) {
>>>>     if (this != &rhs && _temp != rhs._temp) {
>>>>       clear();
>>>>       _temp = rhs._temp;
>>>>       if (_temp != NULL) {
>>>>         _temp->increment_refcount();
>>>>       }
>>>>     }
>>>>     return *this;
>>>> }
>>>>
>>>> Kim's:
>>>>
>>>> void operator=(TempNewSymbol rhs) {
>>>>     Symbol* tmp = rhs._temp;
>>>>     rhs._temp = _temp;
>>>>     _temp = tmp;
>>>> }
>>>>
>>>> [I'm ignoring differences in the return value in the discussion below.
>>>> I think it ought to be either "TempNewSymbol&" (not const) or void.]
>>>>
>>>> Markus's is mostly the same as Coleen's, except it handles a couple 
>>>> of special cases differently.
>>>>
>>>> Markus's handles self-assignment, where Coleen's doesn't. For 
>>>> correctness that's important; in actual practice self-assignment is 
>>>> generally rare to non-existent, so always paying overhead for it is 
>>>> unpleasant.
>>>>
>>>> Markus's also specially handles re-assignment to the same symbol, 
>>>> avoiding any refcount changes in that case.  I suggest this is 
>>>> another rare case, and adding overhead in the normal case to get 
>>>> that rare benefit is not the right tradeoff.
>>>>
>>>> Assuming we're not in one of those special cases and neither 
>>>> Symbol* is NULL, Coleen/Markus decrements the old and increments 
>>>> the new. And in an important special case where rhs is an rvalue, 
>>>> there will be another decrement as the temporary goes out of scope. 
>>>> So two refcount changes when rhs is an lvalue, three when it's an 
>>>> rvalue.
>>>>
>>>> In contrast, Kim's does an increment and a decrement when rhs is an 
>>>> lvalue, and *only* a decrement when rhs is an rvalue and rvalue 
>>>> copy elision is performed by the compiler, which any decent 
>>>> optimizing compiler will do.
>>>>
>>>> The rhs is an rvalue in the following situation:
>>>>
>>>>     TempNewSymbol tns = ... ;
>>>>     ...
>>>>     tns = expr_returning_Symbol* ;
>>>>
>>>> The rhs gets implicitly converted to a temporary (rvalue) 
>>>> TempNewSymbol referring to the symbol (no refcount change), that 
>>>> temporary is directly bound to the assignment argument (copy 
>>>> elided), swap the symbols, and the rvalue converted temp goes out 
>>>> of scope and decrements the refcount of the old Symbol*.
>>>>
>>>> The copy-swap version also handles the (rare) self-assignment and 
>>>> re-assignment to the same symbol cases. In these cases Kim's still 
>>>> does the same refcounting changes as otherwise. But it's better to 
>>>> do extra work in the rare cases and keep the normal cases fast.
>>>>
>>>> This is why copy-swap has become the "usual idiom"for assignment 
>>>> operators. There might be cases where it's not the best answer, but 
>>>> usually it is. And this is why (to answer Lois's issue) one should 
>>>> use it even though it's not obvious if one is not previously 
>>>> familiar with the idiom; one should instead learn the idiom, and 
>>>> then one spends time looking hard at implementations that do 
>>>> something else.
>>>>
>>>> Of course, this may all be entirely in the noise; I hope that 
>>>> TempNewSymbol assignment is not a high frequency operation.
>>>>
>>>>
>>>>>> I think the updated description of TempNewSymbol isn't really right.
>>>>>> Because the conversion constructor doesn't increment the reference
>>>>>> count, it must *not* be used to capture some arbitrary reference to
>>>>>> a Symbol*. Only a new symbol or one obtained from one of the lookup
>>>>>> functions really should be managed by this class.
>>>>> Yes, this is a subtlety of the design and one that I can't really 
>>>>> fix right now.  I thought of changing TempNewSymbol to convert 
>>>>> from const char * but there are cases where we don't want to add a 
>>>>> symbol.
>>>>>
>>>>> There is a comment above TempNewSymbol.  I don't know what to add 
>>>>> to it.   And Markus's comments are really good.
>>>> I was confused about which webrev I was looking at. Coleen's 
>>>> doesn't seem to change those comments.
>>>> It was (parts of) Markus's comments that worried me.  And the whole 
>>>> design is tricky and risky.
>>>>
>>
>



More information about the hotspot-runtime-dev mailing list