[10] RFR: 8184603: Create ObjectStreamField signature lazily when possible

Roger Riggs Roger.Riggs at Oracle.com
Wed Jul 19 13:53:23 UTC 2017


Hi,

On 7/19/2017 5:49 AM, Peter Levart wrote:
> Hi Claes,
>
>
> On 07/17/2017 02:16 PM, Claes Redestad wrote:
>> Hi Peter!
>>
>> On 2017-07-15 14:08, Peter Levart wrote:
>>>
>>> It seems that interning signature(s) is important for correctness 
>>> (for example, in ObjectOutputStream.writeTypeString(str) the 'str' 
>>> is used to lookup a handle so that handles are put into stream 
>>> instead of the type signature(s) for multiple references to the same 
>>> type). Looking up objects in handles table is based on identity 
>>> comparison.
>>
>> Yes, interned signatures is important for correctness (and performance?)
>> of the current serialization implementation.
>>
>>>
>>> But there might be a way to obtain a singleton signature String per 
>>> type and still profit. By adding a field to java.lang.Class and 
>>> caching the JVM signature there. This would also be a useful public 
>>> method, don't you think?
>>
>> I have a nagging feeling that we should be careful about leaking
>> implementation details about the underlying VM through public APIs,
>> since making changes to various specifications is hard enough as it is.
>
> You're right. There's already more than enough "implementation 
> details" that pertain to JVM exposed through reflection API which was 
> supposed to represent Java - the language - view of the world. JVM 
> type signatures just happen to be used in serialization too, which is 
> another implementation detail which might change in the future (with 
> value types etc), so it's better to keep it private.
right
>
>>
>>>
>>> Out of 191 ObjectStreamField constructions I found in JDK sources, 
>>> there are only 39 distinct field types involved, so the number if 
>>> intern() calls is reduced by a factor of ~5. There's no need to 
>>> cache signature in ObjectStreamField(s) this way any more, but there 
>>> must still be a single final field for ObjectStreamField(s) 
>>> constructed with explicit signature(s).
>>>
>>> Here's how this looks like in code:
>>>
>>> http://cr.openjdk.java.net/~plevart/misc/Class.getJvmTypeSignature/webrev.01/ 
>>>
>>
>> Could this be done as a ClassValue instead of another field on Class? My
>> guess is only a small number of classes in any given app will be 
>> directly
>> involved in serialization, so growing Class seems to be a pessimization.
>
> It could be, yes. We are trying to solve two issues here. One is the 
> original 8184603 which is concerned with start-up overhead and your 
> proposal is the right solution for it as it only delays the work to 
> when/if it is needed. The other issue is overheads of repeatable 
> signature interning. These are not frequent enough for cases that just 
> create a bunch of ObjectStreamField instances assigned to static final 
> fields, but I suspect are more frequent when signatures are being 
> de-serialized from stream. At that time, we don't yet have a Class 
> object to go with the signature and to use as a caching anchor, but we 
> still want to keep the invariant of OSF signature(s) being interned 
> Strings. If they really need to be interned right away in that case is 
> a question which needs more studying of deserialization code.
The pacakge-private ObjectStreamField constructor(name, signature, 
unshare) is used only
to create temporary OSF objects during deserialization.  Those OSF 
instances are compared
with the OSF instances created from the local class to determine common 
fields.
The signature.intern() in that constructor is not significant.

The signature.intern() in the public constructor is not important for 
correctness,
comparisons between signatures use equals.

It may have a slight performance or size impact on the object streams 
because otherwise
equivalent signatures will be serialized as separate strings.
>
>>
>>>
>>> What do you think?
>>
>> I wonder what workloads actually see a bottleneck in these String.intern
>> calls, and *which* String.intern calls we are bottlenecking on in these
>> workloads. There's still a couple of constructors here that won't see a
>> speedup.
>
> Right. I suspect the intern() call bottleneck is most problematic when 
> deserializing. All other cases could be optimized by caching the 
> signature on the appropriate Class object(s) via ClassValue for example.
I'd remove the intern in the package-private constructor.
>
>>
>> I think we need more data to ensure this is actually worthwhile to 
>> pursue,
>> or whether there are other optimizations on a higher level that could
>> be done.
>
> Ok, we agree that no new public API for JVM signatures is desired and 
> the problem of intern() calls bottleneck when deserializing should be 
> researched more deeply. I agree that your solution is currently the 
> best for the original issue.
Ditto.

Roger





More information about the core-libs-dev mailing list