回复: [8u] RFR(XS) JFR generate same symbol ID for different symbols (high probabily with CDS)(Internet mail)
kalinshi(施慧)
kalinshi at tencent.com
Sat Oct 10 09:04:16 UTC 2020
Not sure why anonymous klass needs create a new cstring name in JFR.
Possible safe fix might concatenating anonymous klass name and hashcode again in every query and pass none-null C string and hashcode in JfrSymbolId::mark(const char* str, uintptr_t hash). And compare c string every time.
I'll wait others to comment first.
Regards
Hui
-----邮件原件-----
发件人: jdk8u-dev <jdk8u-dev-retn at openjdk.java.net> 代表 kalinshi(施慧)
发送时间: 2020年10月10日 16:51
收件人: Andrew Dinn <adinn at redhat.com>; jdk8u-dev at openjdk.java.net; Jaroslav Bachorík <jaroslav.bachorik at datadoghq.com>; Langer, Christoph <christoph.langer at sap.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
主题: Re: [8u] RFR(XS) JFR generate same symbol ID for different symbols (high probabily with CDS)(Internet mail)
" That may be unlikely but I don't think it is impossible. So, I think it is going to be necessary to change both equals methods. Do you have any reasoning that explains why a hash collision is not possible?"
Yes, it could be happen. in my understanding this happens when there is same identify hashcode between anonymous klass.
In JFR 8u, _cstring_table is only used to store anonymous klass name entry id. Each anonymous klass is assigned with a new anonymous name and hashcode in JfrArtifactSet::mark_anonymous_klass_name first.
New anonymous name is concatenating its klass name and hashcode in create_anonymous_klass_symbol. Later when encounter anonymous klass, just retrieve anonymous klass id with hashcode.
Hashcode is generated with mirror oop's identity_hash, reproduce might need create a lot anonymous klass to have a same 64/32 bit hashcode.
Regards
Hui
-----邮件原件-----
发件人: Andrew Dinn <adinn at redhat.com>
发送时间: 2020年10月9日 20:57
收件人: kalinshi(施慧) <kalinshi at tencent.com>; jdk8u-dev at openjdk.java.net; Jaroslav Bachorík <jaroslav.bachorik at datadoghq.com>; Langer, Christoph <christoph.langer at sap.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
主题: Re: 回复: [8u] RFR(XS) JFR generate same symbol ID for different symbols (high probabily with CDS)(Internet mail)
Hi Hui,
On 09/10/2020 12:56, kalinshi(施慧) wrote:
> “I can see that it is possible that two different symbols might
> generate the same hash, which is what appears to motivate your patch.
> However, you have not really explained in what circumstances that
> problem will happen. That makes it impossible to know whether the
> patch is really needed.“
>
> This problem happens when running with CDS and reproduce steps is in
> https://bugs.openjdk.java.net/browse/JDK-8253837.
Apologies for not seeing that explanation. When I opened the JIRA the description tab was not displayed. This now makes a lot more sense.
> " Also, if it is needed then it is important to know whether this is
> the only thing that needs fixing or if more needs to be done. At the
> very least I think you will need to patch both JfrSymbolId::equals
> methods."
>
> I have noticed " bool JfrSymbolId::equals(const char* query, uintptr_t
> hash, const CStringEntry* entry) ". Plan is to investigate this after
> finding a reproducible case, this code is quite different from JDK11.
> In 8u, some caller is from JfrSymbolId::map_cstring(uintptr_t hash)
> with NULL query and hash code, while in 11u map_cstring method has
> removed and a declaration is left.
Well, I see what you mean regarding calls via map_cstring. There is no possibility of query and entry->literal() being different strings with the same hashcode when equals is called via map_cstring because it only calls lookup_only() with query == null.
However, the cstring table has to be updated somewhere and this happens when method id() is called in JfrSymbolId::mark. id() calls lookup_put() with a non null string. Now that looks to me like it can suffer the same problem as the symbol insert. If two symbols can generate the same hash then I don't see why two C strings could not also generate the same hash. That may be unlikely but I don't think it is impossible. So, I think it is going to be necessary to change both equals methods. Do you have any reasoning that explains why a hash collision is not possible?
I'd also still like to see a response from one of the upstream devs who worked on JDK-8231081 to explain why it needed to make allowance for hash collisions and also to ask if they see any good reason to backport JDK-8231081. If we need to do that anyway then we might as well apply all the required changes in one go. Jaroslav, any comments?
regards,
Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill
More information about the jdk8u-dev
mailing list