generational zgc issues
Alen Vrečko
alen.vrecko at gmail.com
Fri Dec 1 15:01:02 UTC 2023
Thanks Stefan for looking into it as well.
For the address, hm, I don't get it. For the non generational case, method
works without shaving off the colors in the address. But it doesn't work
with generational.
I printed out some of the pointers.
non-generational pointers:
0000000000000000000100000000000000000111110001100000111001011000
0000000000000000000100000000000000000111110001100001001000001000
0000000000000000000100000000000000000111110001100001001000011000
0000000000000000000100000000000000000111110001100001010111001000
generational
0000000010000000000000001111001000101100000000000001010100010000
0000000010000000000000001111001000101100000000100001010100010000
0000000010000000000000001111001000101100011101100001010100010000
0000000010000000000000001111001000101100011110000001010100010000
So the 2 low order bytes are the ZGC meta, the barriers and colors right?
And the high order bytes are the heap base bit (part) + offset?
I tried just shaving off the 2 high order bytes but not seeing the correct
results. Is there more to extracting the address than just shaving off the
2 low order bytes?
Thanks
Alen
V V pet., 1. dec. 2023 ob 13:36 je oseba Stefan Karlsson <
stefan.karlsson at oracle.com> napisala:
> Hi Alen,
>
> I'm glad that you figured out what was happening. FWIW, I ran a whole
> bunch of tests on Alma 9.2 and couldn't reproduce any issues.
>
> Cheers,
> StefanK
>
> On 2023-11-29 19:37, Alen Vrečko wrote:
>
> Hi Stefan,
>
> all good. Finally got around to it. My bad in both cases.
>
> o) adding System.gc() solved the problem. Indeed, not a good idea to have
> expectations when working with java.lang.ref.Cleaner. Preferably not use it
> at all.
>
> o) for the corrupted byte[], got a chance to look into it. Not just
> speculate on log output. The issue was in Java Object Layout library (used
> v0.10). It returned something like 500K for the size of an object if
> Generational is enabled (should be in the range of < 100B). This caused a
> failure while processing byte[] and why I assumed that the byte[] is
> corrupted. I updated the jol library to 0.17 and it works fine now.
> Interesting that it looks like JOL v0.10 works fine on CentOS 7 with
> generational but not Alma 9.2 with generational - same 21 jdk.
>
> Time to fix some bad first impressions.
>
> Thanks
> Alen
>
> V V pon., 13. nov. 2023 ob 22:21 je oseba Alen Vrečko <
> alen.vrecko at gmail.com> napisala:
>
>> Thanks for the fast reply Stefan.
>>
>> For the reference issue. Looks like I misunderstood. Most probably issue
>> with timing in the toy program with major collections. For both G1 and ZGC
>> (non generational) both counters for new Foo() and Cleaner(foo)#clean match
>> after a short while. But not for generational ZGC. I'll add System.gc()
>> call in there and see what happens. Most probably a non-issue then and a
>> misunderstanding on my part.
>>
>> For the corrupted byte[]. Will see how much time I have on my hands to
>> look into it. Like mentioned vanilla ZGC works fine, with generational ZGC
>> seeing funny stuff with byte[].
>>
>> Alen
>>
>> V V pon., 13. nov. 2023 ob 20:28 je oseba Stefan Karlsson <
>> stefan.karlsson at oracle.com> napisala:
>>
>>> Hi Alen,
>>>
>>> On 2023-11-13 19:05, Alen Vrečko wrote:
>>>
>>> Hello everyone,
>>>
>>> o) young gen reference processor
>>>
>>> A bit puzzled by reading in a thread on the list:
>>>
>>> > mentioning that we decided to not ship a young generation reference
>>> processor for 21
>>> Unless you made changes to ByteBuffer#allocateDirect it uses reference
>>> processor to free native memory. If I am not mistaking just using standard
>>> library API such as Files.readAllBytes will in some cases do
>>> BB#allocateDirect in the internals.
>>> Or maybe I am misunderstanding something? I made a toy program and
>>> indeed I could easily get a situation where 20% of reference handlers are
>>> not called like ever.
>>> This will cause issues for code that is using reference handlers.
>>>
>>>
>>> The reference processing will happen when the GC performs a major
>>> collection, which collects both the young and old generation. If you add a
>>> System.gc() you should see that the reference processor is kicking in for
>>> your program. Could you share your toy program?
>>>
>>> o) seeing weird byte[] corruption in production
>>> On CentOS 7 Generational works fine. No issues observed. But on Alma
>>> Linux 9.2 either reading byte[] from file or sending byte[] over the
>>> network corrupts the byte[]. Didn't investigate at all. Just observed
>>> corruption in some cases for some byte[] arrays - not all - just some. On
>>> the same Alma Linux 9.2 without generational zgc no byte[] corruption is
>>> observed and everything works fine as before.
>>>
>>>
>>> It's hard to say if this is a ZGC bug, compiler bug, OS bug, etc. Here
>>> are some suggestions for how to help pin-point the problem:
>>> 1) Could you provide the output from 'java -version'?
>>> 2) Is it possible to reproduce this with a small reproducer?
>>> 3) What CPU is this running on?
>>> 4) Does it happen with -XX:UseAVX=0
>>> 5) Do you know the sizes of the corrupted byte[]s? Do you know the
>>> offset to where it is corrupted?
>>>
>>> StefanK
>>>
>>> To me Generational ZGC looks more like an experimental feature for now.
>>> I am a bit surprised it doesn't require the extra flag to unlock
>>> experimental features.
>>> Thanks
>>> Alen
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20231201/6a25a8f9/attachment-0001.htm>
More information about the zgc-dev
mailing list