RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
Baesken, Matthias
matthias.baesken at sap.com
Thu Jun 4 14:11:59 UTC 2020
HI Severin, with your new patch I run into :
java.lang.RuntimeException: 'OperatingSystemMXBean.getTotalSwapSpaceSize: [1-9][0-9]+' missing from stdout/stderr
at jdk.test.lib.process.OutputAnalyzer.shouldContain(OutputAnalyzer.java:189)
at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:165)
at TestMemoryAwareness.main(TestMemoryAwareness.java:66)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
at java.base/java.lang.Thread.run(Thread.java:832)
Did you really mean in line 165 of TestMemoryAwareness.java
out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: [1-9][0-9]+");
?
When I fix it (to shouldMatch , this was probably meant by you) I get the next error -
java.lang.RuntimeException: 'Metrics.getMemoryLimit() == -1' missing from stdout/stderr
I have
Metrics.getMemoryAndSwapLimit() == -1
Metrics.getMemoryLimit() == 104857600
Metrics.getMemoryAndSwapUsage() == -1
Metrics.getMemoryUsage() == 30670848
Could we omit the
out.shouldContain("Metrics.getMemoryLimit() == -1");
check ?
>> (plus we still need Jie's adjustment to fix getFreeSwapSpaceSize )
>
>Yes, but I'd rather fix this as a separate issue. I don't think it's
>related.
Hmm okay, probably better to open a new one. Jie - should I open a separate JBS issue ? Or do you want to do it ( you proposed the patch for the getFreeSwapSpaceSize issue) ?
Best regards, Matthias
-----Original Message-----
From: Severin Gehwolf <sgehwolf at redhat.com>
Sent: Mittwoch, 3. Juni 2020 15:47
To: Baesken, Matthias <matthias.baesken at sap.com>; jiefu(傅杰) <jiefu at tencent.com>; David Holmes <david.holmes at oracle.com>; 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
Cc: Bob Vandette <Bob.Vandette at oracle.com>
Subject: Re: RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
Hi Matthias,
On Fri, 2020-05-22 at 13:36 +0000, Baesken, Matthias wrote:
> > Anyway, I think we should change the test code similar to this:
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244500/01/webrev/
>
> Hi Severin, this gave me :
>
> Checking OperatingSystemMXBean
> Exception in thread "main" java.lang.IllegalAccessError: class CheckOperatingSystemMXBean (in unnamed module @0x5acf9800) cannot access class jdk.internal.platform.Container (in module java.base) because module java.base does not export jdk.internal.platform to unnamed module @0x5acf9800
> at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:33)
>
> (a bit unexpected because you added it to the test, I though this would be sufficient !?)
OK, reproduced. FWIW, 01 was an untested patch ;-)
Please try this updated webrev:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244500/02/webrev/
It works fine here for cgroups v2 and cgroups v1.
> (plus we still need Jie's adjustment to fix getFreeSwapSpaceSize )
Yes, but I'd rather fix this as a separate issue. I don't think it's
related.
Thanks,
Severin
> -----Original Message-----
> From: Severin Gehwolf <sgehwolf at redhat.com>
> Sent: Mittwoch, 20. Mai 2020 21:30
> To: Baesken, Matthias <matthias.baesken at sap.com>; jiefu(傅杰) <
> jiefu at tencent.com>; David Holmes <david.holmes at oracle.com>;
> 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> Cc: Bob Vandette <Bob.Vandette at oracle.com>
> Subject: Re: RFR [XS]: 8244500: jtreg test error in
> test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
>
> Hi Matthias,
>
> On Wed, 2020-05-20 at 11:07 +0000, Baesken, Matthias wrote:
> > Hi Severin ,
> >
> > > FWIW, parts of this work-around got added by JDK-8236617. Looking
> > > at
> > > the bug it was observed that:
> > >
> > > containerMetrics.getMemoryAndSwapLimit() == 0 and
> > > containerMetrics.getMemoryLimit() returned a large value.
> > >
> > > So then with JDK-8231111 I suspect
> > > containerMetrics.getMemoryAndSwapLimit() now returns -1 and,
> > > thus,
> > > getTotalSwapSpaceSize0() is being used instead of the container
> > > values.
> >
> > I added some tracing output
> > to OperatingSystemImpl.getTotalSwapSpaceSize() and get these
> > values :
> >
> > containerMetrics.getMemoryAndSwapLimit():-1
> > containerMetrics.getMemoryLimit():-1
> > ****** getTotalSwapSpaceSize0() returned 16106061824
>
> Thanks for this. This seems to confirm my suspicion.
>
> So my working theory here is that the system where you are observing
> the issue now is similar to the one when JDK-8236617 was discovered.
> For some reason containerMetrics.getMemoryAndSwapLimit() returned 0
> for
> the case when the necessary files were missing (which was wrong IMO).
> After JDK-8231111 there is a clear distinction for this case: -1 is
> being returned. That makes more sense to me, as there wouldn't be any
> container limit to take into account.
>
> Your trace output seems to confirm that:
>
> containerMetrics.getMemoryAndSwapLimit():-1
> containerMetrics.getMemoryLimit():-1
>
> > > So far the theory, I'd like to gather more data before we come up
> > > with
> > > another patch, though. In JDK-8236617 it wasn't investigated why
> > > containerMetrics.getMemoryAndSwapLimit() returned 0 on such a
> > > system.
> > > It would be good to know why. Could it be that file
> > > 'memory.memsw.limit_in_bytes' doesn't exist on such a system?
> >
> > True, it does not exist :
> >
> > ls -alL /sys/fs/cgroup/memory/
> > total 0
> > drwxr-xr-x 3 root root 0 Mar 16 11:00 .
> > drwxr-xr-x 11 root root 260 Mar 16 11:00 ..
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 cgroup.clone_children
> > --w--w--w- 1 root root 0 Mar 16 11:00 cgroup.event_control
> > -rw-r--r-- 1 root root 0 May 20 13:00 cgroup.procs
> > -r--r--r-- 1 root root 0 Mar 16 11:00 cgroup.sane_behavior
> > drwxr-xr-x 2 root root 0 May 20 12:52 docker
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.failcnt
> > --w------- 1 root root 0 Mar 16 11:00 memory.force_empty
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.limit_in_bytes
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.low_limit_in_bytes
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.max_usage_in_bytes
> > -rw-r--r-- 1 root root 0 Mar 16 11:00
> > memory.move_charge_at_immigrate
> > -r--r--r-- 1 root root 0 Mar 16 11:00 memory.numa_stat
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.oom_control
> > ---------- 1 root root 0 Mar 16 11:00 memory.pressure_level
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.soft_limit_in_bytes
> > -r--r--r-- 1 root root 0 Mar 16 11:00 memory.stat
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.swappiness
> > -r--r--r-- 1 root root 0 Mar 16 11:00 memory.usage_in_bytes
> > -rw-r--r-- 1 root root 0 May 20 13:00 memory.use_hierarchy
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 notify_on_release
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 release_agent
> > -rw-r--r-- 1 root root 0 Mar 16 11:00 tasks
>
> Hmm, is this inside a container on this system with --memory and --
> memory-swap being specified on the docker command line?
>
> Anyway, I think we should change the test code similar to this:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244500/01/webrev/
>
> If this works for you, then that's what I'd suggest as a fix.
>
> Note that I haven't tested this patch. I need to recreate my set-up
> for
> proper docker testing here :-/
>
> Thanks,
> Severin
>
> > Best regards, Matthias
> >
> >
> >
> > -----Original Message-----
> > From: Severin Gehwolf <sgehwolf at redhat.com>
> > Sent: Montag, 18. Mai 2020 12:09
> > To: jiefu(傅杰) <jiefu at tencent.com>; David Holmes <
> > david.holmes at oracle.com>; Baesken, Matthias <
> > matthias.baesken at sap.com>; 'hotspot-dev at openjdk.java.net' <
> > hotspot-dev at openjdk.java.net>
> > Cc: Bob Vandette <Bob.Vandette at oracle.com>
> > Subject: Re: RFR [XS]: 8244500: jtreg test error in
> > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
> >
> > Hi,
> >
> > On Mon, 2020-05-18 at 06:15 +0000, jiefu(傅杰) wrote:
> > > Hi David,
> > >
> > > This bug should be introduced by JDK-8242480.
> > > I'm sorry for that.
> > >
> > > And I think the logic in
> > > src/jdk.management/unix/classes/com/sun/management/internal/Opera
> > > tingSystemImpl.java should be fixed.
> > >
> > > In my fix for JDK-8242480, I had ignored the case for kernels
> > > which don't support "--memory-swap 150M".
> > > In that case, containerMetrics.getMemoryAndSwapLimit()[1] will
> > > return -1.
> >
> > It appears so that containerMetrics.getMemoryAndSwapLimit()
> > returned 0
> > prior JDK-8231111 and -1 after. I'm not convinced JDK-8242480
> > introduced this.
> >
> > > I didn't notice this bug because all our kernels support --
> > > memory-swap.
> > > Fortunately, Matthias found this bug and will fix it.
> >
> > I'm still not sure what's happening in Matthias' case on that
> > system.
> > It would be best to first try to understand the actual problem
> > better.
> > The info I have is:
> >
> > - It's a cgroup v1 system
> > - The system now returns large values for
> > OperatingSystemMXBean.getTotalSwapSpaceSize() and
> > OperatingSystemMXBean.getFreeSwapSpaceSize()
> >
> > It would be good to know what the actual return values are for the
> > container metrics on this system with swap limit disabled at kernel
> > level. And also whether or not relevant cgroup files exist on such
> > a
> > system (within docker).
> >
> > FWIW, parts of this work-around got added by JDK-8236617. Looking
> > at
> > the bug it was observed that:
> >
> > containerMetrics.getMemoryAndSwapLimit() == 0 and
> > containerMetrics.getMemoryLimit() returned a large value.
> >
> > So then with JDK-8231111 I suspect
> > containerMetrics.getMemoryAndSwapLimit() now returns -1 and, thus,
> > getTotalSwapSpaceSize0() is being used instead of the container
> > values.
> >
> > So far the theory, I'd like to gather more data before we come up
> > with
> > another patch, though. In JDK-8236617 it wasn't investigated why
> > containerMetrics.getMemoryAndSwapLimit() returned 0 on such a
> > system.
> > It would be good to know why. Could it be that file
> > 'memory.memsw.limit_in_bytes' doesn't exist on such a system?
> >
> > Unfortunately, I don't know what config one needs in order to
> > reproduce
> > so we'll have to rely on Matthias to gather the info.
> >
> > Thanks,
> > Severin
> >
> > > Thanks a lot.
> > > Best regards,
> > > Jie
> > >
> > > [1]
> > > http://hg.openjdk.java.net/jdk/jdk/file/a95911422005/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java#l73
> > >
> > > On 2020/5/18, 1:48 PM, "David Holmes" <david.holmes at oracle.com>
> > > wrote:
> > >
> > > Hi Matthias,
> > >
> > > On 16/05/2020 12:59 am, Baesken, Matthias wrote:
> > > > Hello David , I get the warning always on this linux
> > > machine (I think it depends on machine setup and/or
> > > kernel version).
> > > >
> > > >> Surely it must be specified to return 0 or some such
> > > sentinel in such a case?
> > > >
> > > > The doc just says this :
> > > >
> > > > 62 /**
> > > > 63 * Returns the total amount of swap space in bytes.
> > > > 64 *
> > > > 65 * @return the total amount of swap space in bytes.
> > > > 66 */
> > > > 67 public long getTotalSwapSpaceSize();
> > > >
> > > > ( however it returned previously 0 before the latest
> > > changes in the container support code came in ).
> > >
> > > That sounds like a bug was introduced to me then.
> > >
> > > >
> > > Regarding the getFreeSwapSpaceSize issue , Jie provided a
> > > change
> > > to src/jdk.management/unix/classes/com/sun/management/internal/O
> > > peratingSystemImpl.java
> > > > Because of this I could remove the try-construct .
> > > >
> > > > New webrev :
> > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8244500.3/
> > >
> > > I'm quite confused by your changes. Your first version simply
> > > adjusted
> > > the test to check for the warning. But now you are also
> > > changing the
> > > main logic. I can't assess the validity of what you are doing
> > > sorry.
> > >
> > > I've cc'd Severin and Bob as I think they need to evaluate
> > > what the real
> > > problem is here and when it was introduced.
> > >
> > > David
> > > ----
> > >
> > > > Best regards, Matthias
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: David Holmes <david.holmes at oracle.com>
> > > > Sent: Donnerstag, 14. Mai 2020 07:02
> > > > To: Baesken, Matthias <matthias.baesken at sap.com>;
> > > 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> > > > Cc: jiefu(傅杰) <jiefu at tencent.com>
> > > > Subject: Re: RFR [XS]: 8244500: jtreg test error in
> > > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
> > > >
> > > > On 14/05/2020 12:28 am, Baesken, Matthias wrote:
> > > >> Hello, here is a new webrev removing the comment :
> > > >>
> > > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8244500.1/
> > > >>
> > > >> It removes the comment , and it also handles 0 for
> > > getFreeSwapSpaceSize
> > > >> ( this can occur after JDK-8242480 : Negative value
> > > may be returned by getFreeSwapSpaceSize() in the docker was
> > > pushed recently ).
> > > >
> > > > Sorry Matthias but I still see this partly as a bug in the
> > > underlying
> > > > container support code. If you get the warning then what
> > > does
> > > > getTotalSwapSpaceSize() return? Surely it must be specified
> > > to return 0
> > > > or some such sentinel in such a case?
> > > >
> > > > Other comments:
> > > >
> > > > ! try {
> > > > !
> > > >
> > > out.shouldMatch("OperatingSystemMXBean\\.getFreeSwapSpaceSize:
> > > > [1-9][0-9]+");
> > > > ! } catch(RuntimeException ex) { // allow 0, see
> > > 8242480
> > > > !
> > > >
> > > out.shouldContain("OperatingSystemMXBean.getFreeSwapSpaceSize:
> > > 0");
> > > > ! }
> > > >
> > > > can't you just change the original regex to match >= 0?
> > > >
> > > > Style nit: avoid historical comments like "see 8242480" and
> > > "since
> > > > 8231111 ...".
> > > >
> > > > Thanks,
> > > > David
> > > >
> > > >> Best regards, Matthias
> > > >>
> > > >> -----Original Message-----
> > > >> From: Baesken, Matthias
> > > >> Sent: Mittwoch, 13. Mai 2020 10:11
> > > >> To: 'David Holmes' <david.holmes at oracle.com>;
> > > 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> > > >> Subject: RE: RFR [XS]: 8244500: jtreg test error in
> > > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
> > > >>
> > > >>
> > > >>> But according to the comment immediately before your
> > > changes:
> > > >>>
> > > >>> // in case of warnings like : "Your kernel does not
> > > support swap limit
> > > >>> capabilities or the cgroup is not mounted. Memory limited
> > > without swap."
> > > >>> // the getTotalSwapSpaceSize does not return the expected
> > > result, but 0
> > > >>>
> > > >>> we should be returning zero. So this seems to be a bug in
> > > the
> > > >>> implementation not in the test.
> > > >>
> > > >>
> > > >> Hi David , I think in case of "Your kernel does not
> > > support swap limit capabilities ... " we just do not get
> > > the values
> > > >> that are normally expected.
> > > >> Previously we indeed got "0" .
> > > >> Probably the comment should be deleted .
> > > >>
> > > >> Best regards, Matthias
> > > >>
> > >
> > >
> > >
More information about the hotspot-dev
mailing list