RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
Baesken, Matthias
matthias.baesken at sap.com
Fri May 22 13:36:14 UTC 2020
>Anyway, I think we should change the test code similar to this:
>http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244500/01/webrev/
Hi Severin, this gave me :
Checking OperatingSystemMXBean
Exception in thread "main" java.lang.IllegalAccessError: class CheckOperatingSystemMXBean (in unnamed module @0x5acf9800) cannot access class jdk.internal.platform.Container (in module java.base) because module java.base does not export jdk.internal.platform to unnamed module @0x5acf9800
at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:33)
(a bit unexpected because you added it to the test, I though this would be sufficient !?)
Guess adding the export to the module-info.java of java.base would also not be so good.
(plus we still need Jie's adjustment to fix getFreeSwapSpaceSize )
Best regards, Matthias
-----Original Message-----
From: Severin Gehwolf <sgehwolf at redhat.com>
Sent: Mittwoch, 20. Mai 2020 21:30
To: Baesken, Matthias <matthias.baesken at sap.com>; jiefu(傅杰) <jiefu at tencent.com>; David Holmes <david.holmes at oracle.com>; 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
Cc: Bob Vandette <Bob.Vandette at oracle.com>
Subject: Re: RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
Hi Matthias,
On Wed, 2020-05-20 at 11:07 +0000, Baesken, Matthias wrote:
> Hi Severin ,
>
> > FWIW, parts of this work-around got added by JDK-8236617. Looking at
> > the bug it was observed that:
> >
> > containerMetrics.getMemoryAndSwapLimit() == 0 and
> > containerMetrics.getMemoryLimit() returned a large value.
> >
> > So then with JDK-8231111 I suspect
> > containerMetrics.getMemoryAndSwapLimit() now returns -1 and, thus,
> > getTotalSwapSpaceSize0() is being used instead of the container values.
>
> I added some tracing output to OperatingSystemImpl.getTotalSwapSpaceSize() and get these values :
>
> containerMetrics.getMemoryAndSwapLimit():-1 containerMetrics.getMemoryLimit():-1
> ****** getTotalSwapSpaceSize0() returned 16106061824
Thanks for this. This seems to confirm my suspicion.
So my working theory here is that the system where you are observing
the issue now is similar to the one when JDK-8236617 was discovered.
For some reason containerMetrics.getMemoryAndSwapLimit() returned 0 for
the case when the necessary files were missing (which was wrong IMO).
After JDK-8231111 there is a clear distinction for this case: -1 is
being returned. That makes more sense to me, as there wouldn't be any
container limit to take into account.
Your trace output seems to confirm that:
containerMetrics.getMemoryAndSwapLimit():-1
containerMetrics.getMemoryLimit():-1
>
> > So far the theory, I'd like to gather more data before we come up with
> > another patch, though. In JDK-8236617 it wasn't investigated why
> > containerMetrics.getMemoryAndSwapLimit() returned 0 on such a system.
> > It would be good to know why. Could it be that file
> > 'memory.memsw.limit_in_bytes' doesn't exist on such a system?
>
> True, it does not exist :
>
> ls -alL /sys/fs/cgroup/memory/
> total 0
> drwxr-xr-x 3 root root 0 Mar 16 11:00 .
> drwxr-xr-x 11 root root 260 Mar 16 11:00 ..
> -rw-r--r-- 1 root root 0 Mar 16 11:00 cgroup.clone_children
> --w--w--w- 1 root root 0 Mar 16 11:00 cgroup.event_control
> -rw-r--r-- 1 root root 0 May 20 13:00 cgroup.procs
> -r--r--r-- 1 root root 0 Mar 16 11:00 cgroup.sane_behavior
> drwxr-xr-x 2 root root 0 May 20 12:52 docker
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.failcnt
> --w------- 1 root root 0 Mar 16 11:00 memory.force_empty
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.limit_in_bytes
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.low_limit_in_bytes
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.max_usage_in_bytes
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.move_charge_at_immigrate
> -r--r--r-- 1 root root 0 Mar 16 11:00 memory.numa_stat
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.oom_control
> ---------- 1 root root 0 Mar 16 11:00 memory.pressure_level
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.soft_limit_in_bytes
> -r--r--r-- 1 root root 0 Mar 16 11:00 memory.stat
> -rw-r--r-- 1 root root 0 Mar 16 11:00 memory.swappiness
> -r--r--r-- 1 root root 0 Mar 16 11:00 memory.usage_in_bytes
> -rw-r--r-- 1 root root 0 May 20 13:00 memory.use_hierarchy
> -rw-r--r-- 1 root root 0 Mar 16 11:00 notify_on_release
> -rw-r--r-- 1 root root 0 Mar 16 11:00 release_agent
> -rw-r--r-- 1 root root 0 Mar 16 11:00 tasks
Hmm, is this inside a container on this system with --memory and --
memory-swap being specified on the docker command line?
Anyway, I think we should change the test code similar to this:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8244500/01/webrev/
If this works for you, then that's what I'd suggest as a fix.
Note that I haven't tested this patch. I need to recreate my set-up for
proper docker testing here :-/
Thanks,
Severin
> Best regards, Matthias
>
>
>
> -----Original Message-----
> From: Severin Gehwolf <sgehwolf at redhat.com>
> Sent: Montag, 18. Mai 2020 12:09
> To: jiefu(傅杰) <jiefu at tencent.com>; David Holmes <david.holmes at oracle.com>; Baesken, Matthias <matthias.baesken at sap.com>; 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> Cc: Bob Vandette <Bob.Vandette at oracle.com>
> Subject: Re: RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
>
> Hi,
>
> On Mon, 2020-05-18 at 06:15 +0000, jiefu(傅杰) wrote:
> > Hi David,
> >
> > This bug should be introduced by JDK-8242480.
> > I'm sorry for that.
> >
> > And I think the logic in src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java should be fixed.
> >
> > In my fix for JDK-8242480, I had ignored the case for kernels which don't support "--memory-swap 150M".
> > In that case, containerMetrics.getMemoryAndSwapLimit()[1] will return -1.
>
> It appears so that containerMetrics.getMemoryAndSwapLimit() returned 0
> prior JDK-8231111 and -1 after. I'm not convinced JDK-8242480
> introduced this.
>
> > I didn't notice this bug because all our kernels support --memory-swap.
> > Fortunately, Matthias found this bug and will fix it.
>
> I'm still not sure what's happening in Matthias' case on that system.
> It would be best to first try to understand the actual problem better.
> The info I have is:
>
> - It's a cgroup v1 system
> - The system now returns large values for
> OperatingSystemMXBean.getTotalSwapSpaceSize() and
> OperatingSystemMXBean.getFreeSwapSpaceSize()
>
> It would be good to know what the actual return values are for the
> container metrics on this system with swap limit disabled at kernel
> level. And also whether or not relevant cgroup files exist on such a
> system (within docker).
>
> FWIW, parts of this work-around got added by JDK-8236617. Looking at
> the bug it was observed that:
>
> containerMetrics.getMemoryAndSwapLimit() == 0 and
> containerMetrics.getMemoryLimit() returned a large value.
>
> So then with JDK-8231111 I suspect
> containerMetrics.getMemoryAndSwapLimit() now returns -1 and, thus,
> getTotalSwapSpaceSize0() is being used instead of the container values.
>
> So far the theory, I'd like to gather more data before we come up with
> another patch, though. In JDK-8236617 it wasn't investigated why
> containerMetrics.getMemoryAndSwapLimit() returned 0 on such a system.
> It would be good to know why. Could it be that file
> 'memory.memsw.limit_in_bytes' doesn't exist on such a system?
>
> Unfortunately, I don't know what config one needs in order to reproduce
> so we'll have to rely on Matthias to gather the info.
>
> Thanks,
> Severin
>
> > Thanks a lot.
> > Best regards,
> > Jie
> >
> > [1] http://hg.openjdk.java.net/jdk/jdk/file/a95911422005/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java#l73
> >
> > On 2020/5/18, 1:48 PM, "David Holmes" <david.holmes at oracle.com> wrote:
> >
> > Hi Matthias,
> >
> > On 16/05/2020 12:59 am, Baesken, Matthias wrote:
> > > Hello David , I get the warning always on this linux machine (I think it depends on machine setup and/or kernel version).
> > >
> > >> Surely it must be specified to return 0 or some such sentinel in such a case?
> > >
> > > The doc just says this :
> > >
> > > 62 /**
> > > 63 * Returns the total amount of swap space in bytes.
> > > 64 *
> > > 65 * @return the total amount of swap space in bytes.
> > > 66 */
> > > 67 public long getTotalSwapSpaceSize();
> > >
> > > ( however it returned previously 0 before the latest changes in the container support code came in ).
> >
> > That sounds like a bug was introduced to me then.
> >
> > > Regarding the getFreeSwapSpaceSize issue , Jie provided a change to src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
> > > Because of this I could remove the try-construct .
> > >
> > > New webrev :
> > > http://cr.openjdk.java.net/~mbaesken/webrevs/8244500.3/
> >
> > I'm quite confused by your changes. Your first version simply adjusted
> > the test to check for the warning. But now you are also changing the
> > main logic. I can't assess the validity of what you are doing sorry.
> >
> > I've cc'd Severin and Bob as I think they need to evaluate what the real
> > problem is here and when it was introduced.
> >
> > David
> > ----
> >
> > > Best regards, Matthias
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: David Holmes <david.holmes at oracle.com>
> > > Sent: Donnerstag, 14. Mai 2020 07:02
> > > To: Baesken, Matthias <matthias.baesken at sap.com>; 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> > > Cc: jiefu(傅杰) <jiefu at tencent.com>
> > > Subject: Re: RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
> > >
> > > On 14/05/2020 12:28 am, Baesken, Matthias wrote:
> > >> Hello, here is a new webrev removing the comment :
> > >>
> > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8244500.1/
> > >>
> > >> It removes the comment , and it also handles 0 for getFreeSwapSpaceSize
> > >> ( this can occur after JDK-8242480 : Negative value may be returned by getFreeSwapSpaceSize() in the docker was pushed recently ).
> > >
> > > Sorry Matthias but I still see this partly as a bug in the underlying
> > > container support code. If you get the warning then what does
> > > getTotalSwapSpaceSize() return? Surely it must be specified to return 0
> > > or some such sentinel in such a case?
> > >
> > > Other comments:
> > >
> > > ! try {
> > > !
> > > out.shouldMatch("OperatingSystemMXBean\\.getFreeSwapSpaceSize:
> > > [1-9][0-9]+");
> > > ! } catch(RuntimeException ex) { // allow 0, see 8242480
> > > !
> > > out.shouldContain("OperatingSystemMXBean.getFreeSwapSpaceSize: 0");
> > > ! }
> > >
> > > can't you just change the original regex to match >= 0?
> > >
> > > Style nit: avoid historical comments like "see 8242480" and "since
> > > 8231111 ...".
> > >
> > > Thanks,
> > > David
> > >
> > >> Best regards, Matthias
> > >>
> > >> -----Original Message-----
> > >> From: Baesken, Matthias
> > >> Sent: Mittwoch, 13. Mai 2020 10:11
> > >> To: 'David Holmes' <david.holmes at oracle.com>; 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> > >> Subject: RE: RFR [XS]: 8244500: jtreg test error in test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java
> > >>
> > >>
> > >>> But according to the comment immediately before your changes:
> > >>>
> > >>> // in case of warnings like : "Your kernel does not support swap limit
> > >>> capabilities or the cgroup is not mounted. Memory limited without swap."
> > >>> // the getTotalSwapSpaceSize does not return the expected result, but 0
> > >>>
> > >>> we should be returning zero. So this seems to be a bug in the
> > >>> implementation not in the test.
> > >>
> > >>
> > >> Hi David , I think in case of "Your kernel does not support swap limit capabilities ... " we just do not get the values
> > >> that are normally expected.
> > >> Previously we indeed got "0" .
> > >> Probably the comment should be deleted .
> > >>
> > >> Best regards, Matthias
> > >>
> >
> >
> >
More information about the hotspot-dev
mailing list