Unconditional messages on large page reservation errors
Stefan Johansson
stefan.johansson at oracle.com
Sat Apr 24 12:31:32 UTC 2021
Hi Thomas,
Sorry for the late reply.
On 2021-04-17 06:11, Thomas Stüfe wrote:
> Hi all,
>
> In os::reserve_memory_special, we print unconditional warnings to stderr in
> case large page reservation fails. Unconditional printouts like these can
> interfere with parsers parsing VM output, and can accumulate and cause high
> memory footprint (see e.g. https://bugs.openjdk.java.net/browse/JDK-8265332
> ).
I see unconditional warnings only for shm-case, in the hugetlbfs-case we
use warn_on_commit_special_failure() which only print the warning if
large pages was explicitly requested. Still we can only end up here if
large pages are enabled and that needs to be done explicitly so the
warnings are kind of unconditional.
>
> Large page reservations may fail at any time os::reserve_memory_special()
> function is called, e.g. because the large page pool is temporarily
> exhausted. And os::reserve_memory_special() is a general purpose function,
> not only used for the heap. Running out of large pages is not fatal, since
> the caller can just fall back to normal page allocation. Which is what we
> do when reserving the java heap. I think unconditional printouts should
> only happen in case of fatal errors, when the VM is about to die.
>
I don't really agree here since the performance implications of not
using large pages are quite big. I think it is fair to issue the warning
since in most cases it signals that there is an environment
configuration problem. For testing we have the possibility to use
-XX:-PrintWarnings and I saw you used that to fix the issue mentioned above.
Also, what would the use-case for warning() be if not for printing
information in non-fatal but problematic situations (which I think this is).
> The unconditional warning probably made sense in the context of reserving
> the java heap, if the user explicitly specified UseLargePages. I propose to
> change this to either
> - if large page allocation for the heap fails, trace with info level
> and fall back to small pages. Leave it up to the user to increase UL and
> monitor log output to find out about this. This is what we usually do when
> system APIs fail.
> - continue printing the message with error level, but exit the VM. If
> it's serious enough to unconditionally notify the user, it's serious enough
> to stop the VM.
>
> I prefer the former. What do you think?
As I said above, I see a value in warning if you don't get what you are
requesting. But I know others that think exiting is a better strategy.
If I'm not mistaken ZGC won't start if it can't reserve enough large
pages. One thing that I would like to change in this are is for these
warnings to be converted to use UL. That way we could turn of warnings
on a much finer granularity and we wouldn't have to use jio_snprintf to
compose messages.
Thanks for bringing this up for discussion,
StefanJ
>
> Thanks and best Regards,
>
> Thomas
>
More information about the hotspot-gc-dev
mailing list