RFR: JDK-8160350 cannot truss jdk9 [solaris]
Alan Burlison
Alan.Burlison at oracle.com
Fri Jul 1 20:31:51 UTC 2016
On 01/07/2016 20:44, Kim Barrett wrote:
> This doesn't look right to me.
>
> The BSD implementation of os::elapsedVTime presently just calls
> elapsedTime, with a comment "better than nothing, but not much". I
> think it's worse than nothing, and should actually fail, with
> os::supports_vtime returning false for this platform. Similarly,
> elapsedVtime on some other platforms falls back to elapsedTime, which
> I suspect is a mistake.
I think os::elapsedVTime() is right on Windows, AIX, Solaris and Linux,
and could be made right on OSX as well if the Mach APIs were used, and
on FreeBSD it could use getrusage(). I think the problem is the other
BSDs, I couldn't find anything suitable to use for those.
And I agree the fallback to elapsedTime() also seems suspect as it's
only ever done if the vtime mechanism for a given platform fails. If
there was ever a transient failure and elapsedVTime() switched between
vtime and elapsed time, you'd get some *very* bizarre values.
> The problem is that the way one deals with vtime is different from the
> way one deals with elapsedTime values. The predicate is needed so
> that code that wants to use vtime if available can be properly written
> to fall back to something else when it's not. (Note that I think
> there might be bugs in how G1 is attempting to do that in some
> places.)
I agree what's there at present is a bit of a mess, which is why the
first version of the patch (attached to the bug) just addressed the
Solaris issue that caused truss to fail - that *definitely* needs fixing
as it will cause a significant observability issue on Solaris.
> I have a note to self to look into this and probably report some bugs,
> but haven't gotten around to doing so.
>
> So while we might no longer need os::enable_vtime and
> os::vtime_enabled, I think getting rid of os::supports_vtime isn't
> right.
Two main options seem feasible:
1. Revert to the V1 patch which just fixes the Solaris issue and leaves
everything else alone.
2. Respin the V2 patch and leave os::supports_vtime() in place, and the
two calls to it (from g1CollectedHeap.cpp and
g1YoungRemSetSamplingThread.cpp). Remove os::enable_vtime() and
os::vtime_enabled().
For #2 we could in addition change BSD code so it returns the correct
value from os::supports_vtime() (false) but I'm cautious about doing
that as it would start using a new codepath that hasn't previously been
exercised.
Also for #2 we could remove the fallback code, the question being what
to replace it with? Returning 0 could potentially cause the generation
of negative values elsewhere, if the call to os::elapsedVTime() had
previously succeeded.
> OTOH, the present usage of vtime is quite limited. As noted, it's
> only used by G1, and there I think only for some logging / stats
> reporting. Maybe we don't really need it at all? I don't have an
> opinion on that, but other folks in the GC team (at least) might.
Who should we ask for a GC opinion?
I'm leaning towards switching back to the V1 of the patch which just
fixes the Solaris issue, and in addition logging another bug which
covers the larger cross-platform issues. That seems like the lowest-risk
approach whist still fixing a problem on Solaris that *absolutely* needs
to be addressed, but I'll welcome any advice.
Thanks,
--
Alan Burlison
--
More information about the hotspot-dev
mailing list