RFR: JDK-8160350 cannot truss jdk9 [solaris]
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jul 1 20:32:30 UTC 2016
> I'm leaning towards switching back to the V1 of the patch which just
> fixes the Solaris issue, and in addition logging another bug which
> covers the larger cross-platform issues. That seems like the lowest-risk
> approach whist still fixing a problem on Solaris that *absolutely* needs
> to be addressed, but I'll welcome any advice.
Vote: Use JDK-8160350 to fix Solaris and file a new bug for the
larger cross-platform issues.
Dan
On 7/1/16 2:31 PM, Alan Burlison wrote:
> On 01/07/2016 20:44, Kim Barrett wrote:
>
>> This doesn't look right to me.
>>
>> The BSD implementation of os::elapsedVTime presently just calls
>> elapsedTime, with a comment "better than nothing, but not much". I
>> think it's worse than nothing, and should actually fail, with
>> os::supports_vtime returning false for this platform. Similarly,
>> elapsedVtime on some other platforms falls back to elapsedTime, which
>> I suspect is a mistake.
>
> I think os::elapsedVTime() is right on Windows, AIX, Solaris and
> Linux, and could be made right on OSX as well if the Mach APIs were
> used, and on FreeBSD it could use getrusage(). I think the problem is
> the other BSDs, I couldn't find anything suitable to use for those.
>
> And I agree the fallback to elapsedTime() also seems suspect as it's
> only ever done if the vtime mechanism for a given platform fails. If
> there was ever a transient failure and elapsedVTime() switched between
> vtime and elapsed time, you'd get some *very* bizarre values.
>
>> The problem is that the way one deals with vtime is different from the
>> way one deals with elapsedTime values. The predicate is needed so
>> that code that wants to use vtime if available can be properly written
>> to fall back to something else when it's not. (Note that I think
>> there might be bugs in how G1 is attempting to do that in some
>> places.)
>
> I agree what's there at present is a bit of a mess, which is why the
> first version of the patch (attached to the bug) just addressed the
> Solaris issue that caused truss to fail - that *definitely* needs
> fixing as it will cause a significant observability issue on Solaris.
>
>> I have a note to self to look into this and probably report some bugs,
>> but haven't gotten around to doing so.
>>
>> So while we might no longer need os::enable_vtime and
>> os::vtime_enabled, I think getting rid of os::supports_vtime isn't
>> right.
>
> Two main options seem feasible:
>
> 1. Revert to the V1 patch which just fixes the Solaris issue and
> leaves everything else alone.
>
> 2. Respin the V2 patch and leave os::supports_vtime() in place, and
> the two calls to it (from g1CollectedHeap.cpp and
> g1YoungRemSetSamplingThread.cpp). Remove os::enable_vtime() and
> os::vtime_enabled().
>
> For #2 we could in addition change BSD code so it returns the correct
> value from os::supports_vtime() (false) but I'm cautious about doing
> that as it would start using a new codepath that hasn't previously
> been exercised.
>
> Also for #2 we could remove the fallback code, the question being what
> to replace it with? Returning 0 could potentially cause the generation
> of negative values elsewhere, if the call to os::elapsedVTime() had
> previously succeeded.
>
>> OTOH, the present usage of vtime is quite limited. As noted, it's
>> only used by G1, and there I think only for some logging / stats
>> reporting. Maybe we don't really need it at all? I don't have an
>> opinion on that, but other folks in the GC team (at least) might.
>
> Who should we ask for a GC opinion?
>
> I'm leaning towards switching back to the V1 of the patch which just
> fixes the Solaris issue, and in addition logging another bug which
> covers the larger cross-platform issues. That seems like the
> lowest-risk approach whist still fixing a problem on Solaris that
> *absolutely* needs to be addressed, but I'll welcome any advice.
>
> Thanks,
>
More information about the hotspot-dev
mailing list