8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout
Gary Adams
gary.adams at oracle.com
Fri Jun 21 17:03:35 UTC 2019
Any chance this will also fix JDK-8226367 ?
On 6/20/19, 10:39 PM, Chris Plummer wrote:
> Thanks for looking into this.
>
> Chris
>
> On 6/20/19 6:42 PM, Daniil Titov wrote:
>> Thank you, Chris and Serguei, for reviewing this change.
>>
>> I did more testing with a test program on Linux that repeats
>> get_namespace_pid() method and reads from the real file ( the copy of
>> /proc/<pid>/status).
>> I paused the program after the first several lines ( but not "NSpid:"
>> line) where processed and deleted all lines in the status file
>> before " NSpid: ..." line in order
>> to this line became the first in the edited file. After that the
>> program continues in the while loop but it seems as the original file
>> content was
>> already buffered and the program just continues over the original
>> unedited lines and successfully found the match.
>>
>> Best regards,
>> Daniil
>>
>> On 6/19/19, 9:13 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>>
>> Hi Daniil,
>> I think your fix is good, although I wonder about the
>> robustness of this
>> function given that the status file can change while it is
>> reading it. I
>> wonder if it can return a false negative because it never saw the
>> matching line. This could happen if a line gets deleted, causing
>> the
>> matching line to suddenly be earlier in the file, possibly
>> before the
>> current location being read. Anyway, that's not really related
>> to the
>> current issue or fix, but if you think it might be an issue
>> maybe a bug
>> should be filed for it.
>> thanks,
>> Chris
>> On 6/19/19 9:02 PM, Daniil Titov wrote:
>> > Please review the change that fixes an intermittent failure of
>> serviceability/dcmd/framework/* tests on Linux platform.
>> >
>> > The problem here is that get_namespace_pid() method, that is called
>> by mmap_attach_shared () that in turn is called by PerfMemory::attach(),
>> > tries to read the namespace pid information from /proc/<pid>/status
>> file. However, it doesn't check that the error indicator associated with
>> > stream is set that results in the endless loop (lines 664-677) if
>> the process terminates after /proc/<pid>/status was opened (line 659)
>> > and checked for null (line 661).
>> >
>> > 658 snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
>> > 659 FILE *fp = fopen(fname, "r");
>> > 660
>> > 661 if (fp) {
>> > 662 int pid, nspid;
>> > 663 int ret;
>> > 664 while (!feof(fp)) {
>> > 665 ret = fscanf(fp, "NSpid: %d %d", &pid, &nspid);
>> > 666 if (ret == 1) {
>> > 667 break;
>> > 668 }
>> > 669 if (ret == 2) {
>> > 670 retpid = nspid;
>> > 671 break;
>> > 672 }
>> > 673 for (;;) {
>> > 674 int ch = fgetc(fp);
>> > 675 if (ch == EOF || ch == (int)'\n') break;
>> > 676 }
>> > 677 }
>> > 678 fclose(fp);
>> > 679 }
>> >
>> > The fix adds the check for the error indicator to ensure that the
>> "while" loop terminates properly if the file no longer exists.
>> >
>> > Issues [3] and [4] have the same cause and will be closed as
>> duplicates of this issue.
>> >
>> > Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2,
>> and tier3 tests are in progress.
>> >
>> > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
>> > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
>> > [3] https://bugs.openjdk.java.net/browse/JDK-8223600
>> > [4] https://bugs.openjdk.java.net/browse/JDK-8217351
>> >
>> > Thanks!
>> > -Daniil
>> >
>> >
>>
>>
>
>
More information about the serviceability-dev
mailing list