8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

Gary Adams gary.adams at oracle.com
Fri Jun 21 17:03:35 UTC 2019


Any chance this will also fix JDK-8226367 ?

On 6/20/19, 10:39 PM, Chris Plummer wrote:
> Thanks for looking into this.
>
> Chris
>
> On 6/20/19 6:42 PM, Daniil Titov wrote:
>> Thank you, Chris and Serguei, for reviewing this change.
>>
>> I did more testing with a test program on Linux that repeats 
>> get_namespace_pid() method and reads from the real file ( the copy of 
>> /proc/<pid>/status).
>> I paused the program after the first several lines ( but not "NSpid:" 
>> line)  where processed and deleted all lines in the status file 
>> before " NSpid: ..." line in order
>> to this line became the first in the edited file. After that the 
>> program continues in the while loop but it seems as the original file 
>> content was
>> already buffered and the program just continues over the original 
>> unedited lines and successfully found the match.
>>
>> Best regards,
>> Daniil
>>
>> On 6/19/19, 9:13 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>>
>>      Hi Daniil,
>>           I think your fix is good, although I wonder about the 
>> robustness of this
>>      function given that the status file can change while it is 
>> reading it. I
>>      wonder if it can return a false negative because it never saw the
>>      matching line. This could happen if a line gets deleted, causing 
>> the
>>      matching line to suddenly be earlier in the file, possibly 
>> before the
>>      current location being read. Anyway, that's not really related 
>> to the
>>      current issue or fix, but if you think it might be an issue 
>> maybe a bug
>>      should be filed for it.
>>           thanks,
>>           Chris
>>           On 6/19/19 9:02 PM, Daniil Titov wrote:
>> > Please review the change that fixes an intermittent failure of 
>> serviceability/dcmd/framework/* tests on Linux platform.
>> >
>> > The problem here is that get_namespace_pid() method, that is called 
>> by mmap_attach_shared () that in turn is called by PerfMemory::attach(),
>> > tries to read the namespace pid information from /proc/<pid>/status 
>> file. However, it doesn't check that the error indicator associated with
>> > stream is set that results in the endless  loop (lines 664-677) if 
>> the process terminates after /proc/<pid>/status was opened (line 659)
>> > and checked for null (line 661).
>> >
>> >    658      snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
>> >     659      FILE *fp = fopen(fname, "r");
>> >     660
>> >     661      if (fp) {
>> >     662        int pid, nspid;
>> >     663        int ret;
>> >     664        while (!feof(fp)) {
>> >     665          ret = fscanf(fp, "NSpid: %d %d", &pid, &nspid);
>> >     666          if (ret == 1) {
>> >     667            break;
>> >     668          }
>> >     669          if (ret == 2) {
>> >     670            retpid = nspid;
>> >     671            break;
>> >     672          }
>> >     673          for (;;) {
>> >     674            int ch = fgetc(fp);
>> >     675            if (ch == EOF || ch == (int)'\n') break;
>> >     676          }
>> >     677        }
>> >     678        fclose(fp);
>> >     679      }
>> >
>> > The fix adds the check for the error indicator to ensure that the 
>> "while" loop terminates properly if the file no longer exists.
>> >
>> > Issues [3] and [4] have the same cause and will be closed as 
>> duplicates of this issue.
>> >
>> > Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, 
>> and tier3 tests are in progress.
>> >
>> > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
>> > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
>> > [3] https://bugs.openjdk.java.net/browse/JDK-8223600
>> > [4] https://bugs.openjdk.java.net/browse/JDK-8217351
>> >
>> > Thanks!
>> > -Daniil
>> >
>> >
>>
>>
>
>



More information about the serviceability-dev mailing list