RFR(M): 8140482: Various minor code improvements (runtime)

Wed Nov 4 07:14:32 UTC 2015

Hi Goetz,

On 4/11/2015 12:10 AM, Lindenmaier, Goetz wrote:
> Hi David,
>
> the new scan is already through. I made a new webrev:
> http://cr.openjdk.java.net/~goetz/webrevs/8140482-covRt/webrev.01/
>
> attachListener_linux.cpp:
> I had moved the string termination out of the loop, and read one
> less char from the file. The scan still claims "Passing unterminated
> string buf to strlen" in line 274.  So I will undo it again for the
> webrev.
>
> codeBuffer.cpp
> Doing memset is fine.  I'll use memset().
>
>>>>> ps_core.c:
>>>>> Pread not necessarily terminates interp_name which is printed
>>>>> thereafter.  Increase buffer size by 1 and add '\0'.
>>>>
>>>> Given:
>>>> #define BUF_SIZE     (PATH_MAX + NAME_MAX + 1)
>>>> isn't it impossible to encounter that problem?
>>> As I understand, pread does not null-terminate what it read.  So the
>>> null must come from the file.  This protects against a corrupted file.
>>
>> So are you saying the nul is not present in the file? I'm not familiar
>> with the ELF format.
> There should be a nul in the file, else the code would fail more
> obviously.  The problem is if the file is corrupted.

Thanks for clarifying.

I'm still unclear why the attachListener.hpp changes are needed if we 
have validated the entry points in the attachListener_<os>.cpp files?

Also we still need someone from serviceability to comment on the _NSIG 
issue.

Thanks,
David

> Best regards,
>    Goetz.
>
>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Dienstag, 3. November 2015 11:07
>> To: Lindenmaier, Goetz; hotspot-runtime-dev at openjdk.java.net;
>> serviceability-dev
>> Subject: Re: RFR(M): 8140482: Various minor code improvements (runtime)
>>
>> Hi Goetz,
>>
>> Quick follow up on a couple of things ...
>>
>> On 3/11/2015 7:33 PM, Lindenmaier, Goetz wrote:
>>> Hi David,
>>>
>>>> Sorry, lots of folks very busy at the moment as we try to get features
>>>> finalized before the deadline.
>>> thanks for looking at this!  I know the Dec. 10 deadline, but I guess that also
>>> holds for us ... at least J1 is over now.  (Unfortunately we could not attend
>>> this year.)
>>
>> Me neither :)
>>
>>>>> ps_core.c:
>>>>> Pread not necessarily terminates interp_name which is printed
>> thereafter.
>>>>> Increase buffer size by 1 and add '\0'.
>>>>
>>>> Given:
>>>> #define BUF_SIZE     (PATH_MAX + NAME_MAX + 1)
>>>> isn't it impossible to encounter that problem?
>>> As I understand, pread does not null-terminate what it read.  So the
>>> null must come from the file.  This protects against a corrupted file.
>>
>> So are you saying the nul is not present in the file? I'm not familiar
>> with the ELF format.
>>
>>>>> stubRoutines_x86.cpp:
>>>>> Cast to proper type. This way, left and right of '&' have the same type.
>>>>
>>>> I think you could just have changed the uint64_t to uint32_t as applied
>>>> to the shift rather than casting the 1 to uint_32t. End result is the
>>>> same though.
>>> What you propose did not work.  It was my first fix, too.
>>
>> Hmm okay. The result of the shift must be an unsigned type and the
>> constant 1 is signed, so needs the cast (or use the unsigned constant
>> form - 1ud? )
>>
>>>>> attachListener_linux.cpp:
>>>>> Read does not terminate buf. Size for '\0' is already considered.
>>>>
>>>> Looks a little odd being done on each iteration, but okay I guess.
>>> I'll try to move it out of the loop.  Better: I'll check whether the
>>> scan groks it if I move it out of the loop :)
>>>
>>>>> os_linux.cpp:
>>>>> Array sigflags[] has size MAXSIGNUM==32.  _NSIG is bigger than
>>>>> MAXSIGNUM (_NSIG == 65 on my machine).
>>>>> sig is checked to be smaller than _NSIG. Later, in set_our_sigflags(),
>>>>> sig is used to access sigflags[MAXSIGNUM] which can overflow the array.
>>>>> Should we also increase MAXSIGNUM?
>>>>
>>>> Need to let the SR folk comment here as something definitely seems
>>>> wrong, but I'm not 100% sure the what the correct answer is. If
>>>> _JAVA_SR_SIGNUM is too big it should be validated somewhere and an
>> error
>>>> or warning reported.
>>> I'm also not sure how to best handle this. Might even require a fix
>>> exceeding this change.  But I think this is the best finding.
>>>
>>>>> codeBuffer.cpp:
>>>>> New_capacity is not initialized. Figure_expanded_capacities() handles
>> this
>>>>> correctly, but initializing this is cheap and safe.
>>>>
>>>> Hmmm ... I hate redundancy - this is pure wasted cycles. If we had to do
>>>> it would memset not be better? Or would the code-checker not realize
>>>> what memset was doing?
>>> I guess it would work with memset, too.  But I thought the 3-deep loop
>>> will be unrolled completely so that only three stores remain.
>>
>> I tend not to try and imagine what the compiler may or may not do. Happy
>> to take other opinions. Though again I'd prefer if the checker could be
>> shown that there is no missing initialization.
>>
>>>>> dict.cpp:
>>>>> If j-- is executed for j==0, the loop aborts because j is unsigned (0-- >= b-
>>>>> _cnt).
>>>>> Instead, only do j++ if necessary.
>>>>
>>>> Not at all obvious to me that it is possible to do j-- when j==0, but
>>>> the change seems reasonable.
>>> Yes, the scan does not understand there is j++ right after j-- because
>>> of the loop iteration.  I saw it complaining about this pattern several times.
>>>
>>>> Lots of spacing changes in that code make it hard to see the real changes.
>>> Before, I was asked to fix indentation issues in a function I touch.
>>> Does that only hold for compiler files?
>>
>> Yes/no/maybe :) Fixing up bad formatting when you are touching an area
>> can be convenient, however it can also obscure the real changes, so it
>> depends on the ratio of functional changes to format changes.
>>
>>>> 145     // SAPJVM GL j-- can underflow, which will cause the loop to abort.
>>>> Seems unnecessary with the code change as noone will understand what
>> j--
>>>> you are referring to.
>>> Didn't mean to leave this in here. Removed.
>>>
>>>>     150         nb->_keyvals[nbcnt + nbcnt    ] = key;
>>>>     151         nb->_keyvals[nbcnt + nbcnt + 1] = b->_keyvals[j+j+1];
>>>> hotspot-style doesn't align array index expressions like that. Ditto
>>>> 154/155.
>>> Fixed.
>>>
>>>>> generateOopMap.cpp:
>>>>> Idx is read from String. This is only called with constant strings, so
>> compare
>>>>> should be folded away by optimizing compilers if inlined.
>>>>
>>>> Not a fan of adding conditions that should never be false (hence the
>>>> assert) and relying on the compiler to elide them.
>>> OK, removed.
>>>
>>>>> deoptimization.cpp:
>>>>> If buflen == 0, buf[-1] is accessed.
>>>>
>>>> Okay - but an assert(buflen>0) would be better I think as we should
>>>> never be calling with a zero-length buffer.
>>> Ok, I added the assert.  As this isn't critical code, I would like to leave the
>>> check in there, still.
>>>
>>>>> task.cpp:
>>>>> Fatal can return if -XX:SuppressErrorAt is used. Just don't access the
>>>>> array in this case.
>>>>
>>>> Okay. I would not be surprised if we have a lot of potential errors if a
>>>> fatal/guarantee failure is suppressed.
>>>>
>>>>> attachListener.hpp:
>>>>> Do strncpy to not overflow buffer. Don't write more chars than before.
>>>>
>>>> Again we have the assert to catch an error in the caller using an
>>>> invalid name.
>>> Hmm, the command comes from outside of the VM.  It's not checked
>>> very thoroughly, see, e.g., attachListener_windows.cpp:194.  Arg0 is
>>> checked twice, arg1 and arg2 are not checked at all.
>>
>> The libattach code is still part of our codebase so should be doing the
>> right things. The linux and solaris code seems to be doing the expected
>> name length check. On Windows the name is set using cmd, which is also
>> subject to a length check:
>>
>>    if (strlen(cmd) > AttachOperation::name_length_max) return
>> ATTACH_ERROR_ILLEGALARG;
>>
>>> I add fixes for attachListener_windows.cpp to this change.
>>>
>>>>> heapDumper.cpp:
>>>>> strncpy does not null terminate.
>>>>
>>>> 1973     if (total_length >= sizeof(base_path)) {
>>>>
>>>> total_length already adds +1 for the nul character so the == case is
>>>> fine AFAICS.
>>>>
>>>> strncpy wont nul-terminate if the string exceeds the buffer size. But we
>>>> have already established that total_length <= sizeof(base_path), and
>>>> total_path includes space for a bunch of stuff other than HeapDumpPath,
>>>> so the strncpy of HeapDumpPath has to copy the nul character.
>>> Ok, removed.
>>>
>>>>    > src/share/vm/services/memoryService.cpp
>>>>
>>>> Ok.
>>>>
>>>>    > src/share/vm/utilities/xmlstream.cpp
>>>>
>>>> Ok - I'm more concerned about the "magic" 10 in that piece of code.
>>> I assume the 10 is the space needed for the "_done" plus some waste ...
>>>
>>> I'll do another run of the scan.  That takes a day.  I'll post a new webrev
>> after
>>> that.
>>
>> Thanks,
>> David
>>
>>> Thank again for this thorough review,
>>>     Goetz
>>>
>>>
>>>
>>>
>>>>> Some of these, as the issue in codeBuffer.cpp, are actually handled
>> correctly.
>>>>> Nevertheless this is not that obvious so that somebody changing the
>> code
>>>>> Could oversee he has to add the initialization.
>>>>
>>>> Not an argument I buy en-masse as it leads to a lot of redundancy
>>>> through the code paths. Plus these tools that are being run should show
>>>> if a code changes requires initialization that is not present.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Some of these fixes are part of SAP JVM for a long time.  This change has
>>>>> been tested with our nightly build of openJDK.
>>>>>
>>>>> Best regards,
>>>>>      Goetz,.
>>>>>