RFR(M): 8140482: Various minor code improvements (runtime)

Tue Nov 3 10:07:28 UTC 2015

Hi Goetz,

Quick follow up on a couple of things ...

On 3/11/2015 7:33 PM, Lindenmaier, Goetz wrote:
> Hi David,
>
>> Sorry, lots of folks very busy at the moment as we try to get features
>> finalized before the deadline.
> thanks for looking at this!  I know the Dec. 10 deadline, but I guess that also
> holds for us ... at least J1 is over now.  (Unfortunately we could not attend
> this year.)

Me neither :)

>>> ps_core.c:
>>> Pread not necessarily terminates interp_name which is printed thereafter.
>>> Increase buffer size by 1 and add '\0'.
>>
>> Given:
>> #define BUF_SIZE     (PATH_MAX + NAME_MAX + 1)
>> isn't it impossible to encounter that problem?
> As I understand, pread does not null-terminate what it read.  So the
> null must come from the file.  This protects against a corrupted file.

So are you saying the nul is not present in the file? I'm not familiar 
with the ELF format.

>>> stubRoutines_x86.cpp:
>>> Cast to proper type. This way, left and right of '&' have the same type.
>>
>> I think you could just have changed the uint64_t to uint32_t as applied
>> to the shift rather than casting the 1 to uint_32t. End result is the
>> same though.
> What you propose did not work.  It was my first fix, too.

Hmm okay. The result of the shift must be an unsigned type and the 
constant 1 is signed, so needs the cast (or use the unsigned constant 
form - 1ud? )

>>> attachListener_linux.cpp:
>>> Read does not terminate buf. Size for '\0' is already considered.
>>
>> Looks a little odd being done on each iteration, but okay I guess.
> I'll try to move it out of the loop.  Better: I'll check whether the
> scan groks it if I move it out of the loop :)
>
>>> os_linux.cpp:
>>> Array sigflags[] has size MAXSIGNUM==32.  _NSIG is bigger than
>>> MAXSIGNUM (_NSIG == 65 on my machine).
>>> sig is checked to be smaller than _NSIG. Later, in set_our_sigflags(),
>>> sig is used to access sigflags[MAXSIGNUM] which can overflow the array.
>>> Should we also increase MAXSIGNUM?
>>
>> Need to let the SR folk comment here as something definitely seems
>> wrong, but I'm not 100% sure the what the correct answer is. If
>> _JAVA_SR_SIGNUM is too big it should be validated somewhere and an error
>> or warning reported.
> I'm also not sure how to best handle this. Might even require a fix
> exceeding this change.  But I think this is the best finding.
>
>>> codeBuffer.cpp:
>>> New_capacity is not initialized. Figure_expanded_capacities() handles this
>>> correctly, but initializing this is cheap and safe.
>>
>> Hmmm ... I hate redundancy - this is pure wasted cycles. If we had to do
>> it would memset not be better? Or would the code-checker not realize
>> what memset was doing?
> I guess it would work with memset, too.  But I thought the 3-deep loop
> will be unrolled completely so that only three stores remain.

I tend not to try and imagine what the compiler may or may not do. Happy 
to take other opinions. Though again I'd prefer if the checker could be 
shown that there is no missing initialization.

>>> dict.cpp:
>>> If j-- is executed for j==0, the loop aborts because j is unsigned (0-- >= b-
>>> _cnt).
>>> Instead, only do j++ if necessary.
>>
>> Not at all obvious to me that it is possible to do j-- when j==0, but
>> the change seems reasonable.
> Yes, the scan does not understand there is j++ right after j-- because
> of the loop iteration.  I saw it complaining about this pattern several times.
>
>> Lots of spacing changes in that code make it hard to see the real changes.
> Before, I was asked to fix indentation issues in a function I touch.
> Does that only hold for compiler files?

Yes/no/maybe :) Fixing up bad formatting when you are touching an area 
can be convenient, however it can also obscure the real changes, so it 
depends on the ratio of functional changes to format changes.

>> 145     // SAPJVM GL j-- can underflow, which will cause the loop to abort.
>> Seems unnecessary with the code change as noone will understand what j--
>> you are referring to.
> Didn't mean to leave this in here. Removed.
>
>>    150         nb->_keyvals[nbcnt + nbcnt    ] = key;
>>    151         nb->_keyvals[nbcnt + nbcnt + 1] = b->_keyvals[j+j+1];
>> hotspot-style doesn't align array index expressions like that. Ditto
>> 154/155.
> Fixed.
>
>>> generateOopMap.cpp:
>>> Idx is read from String. This is only called with constant strings, so compare
>>> should be folded away by optimizing compilers if inlined.
>>
>> Not a fan of adding conditions that should never be false (hence the
>> assert) and relying on the compiler to elide them.
> OK, removed.
>
>>> deoptimization.cpp:
>>> If buflen == 0, buf[-1] is accessed.
>>
>> Okay - but an assert(buflen>0) would be better I think as we should
>> never be calling with a zero-length buffer.
> Ok, I added the assert.  As this isn't critical code, I would like to leave the
> check in there, still.
>
>>> task.cpp:
>>> Fatal can return if -XX:SuppressErrorAt is used. Just don't access the
>>> array in this case.
>>
>> Okay. I would not be surprised if we have a lot of potential errors if a
>> fatal/guarantee failure is suppressed.
>>
>>> attachListener.hpp:
>>> Do strncpy to not overflow buffer. Don't write more chars than before.
>>
>> Again we have the assert to catch an error in the caller using an
>> invalid name.
> Hmm, the command comes from outside of the VM.  It's not checked
> very thoroughly, see, e.g., attachListener_windows.cpp:194.  Arg0 is
> checked twice, arg1 and arg2 are not checked at all.

The libattach code is still part of our codebase so should be doing the 
right things. The linux and solaris code seems to be doing the expected 
name length check. On Windows the name is set using cmd, which is also 
subject to a length check:

  if (strlen(cmd) > AttachOperation::name_length_max) return 
ATTACH_ERROR_ILLEGALARG;

> I add fixes for attachListener_windows.cpp to this change.
>
>>> heapDumper.cpp:
>>> strncpy does not null terminate.
>>
>> 1973     if (total_length >= sizeof(base_path)) {
>>
>> total_length already adds +1 for the nul character so the == case is
>> fine AFAICS.
>>
>> strncpy wont nul-terminate if the string exceeds the buffer size. But we
>> have already established that total_length <= sizeof(base_path), and
>> total_path includes space for a bunch of stuff other than HeapDumpPath,
>> so the strncpy of HeapDumpPath has to copy the nul character.
> Ok, removed.
>
>>   > src/share/vm/services/memoryService.cpp
>>
>> Ok.
>>
>>   > src/share/vm/utilities/xmlstream.cpp
>>
>> Ok - I'm more concerned about the "magic" 10 in that piece of code.
> I assume the 10 is the space needed for the "_done" plus some waste ...
>
> I'll do another run of the scan.  That takes a day.  I'll post a new webrev after
> that.

Thanks,
David

> Thank again for this thorough review,
>    Goetz
>
>
>
>
>>> Some of these, as the issue in codeBuffer.cpp, are actually handled correctly.
>>> Nevertheless this is not that obvious so that somebody changing the code
>>> Could oversee he has to add the initialization.
>>
>> Not an argument I buy en-masse as it leads to a lot of redundancy
>> through the code paths. Plus these tools that are being run should show
>> if a code changes requires initialization that is not present.
>>
>> Thanks,
>> David
>>
>>> Some of these fixes are part of SAP JVM for a long time.  This change has
>>> been tested with our nightly build of openJDK.
>>>
>>> Best regards,
>>>     Goetz,.
>>>