Sometimes constraints are questionable

David Holmes david.holmes at oracle.com
Wed Jun 3 06:49:53 UTC 2020


Hi Stuart,

On 3/06/2020 8:08 am, Stuart Marks wrote:
> Hi Jim,
> 
> This was mentioned previously in this thread but not discussed very 
> much. I suggest you take a look at 
> jdk.internal.util.ArraysSupport.newLength(). Ivan Gerasimov and I worked 
> this over fairly closely last year, and I'm pretty sure it does what 
> Martin is saying, which I also think is the right thing.
> 
> The intent is that it be used for things that have growable arrays, 
> where the array might have a larger capacity than the logical number of 
> elements currently stored. Sometimes the array needs to be grown to 
> accommodate an immediate need (minGrowth) but it's desired to grow 
> larger (prefGrowth) in anticipation of future needs. If minGrowth can't 
> be accommodated, it throws OOME, but if prefGrowth can't be 
> accommodated, it might be acceptable to provide a smaller amount of growth.
> 
> (Of course, all this assumes that there is sufficient memory available 
> to allocate the actual array. ArraysSupport.newLength doesn't attempt to 
> ascertain that.)
> 
> One issue is integer wraparound (overflow). This is the primary value 
> that ArraysSupport.newLength provides. It would be good to centralize 
> these computations instead of having them be spread all over.
> 
> Another issue is the one that MAX_ARRAY_LENGTH (also called 
> MAX_ARRAY_SIZE) is trying to address. This is sort-of a misnomer. It's 
> not the actual maximum array size (which in fact isn't known the the 
> library). It's actually the maximum array size that the library is 
> fairly confident the VM can provide, assuming that enough memory is 
> actually available. What the heck does that mean?
> 
> The theoretical maximum array size is Integer.MAX_VALUE, since the JLS 
> and JVMS don't allow anything larger. However, actual VM implementations 
> will refuse to allocate an array above a certain amount slightly smaller 
> than that, even if there is enough memory available. In practice, I 
> believe the values for current Hotspot are Integer.MAX_VALUE-3 or 
> Integer.MAX_VALUE-2, depending on whether compressed OOPS are in use.
> 
> Why is this significant? Consider the following case, where the capacity 
> of something is currently Integer.MAX_VALUE-100, and it's filled with 
> elements. The application performs some operation that requires 50 
> elements (minGrowth) be added. A new array could certainly be allocated 
> with size Integer.MAX_VALUE-50, but typical growth policies for these 
> kinds of containers want to increase the current capacity by 1.5x or 2x 
> (prefGrowth). Doing this multiplication would exceed Integer.MAX_VALUE, 
> so that won't work. Clearly, we need to clamp the capacity somewhere.
> 
> We don't want to clamp the capacity at Integer.MAX_VALUE, because this 
> allocation would fail on every JVM I'm aware of, even if enough memory 
> is available. So we don't do that. Instead, we clamp the preferred 
> growth at some fairly arbitrary value smaller than Integer.MAX_VALUE, 
> which is here called MAX_ARRAY_LENGTH, and increase the capacity to that 
> instead. This allows the container's requested allocation to proceed: it 
> satisfies minGrowth, but it doesn't satisfy prefGrowth. Instead, it 
> returns a capacity value that's reasonably likely to succeed, given an 
> unknown JVM implementation limit.
> 
> Recall that the container now has Integer.MAX_VALUE-50 elements and the 
> capacity is MAX_ARRAY_SIZE, which is currently defined somewhat 
> arbitrarily at Integer.MAX_VALUE-8. Suppose the application now wants to 
> add 43 elements. What should happen?
> 
> We could say, this exceeds MAX_ARRAY_LENGTH, so refuse the request and 
> throw OOME. This is reasonable and consistent in some sense, but perhaps 
> not in another. Suppose there is sufficient memory, and the JVM does 
> allow arrays of Integer.MAX_VALUE-7 to be created. Shouldn't we even try?
> 
> That's what hugeLength() does -- it returns a value that attempts an 
> allocation beyond the max preferential growth, and leaves it up to the 
> JVM to grant or refuse the request based on its own implementation limits.

IIUC what you are saying is that MAX_ARRAY_LENGTH is treated as a 
soft-limit. A request for prefGrowth won't be allowed to exceed it. But 
if minGrowth takes the length passed it then the code tries to do the 
allocation that large anyway. If it succeeds we win, and if we get OOME 
that is what we would have thrown anyway if we rejected the request as 
too big.

So my misunderstanding in this was that MAX_ARRAY_LENGTH is not 
attempting to define the actual VM hard limit, just a large value close 
to that which is expected to always be valid (actual memory permitting).

Thanks for the detailed explanation.

David
-----

> Anyway, this is all quite subtle, and maybe comments in ArraysSupport 
> don't describe this adequately. But the code that implements this kind 
> of policy has been copied to different locations around the JDK, and it 
> uses somewhat different terminology, and it might have slightly 
> different bugs, but they're all essentially trying to implement this 
> policy.
> 
> **
> 
> Several questions could be asked:
> 
> 1) Is this the right policy for implementing growable arrays?
> 
> 2) In cases where a class needs a growable array, can and should it be 
> refactored to use ArraysSupport.newLength()?
> 
> 3) Does ArraysSupport.newLength() need to be modified to accommodate 
> needs of additional call sites?
> 
> 4) We might want to consider refactoring PriorityBlockingQueue and 
> ArrayDeque to use ArraysSupport.newLength, in order to provide a 
> consistent policy for collections. Other growable-array-based 
> collections (Vector, ArrayList, PriorityQueue) do already.
> 
> s'marks
> 
> 
> 
> 
> 
> On 6/1/20 4:47 AM, Jim Laskey wrote:
>> Thanks David will run with that,
>>
>>
>>> On May 31, 2020, at 8:34 PM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> On 31/05/2020 12:29 am, Jim Laskey wrote:
>>>> I'm working through https://bugs.openjdk.java.net/browse/JDK-8230744 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8230744> Several classes 
>>>> throw OutOfMemoryError without message .
>>>> I'm wondering why hugeCapacity in 
>>>> src/jdk.zipfs/share/classes/jdk/nio/zipfs/ByteArrayChannel.java is 
>>>> defined as
>>>>      /**
>>>>       * The maximum size of array to allocate.
>>>>       * Some VMs reserve some header words in an array.
>>>>       * Attempts to allocate larger arrays may result in
>>>>       * OutOfMemoryError: Requested array size exceeds VM limit
>>>>       */
>>>>      private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
>>>>      /**
>>>>       * Increases the capacity to ensure that it can hold at least the
>>>>       * number of elements specified by the minimum capacity argument.
>>>>       *
>>>>       * @param minCapacity the desired minimum capacity
>>>>       */
>>>>      private void grow(int minCapacity) {
>>>>          // overflow-conscious code
>>>>          int oldCapacity = buf.length;
>>>>          int newCapacity = oldCapacity << 1;
>>>>          if (newCapacity - minCapacity < 0)
>>>>              newCapacity = minCapacity;
>>>>          if (newCapacity - MAX_ARRAY_SIZE > 0)
>>>>              newCapacity = hugeCapacity(minCapacity);
>>>>          buf = Arrays.copyOf(buf, newCapacity);
>>>>      }
>>>>      private static int hugeCapacity(int minCapacity) {
>>>>          if (minCapacity < 0) // overflow
>>>>              throw new OutOfMemoryError();
>>>
>>> Not sure how we could have minCapacity < 0 at this point. It should 
>>> have been checked before the call to grow, and grow will not make it 
>>> negative.
>>>
>>>>          return (minCapacity > MAX_ARRAY_SIZE) ?
>>>>              Integer.MAX_VALUE :
>>>>              MAX_ARRAY_SIZE;
>>>
>>> That's a bug plain and simple. It should never report a size > 
>>> MAX_ARRAY_SIZE.
>>>
>>>>      }
>>>> It just seems that it's pushing the inevitable off to 
>>>> Arrays.copyOf.  Shouldn't it be:
>>>>      private static int hugeCapacity(int minCapacity) {
>>>>          if (minCapacity < 0 || minCapacity > MAX_ARRAY_SIZE) {
>>>>              throw
>>>>                  new OutOfMemoryError("ByteArrayChannel exceeds 
>>>> maximum size: " +
>>>>                                        MAX_ARRAY_SIZE);
>>>>          }
>>>>                   return MAX_ARRAY_SIZE;
>>>>      }
>>>
>>> That seems more appropriate to me - modulo the question mark over 
>>> minCapacity being negative.
>>>
>>>> Real question: is there some hidden purpose behind this kind of logic?
>>>
>>> The basic strategy is to double the current capacity unless that will 
>>> trigger an unnecessary exception, in which case just use the 
>>> requested capacity, but again watch for the implementation limits.
>>>
>>> Cheers,
>>> David
>>> -----
>>>
>>>> Cheers,
>>>> -- Jim
>>


More information about the core-libs-dev mailing list