RFR (S) 8150465: Unsafe methods to produce uninitialized arrays

Fri Feb 26 10:24:19 UTC 2016

Hi Aleksey,

Agreed on all fronts that Unsafe is playing with fire.

I was not objecting, I just wanted to accurately depict the similarities 
and differences with the existing function before we allow the faulty 
analogy to move that analysis off the table.  "A is like B so we don't 
need to consider any new info" only works if all aspects of A are 
covered by aspects of B.

As you note, we can leak uninitialized memory from Unsafe.allocateMemory 
as well, if we copy some of the uninitialized data out of it on behalf 
of an untrusted caller, but that is one level of "oops" harder than 
storing the reference to the unsafe primitive array in an accidentally 
accessible location.

The potential failures of the existing method are that we dig out 
unwritten data and return it.  It is not a failure to accidentally give 
them the handle, though, because the handle is useless to them.  It 
would be a huge error to let them overwrite or otherwise change any of 
these handles, though, but simply returning the handle from a method 
does not create a leak.

The potential failures of the new method are both that we might dig out 
unwritten data and supply that data, but also that we might accidentally 
give away the handle itself.

It doesn't have to be "uh, oh, I forgot to initialize all of the data 
because my algorithm was too obscure", it can be that the programmer 
stored the array reference in a field that has a getter before they are 
finished initializing it.  You can examine the initialization code in 
fine detail and say "Yep, that's all of the locations in the array", but 
if you didn't notice that there was a getter that someone could call 
while you were doing that work, you've leaked data.  Or maybe you knew 
about the getter, but thought that the object was only held by trusted 
modules - until there's an exploit.  The getter may not even exist when 
they write the initialization code, but someone comes along later and 
thinks "it would be really handy to have a getter for that array 
reference" and doesn't notice that it goes through part of its life span 
as a partially initialized array ("OMG, what's that now?!").

I also don't think that it's a good analogy that these are similar to 
bugs that the compiler could have.  The compiler code is some of the 
most scrutinized code we have, with a fairly narrow and specific set of 
code translation operations it performs, maintained by a fairly limited 
group of engineers who are experts in those issues.  But we have a ton 
of library code in many places and a huge body of engineers working on 
it - a few orders of magnitude more fingers that aren't trained in 
memory barriers and proving accessibility that can slip up ("What do you 
mean they can call a protected method? I thought it was protected for a 
reason!?").

That widens the gap of potential programmer errors that can create a 
data leak.  It sounds like we probably accept that risk, but let's be on 
the table about it.  All of those are programmer errors, but let's at 
least acknowledge the scenarios before we simply say "all programmers 
everywhere shouldn't do bad stuff" and rubber stamp a new feature.

In the end, I'm just saying that the specific comment that we already 
have a mechanism that can create uninitialized memory doesn't imply that 
there are no new security implications to consider here.  That's a 
logical shortcut to minimize discussion, not a valid conclusion.  I'm 
seeing a fair amount of hand waving here and not much objective analysis 
of the risks.  It's important to nail down the issues if for no other 
reason than to provide good advice on usage practices in our 
documentation to our internal developers...

			...jim

On 2/25/2016 11:01 PM, Aleksey Shipilev wrote:
> Unsafe is unsafe, no doubt about it.
>
> But if you accept the premise that Unsafe.allocateUninitializedArray is
> wrong because it can leak data from previous allocation if used
> incorrectly, then you should equally accept that compilers should not do
> the same automatic optimization too. Even more so, because the compiler
> code is much harder to audit for these mistakes; and programmers *do*
> make mistakes in that complicated compiler code.
>
> Notable examples in C2 are OptimizeStringConcat allocating uninitialized
> array for String (pretty much what we are targeting to do here), and
> subsequent arraycopy eliminating the array zeroing in a "tightly
> coupled" allocation. But, we do these optimizations already.
>
> Unsafe users *have* to provide an additional protection to avoid such
> leakage, pretty much how compilers *have* to guarantee safety in those
> situations too.
>
> Even without Unsafe.allocateUninitializedArray, I can certainly
> construct the buggy JDK code that will leak uninitialized memory to
> untrusted code: a simple mistake in offset calculation for
> Unsafe.getX(long,long) is enough to leak. Yet, we don't nuke the raw
> memory accessors from Unsafe, because they are useful, and we do
> understand those are sharp tools, and we should be extra careful using
> them.
>
> The same goes for any other Unsafe method. If you treat
> U.allocateUninitializedArray with the same respect you treating other
> tools in that toolbox, everything is fine. If you don't, then stay away
> from Unsafe to begin with. Unsafe is unsafe, there is no doubt about it.
>
> -Aleksey
>
> On 02/26/2016 01:57 AM, Jim Graham wrote:
>> Just to play devil's advocate here.
>>
>> It's true that from a code correctness-safety perspective Unsafe
>> programmers can already shoot themselves in the foot with uninitialized
>> allocations, but from the security point of view the two methods don't
>> have the same opportunity to leak information.
>>
>> Unsafe.allocateMemory returns a long, which is just a long to any
>> untrusted code since it can't use the Unsafe methods to access the data
>> in it.
>>
>> The new uninitialized array allocation returns a primitive array which
>> can be inspected by untrusted code for any stale elements that hold
>> private information from a previous allocation - should that array ever
>> be leaked to untrusted code...
>>
>>              ...jim
>>
>> On 2/25/2016 7:47 AM, Paul Sandoz wrote:
>>>
>>>> On 25 Feb 2016, at 15:36, Aleksey Shipilev
>>>> <aleksey.shipilev at oracle.com> wrote:
>>>>
>>>> On 02/25/2016 05:13 PM, Andrew Haley wrote:
>>>>> On 02/25/2016 01:44 PM, Aleksey Shipilev wrote:
>>>>>> Of course, you will still see garbage data if after storing the array
>>>>>> elements into the uninitialized array you would publish it racily. But
>>>>>> the same is true for the "regular" allocations and subsequent writes.
>>>>>> The only difference is whether you see "real" garbage, or some
>>>>>> "synthetic" garbage like zeros. It is, of course, a caller
>>>>>> responsibility to publish array safely in both cases, if garbage is
>>>>>> unwanted.
>>>>>
>>>>> Of course, my worry with this optimization assumes that programmers
>>>>> make mistakes.  But you did say "complicated processing is done after
>>>>> the allocation."  And that's where programmers make mistakes.
>>>>
>>>> Of course they do; at least half of the Unsafe methods is suitable for
>>>> shooting oneself in a foot in creative ways. Unsafe is a sharp tool, and
>>>> Unsafe callers are trusted in their madness. This is not your average
>>>> Joe's use case, for sure.
>>>>
>>>
>>> FTR the contents of the memory allocated by Unsafe.allocateMemory are
>>> also uninitialized.
>>>
>>> Paul.
>>>
>
>