RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser
Mikael Vidstedt
mikael.vidstedt at oracle.com
Thu May 18 23:19:38 UTC 2017
> On May 18, 2017, at 3:50 PM, David Holmes <david.holmes at oracle.com> wrote:
>
> Hi Mikael,
>
> On 19/05/2017 8:15 AM, Mikael Vidstedt wrote:
>>
>>> On May 18, 2017, at 2:59 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> On 05/17/2017 03:46 AM, Kim Barrett wrote:
>>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt <mikael.vidstedt at oracle.com> wrote:
>>>>>
>>>>>
>>>>> Warning: It may be wise to stock up on coffee or tea before reading this.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032
>>>>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/>
>>>> Not a review, just a question.
>>>> ------------------------------------------------------------------------------
>>>> src/cpu/x86/vm/bytes_x86.hpp
>>>> 40 template <typename T>
>>>> 41 static inline T get_native(const void* p) {
>>>> 42 assert(p != NULL, "null pointer");
>>>> 43
>>>> 44 T x;
>>>> 45
>>>> 46 if (is_ptr_aligned(p, sizeof(T))) {
>>>> 47 x = *(T*)p;
>>>> 48 } else {
>>>> 49 memcpy(&x, p, sizeof(T));
>>>> 50 }
>>>> 51
>>>> 52 return x;
>>>> I'm looking at this and wondering if there's a good reason to not just
>>>> unconditionally use memcpy here. gcc -O will generate a single move
>>>> instruction for that on x86_64. I'm not sure what happens on 32bit
>>>> with an 8 byte value, but I suspect it will do something similarly
>>>> sensible, e.g. 2 4 byte memory to memory transfers.
>>>
>>> Unconditionally memcpy would be nice!
>>>
>>> Are going to look into that Mikael?
>>
>> It’s complicated…
>>
>> We may be able to switch, but there is (maybe) a subtle reason why the alignment check is in there: to avoid word tearing..
>>
>> Think of two threads racing:
>>
>> * thread 1 is writing to the memory location X
>> * thread 2 is reading from the same memory location X
>>
>> Will thread 2 always see a consistent value (either the original value or the fully updated value)?
>
> We're talking about internal VM load and stores rights? For those we need to use appropriate atomic routine if there are potential races. But we should never be mixing these kind of accesses with Java level field accesses - that would be very broken.
That seems reasonable, but for my untrained eye it’s not trivially true that relaxing the implementation is correct for all the uses of the get/put primitives. I am therefore a bit reluctant to do so without understanding the implications.
> For classFileparser we should no concurrency issues.
That seems reasonable. What degree of certainty does your “should” come with? :)
Cheers,
Mikael
>
> David
>
>> In the unaligned/memcpy case I think we can agree that there’s nothing preventing the compiler from doing individual loads/stores of the bytes making up the data. Especially in something like slowdebug that becomes more or less obvious - memcpy most likely isn’t intrinsified and is quite likely just copying a byte at a time. Given that the data is, in fact, unaligned, there is really no simple way to prevent word tearing, so I’m pretty sure that we never depend on it - if needed, we’re likely to already have some higher level synchronization in place guarding the accesses. And the fact that the other, non-x86 platforms already do individual byte loads/stores when the pointer is unaligned indicates is a further indication that that’s the case.
>>
>> However, the aligned case is where stuff gets more interesting. I don’t think the C/C++ spec guarantees that accessing a memory location using a pointer of type T will result in code which does a single load/store of size >= sizeof(T), but for all the compilers we *actually* use that’s likely to be the case. If it’s true that the compilers don’t splits the memory accesses, that means we won’t have word tearing when using the Bytes::get/put methods with *aligned* pointers.
>>
>> If I switch to always using memcpy, there’s a risk that it introduces tearing problems where earlier we had none. Two questions come to mind:
>>
>> * For the cases where the get/put methods get used *today*, is that a problem?
>> * What happens if somebody in the *future* decides that put_Java_u4 seems like a great thing to use to write to a Java int field on the Java heap, and a Java thread is racing to read that same data?
>>
>>
>> All that said though, I think this is worth exploring and it may well turn out that word tearing really isn’t a problem. Also, I believe there may be opportunities to further clean up this code and perhaps unify it a bit across the various platforms.
>>
>> And *that* said, I think the change as it stands is still an improvement, so I’m leaning towards pushing it and filing an enhancement and following up on it separately. Let me know if you strongly feel that this should be looked into and addressed now and I may reconsider :)
>>
>> Cheers,
>> Mikael
>>
>>>
>>> /Robbin
>>>
>>>> ------------------------------------------------------------------------------
More information about the hotspot-dev
mailing list