RFR: Unaligned pointer dereference crash in ClassFileParser

Wed Apr 12 04:55:44 UTC 2017

This one is interesting, and requires a bit of background. Get yourself a cup of coffee/tea if you haven’t already. I’d actually like to push this this jdk10 as well sometime soon-ish, because it’s an actual/latent bug which may be tickled if our toolchain changes in some way. It’s not actually a musl bug per-se, it just happens to be tickled when compiling for musl.

webrev: http://cr.openjdk.java.net/~mikael/webrevs/portola/copyswapaligned/webrev.00/hotspot/webrev/

The story goes something like this:

x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn’t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned.

Normally this isn’t a problem, because after all we don’t typically use 128-bit data types in our C/C++ code. However, just because we don’t use any such data types explicitly, there’s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses.

We’ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2 [1]. That time the problem was in nio/Bits.c, and it was fixed (by me) by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc.

This time the problem is with ClassFileParser. Or more accurately, it’s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the “native” endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation:

// Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering
// (no special code is needed since x86 CPUs can access unaligned data)

While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction.

So why is this only a problem when using musl? Turns out it sorta isn’t, but bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system <byteswap.h> functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here’s the “funny” & interesting part:

With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to a rotate “rorw” instruction. Since GCC can’t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional.

With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it’s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom.

To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay.

That concludes the background. Let’s move over to the suggested fix. There are (at least) two alternatives:

1. Only fix the specific case of classFileStream.cpp/copy_u2_with_conversion
2. Address the larger problem of Bytes providing methods which may not be safe

For better or worse, the webrev I linked to above does the latter - it doesn’t just do the minimal change. I’m happy to revisit that and reduce the scope of the patch if needed.

The key changes are in three different areas, the first two of which are needed (in one way or another):

* copy.[ch]pp

Introducing: conjoint_swap_maybe

conjoint_swap_maybe will copy data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap).

* classFile{Parser,Stream}

The key (and only absolutely needed) change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_maybe method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2.

However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there’s nothing wrong with *having* an unaligned pointer in C - as long as it’s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler.

Changing to void* may or may not be the right thing to do here. In a way I’d actually like to “carry” the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions.

* bytes_x86.hpp

Note: I believe that given the above changes, this one isn’t absolutely necessary. This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren’t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said:

It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There’s some template “magic” in there avoid duplicating code, but hopefully the magic is relatively straightforward.

Appreciate any and all comments and feedback. As mentioned, I think this is worth pushing to jdk10 as well in some not-all-too-distant future.

Cheers,
Mikael

[1] https://bugs.openjdk.java.net/browse/JDK-8141491