RFR: 8151841: Build needs additional flags to compile with GCC 6

Andrew Haley aph at redhat.com
Wed Mar 16 10:12:00 UTC 2016


On 15/03/16 19:15, Kim Barrett wrote:
>> On Mar 15, 2016, at 12:18 AM, Andrew Hughes <gnu.andrew at redhat.com> wrote:
> 
> I’ll probably have more to say later; just responding to one point here.
> 
>>>> 2. A number of optimisations in GCC 6 lead to a broken JVM. We need to
>>>> add -fno-delete-null-pointer-checks and -fno-lifetime-dse to get a
>>>> working JVM.
>>>
>>> That's very disturbing!
>>
>> Andrew Haley (CCed) has more details on the need for these options,
>> as he diagnosed the reason for the crash, with the help of the GCC
>> developers. From what I understand of it, it is a case of more
>> aggressive optimisations in the new GCC running into problems with
>> code that doesn't strictly conform to the specification and exhibit
>> undefined behaviour.
> 
> That is my suspicion too, though without more detail of the failures it’s
> hard to completely discount the possibility of compiler bugs.

They weren't compiler bugs: I analyzed the code and I am sure that the
code in HotSpot isn't valid C++.  The -fno-lifetime-dse is because we
write to a field of an object in operator new before the constructor.
This is in Node::operator new.  It's been partially fixed in JDK 9 by
8034812, but an illegal write remains if assertions are turned on.
The bug remains in JDK 8.  It might be that there are no similar bugs
elsewhere in HotSpot, but it would take more time than I had to prove
this.

We dereference null pointers a lot.  I would very much like to clean
all of these out but I didn't detect much enthusiasm from the HotSpot
team.

>> The need for -flifetime-dse is covered in 
>> comment #47 of the bug for this [0]; "an object field [is] being
>> accessed before the object is constructed, in breach of
>> C++98 [class.cdtor]”.
> 
> Thanks for the pointer to the redhat bug for tracking this work:
> https://bugzilla.redhat.com/show_bug.cgi?id=1306558
> 
> [Though a lot of comments there aren't visible to me.]
> 
> This comment is quite worrisome.
> https://bugzilla.redhat.com/show_bug.cgi?id=1306558#c6
>   I very strongly suspect that -fno-strict-aliasing is broken in this
>   version of GCC.
> 
> Is that still thought to be a concern?

No.  I was wrong.

> And any more information about why -fno-delete-null-pointer-checks
> matters?

As I mentioned above, we dereference null pointers a lot.  For
example, Register rax is defined as (RegisterImpl*)0.  So, if we do
something like

     guarantee(reg->is_valid(), "must be");

     if (reg == rax)
       stuff...

GCC is quite within its rights to delete the call to "stuff".  And it
will.

As I said, I would very much like to clean this stuff up, but I'd need
support from the HotSpot team, and at the moment I feel that this is
lacking.  Even if we do get rid of it, problems will remain for old
versions of OpenJDK for years.

HotSpot is a million lines of code, more or less.  We've found this
kind of problem in several places.  Auditing to show that we don't
have such problems is a huge job, but we should do it.  In the
meantime, we should just consider some compiler options to be a
defence against an increasingly aggressive compiler, and err on the
side of safety.  We're not losing significant performance because
these optimizations are new and in many cases simply delete code we
want.

Andrew.



More information about the build-dev mailing list