RFR(M): 8038498: Fix includes and C inlining after 8035330

Tue Apr 1 11:10:45 UTC 2014

In C/C++ we have the fundamental problem that, as far as defined by
the standard, the compiler can only 'see' the things which are defined
in its current compilation unit where a compilation unit is defined by
a .cpp file together with all the files it includes.

That is exactly the reason why you should not declare a function 'F'
as "inline" in a header file (say "f.h") but define it in a .cpp file
(say "f.cpp"). This will make it impossible for a compiler to really
inline "F" in another compilation unit (say "g.cpp") which includes
"f.hpp" (but of course not "g.cpp").

One simple solution to overcome this problem is to not only declare
but to also define inline functions directly in their corresponding
header files. Unfortunately this only works for very small examples.
Once your program gets bigger, you'll get cyclic dependencies between
such header files. And that's actually clear, because defining
functions in a header file breaks the contract that header files
should contain interface declarations while .cpp files contain the
actual definitions (i.e. implementations).

So the next step (and the solution used in HotSpot) is to restrict
oneself to declarations in header files and place definitions which
are needed in more than one compilation unit into so called
".inline.hpp" files. These ".inline.hpp" include their corresponding
".hpp" files by default (because they actually implement something
which is declared in there - so they need access to it). On the other
hand, every compilation unit which requires a
definition/implementation of a class or function (and not only its
declaration) can now include that ".inline.hpp" file. And they don't
need to include the corresponding ".hpp" file, because the
".inline.hpp" already did that.

All that is the ideal world which would work perfectly fine.
Unfortunately, not all of the current HotSpot code adheres to these
conventions. The usual mistake people make is to define a very simple
inline function right in the header file (i.e. ".hpp") because it is
so simple and has no dependencies at all. But as time goes by, that
function grows, becomes more complicated and requires more and more
dependencies (i.e. ".inline.hpp" files). But instead of refactoring
such functions into their own ".inline.hpp" files, the ingenious
programmer naively includes the ".inline.hpp" file into the header.
And that's where it starts to get ugly!

All that said, there are a few things which influence the situation.
The first one is "precompiled headers". Precompiled headers are a
compiler hack which is not specified by the C/C++ standard and which
is implemented differently by different compilers with the only only
goal to speed up the compilation of a compilation unit. To achieve
this goal the compiler assumes that many compilation units contain a
lot of common code and therefore he "widens" the actual definition of
a "compilation unit" a little bit. With precompiled headers, the
compiler can see beyond its actual compilation unit because he sees
all the declarations and definitions from the precompiled header file
(or database or whatever) and not only the ones which are actually
transitively included by the current ".cpp" file. That's exactly the
reason why you don't realize missing dependencies if you only compile
with precompiled headers turned on (and why it is crucial to have a
sanity build without precompiled headers).

And there's yet another aspect to this: C++ templates. With templates
you have exactly the same problem like with inline functions - they
are declared in a header file, but their definition is potentially
required in several compilation units (actually everywhere they get
instantiated at). Now as long as the template functions are simple,
they can be defined right in their header files. But we will get very
fast to the point where the template definitions get more complicated
and where they would introduce cycles in the header files. Because
templates where specified later then the "inline mechanism" and the
designers were aware of the problem, they tried to somehow address
this within the language for examle with the "export" keyword.
Unfortunately this was not very successful and has meanwhile been
deprecated. As a consequence, a lot of techniques have been developed
to overcome these issues: put all template definitions into the
headers -> may lead to code bloat, use special template instantiation
files -> clumsy, compiler specific solutions like template
instantiation repositories -> proprietary.

One solution which always works and which is currently used in HotSpot
is to treat template definitions like the definition of inline
functions and place them into ".inline.hpp" files. We could of course
use ".template.hpp" but that's probably not worth the effort.

In the end, I absolutely agree with Stefan that we should add these
rules for the implementation of inline and template functions to the
corresponding HotSpot coding styles.

And finally I think that now, just at the beginning of the Java 9
development cycle, it is a perfect point in time to fix the current
code base where it violates these rules.

Regards,
Volker

On Tue, Apr 1, 2014 at 9:36 AM, David Holmes <david.holmes at oracle.com> wrote:
> On 1/04/2014 5:30 PM, Lindenmaier, Goetz wrote:
>>
>> Hi David,
>>
>>> Personally I don't see why a cpp file should have to be aware that a
>>> member function it uses just happens to have been defined in an
>>> inline.hpp file by its author. Seems to break encapsulation to me. :(
>>
>> Because the .cpp file will not compile if the .inline.hpp header with
>> the implementation is not visible during compilation.
>
>
> Understood - but there is more than one way to make the implementation
> visible to the cpp file eg cpp include .hpp and .hpp includes .inline.hpp (I
> suppose that is a "bad" thing to do?)
>
> And that still doesn't change the fact it is breaking encapsulation :)
>
>
>>> I see 277 occurrences of this in the existing OpenJDK hotspot code, so I
>>
>> There are a lot of .hpp files with include methods that don't have
>> a corresponding .inline.hpp file.  Like taskqueue.hpp, which includes
>> the atomic headers.  Would be nice if this was cleaned up.
>> Also, oop.inline.hpp often makes problems, because methods defined
>> there are used in .hpp files, which can not include oop.inline.hpp
>> as it causes cycles.  If you include that header in a cpp file, you also
>> have
>> to include oop.inline.hpp, which is really unintuitive.
>
>
> I find the inline situation to be unintuitive in general. There's a flow on
> affect that an inline function can't use other inline functions unless they
> all get moved to .inline.hpp files. I hope it is okay for one .inline.hpp
> file to include another one? Otherwise you have to expose the internal
> implementation detail of one class to its clients so they know which set of
> .inline.hpp files have to be included!
>
> Cheers,
> David
>
>
>> Best regards,
>>    Goetz.
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Dienstag, 1. April 2014 08:03
>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net;
>> hotspot-gc-dev at openjdk.java.net
>> Subject: Re: RFR(M): 8038498: Fix includes and C inlining after 8035330
>>
>> Hi Goetz,
>>
>> Not a review ...
>>
>> On 31/03/2014 10:42 PM, Lindenmaier, Goetz wrote:
>>>
>>> Hi,
>>>
>>> could I please get reviews for this change?  I please also need a
>>> sponsor.
>>> I would like to get this into 8u20, as 8036330 was downported, too.
>>>
>>> I'd like to make the point of this change clear once more, independent
>>> of which compiler does what.
>>>
>>> I think we agree that basically
>>>     - a file should include all headers that define symbols or include
>>>       method bodies used in that file
>>
>>
>> Personally I don't see why a cpp file should have to be aware that a
>> member function it uses just happens to have been defined in an
>> inline.hpp file by its author. Seems to break encapsulation to me. :(
>>
>>>     - a .hpp header should not include a .inline.hpp header
>>
>>
>> I see 277 occurrences of this in the existing OpenJDK hotspot code, so I
>> don't think we can take this as a general rule either.
>>
>> Note I'm not saying this fix isn't right or needed, just that the
>> "rules" you are trying to embody don't seem to generally apply.
>>
>> David
>> -----
>>
>>> This is violated
>>>
>>> 1.)    by defining
>>>    template <class T> void immediate_rs_update(HeapRegion* from, T* p,
>>> int tid) {
>>>      if (!from->is_survivor()) {
>>>         _g1_rem->par_write_ref(from, p, tid);
>>>       }
>>> }
>>>    in g1CollectedHeap.hpp, as par_write_ref in declared 'inline' and only
>>> defined in
>>>       g1RemSet.inline.hpp - which is not included and should not be
>>> included
>>>       in g1CollectedHeap.hpp.
>>> 2. by including g1CollectedHeap.inline.hpp in sparsePRT.hpp
>>>
>>> This change fixes these two points by moving functions into .inline.hpp
>>> files
>>> and fixing the include.
>>>
>>> Best regards,
>>>     Goetz.
>>>
>>>
>>>
>>>
>>> From: Lindenmaier, Goetz
>>> Sent: Donnerstag, 27. März 2014 17:46
>>> To: hotspot-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net
>>> Subject: RFR(M): 8038498: Fix includes and C inlining after 8035330
>>>
>>> Hi,
>>>
>>> Please review and test this change.  I please need a sponsor:
>>> http://cr.openjdk.java.net/~goetz/webrevs/8038498-incl/
>>>
>>> Change 8035330: Remove G1ParScanPartialArrayClosure and
>>> G1ParScanHeapEvacClosure
>>> broke the dbg build on AIX. That's because do_oop_partial_array() is
>>> added in
>>> a header, but requires inline function par_write_ref() through several
>>> inlined
>>> calls. In some cpp files, like arguments.cpp, par_write_ref() is not
>>> defined
>>> as the corresponding inline header is not included.  The aix debug VM
>>> does not start.
>>>
>>> This can be solved by including g1RemSet.inline.hpp in
>>> g1CollectedHeap.inline.hpp.
>>>
>>> Unfortunately this now causes a cyclic dependency that breaks the linux
>>> build.
>>> A inline.hpp file is included ahead of a .hpp file, so that the code in
>>> the inline.hpp file
>>> can not see the class declaration.
>>> This is caused because g1CollectedHeap.inline.hpp is included in
>>> sparsePRT.hpp.
>>> But .inline.hpp files never should be included in .hpp files.
>>>
>>> To resolve this, I changed this inlcude to g1CollectedHeap.hpp.  As
>>> consequence,
>>> I had to move a row of functions to existing .inline.hpp files.
>>>
>>> I did debug, fastdebug and product builds on linux_ppc64, aix_ppc64,
>>> sun_64, bsd_x86_64 and linux_x86_64 and tested that the VM starts up.
>>>
>>> Best regards,
>>>     Goetz
>>>
>