Stop using precompiled headers for Linux?

Fri Nov 2 10:39:53 UTC 2018

On 2018-11-02 00:53, Ioi Lam wrote:
> Maybe precompiled.hpp can be periodically (weekly?) updated by a 
> robot, which parses the dependencies files generated by gcc, and pick 
> the most popular N files?
I think that's tricky to implement automatically. However, I've done 
more or less, that, and I've got some wonderful results! :-)

I'd still like to run some more tests, but preliminiary data indicates 
that there is much to be gained by having a more sensible list of files 
in the precompiled header.

The fewer files we got on this list, the less likely it is to become 
(drastically) outdated. So I don't think we need to do this 
automatically, but perhaps manually every now and then when we feel 
build times are increasing.

/Magnus

>
> - Ioi
>
>
> On 11/1/18 4:38 PM, David Holmes wrote:
>> It's not at all obvious to me that the way we use PCH is the 
>> right/best way to use it. We dump every header we think it would be 
>> good to precompile into precompiled.hpp and then only ask gcc to 
>> precompile it. That results in a ~250MB file that has to be read into 
>> and processed for every source file! That doesn't seem very efficient 
>> to me.
>>
>> Cheers,
>> David
>>
>> On 2/11/2018 3:18 AM, Erik Joelsson wrote:
>>> Hello,
>>>
>>> My point here, which wasn't very clear, is that Mac and Linux seem 
>>> to lose just as much real compile time. The big difference in these 
>>> tests was rather the number of cpus in the machine (32 threads in 
>>> the linux box vs 8 on the mac). The total amount of work done was 
>>> increased when PCH was disabled, that's the user time. Here is my 
>>> theory on why the real (wall clock) time was not consistent with 
>>> user time between these experiments can be explained:
>>>
>>> With pch the time line (simplified) looks like this:
>>>
>>> 1. Single thread creating PCH
>>> 2. All cores compiling C++ files
>>>
>>> When disabling pch it's just:
>>>
>>> 1. All cores compiling C++ files
>>>
>>> To gain speed with PCH, the time spent in 1 much be less than the 
>>> time saved in 2. The potential time saved in 2 goes down as the 
>>> number of cpus go up. I'm pretty sure that if I repeated the 
>>> experiment on Linux on a smaller box (typically one we use in CI), 
>>> the results would look similar to Macosx, and similarly, if I had 
>>> access to a much bigger mac, it would behave like the big Linux box. 
>>> This is why I'm saying this should be done for both or none of these 
>>> platforms.
>>>
>>> In addition to this, the experiment only built hotspot. If you we 
>>> would instead build the whole JDK, then the time wasted in 1 in the 
>>> PCH case would be negated to a large extent by other build targets 
>>> running concurrently, so for a full build, PCH is still providing 
>>> value.
>>>
>>> The question here is that if the value of PCH isn't very big, 
>>> perhaps it's not worth it if it's also creating as much grief as 
>>> described here. There is no doubt that there is value however. And 
>>> given the examination done by Magnus, it seems this value could be 
>>> increased.
>>>
>>> The main reason why we haven't disabled PCH in CI before this. We 
>>> really really want to get CI builds fast. We don't have a ton of 
>>> over capacity to just throw at it. PCH made builds faster, so we 
>>> used them. My other reason is consistency between builds. Supporting 
>>> multiple different modes of building creates the potential for 
>>> inconsistencies. For that reason I would definitely not support 
>>> having PCH on by default, but turned off in our CI/dev-submit. We 
>>> pick one or the other as the official build configuration, and we 
>>> stick with the official build configuration for all builds of any 
>>> official capacity (which includes CI).
>>>
>>> In the current CI setup, we have a bunch of tiers that execute one 
>>> after the other. The jdk-submit currently only runs tier1. In tier2 
>>> I've put slowdebug builds with PCH disabled, just to help verify a 
>>> common developer configuration. These builds are not meant to be 
>>> used for testing or anything like that, they are just run for 
>>> verification, which is why this is ok. We could argue that it would 
>>> make sense to move the linux-x64-slowdebug without pch build to 
>>> tier1 so that it's included in dev-submit.
>>>
>>> /Erik
>>>
>>> On 2018-11-01 03:38, Magnus Ihse Bursie wrote:
>>>>
>>>>
>>>> On 2018-10-31 00:54, Erik Joelsson wrote:
>>>>> Below are the corresponding numbers from a Mac, (Mac Pro (Late 
>>>>> 2013), 3.7 GHz, Quad-Core Intel Xeon E5, 16 GB). To be clear, the 
>>>>> -npch is without precompiled headers. Here we see a slight 
>>>>> degradation when disabling on both user time and wall clock time. 
>>>>> My guess is that the user time increase is about the same, but 
>>>>> because of a lower cpu count, the extra load is not as easily 
>>>>> covered.
>>>>>
>>>>> These tests were run with just building hotspot. This means that 
>>>>> the precompiled header is generated alone on one core while 
>>>>> nothing else is happening, which would explain this degradation in 
>>>>> build speed. If we were instead building the whole product, we 
>>>>> would see a better correlation between user and real time.
>>>>>
>>>>> Given the very small benefit here, it could make sense to disable 
>>>>> precompiled headers by default for Linux and Mac, just as we did 
>>>>> with ccache.
>>>>>
>>>>> I do know that the benefit is huge on Windows though, so we cannot 
>>>>> remove the feature completely. Any other comments?
>>>>
>>>> Well, if you show that it is a loss in time on macosx to disable 
>>>> precompiled headers, and no-one (as far as I've seen) has 
>>>> complained about PCH on mac, then why not keep them on as default 
>>>> there? That the gain is small is no argument to lose it. (I 
>>>> remember a time when you were hunting seconds in the build time ;-))
>>>>
>>>> On linux, the story seems different, though. People experience PCH 
>>>> as a problem, and there is a net loss of time, at least on selected 
>>>> testing machines. It makes sense to turn it off as default, then.
>>>>
>>>> /Magnus
>>>>
>>>>>
>>>>> /Erik
>>>>>
>>>>> macosx-x64
>>>>> real     4m13.658s
>>>>> user     27m17.595s
>>>>> sys     2m11.306s
>>>>>
>>>>> macosx-x64-npch
>>>>> real     4m27.823s
>>>>> user     30m0.434s
>>>>> sys     2m18.669s
>>>>>
>>>>> macosx-x64-debug
>>>>> real     5m21.032s
>>>>> user     35m57.347s
>>>>> sys     2m20.588s
>>>>>
>>>>> macosx-x64-debug-npch
>>>>> real     5m33.728s
>>>>> user     38m10.311s
>>>>> sys     2m27.587s
>>>>>
>>>>> macosx-x64-slowdebug
>>>>> real     3m54.439s
>>>>> user     25m32.197s
>>>>> sys     2m8.750s
>>>>>
>>>>> macosx-x64-slowdebug-npch
>>>>> real     4m11.987s
>>>>> user     27m59.857s
>>>>> sys     2m18.093s
>>>>>
>>>>>
>>>>> On 2018-10-30 14:00, Erik Joelsson wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 2018-10-30 13:17, Aleksey Shipilev wrote:
>>>>>>> On 10/30/2018 06:26 PM, Ioi Lam wrote:
>>>>>>>> Is there any advantage of using precompiled headers on Linux?
>>>>>>> I have measured it recently on shenandoah repositories, and 
>>>>>>> fastdebug/release build times have not
>>>>>>> improved with or without PCH. Actually, it gets worse when you 
>>>>>>> touch a single header that is in PCH
>>>>>>> list, and you end up recompiling the entire Hotspot. I would be 
>>>>>>> in favor of disabling it by default.
>>>>>> I just did a measurement on my local workstation (2x8 cores x2 ht 
>>>>>> Ubuntu 18.04 using Oracle devkit GCC 7.3.0). I ran "time make 
>>>>>> hotspot" with clean build directories.
>>>>>>
>>>>>> linux-x64:
>>>>>> real    4m6.657s
>>>>>> user    61m23.090s
>>>>>> sys    6m24.477s
>>>>>>
>>>>>> linux-x64-npch
>>>>>> real    3m41.130s
>>>>>> user    66m11.824s
>>>>>> sys    4m19.224s
>>>>>>
>>>>>> linux-x64-debug
>>>>>> real    4m47.117s
>>>>>> user    75m53.740s
>>>>>> sys    8m21.408s
>>>>>>
>>>>>> linux-x64-debug-npch
>>>>>> real    4m42.877s
>>>>>> user    84m30.764s
>>>>>> sys    4m54.666s
>>>>>>
>>>>>> linux-x64-slowdebug
>>>>>> real    3m54.564s
>>>>>> user    44m2.828s
>>>>>> sys    6m22.785s
>>>>>>
>>>>>> linux-x64-slowdebug-npch
>>>>>> real    3m23.092s
>>>>>> user    55m3.142s
>>>>>> sys    4m10.172s
>>>>>>
>>>>>> These numbers support your claim. Wall clock time is actually 
>>>>>> increased with PCH enabled, but total user time is decreased. 
>>>>>> Does not seem worth it to me.
>>>>>>>> It's on by default and we keep having
>>>>>>>> breakage where someone would forget to add #include. The latest 
>>>>>>>> instance is JDK-8213148.
>>>>>>> Yes, we catch most of these breakages in CIs. Which tells me 
>>>>>>> adding it to jdk-submit would cover
>>>>>>> most of the breakage during pre-integration testing.
>>>>>> jdk-submit is currently running what we call "tier1". We do have 
>>>>>> builds of Linux slowdebug with precompiled headers disabled in 
>>>>>> tier2. We also build solaris-sparcv9 in tier1 which does not 
>>>>>> support precompiled headers at all, so to not be caught in 
>>>>>> jdk-submit you would have to be in Linux specific code. The 
>>>>>> example bug does not seem to be that. Mach5/jdk-submit was down 
>>>>>> over the weekend and yesterday so my suspicion is the offending 
>>>>>> code in this case was never tested.
>>>>>>
>>>>>> That said, given that we get practically no benefit from PCH on 
>>>>>> Linux/GCC, we should probably just turn it off by default for 
>>>>>> Linux and/or GCC. I think we need to investigate Macos as well here.
>>>>>>
>>>>>> /Erik
>>>>>>> -Aleksey
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>