RFR: JDK-8186694: JShell: speed-up compilation by reusing compiler instances

Fri Sep 1 12:21:50 UTC 2017

Hi,

Thanks. I've updated the patch so that the pool (and both JShell and 
tests) take a worker and the tasks are only valid while the worker is 
running:

http://cr.openjdk.java.net/~jlahoda/8186694/webrev.02/

What do you think?

Thanks,
     Jan

On 31.8.2017 16:17, Maurizio Cimadamore wrote:
> Looks better, thanks.
>
> On the issue of getTask/returnTask, I think that either a
> lambda-accepting method or a TWR approach would be good in enforcing the
> 'right' idiom. Another alternative would be to return a task which
> overrides the parse()/analze()... methods so that the returnTask logic
> is called at the end. But that seems weaker than the lambda based approach.
>
> I'm not too worried w.r.t. inconsistencies with the JavaCompiler.getTask
> because (i) even having a pair of methods is inconsistent with that API
> and (ii) JavacPoolTask represents a pool of compilation tasks, so I'm
> not sure it has to be 100% consistent with what's done in JavaCompiler.
>
> Updating the clients would probably be the biggest chunk of work, so I
> leave with you as to whether this should be addressed. My sense is that
> some of the jshell clients have already changes because of this (with
> try/finally, so adding a method call with a lambda would probably be
> rather similar). But combo tests would also need same treatment (e.g.
> ComboInstance::newCompilationTask should probably become a
> lambda-accepting ComboInstance::withCompilatonTask).
>
> Maurizio
>
>
> On 31/08/17 10:14, Jan Lahoda wrote:
>> Thanks for the comments so far.
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~jlahoda/8186694/webrev.01/
>>
>> I've moved the class to the javac.api package as suggested, and did
>> more changes as noted below.
>>
>> One of the additional changes is a Scope related fix to TypeEnter and
>> corresponding ScopeClassHeaderTest.java test, which is mostly
>> independent to the rest of this patch, and I can split that out. (Some
>> of the fixes in this webrev caused this bug was revealed.)
>>
>> On 30.8.2017 12:43, Maurizio Cimadamore wrote:
>>> Good work, some comments:
>>>
>>> * in the combo test API, who is calling returnTask()?
>>
>> Ooops. Fixed (when a new task is to be issued, the previous one is
>> closed).
>>
>>>
>>> * would it make sense to use try with resources for JavacPoolTask and
>>> jshell AnalyzeTask, to minimize chances of errors?
>>
>> Makes sense, done.
>>
>>>
>>> * not sure about the getTask/returnTask handshake - the logic seems to
>>> work in this way:
>>>
>>> - when a new task is requested, you check the caches, and if a context
>>> is there (with given options) you return it, otherwise you create a
>>> fresh one
>>> - when a task is returned, if it's not polluted, you add it to the
>>> caches. Then you need to trim the caches to make sure there's room for
>>> the new context
>>
>> Yes, that's the intent.
>>
>>>
>>> We badly miss a circular queue here (not a fault of this patch of course
>>> :-)).
>>>
>>> If we could reduce the state to the options2Context map, then we could
>>> probably use a CHM and get rid of synchronized block.
>>>
>>> Even w/ synchronized map, it seems like 'pool' is only used to get at
>>> the first cached context, so maybe a simple field (which you overwrite)
>>> might suffice?
>>
>> I've got rid of the pool field. But not sure if the code can be
>> rewritten to CHM - might be safer to simply use a simple synchronized
>> block.
>>
>>>
>>> More generally, I tend to view pair of coupled methods such as
>>> getTask/returnTask as suspicious and error prone (in case somebody
>>> forgets to call the second method) - so I tend to try to avoid writing
>>> such APIs unless absolutely necessary - here I'm not 100% on why this is
>>> necessary... on the one hand I get the desire of knowing when a client
>>> is done with a task, on the other hand there's nothing stopping the same
>>> client from keep using same task after returning it, so...
>>
>> In principle I agree having a pair of methods can lead to errors. But
>> the error here is that a Context won't be reused, which just should
>> mean things will be somewhat slower, but the results should be OK. I
>> can see there are some options, but I am not sure if they are better:
>> a) when asked for a task, we could simply reuse the last Context,
>> without checking if the previous user is finished with that. (This is
>> what the tests are doing now.) I think this was OK for tests, and
>> might work for JShell as well. But if the previous user would be still
>> using the Context, the outcomes are hardly predictable (can lead to
>> surprising broken results). So in JShell, each time we would create a
>> new task, we would need to ask ourselves if there's another task
>> currently in use. So this seems like complicating maintenance.
>>
>> b) make the tasks only available inside a worker method. Like:
>> <Z> Z getTask(<getTask params>, Function<JavacTask, Z> worker) {
>> ...
>> }
>>
>> the task is only valid while "worker" is running. This is safer, as
>> one cannot forget to close the task. (And I rewrote JShell to do this
>> as an experiment.) But not sure if this is not too complex as this
>> means significant changes to the callers, and is also inconsistent
>> with JavaCompiler.getTask.
>>
>> Jan
>>
>>>
>>> Maurizio
>>>
>>>
>>> On 29/08/17 21:11, Jonathan Gibbons wrote:
>>>> I don't think JavacTaskPool belongs in the javac.util package.
>>>>
>>>> Unless there are good reasons for doing otherwise, how about using
>>>> javac.api instead ... i..e closed to JavacTaskImpl and friend.
>>>>
>>>> -- Jon
>>>>
>>>>
>>>> On 08/29/2017 12:58 PM, Jan Lahoda wrote:
>>>>> Hello,
>>>>>
>>>>> When starting jshell like this (where "/tmp/script" is a file which
>>>>> has just "/exit"):
>>>>> $ jshell PRINTING --execution local /tmp/script
>>>>>
>>>>> It currently takes approx. 5-6s on my laptop. PRINTING will add quite
>>>>> a few startup snippets. Each of these snippets is processed
>>>>> separately and needs several instances of javac to be created to be
>>>>> processed correctly.
>>>>>
>>>>> So, the idea in this patch is to reuse the javac instances multiple
>>>>> (many) times. This avoids initializing the javac instances just to
>>>>> process a very small ("single line") input.
>>>>>
>>>>> This patch is based on the ReusableContext by Maurizio used in tests.
>>>>> The ReusableContext handling is wrapped in JavacTaskPool, which keeps
>>>>> the ReusableContext and issues JavacTask possibly based on a reused
>>>>> Context. The pool is not currently general purpose, but should work
>>>>> for tests for which ReusableContext worked and for JShell.
>>>>>
>>>>> One tricky thing is that in JShell, it is necessary to prune
>>>>> everything in the "REPL" package when reusing the Context, as it is
>>>>> necessary to reload the correct/updated classes again from the file
>>>>> manager (this is achieved by additional cleaning in TaskFactory).
>>>>>
>>>>> Overall, with this patch, the time to execute the above command is
>>>>> approx. 2.5s
>>>>>
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8186694
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~jlahoda/8186694/webrev.00/
>>>>>
>>>>> How does this look?
>>>>>
>>>>> Thanks,
>>>>>     Jan
>>>>
>>>
>