RFR: JDK-8186694: JShell: speed-up compilation by reusing compiler instances

Thu Aug 31 09:14:06 UTC 2017

Thanks for the comments so far.

Updated webrev:
http://cr.openjdk.java.net/~jlahoda/8186694/webrev.01/

I've moved the class to the javac.api package as suggested, and did more 
changes as noted below.

One of the additional changes is a Scope related fix to TypeEnter and 
corresponding ScopeClassHeaderTest.java test, which is mostly 
independent to the rest of this patch, and I can split that out. (Some 
of the fixes in this webrev caused this bug was revealed.)

On 30.8.2017 12:43, Maurizio Cimadamore wrote:
> Good work, some comments:
>
> * in the combo test API, who is calling returnTask()?

Ooops. Fixed (when a new task is to be issued, the previous one is closed).

>
> * would it make sense to use try with resources for JavacPoolTask and
> jshell AnalyzeTask, to minimize chances of errors?

Makes sense, done.

>
> * not sure about the getTask/returnTask handshake - the logic seems to
> work in this way:
>
> - when a new task is requested, you check the caches, and if a context
> is there (with given options) you return it, otherwise you create a
> fresh one
> - when a task is returned, if it's not polluted, you add it to the
> caches. Then you need to trim the caches to make sure there's room for
> the new context

Yes, that's the intent.

>
> We badly miss a circular queue here (not a fault of this patch of course
> :-)).
>
> If we could reduce the state to the options2Context map, then we could
> probably use a CHM and get rid of synchronized block.
>
> Even w/ synchronized map, it seems like 'pool' is only used to get at
> the first cached context, so maybe a simple field (which you overwrite)
> might suffice?

I've got rid of the pool field. But not sure if the code can be 
rewritten to CHM - might be safer to simply use a simple synchronized block.

>
> More generally, I tend to view pair of coupled methods such as
> getTask/returnTask as suspicious and error prone (in case somebody
> forgets to call the second method) - so I tend to try to avoid writing
> such APIs unless absolutely necessary - here I'm not 100% on why this is
> necessary... on the one hand I get the desire of knowing when a client
> is done with a task, on the other hand there's nothing stopping the same
> client from keep using same task after returning it, so...

In principle I agree having a pair of methods can lead to errors. But 
the error here is that a Context won't be reused, which just should mean 
things will be somewhat slower, but the results should be OK. I can see 
there are some options, but I am not sure if they are better:
a) when asked for a task, we could simply reuse the last Context, 
without checking if the previous user is finished with that. (This is 
what the tests are doing now.) I think this was OK for tests, and might 
work for JShell as well. But if the previous user would be still using 
the Context, the outcomes are hardly predictable (can lead to surprising 
broken results). So in JShell, each time we would create a new task, we 
would need to ask ourselves if there's another task currently in use. So 
this seems like complicating maintenance.

b) make the tasks only available inside a worker method. Like:
<Z> Z getTask(<getTask params>, Function<JavacTask, Z> worker) {
...
}

the task is only valid while "worker" is running. This is safer, as one 
cannot forget to close the task. (And I rewrote JShell to do this as an 
experiment.) But not sure if this is not too complex as this means 
significant changes to the callers, and is also inconsistent with 
JavaCompiler.getTask.

Jan

>
> Maurizio
>
>
> On 29/08/17 21:11, Jonathan Gibbons wrote:
>> I don't think JavacTaskPool belongs in the javac.util package.
>>
>> Unless there are good reasons for doing otherwise, how about using
>> javac.api instead ... i..e closed to JavacTaskImpl and friend.
>>
>> -- Jon
>>
>>
>> On 08/29/2017 12:58 PM, Jan Lahoda wrote:
>>> Hello,
>>>
>>> When starting jshell like this (where "/tmp/script" is a file which
>>> has just "/exit"):
>>> $ jshell PRINTING --execution local /tmp/script
>>>
>>> It currently takes approx. 5-6s on my laptop. PRINTING will add quite
>>> a few startup snippets. Each of these snippets is processed
>>> separately and needs several instances of javac to be created to be
>>> processed correctly.
>>>
>>> So, the idea in this patch is to reuse the javac instances multiple
>>> (many) times. This avoids initializing the javac instances just to
>>> process a very small ("single line") input.
>>>
>>> This patch is based on the ReusableContext by Maurizio used in tests.
>>> The ReusableContext handling is wrapped in JavacTaskPool, which keeps
>>> the ReusableContext and issues JavacTask possibly based on a reused
>>> Context. The pool is not currently general purpose, but should work
>>> for tests for which ReusableContext worked and for JShell.
>>>
>>> One tricky thing is that in JShell, it is necessary to prune
>>> everything in the "REPL" package when reusing the Context, as it is
>>> necessary to reload the correct/updated classes again from the file
>>> manager (this is achieved by additional cleaning in TaskFactory).
>>>
>>> Overall, with this patch, the time to execute the above command is
>>> approx. 2.5s
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8186694
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~jlahoda/8186694/webrev.00/
>>>
>>> How does this look?
>>>
>>> Thanks,
>>>     Jan
>>
>