Getting a live view of environment variables (Gradle and JDK 9)

Tue May 16 18:11:10 UTC 2017

Hi Uwe,

I already explained multiple times why we need this. Short answer: because
we must. Slightly longer answer: because the build environment, the daemon,
has to reflect the environment from the CLI (or the IDE, in case we call
using the tooling API), at the moment the build is executed. Why? Because a
user wouldn't understand that a script which does:

println System.getenv('MY_VAR')

doesn't print "foo" after doing:

MY_VAR=foo gradle printVar

(that's a silly example, but the simplest we can come with). Not everything
runs in a separate process: there are things that execute in the daemon
itself. A lot of things (starting from build scripts). And yes, we can
provide a different API to get environment variables, but:

1. it's a breaking change
2. there are lots of plugins which use libraries, which do NOT depend on
the Gradle API, so that API wouldn't be available

I'm honestly starting to get tired of explaining again and again WHY we
need this. If it was easy, we would have done it differently for years. We
worked around a bug in the JDK, which doesn't return the true environment
variables but the ones snapshotted when the process started. Now in JDK 9
we cannot workaround anymore, which either just kills Gradle performance or
forces us to write even nastier code with bytecode manipulating agents to
replace what `System.getenv` does.

2017-05-16 19:46 GMT+02:00 Uwe Schindler <uschindler at apache.org>:

> Hi,
>
> I still don't understand why you need to change the environment variables
> of the actual process. I was talking with Rémi about that last week, and
> it's not obvious to me why you need that. Sure, it is easier to change the
> environment of the actual process and then spawn a child process for some
> non-java build tool like GCC or even another standalone java program with
> option fork=true. But: Why not use the ProcessBuilder API when spawning
> those childs? So you just add an "official" build API inside Gradle (an
> official one, documented that allows Gradle plugins to modify the
> environment variables for the current build running). If you missed to add
> such an API from the beginning (IMHO its essential for a build system like
> Gradle), then you now have to tell your users: "Sorry we did something
> wrong and all our (bad) hacks to allow you to change environment variables
> are no longer working in the future, so please change your code. We are so
> sorry!"
>
> If some plugin is not using that new API, then it's not your problem. You
> document that you *have* to use the Environment API, because plugins cannot
> rely on the fact that another plugin or maybe another build running at a
> later time is changing the Gradle process environment.
>
> At some point you have to break backwards compatibility and tell users:
> Please update your code, otherwise plugin won't work anymore with Gradle
> version x.y (the one that's compatible to Java 9).
>
> Uwe
>
> -----
> Uwe Schindler
> uschindler at apache.org
> ASF Member, Apache Lucene PMC / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
> > -----Original Message-----
> > From: core-libs-dev [mailto:core-libs-dev-bounces at openjdk.java.net] On
> > Behalf Of Cédric Champeau
> > Sent: Thursday, May 11, 2017 9:02 AM
> > To: Martin Buchholz <martinrb at google.com>
> > Cc: core-libs-dev <core-libs-dev at openjdk.java.net>
> > Subject: Re: Getting a live view of environment variables (Gradle and
> JDK 9)
> >
> > Thanks for the answers, folks, but I think they are kind of missing the
> > point. The fact is that environment variables *are* mutable. Java happens
> > to return an immutable view of the environment variables when the VM was
> > started, which is not the reality. We cannot trust what `System.geteenv`
> > returns. The Javadocs say "current system environment" but it's just not
> > true. The method should be called `getEnvWhenTheVMWasStarted`.
> >
> > We also have the problem that the environment of the client and the
> > daemon
> > differ over time, as I explained in the previous email. And it is pretty
> > easy to understand that _when the build runs_, we need to get the
> > environment variables of the _client_, not the daemon. Imagine something
> > as
> > simple as this:
> >
> > String toolPath = System.getenv('SOMETOOL_HOME')
> >
> > and imagine that this code runs in the daemon. When the daemon is
> started,
> > there's absolutely no guarantee that this variable is going to exist.
> > However, we know that when we're going to execute the build, this
> variable
> > *has* to be defined. Obviously, it's going to be done from the CLI:
> >
> > $ export SOMETOOL_HOME = ...
> > $ ./gradlew run ---> connects to the daemon, passes environment
> variables,
> > mutates those of the daemon, which then executes the build
> >
> > Similarly, from a *single* CLI/terminal, it's very common to change the
> > values of environment variables (because, hey, they are mutable!):
> >
> > $ export SOMETOOL_HOME = another path I want to try out
> > $ ./gradlew run --> ... must update env vars again
> >
> > So, to work around the problem that Java doesn't provide a way to mutate
> > the current environment variables, we perform a JNI call to do it. *Then*
> > we need to mutate the "immutable view" that Java provides, or all Java
> code
> > is going to get wrong information, and we would never succeed. Note that
> > having to resort on JNI to mutate the environment is not the problem. We
> > can live with that. Once can think it's a bad idea to allow mutation, in
> > practice, it is necessary, but I reckon it could be a bad idea to have an
> > API for this. The *real* issue is that `System.getenv` does *not* return
> > the real values, because it's an immutable view of what existed at the VM
> > startup.
> >
> > Note that it's not recommended to mutate environment variables in a
> > multi-threaded environment. But in practice, you can assimilate what we
> do
> > to the VM startup: we get environment variables, mutate them, _then_ the
> > build runs (and only at that moment we would effectively be
> > multi-threaded). We can live with potential issues of mutating the
> > environment: the benefits outperforms the drawbacks by orders of
> > magnitude.
> > When the daemon is activated, we're not talking about 10% faster builds.
> > They can effectively be 3, 5 or 10x faster!
> >
> > Now, back to the problem with JDK 9:
> >
> > - first, the implementation has changed. But we could adapt our code.
> > - second, calling `setAccessible` is not allowed anymore. And this is a
> > MAJOR pita. The problem is that we're trying to access the java base
> > module, and reflection on that module is not allowed anymore. There are
> no
> > good solutions for this: we could write a JVM agent, like you suggested,
> > that rewrites bytecode. But since we're trying to rewrite a core class,
> it
> > would have to be found on the invocation of `java` command itself, and
> > cannot be dynamically loaded. In addition, we would have to add a flag to
> > open core classes to rewrite. There are multiple problems with this
> > approach: first, we don't know on which environment we run before we're
> > started. Gradle can run on JDK 7, 8, 9, ... and we cannot have a startup
> > script which tries to infer the version from whatever it finds, this is
> not
> > realistic and would add unacceptable latency on the client side (plus, a
> > lot of headaches to support the various environments we support: Linux,
> > Mac, Windows, ...). Second, we may not have a hand on the CLI at all. For
> > example, if the build runs in Travis CI, Amazon, or whatever cloudish
> thing
> > that spawns its own containers, we cannot tweak the VM arguments. We just
> > have to use whatever is there. Last, rewriting bytecode has a cost, that
> > you pay at startup.
> >
> > Again, what we need is that Java just returns the live view of the
> > environment variables. Allowing *mutation* is not necessary, technically
> > speaking, because we can rely on JNI. However, I reckon that returning an
> > immutable view is done for performance reasons. That's why adding a way
> > to
> > mutate "the view" would be an acceptable thing I think. No reflection, no
> > bytecode rewriting, just give a way to mutate the map that
> > `ProcessEnvironment` retains. The advantage of this is that you would not
> > effectively mutate the environment variables, so you, on the JVM side,
> > would be safe. What you would mutate is the view you are retaining.
> > Alternatively, provide us with an API to "refresh" the view. Like,
> > `System.getenv(true)`. The benefit would be that you would only have to
> get
> > the new view of environment variables if the user asks for it. And,
> > subsequent callers would get the refreshed view, so people using
> `gettenv`
> > in their code would be up-to-date.
> >
> > I'm really concerned that this problem is not taken seriously. It's a
> major
> > performance problem for the most widely used build tool out there
> > (considering all users, from native to Java and Android). Just saying
> > "you're doing something nasty" is not a valid answer: we have to do it,
> and
> > do it for good reasons.
> >
> >
> > 2017-05-11 4:50 GMT+02:00 Martin Buchholz <martinrb at google.com>:
> >
> > > Since you're OK with doing brain surgery on the JDK and you control the
> > > entire process, consider controlling your daemon with a bytecode
> rewriting
> > > agent (e.g. byteman) that changes the definition of System.getenv etc.
> > >
> > > (Whose side are you on Martin?! ...  I confess I also wish for a faster
> > > gradle ...)
> > >
>
>