Getting a live view of environment variables (Gradle and JDK 9)

Tue May 16 19:56:26 UTC 2017

Hi,

Another example is not having your problem is Jenkins CI: It maintains a map of environment variables throughout the build. And you have to use that a a state container throughout the build. I have never seen a plug-in not using that, because it's fundamental to the whole build server.

If you don't have that, then you have a fundamental API design problem! I don't think Java should take care of that. Backwards compatibility for some broken plugins is a no-go. You have to communicate that to developers. I'd run the build in a security manager and whenever somebody calls getenv() print a warning and in later version fail completely. Elasticsearch is doing similar stuff for their plugins.

Uwe

Am 16. Mai 2017 20:11:10 MESZ schrieb "Cédric Champeau" <cedric.champeau at gmail.com>:
>Hi Uwe,
>
>I already explained multiple times why we need this. Short answer:
>because
>we must. Slightly longer answer: because the build environment, the
>daemon,
>has to reflect the environment from the CLI (or the IDE, in case we
>call
>using the tooling API), at the moment the build is executed. Why?
>Because a
>user wouldn't understand that a script which does:
>
>println System.getenv('MY_VAR')
>
>doesn't print "foo" after doing:
>
>MY_VAR=foo gradle printVar
>
>(that's a silly example, but the simplest we can come with). Not
>everything
>runs in a separate process: there are things that execute in the daemon
>itself. A lot of things (starting from build scripts). And yes, we can
>provide a different API to get environment variables, but:
>
>1. it's a breaking change
>2. there are lots of plugins which use libraries, which do NOT depend
>on
>the Gradle API, so that API wouldn't be available
>
>I'm honestly starting to get tired of explaining again and again WHY we
>need this. If it was easy, we would have done it differently for years.
>We
>worked around a bug in the JDK, which doesn't return the true
>environment
>variables but the ones snapshotted when the process started. Now in JDK
>9
>we cannot workaround anymore, which either just kills Gradle
>performance or
>forces us to write even nastier code with bytecode manipulating agents
>to
>replace what `System.getenv` does.
>
>2017-05-16 19:46 GMT+02:00 Uwe Schindler <uschindler at apache.org>:
>
>> Hi,
>>
>> I still don't understand why you need to change the environment
>variables
>> of the actual process. I was talking with Rémi about that last week,
>and
>> it's not obvious to me why you need that. Sure, it is easier to
>change the
>> environment of the actual process and then spawn a child process for
>some
>> non-java build tool like GCC or even another standalone java program
>with
>> option fork=true. But: Why not use the ProcessBuilder API when
>spawning
>> those childs? So you just add an "official" build API inside Gradle
>(an
>> official one, documented that allows Gradle plugins to modify the
>> environment variables for the current build running). If you missed
>to add
>> such an API from the beginning (IMHO its essential for a build system
>like
>> Gradle), then you now have to tell your users: "Sorry we did
>something
>> wrong and all our (bad) hacks to allow you to change environment
>variables
>> are no longer working in the future, so please change your code. We
>are so
>> sorry!"
>>
>> If some plugin is not using that new API, then it's not your problem.
>You
>> document that you *have* to use the Environment API, because plugins
>cannot
>> rely on the fact that another plugin or maybe another build running
>at a
>> later time is changing the Gradle process environment.
>>
>> At some point you have to break backwards compatibility and tell
>users:
>> Please update your code, otherwise plugin won't work anymore with
>Gradle
>> version x.y (the one that's compatible to Java 9).
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> uschindler at apache.org
>> ASF Member, Apache Lucene PMC / Committer
>> Bremen, Germany
>> http://lucene.apache.org/
>>
>> > -----Original Message-----
>> > From: core-libs-dev [mailto:core-libs-dev-bounces at openjdk.java.net]
>On
>> > Behalf Of Cédric Champeau
>> > Sent: Thursday, May 11, 2017 9:02 AM
>> > To: Martin Buchholz <martinrb at google.com>
>> > Cc: core-libs-dev <core-libs-dev at openjdk.java.net>
>> > Subject: Re: Getting a live view of environment variables (Gradle
>and
>> JDK 9)
>> >
>> > Thanks for the answers, folks, but I think they are kind of missing
>the
>> > point. The fact is that environment variables *are* mutable. Java
>happens
>> > to return an immutable view of the environment variables when the
>VM was
>> > started, which is not the reality. We cannot trust what
>`System.geteenv`
>> > returns. The Javadocs say "current system environment" but it's
>just not
>> > true. The method should be called `getEnvWhenTheVMWasStarted`.
>> >
>> > We also have the problem that the environment of the client and the
>> > daemon
>> > differ over time, as I explained in the previous email. And it is
>pretty
>> > easy to understand that _when the build runs_, we need to get the
>> > environment variables of the _client_, not the daemon. Imagine
>something
>> > as
>> > simple as this:
>> >
>> > String toolPath = System.getenv('SOMETOOL_HOME')
>> >
>> > and imagine that this code runs in the daemon. When the daemon is
>> started,
>> > there's absolutely no guarantee that this variable is going to
>exist.
>> > However, we know that when we're going to execute the build, this
>> variable
>> > *has* to be defined. Obviously, it's going to be done from the CLI:
>> >
>> > $ export SOMETOOL_HOME = ...
>> > $ ./gradlew run ---> connects to the daemon, passes environment
>> variables,
>> > mutates those of the daemon, which then executes the build
>> >
>> > Similarly, from a *single* CLI/terminal, it's very common to change
>the
>> > values of environment variables (because, hey, they are mutable!):
>> >
>> > $ export SOMETOOL_HOME = another path I want to try out
>> > $ ./gradlew run --> ... must update env vars again
>> >
>> > So, to work around the problem that Java doesn't provide a way to
>mutate
>> > the current environment variables, we perform a JNI call to do it.
>*Then*
>> > we need to mutate the "immutable view" that Java provides, or all
>Java
>> code
>> > is going to get wrong information, and we would never succeed. Note
>that
>> > having to resort on JNI to mutate the environment is not the
>problem. We
>> > can live with that. Once can think it's a bad idea to allow
>mutation, in
>> > practice, it is necessary, but I reckon it could be a bad idea to
>have an
>> > API for this. The *real* issue is that `System.getenv` does *not*
>return
>> > the real values, because it's an immutable view of what existed at
>the VM
>> > startup.
>> >
>> > Note that it's not recommended to mutate environment variables in a
>> > multi-threaded environment. But in practice, you can assimilate
>what we
>> do
>> > to the VM startup: we get environment variables, mutate them,
>_then_ the
>> > build runs (and only at that moment we would effectively be
>> > multi-threaded). We can live with potential issues of mutating the
>> > environment: the benefits outperforms the drawbacks by orders of
>> > magnitude.
>> > When the daemon is activated, we're not talking about 10% faster
>builds.
>> > They can effectively be 3, 5 or 10x faster!
>> >
>> > Now, back to the problem with JDK 9:
>> >
>> > - first, the implementation has changed. But we could adapt our
>code.
>> > - second, calling `setAccessible` is not allowed anymore. And this
>is a
>> > MAJOR pita. The problem is that we're trying to access the java
>base
>> > module, and reflection on that module is not allowed anymore. There
>are
>> no
>> > good solutions for this: we could write a JVM agent, like you
>suggested,
>> > that rewrites bytecode. But since we're trying to rewrite a core
>class,
>> it
>> > would have to be found on the invocation of `java` command itself,
>and
>> > cannot be dynamically loaded. In addition, we would have to add a
>flag to
>> > open core classes to rewrite. There are multiple problems with this
>> > approach: first, we don't know on which environment we run before
>we're
>> > started. Gradle can run on JDK 7, 8, 9, ... and we cannot have a
>startup
>> > script which tries to infer the version from whatever it finds,
>this is
>> not
>> > realistic and would add unacceptable latency on the client side
>(plus, a
>> > lot of headaches to support the various environments we support:
>Linux,
>> > Mac, Windows, ...). Second, we may not have a hand on the CLI at
>all. For
>> > example, if the build runs in Travis CI, Amazon, or whatever
>cloudish
>> thing
>> > that spawns its own containers, we cannot tweak the VM arguments.
>We just
>> > have to use whatever is there. Last, rewriting bytecode has a cost,
>that
>> > you pay at startup.
>> >
>> > Again, what we need is that Java just returns the live view of the
>> > environment variables. Allowing *mutation* is not necessary,
>technically
>> > speaking, because we can rely on JNI. However, I reckon that
>returning an
>> > immutable view is done for performance reasons. That's why adding a
>way
>> > to
>> > mutate "the view" would be an acceptable thing I think. No
>reflection, no
>> > bytecode rewriting, just give a way to mutate the map that
>> > `ProcessEnvironment` retains. The advantage of this is that you
>would not
>> > effectively mutate the environment variables, so you, on the JVM
>side,
>> > would be safe. What you would mutate is the view you are retaining.
>> > Alternatively, provide us with an API to "refresh" the view. Like,
>> > `System.getenv(true)`. The benefit would be that you would only
>have to
>> get
>> > the new view of environment variables if the user asks for it. And,
>> > subsequent callers would get the refreshed view, so people using
>> `gettenv`
>> > in their code would be up-to-date.
>> >
>> > I'm really concerned that this problem is not taken seriously. It's
>a
>> major
>> > performance problem for the most widely used build tool out there
>> > (considering all users, from native to Java and Android). Just
>saying
>> > "you're doing something nasty" is not a valid answer: we have to do
>it,
>> and
>> > do it for good reasons.
>> >
>> >
>> > 2017-05-11 4:50 GMT+02:00 Martin Buchholz <martinrb at google.com>:
>> >
>> > > Since you're OK with doing brain surgery on the JDK and you
>control the
>> > > entire process, consider controlling your daemon with a bytecode
>> rewriting
>> > > agent (e.g. byteman) that changes the definition of System.getenv
>etc.
>> > >
>> > > (Whose side are you on Martin?! ...  I confess I also wish for a
>faster
>> > > gradle ...)
>> > >
>>
>>