ScopedValue.runWhere not returning scope

Fri Jun 7 13:03:09 UTC 2024

> I read your initial mail about instrumentation use-cases, you may need to
define that a bit more as it's not clear if this is about bytecode
instrumentation to add hooks for telemetry or something else.  In your
Proxy example it looks like the user's runnable operation is called from
Proxy.run. It's not immediately clear from that example why it's not using
ScopedValue.runWhere where ScopedValue's value is the map.

This is not bytecode instrumentation. Let me try to be more specific.
Different libraries give users options to run some code before and after
some processing is happening. They do that by giving some interceptor
interfaces that the user can implement. Such solutions are available in
Apache CXF [1], Apache Commons for Http client [2], Datasource Proxy [3],
Jersey [4][5], R2DbcProxy [6] etc.

Let me provide the same example again but using HTTP terms.

Interceptor
```
interface Interceptor {

  /**
   * Called before actual method execution
   * @param map mutable request to pass between methods
   */
  void before(Map<Object, Object> map);

  /**
   * Called after actual method execution
   * @param exception potential exception - can be null
   * @param map mutable request to pass between methods
   */
  void after(Exception exception, Map<Object, Object> map) throws Exception;

}
```

ThreadLocalInterceptor
```
class ThreadLocalInterceptor implements Interceptor {

  static final ThreadLocal<String> threadLocal = new ThreadLocal<>();

  @Override
  public void before(Map<Object, Object> map) {
    threadLocal.set("foo");
    Closeable scope = threadLocal::remove;
    map.put("scope", scope);
  }

  @Override
  public void after(Exception exception, Map<Object, Object> map) throws
Exception {
    Closeable scope = (Closeable) map.get("scope");
    scope.close();
  }
}
```

RestClient
```
class RestClient {

  private final Interceptor interceptor;

  RestClient(Interceptor interceptor) {
    this.interceptor = interceptor;
  }

  void makeAnHttpCall()  throws Exception {
    Map<Object, Object> request = new HashMap<>();
    Exception exception = null;
    interceptor.before(request); // E.g. enrich MDC with some values
    try {
      // Log sth out - MDC could be enriched with values with the
interceptor
      // in case of Distributed Tracing MDC get enriched with Span and
Trace Ids.
      // Then they are available in the logs. For simplicity, I'm using
System.out
      System.out.println("HELLO [" +
ThreadLocalInterceptor.threadLocal.get() + "]"); // MDC would be enriched
and logs would contain that set value
      httpCall();
    } catch (Exception ex) {
      exception = ex;
      throw ex;
    } finally {
      interceptor.after(exception, request); // e.g. clear MDC
    }
  }

  void httpCall() {
    // Make an HTTP call
  }

}
```

App
```
public class App {

  public static void main(String[] args) throws Exception {
    new RestClient(new ThreadLocalInterceptor()).makeAnHttpCall();
    System.out.println("Goodbye [" +
ThreadLocalInterceptor.threadLocal.get() + "]");
  }
}
```

> It's not immediately clear from that example why it's not
ScopedValue.runWhere where ScopedValue's value is the map.

What if the code doesn't want to have direct dependencies to the latest JDK
but still needs to support older versions? Typically library maintainers
would extract an interface and have a separate module that would be running
e.g. JDK21 with an implementation that would use ScopedValues. That's what
I meant by the SPI approach.

> I don't understand this. Why would it be a shame? The differences between
scoped values and thread-local variables are that scoped values are one-way
from caller to callee and strictly nested. The use pattern that you are
presenting loses those properties that make scoped values distinct. If you
need an unstructured context object you've got a solution already,
thread-local variables. Why do you want to use scoped values?

Because I want to leverage the performance benefit. Maybe I misinterpreted
the use case for Scoped Values but my assumption was that when using Scoped
Values I will get better performance in comparison to the pure Thread Local
approach. In e.g. distributed tracing or security you often need to access
current span or current principal and those are retrieved from Thread
Local. Each Thread Local call is costly.

> Can you explain that a little more? Why can't isolated functionality be
added without separate before and after logic?

I mentioned that above but I'll try to rephrase it. There are already
solutions such as Apache CXF [1], Apache Commons for Http client [2],
Datasource Proxy [3], Jersey [4][5], R2DbcProxy [6] that work with such
interceptors. All that code would have to be rewritten. I'm not saying that
this is not feasible but let's be realistic. Also if all of those libraries
wouldn't rewrite their code then distributed tracing would be broken. If
one service is not working then instead of one correlation you will get two
or more.

> Out of curiosity, what do you think should happen, specification wise, if
scoped value bindings are closed out of order?

That's a fantastic question and I of course had such issues before. If they
close out of order then scope hierarchy breaks and you might have memory
leaks. That's because scopes know about their parents and wrong parents
might be set, and some will never be closed. In Brave [7] you have
misalignment assertion errors (only when assertions are turned on) [7] and
in OpenTelemetry they are not closing such misaligned scopes [8]. From the
spec perspective honestly I don't know what the best solution is. I would
have to invest time in doing a proper pros and cons analysis. Of course
when someone is manually taking care of opening and closing scopes they
must close the scope properly. It's the same case as with all of the
Closeables and Autocloseables.

> The point of any structured programming construct is to restrict
programmers. Any program may be written with conditional jumps, but we
don't do that any more because it makes programs hard to reason about.
Anything that you can do with a scoped value you can also do with a
thread-local variable, albeit a little less efficiently. The added
efficiency of scoped values is a direct result of the restrictions that we
place on the way they may be used.

I mean if there's no distinction in terms of performance between Scoped
Values and Thread Locals then we should just need to ensure that there's an
interop between Thread Locals and Scoped Values and current implementations
could stay on Thread Locals. But I thought that the very idea is to use
Scoped Values preferably with Virtual Threads to have best performance in
comparison to using Thread Locals. Maybe I made a mistake in this
assumption?

[1] - https://cxf.apache.org/docs/interceptors.html
[2] -
https://hc.apache.org/httpcomponents-core-4.4.x/current/httpcore/apidocs/org/apache/http/HttpRequestInterceptor.html
(there's one for response too)
[3] -
https://github.com/jdbc-observations/datasource-proxy/blob/datasource-proxy-1.10/src/main/java/net/ttddyy/dsproxy/listener/QueryExecutionListener.java
[4] -
https://github.com/eclipse-ee4j/jersey/blob/3.1.7/core-server/src/main/java/org/glassfish/jersey/server/monitoring/ApplicationEventListener.java
[5] -
https://github.com/eclipse-ee4j/jersey/blob/3.1.7/core-server/src/main/java/org/glassfish/jersey/server/monitoring/RequestEventListener.java
[6] -
https://github.com/r2dbc/r2dbc-proxy/blob/v1.1.5.RELEASE/src/main/java/io/r2dbc/proxy/listener/ProxyExecutionListener.java
[7] -
https://github.com/openzipkin/brave/blob/6.0.3/brave/src/main/java/brave/propagation/ThreadLocalSpan.java#L155
[8] -
https://github.com/open-telemetry/opentelemetry-java/blob/v1.39.0/context/src/main/java/io/opentelemetry/context/ThreadLocalContextStorage.java#L48

Pozdrawiam / Best regards,
Marcin Grzejszczak

https://marcin.grzejszczak.pl
https://toomuchcoding.com

pt., 7 cze 2024 o 09:17 Andrew Haley <aph-open at littlepinkcloud.com>
napisał(a):

> Hi,
>
> On 6/6/24 21:53, Marcin Grzejszczak wrote:
>
>  >  > If you consider an execution of a program as a downwards-growing
>  >  > stack, a scoped value is bound to some value at one point in the
>  >  > stack, and the bound value is visible to all frames below that
>  >  > point. That is all: there's no need to consider scopes opening
>  >  > and closing.
>  >
>  > In theory that's true. But if we take into consideration frameworks
>  > like Brave [1], Micrometer [2], OpenTelemetry [3] , Apache Camel
>  > [4], Apache Skywalking [5] etc. (so those that work with distributed
>  > tracing) that theoretical viewpoint does not hold. It would be a
>  > gigantic shame if they would need to always rely on ThreadLocals
>  > instead of being able to use the ScopedValues mechanism.
>
> I don't understand this. Why would it be a shame? The differences
> between scoped values and thread-local variables are that scoped
> values are one-way from caller to callee and strictly nested. The use
> pattern that you are presenting loses those properties that make
> scoped values distinct. If you need an unstructured context object
> you've got a solution already, thread-local variables. Why do you want
> to use scoped values?
>
>  >  > It's true that this decision makes scoped values less versatile
>  >  > than thread-local variables, but on the other hand the hard
>  >  > guarantees it provides make programs easier to reason about. On
>  >  > balance, we consider the latter more important than the former.
>  >
>  > I completely agree with that however for library instrumentors this
>  > separation is extremely important. Also that would cancel out all
>  > libraries that want to use scoped values and work in an SPI model
>  > where users can add functionalities in an isolated fashion (separate
>  > before and after logic).
>
> Can you explain that a little more? Why can't isolated functionality
> be added without separate before and after logic?
>
>  > It would require all of the frameworks that I mentioned to
>  > completely rewrite their internals so that the user's code execution
>  > would be wrapped in a e.g. `ScopedValue.runWhere`.
>
> That's true.
>
>  >  > We did consider (and even implement) an API with open() and
>  >  > close() methods, but we discarded it in order to make the use you
>  >  > describe impossible
>  >
>  > I understand that with great power comes great responsibility and
>  > you want to guard users from cases like "opening and not closing".
>
> Exactly, and it makes a difference to code quality when we know that a
> scoped value cannot change during the lifetime of a method, even if
> other methods are called or if a (virtual) thread is suspended and
> resumed. A scoped value is effectively constant for the lifetime of a
> method.
>
> Out of curiosity, what do you think should happen, specification wise,
> if scoped value bindings are closed out of order?
>
>  > I fully understand that because I've been working in the domain of
>  > distributed tracing for a decade now and I've seen that particular
>  > scenario happen every now and then. It's not a common situation
>  > though. Since you've already implemented it once then you did see
>  > reasons for doing that. Is there a chance to come back to that
>  > discussion? What would need to happen to reopen that conversation
>  > once again?
>
> Given that it would lose the most fundamental distinction between
> thread-local variables and scoped values, I'd say that it'd take some
> kind of miracle.
>
> The point of any structured programming construct is to restrict
> programmers. Any program may be written with conditional jumps, but we
> don't do that any more because it makes programs hard to reason about.
> Anything that you can do with a scoped value you can also do with a
> thread-local variable, albeit a little less efficiently. The added
> efficiency of scoped values is a direct result of the restrictions
> that we place on the way they may be used.
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240607/143faa29/attachment-0001.htm>