ScopedValue.runWhere not returning scope

Wed Jun 19 14:43:19 UTC 2024

> To be clear, I have no control whatsoever over what does or does not
happen with Java. I am only stating my opinion - and trying to help!

I'm really happy about that. We're talking about a complicated problem and
I'm sure we'll reach some compromise whatever that will be ;)

> I do think it is strange that you want to take advantage of a new api,
but you don’t want to do any work to do so??? This wouldn’t be an api
change then, it would be a performance optimization behind the scenes. A
primary purpose of the ScopedValue api addition is the automatic closing of
the ScopeValue for better resource management in a VirtualThread system.
Your requested changes eliminate a (the?) primary purpose of the api
change! So, if you don’t want this, continue with ThreadLocal.

That's a very interesting point.

As a developer I shouldn't care whether ThreadLocal or ScopedVariable is
being used. Let's say that I have an MVC application and would like to have
better performance so I would just configure my framework (e.g. Spring
Boot) to use Virtual Threads. Hopefully we are in agreement here that this
is the intended use case of Virtual Threads - improve performance. Now,
correct me if I'm wrong, but in order to maintain that performance gain
behind the scenes Scoped Values would be used to retrieve the information
about e.g. the current span (for tracing). As a user I don't care what is
happening under the hood. From the point of view of the library creator I
very much do.

If the only purpose of Scoped Values is to allow a value to be there for
the life of a runnable / callable then indeed this whole discussion is
pointless. However we are already asked by our users whether we will
support Scoped Values and I'm pretty sure that their assumption is that it
will work in a transparent way (i.e. again they won't care about whether TL
or SV are used under the hood - they want performance gains).

Assumption that everyone has to do some migration to use this feature is
acceptable, but still if only the scope would be returned by this API, no
changes would be required. I think we constantly go in circles with this
and I still do not understand why is it such a big problem (I understood
the topic of security, but I'm personally not convinced that this is such a
big deal; it's similar to a case if someone didn't close a stream or
something like that. If they don't they will have problems too)

> Still, if the tracing library code is properly structured I would doubt
it would be more than a global search and replace in the library to switch
from ThreadLocal to ScopedValues. You keep stating “rewrite all of our
code” without providing any evidence of why that would be the case.

For Micrometer Observation I need to introduce new API that instead of
allowing to create a Scope and close it I need to create new API that works
more or less like FilterChain and Filters for HTTP (we have an analogous
approach of ObservationHandlers that would need to work with HandlerChain).
In the Micrometer repo only I need to do 60+ changes of the API usage. Of
course I can't break compatibility so I need to leave the old API. Plus I
need to add new ObservationHandler methods and not break compatibility.
That's only for Micrometer Observation. I will need to have similar changes
in Micrometer Tracing plus Brave and OpenTelemetry. Only after all of these
projects are changed will we have to go to all the projects that actually
use us to change their usage to the new one.

I don't have to mention that for all of these changes we need to ensure
that compatibility will not be broken and that things will work with and
without Scoped Values. There also must be a TL and SV interop because not
everyone will use SV only. For me it sounds like a lot of work.

> It is highly doubtful any instrumented projects would need to change. I
looked at how the tracing support is done with Spring and the proxies and
that is certainly wouldn’t be required there.

I'm sorry but in the vast majority of cases it was me doing or co-authoring
the Spring instrumentation (plus dozens of other projects) and what you
wrote is not entirely correct. Micrometer Observation API is being used to
do the instrumentation. You can check an example here [1]. No proxies are
being involved.

> If you wanted no changes to instrumented projects whatsoever, you would
need to use the “check if ScopedValue context exists, if not use context
from ThreadLocal” design as was suggested - which is meaningless overhead
compared to the overhead of tracing itself. And if the instrumented project
is being changed to use VirtualThreads it SHOULD be changed at the same
time to update the tracing library setup code.

So do I understand correctly that you should either use TL or SV ? No
library will migrate directly to use only this or that. There must be some
bridging or an option to use gradual migration.

> I updated the my code to demonstrate the fallback to ThreadLocal when the
top-level ScopedValue context is not available.
> I also updated the code to work when the ScopedValue setup is not
performed. I am pretty sure it works the way most tracing libraries are
interfaced with.

I looked at your sample and have a couple of comments:
- You assume that you know what the entry port to your application is (a
"request" comes in). You can't assume that in a production application. You
might have an MVC app but maybe you have a command line application. We
would have to try to guess all those entry points so that we wrap these
calls in `where(...)`
- Your approach is equivalent to creation of a TL entry that is mutable and
all the scopes are put there inside that entry. I'm not saying this is
wrong but tracing libraries do not work like this. Maybe they need to be
rewritten completely to work like that? I will need to think about this but
this would require major rewrite of Micrometer Tracing, Brave and
Opentelemetry which I highly doubt would happen.
- I understand that the SV will contain the TraceContext. The span from
TraceContext can be shared between threads. How will you guard that span
against concurrent writes?

> I still think you are stuck and looking at the problem the wrong way.

That is possible. The problem I have is that as I see it when SVs become
GA, users will ask us, library creators to have tracing work with SV and it
won't work out of the box unless we tell them to use TL. That might be the
proper answer but I'm trying to find a solution that would be beneficial
for the community.

[1] -
https://github.com/spring-projects/spring-framework/blob/v6.1.9/spring-web/src/main/java/org/springframework/web/filter/ServerHttpObservationFilter.java#L102-L128

Pozdrawiam / Best regards,
Marcin Grzejszczak

https://marcin.grzejszczak.pl
https://toomuchcoding.com

pon., 17 cze 2024 o 16:08 Robert Engels <robaho at icloud.com> napisał(a):

> To be clear, I have no control whatsoever over what does or does not
> happen with Java. I am only stating my opinion - and trying to help!
>
> From the beginning I want to implement everything using Scoped Values but
> I don't want to need to rewrite all my code. Neither as a library
> implementor nor in the instrumented projects.
>
>
> I do think it is strange that you want to take advantage of a new api, but
> you don’t want to do any work to do so??? This wouldn’t be an api change
> then, it would be a performance optimization behind the scenes. A primary
> purpose of the ScopedValue api addition is the automatic closing of the
> ScopeValue for better resource management in a VirtualThread system. *Your
> requested changes eliminate a (the?) primary purpose of the api change!*
> So, if you don’t want this, continue with ThreadLocal.
>
> Still, if the tracing library code is properly structured I would doubt it
> would be more than a global search and replace in the library to switch
> from ThreadLocal to ScopedValues. You keep stating “*rewrite all of our
> code*” without providing any evidence of why that would be the case.
>
> It is highly doubtful any instrumented projects would need to change. I
> looked at how the tracing support is done with Spring and the proxies and
> that is certainly wouldn’t be required there.
>
> If you wanted no changes to instrumented projects whatsoever, you would
> need to use the “check if ScopedValue context exists, if not use context
> from ThreadLocal” design as was suggested - which is meaningless overhead
> compared to the overhead of tracing itself. And if the instrumented project
> is being changed to use VirtualThreads it SHOULD be changed at the same
> time to update the tracing library setup code.
>
> I updated the my code to demonstrate the fallback to ThreadLocal when the
> top-level ScopedValue context is not available.
>
> I also updated the code to work when the ScopedValue setup is not
> performed. I am pretty sure it works the way most tracing libraries are
> interfaced with.
>
> I still think you are stuck and looking at the problem the wrong way.
>
> The code I shared emits all of the data to performed nested traces and
> analysis.
>
> event Event[start=1718639998618, end=1718639998807, description=close span
> handleRequest, span=SpanID 1985, tags [request 943],parent SpanId -1],
> duration 189ms
> event Event[start=1718639998709, end=1718639998709, description=open span
> callServer2, span=SpanID 6021, tags [],parent SpanId 1943], duration 0ms
> event Event[start=1718639998709, end=1718639998807, description=close span
> callServer2, span=SpanID 6021, tags [],parent SpanId 1943], duration 98ms
> event Event[start=1718639998807, end=1718639998807, description=open span
> making jdbc call, span=SpanID 7920, tags [jdbc],parent SpanId 1943],
> duration 0ms
> event Event[start=1718639998807, end=1718639998807, description=call jdbc,
> span=SpanID 7920, tags [jdbc],parent SpanId 1943], duration 0ms
> event Event[start=1718639998617, end=1718639998722, description=close span
> handleRequest, span=SpanID 1872, tags [request 885],parent SpanId -1],
> duration 105ms
> event Event[start=1718639998808, end=1718639998808, description=open span
> making jdbc call, span=SpanID 7921, tags [jdbc],parent SpanId 1715],
> duration 0ms
> event Event[start=1718639998808, end=1718639998808, description=call jdbc,
> span=SpanID 7921, tags [jdbc],parent SpanId 1715], duration 0ms
> event Event[start=1718639998808, end=1718639998808, description=close span
> making jdbc call, span=SpanID 7921, tags [jdbc],parent SpanId 1715],
> duration 0ms
> event Event[start=1718639998597, end=1718639998722, description=close span
> handleRequest, span=SpanID 71, tags [request 35],parent SpanId -1],
> duration 125ms
>
>
>
> On Jun 17, 2024, at 10:19 AM, Marcin Grzejszczak <
> marcin.grzejszczak at gmail.com> wrote:
>
> > Right, because we don’t want to use thread locals!!!. The code I posted
> supports nested spans. If you are using a ThreadLocal.CURRENT_TRACER - you
> only have a single value per thread - so either the spans are chained, or
> there is a stack.
>
> I never said that I want you to use thread locals!!! From the beginning I
> want to implement everything using Scoped Values but I don't want to need
> to rewrite all my code. Neither as a library implementor nor in the
> instrumented projects.
>
> > I guarantee I can implement a full tracing library without using
> ThreadLocal and have it support nested spans, etc. How can I guarantee
> this? Because, you can implement tracing by passing “context” as a
> parameter to every method. If you read the ScopedValue documentation it
> states that the purpose of ScopedValue is to avoid needing to pass these
> method parameters. So given that, you can replace
> callMethod(context,parm1…) with callMethod(parm1,…). A tracing library is
> trivially implemented with a Context parameter.
>
> I fully believe you but I don't want to need to rewrite everything!
>
> > There is absolutely no need to add open/close semantics to ScopedValue
> to support tracing libraries.
>
> Maybe someone else can do better at explaining this, I give up. I
> understand that this CAN be done. I just doubt that forcing us to rewrite
> all of our code even in a trivial fashion is OK, but I understood the
> message.
>
> Pozdrawiam / Best regards,
> Marcin Grzejszczak
>
> https://marcin.grzejszczak.pl
> https://toomuchcoding.com
>
>
> pon., 17 cze 2024 o 15:14 Robert Engels <robaho at icloud.com> napisał(a):
>
>> That is not correct. There is a “trace context” per request at the top.
>> The “trace context” supports nested spans (which I call scopes). I have
>> changed the name to Span to make this clearer, and I also added Span tag
>> support. If the parentScopeId ==-1, then you are at the top-level
>> (request). As an example:
>>
>> request 409, event Event[start=1718636994650, end=1718636994650,
>> description=open span making jdbc call, spanId=3824, parentSpanId=-1],
>> duration 0ms
>> request 409, event Event[start=1718636994650, end=1718636994650,
>> description=call jdbc, spanId=3824, *parentSpanId=825*], duration 0ms
>> request 409, event Event[start=1718636994650, end=1718636994650,
>> description=close span making jdbc call, spanId=3824, *parentSpanId=825*],
>> duration 0ms
>> request 409, event Event[start=1718636994494, end=1718636994650,
>> description=close span handleRequest, spanId=825, parentSpanId=-1],
>> duration 156ms
>> request 148, event Event[start=1718636994487, end=1718636994598,
>> description=close span handleRequest, spanId=299, parentSpanId=-1],
>> duration 111ms
>> request 409, total duration 166ms
>> request 786, total duration 167ms
>> request 75, event Event[start=1718636994485, end=1718636994655,
>> description=close span handleRequest, spanId=149, parentSpanId=-1],
>> duration 170ms
>> request 148, total duration 119ms
>> request 452, total duration 171ms
>> request 75, total duration 177ms
>> request 496, event Event[start=1718636994602, end=1718636994602,
>> description=call jdbc, spanId=3367, parentSpanId=1000], duration 0ms
>> request 496, event Event[start=1718636994602, end=1718636994602,
>> description=close span making jdbc call, spanId=3367, parentSpanId=1000],
>> duration 0ms
>> *request 496*, event Event[start=1718636994495, end=1718636994602,
>> description=close span handleRequest, spanId=1000, *parentSpanId=-1*],
>> duration 107ms
>> request 496, total duration 117ms
>> request 635, event Event[start=1718636994497, end=1718636994658,
>> description=close span handleRequest, spanId=1301, parentSpanId=-1],
>> duration 161ms
>> request 308, total duration 176ms
>> request 576, event Event[start=1718636994658, end=1718636994658,
>> description=close span making jdbc call, spanId=3901, parentSpanId=1163],
>> duration 0ms
>> request 635, total duration 171ms
>> request 576, event Event[start=1718636994497, end=1718636994658,
>> description=close span handleRequest, spanId=1163, parentSpanId=-1],
>> duration 161ms
>> request 267, event Event[start=1718636994658, end=1718636994658,
>> description=call jdbc, spanId=3899, parentSpanId=538], duration 0ms
>>
>>
>> Looking at the above what happens is that in pre, a new span is created
>> and put in thread local. It is then closed in the post and removed from
>> ThreadLocal. That's not what you're doing in beforeStatement and
>> afterStatement in your example.
>>
>>
>> Right, because we don’t want to use thread locals!!!. The code I posted
>> supports nested spans. If you are using a ThreadLocal.CURRENT_TRACER - you
>> only have a single value per thread - so either the spans are chained, or
>> there is a stack.
>>
>> I guarantee I can implement a full tracing library without using
>> ThreadLocal and have it support nested spans, etc. How can I guarantee
>> this? Because, you can implement tracing by passing “context” as a
>> parameter to every method. If you read the ScopedValue documentation it
>> states that the purpose of ScopedValue is to avoid needing to pass these
>> method parameters. So given that, you can replace
>> callMethod(context,parm1…) with callMethod(parm1,…). A tracing library is
>> trivially implemented with a Context parameter.
>>
>> There is absolutely no need to add open/close semantics to ScopedValue to
>> support tracing libraries.
>>
>>
>> On Jun 17, 2024, at 9:43 AM, Marcin Grzejszczak <
>> marcin.grzejszczak at gmail.com> wrote:
>>
>> > I updated the github.com/robaho/scope_trace code to demonstrate
>> before/after tracing using ScopedValues. There are no ThreadLocals in the
>> code. It works fine.
>>
>> Thanks for doing this, I really appreciate it!
>>
>> You create a SV at the beginning when the "request" is received. Try to
>> create a "span" with a new SV for the JDBC statement. With how JDBC is
>> being instrumented you actually don't have a handle on when and how the
>> JDBC code is called. You only have access to BEFORE and AFTER. So in the
>> BEFORE you would create a new span, put it in scope so that e.g. MDC get
>> enriched with tracing data so that you will have subsequent logs with span
>> and trace ids and then in the afterStatement you would stop the scope,
>> clear MDC and close that new span.
>>
>> I think that over the course of those 40 emails or so we're talking all
>> the time about the same thing. You're trying to convince me that it's
>> enough to create a SV at the beginning of code execution and wrap the
>> actual execution in the runWhere which is not the case. When you do that
>> you have in fact 1 span per request and then you just retrieve it in all
>> your methods, there's no span nesting, no parent-child relationship.
>>
>> In a normal application you would have 1 span for getting the request, at
>> least 1 span for the JDBC call and 1 for each of the http calls. In your
>> case you have 1 span for all 4.
>>
>> > The problem with open/close semantics is that it is easy to close the
>> scopes out of order - which corrupts the tracing information.
>>
>> Yes, I mentioned that a couple of times already.
>>
>> > Again referring to brave, look at
>> https://github.com/openzipkin/brave/blob/69003dfc811418f0dbc42e9e17ff880ebe1f4b02/instrumentation/mysql/src/main/java/brave/mysql/TracingStatementInterceptor.java#L39
>>
>> This line creates a new span, whose parent was the current one in
>> ThreadLocal and it also puts the newly created span in ThreadLocal. That's
>> because the ThreadLocalSpan is being used. You don't have to use that API,
>> you could create a span and put it in scope, that way the information about
>> the underlying ThreadLocal storage mechanism wouldn't leak.
>>
>> > This is an example of before/after tracing. It REQUIRES that the
>> preStatement() and postStatment() will ALWAYS be executed in the correct
>> sequence (with no intervening spans left open).
>>
>> Looking at the above what happens is that in pre, a new span is created
>> and put in thread local. It is then closed in the post and removed from
>> ThreadLocal. That's not what you're doing in beforeStatement and
>> afterStatement in your example.
>>
>> > This code would need to changed regardless, most likely to something
>> like Brave.currentTracer() which would be able to obtain the current tracer
>> appropriately. I don’t see how adding open/close to ScopedValue values
>> would alleviate this.
>>
>> Typically this information gets passed via a constructor. Most likely in
>> this case some JDBC SPI is being called and that class is instantiated
>> without any dependencies.
>>
>> > I am guessing most libraries have these hard-coded dependencies on
>> ThreadLocal.
>>
>> They do not. They use abstractions with scopes. Micrometer Observation
>> gets the current observation from an ObservationRegistry. Underneath the
>> default implementation does look into TL but it could look into SV.
>>
>> > Thus, someone will write a VirtualThread/ScopedValue “native” tracing
>> library, and that will be what people use - or developers will need to
>> modify their libraries to use ScopedValues in an APPROPRIATE manner.
>>
>> I'm trying to agree with you what APPROPRIATE means because I still don't
>> understand because I think we're comparing apples to oranges at this point.
>>
>> >  You seem to believe that a tracing library user will want to create a
>> new span on every ScopedValue.runWhere() - I don’t think that is the case -
>> as it will generate a lot of noise.
>>
>> I know a few things about tracing libraries and instrumenting projects
>> because I've been doing it for a decade. I never said that you should
>> create a span on every runWhere. Assuming that tracing libraries would be
>> using SV then I can imagine that when a Span would be put in scope then if
>> it's not there in scope already then runWhere should be run.
>>
>> Pozdrawiam / Best regards,
>> Marcin Grzejszczak
>>
>> https://marcin.grzejszczak.pl
>> https://toomuchcoding.com
>>
>>
>> pon., 17 cze 2024 o 14:14 Robert Engels <robaho at icloud.com> napisał(a):
>>
>>> I updated the github.com/robaho/scope_trace code to demonstrate
>>> before/after tracing using ScopedValues. There are no ThreadLocals in the
>>> code. It works fine.
>>>
>>> The problem with open/close semantics is that it is easy to close the
>>> scopes out of order - which corrupts the tracing information.
>>>
>>> Again referring to brave, look at
>>> https://github.com/openzipkin/brave/blob/69003dfc811418f0dbc42e9e17ff880ebe1f4b02/instrumentation/mysql/src/main/java/brave/mysql/TracingStatementInterceptor.java#L39
>>>
>>> This is an example of before/after tracing. It REQUIRES that the
>>> preStatement() and postStatment() will ALWAYS be executed in the correct
>>> sequence (with no intervening spans left open).
>>>
>>> Notice also
>>>
>>> Span span = ThreadLocalSpan.CURRENT_TRACER.next();
>>>
>>> This code would need to changed regardless, most likely to something
>>> like Brave.currentTracer() which would be able to obtain the current tracer
>>> appropriately. *I don’t see how adding open/close to ScopedValue values
>>> would alleviate this.*
>>>
>>> I am guessing most libraries have these hard-coded dependencies on
>>> ThreadLocal.
>>>
>>> Thus, someone will write a VirtualThread/ScopedValue “native” tracing
>>> library, and that will be what people use - or developers will need to
>>> modify their libraries to use ScopedValues in an APPROPRIATE manner. You
>>> seem to believe that a tracing library user will want to create a new span
>>> on every ScopedValue.runWhere() - I don’t think that is the case - as it
>>> will generate a lot of noise.
>>>
>>> “scope values” are not only for tracing scopes.
>>>
>>>
>>> On Jun 17, 2024, at 7:01 AM, robert engels <robaho at icloud.com> wrote:
>>>
>>> Your statements like “a user puts a scope in a SV won’t be honored”. Of
>>> course not. I think you are trying to conflate scoped values with tracing
>>> scopes. They are not the same thing and can’t be treated as such due to the
>>> volume.
>>>
>>> The user will manage scopes for tracing the way they do now. It is an
>>> implementation detail on how the tracing library manages these scopes. They
>>> currently use thread locals. As I’ve pointed out several times the library
>>> can use a scoped value to carry the scope context - and mange the active
>>> scopes within in
>>>
>>> You are trying to make the scoped values be able to support tracing
>>> directly. I don’t believe that was ever a design goal.
>>>
>>> On Jun 17, 2024, at 5:59 AM, Marcin Grzejszczak <
>>> marcin.grzejszczak at gmail.com> wrote:
>>>
>>> 
>>> > Definitely, yes. If you need to create a context for every virtual
>>> thread, scoped values versus thread-local variables is the least of your
>>> problems.
>>>
>>> Can you elaborate please?
>>>
>>> Pozdrawiam / Best regards,
>>> Marcin Grzejszczak
>>>
>>> https://marcin.grzejszczak.pl
>>> https://toomuchcoding.com
>>>
>>>
>>> czw., 13 cze 2024 o 19:23 Andrew Haley <aph-open at littlepinkcloud.com>
>>> napisał(a):
>>>
>>>> On 6/13/24 16:03, Robert Engels wrote:
>>>> > 3. More importantly, if you use VT effectively, you are talking about
>>>> > orders of magnitude more “trace contexts” (in the default brave
>>>> handler
>>>> > they are shared by thread so they are highly limited since the only
>>>> > effective Java threading implementation prior to VT required pooled
>>>> > threads) which reinforces my point about the increase in the volume of
>>>> > unique data is the major problem for existing tracing libraries, which
>>>> > will probably require architectural changes anyway.
>>>>
>>>> Definitely, yes. If you need to create a context for every virtual
>>>> thread,
>>>> scoped values versus thread-local variables is the least of your
>>>> problems.
>>>>
>>>> --
>>>> Andrew Haley  (he/him)
>>>> Java Platform Lead Engineer
>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>> https://keybase.io/andrewhaley
>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240619/9862a0f7/attachment-0001.htm>