[External] : Re: JEP draft: Integrity and Strong Encapsulation

Mon May 22 15:57:57 UTC 2023

> On 22 May 2023, at 15:01, Andrew Dinn <adinn at redhat.com> wrote:
> 
> Hi Ron,
> 
> I've been studiously reading all the discussion on the integrity JEP (currently numbered 8305968) -- whether direct comments in this thread or indirect comments from the various threads that relate to changing the dynamic agent load default. It seems obvious to me that there is a rather unfortunate amount of confusion regarding the goals and strategic direction of the JEP. I believe that this has led to many objections that I think miss their target. However, I think it is incontrovertible that this implies the JEP needs some refinement and perhaps to consider some of its conclusions.

Yes. This is still a draft.

> 
> I'd like to propose two very specific changes to the text that might help to steer readers in the intended direction and thereby avoid a lot of the confusion. I also have some suggestions for extra material (but not yet any explicit edits) that I think could to be added to the JEP in order to clarify and amplify the need for certain of the legitimate uses cases for breaking encapsulation that the JEP *already acknowledges*.
> 
> 1) My first suggestion is to modify the summary so as to clarify and highlight the issue of control.
> 
>  "As Java continues to move forward, it is appropriate to restrict all APIs so that they cannot break strong encapsulation /without explicit end user permission/, while still accommodating use cases that need to operate beyond encapsulation boundaries."
> 
> Why this change? A key element of the argument in the JEP is that the end user must be in a position to decide whether and where encapsulation can be restricted (or at least, in the case of naive end users, the program run script or tool that initiates execution). The above edit not only makes this explicit, it also underlines up front a key truth that the JEP acknowledges: there are use cases where an end user may legitimately decide to break encapsulation (in some controlled manner). Without this qualification it is easy for a reader to misread this sentence and think that the capability is being taken away entirely -- yes, despite that immediately following clause -- as the discussion has shown. Its presence also sets up the user for subsequent discussion as to what guiding principles and language or implementation mechanisms are needed and available to manage that permission process.

Good suggestion. Done. (It may take some minutes for the changes in JBS to reflect in the rendered JEP)

> 
> 2) My second suggestion is to change the first sentence in the section headed "Strong Encapsulation by Default".
> 
> 
>  "Because Java since JDK 1.1 had allowed encapsulation to be broken via deep reflection, a number of /production/ libraries came to depend on the ability to break it."
> 
> Why this change? This section rightly focuses on historic, invalid use cases for breaking encapsulation in order to motivate the discussion of how the general integrity goals outlined in the previous section have been broken and dangerously so. The cited example usages identify behaviours that are definitely problematic in a production deployment. However, the first example, working around a missing API, is not only less problematic in a development, test-time environment, it may indeed in some cases a necessary step, the only step to achieve certain kinds of test coverage  -- a point that is even made later on in the discussion of white box testing.
> 
> Without this change a reader may easily draw the wrong conclusion that the most obvious example of legitimate, development time encapsulation breakage, the use of white box testing libraries, is being proscribed by the JEP. The change more precisely highlights the much more pernicious abuse, libraries that bypass end user regulation of encapsulation breakage in /deployed/ software, without running the risk of conflating this highlighted problem case with legitimate cases.

Agreed. Done.

> 
> 3) My third concern is that the JEP is missing an important aspect of why and how 'encapsulation' is broken by agents. The JEP concentrates on the use of deep reflection and agents to bypass data and behaviour access restrictions defined using module visibility in combination with the available combinations of package- and class-level (public, protected or private) access. This misses another level of encapsulation that has always been present in Java and is becoming ever more important with the greater adoption of micro-services i.e. thread and process encapsulation. This is highly relevant for development time testing but also especially needs to be taken into account when it comes to assessing the vital importance of agents for observability and discussing/clarifying their legitimacy.
> 
> Many important component and integration test scenarios require cross-validation of actions performed, or data produced and consumed, in disparate threads (a fortiori, processes). However, this is difficult because processes and threads encapsulate data and behaviours. This is not achieved by through enforcement of private or protected accesses to a (directly or indirectly) referenced object by type. The encapsulation occurs by ensuring that references held by one thread are just never made available to some other thread or threads through a shared data path. Direct cross-validation at test code, whether by the code executed by these threads or by third-party test code is usually impossible. In most cases there is no desire and, just as often, there are many impediments, to enabling synchronization or exchange of data between threads in the deployed application.
> 
> The impracticality of implementing such validation via a general API exposed by the relevant library code is obvious. The need for a given synchronization and data exchange is very much determined by the way scheme the client app adopts to distribute the work and data amongst threads (or processes). Even granted a library that can provide a suitably general API it may well be that a client thread which initiates an action has no handle on the library thread which performs the action or, more problematically, an indirect thread which co-operates with the latter thread.
> 
> Yet, it is very easy for an agent to ensure that specific actions in arbitrary threads are correlated and that conditions at the point of action are compared and validated. This is indeed precisely the kind of use case Byteman was originally developed to support. The motivating use case for Byteman was to validate a sequence of asynchronous message exchanges in a variety of success and failure scenarios as a distributed web transaction manager and/or client proceeded through the stages of a two phase commit.
> 
> A similar concern arises with the use of agents for monitoring in production. One of the reasons observability has recently become such a major concern and, likewise, why agents have become so attractive as a way to achieve observability, is precisely because Java deployments are making so much more use of multi-threading and multi-processing.
> 
> Legacy solutions, like logging, are creaking at the seams: they make it very hard to collate the information they collect from multiple, independent data sources by forcing it to be deposited in multiple independent data sinks. This problem is only exacerbated by the clear impossibility for developers in a library eco-system as large as that of Java to provide a common model for logging that an app can be sure all its libraries will coherently support in all the various versions it needs to rely on.
> 
> Agent instrumentation, including in some cases code that breaks process, thread, module, package and class encapsulation boundaries, provides a far more realistic means of addressing both the distribution problem and the versioning problem than the pipe dream of a one size fits all API. So, once again a discussion of encapsulation that ignores thread and process encapsulation and the need for mechanisms to selectively bypass it at runtime is going to omit important considerations that many readers of this JEP will be highly aware of.
> 
> 

I’m not sure I understand this. While there are certainly integrity invariants related to threads — JMM and the removal of stop in JDK 20 that’s mentioned in the JEP, thread encapsulation is not a concept in the Java platform (perhaps indirectly with ThreadLocals/ScopedValues??). As such, something like it would be yet another higher-level integrity guarantee that can be programmed thanks to strong encapsulation (indeed, it is strong encapsulation that gives ThreadLocals and ScopedValues their invariants).

As to your point above the use of agents for observability, I believe it’s covered in the third bullet in the Disabling Strong Encapsulation section with the APM example. However, I added a few more words to explain this usage more generally.

— Ron