JEP draft: Integrity and Strong Encapsulation

Andrew Dinn adinn at redhat.com
Mon May 22 14:01:31 UTC 2023


Hi Ron,

I've been studiously reading all the discussion on the integrity JEP 
(currently numbered 8305968) -- whether direct comments in this thread 
or indirect comments from the various threads that relate to changing 
the dynamic agent load default. It seems obvious to me that there is a 
rather unfortunate amount of confusion regarding the goals and strategic 
direction of the JEP. I believe that this has led to many objections 
that I think miss their target. However, I think it is incontrovertible 
that this implies the JEP needs some refinement and perhaps to consider 
some of its conclusions.

I'd like to propose two very specific changes to the text that might 
help to steer readers in the intended direction and thereby avoid a lot 
of the confusion. I also have some suggestions for extra material (but 
not yet any explicit edits) that I think could to be added to the JEP in 
order to clarify and amplify the need for certain of the legitimate uses 
cases for breaking encapsulation that the JEP *already acknowledges*.

1) My first suggestion is to modify the summary so as to clarify and 
highlight the issue of control.

   "As Java continues to move forward, it is appropriate to restrict all 
APIs so that they cannot break strong encapsulation /without explicit 
end user permission/, while still accommodating use cases that need to 
operate beyond encapsulation boundaries."

Why this change? A key element of the argument in the JEP is that the 
end user must be in a position to decide whether and where encapsulation 
can be restricted (or at least, in the case of naive end users, the 
program run script or tool that initiates execution). The above edit not 
only makes this explicit, it also underlines up front a key truth that 
the JEP acknowledges: there are use cases where an end user may 
legitimately decide to break encapsulation (in some controlled manner). 
Without this qualification it is easy for a reader to misread this 
sentence and think that the capability is being taken away entirely -- 
yes, despite that immediately following clause -- as the discussion has 
shown. Its presence also sets up the user for subsequent discussion as 
to what guiding principles and language or implementation mechanisms are 
needed and available to manage that permission process.

2) My second suggestion is to change the first sentence in the section 
headed "Strong Encapsulation by Default".


   "Because Java since JDK 1.1 had allowed encapsulation to be broken 
via deep reflection, a number of /production/ libraries came to depend 
on the ability to break it."

Why this change? This section rightly focuses on historic, invalid use 
cases for breaking encapsulation in order to motivate the discussion of 
how the general integrity goals outlined in the previous section have 
been broken and dangerously so. The cited example usages identify 
behaviours that are definitely problematic in a production deployment. 
However, the first example, working around a missing API, is not only 
less problematic in a development, test-time environment, it may indeed 
in some cases a necessary step, the only step to achieve certain kinds 
of test coverage  -- a point that is even made later on in the 
discussion of white box testing.

Without this change a reader may easily draw the wrong conclusion that 
the most obvious example of legitimate, development time encapsulation 
breakage, the use of white box testing libraries, is being proscribed by 
the JEP. The change more precisely highlights the much more pernicious 
abuse, libraries that bypass end user regulation of encapsulation 
breakage in /deployed/ software, without running the risk of conflating 
this highlighted problem case with legitimate cases.

3) My third concern is that the JEP is missing an important aspect of 
why and how 'encapsulation' is broken by agents. The JEP concentrates on 
the use of deep reflection and agents to bypass data and behaviour 
access restrictions defined using module visibility in combination with 
the available combinations of package- and class-level (public, 
protected or private) access. This misses another level of encapsulation 
that has always been present in Java and is becoming ever more important 
with the greater adoption of micro-services i.e. thread and process 
encapsulation. This is highly relevant for development time testing but 
also especially needs to be taken into account when it comes to 
assessing the vital importance of agents for observability and 
discussing/clarifying their legitimacy.

Many important component and integration test scenarios require 
cross-validation of actions performed, or data produced and consumed, in 
disparate threads (a fortiori, processes). However, this is difficult 
because processes and threads encapsulate data and behaviours. This is 
not achieved by through enforcement of private or protected accesses to 
a (directly or indirectly) referenced object by type. The encapsulation 
occurs by ensuring that references held by one thread are just never 
made available to some other thread or threads through a shared data 
path. Direct cross-validation at test code, whether by the code executed 
by these threads or by third-party test code is usually impossible. In 
most cases there is no desire and, just as often, there are many 
impediments, to enabling synchronization or exchange of data between 
threads in the deployed application.

The impracticality of implementing such validation via a general API 
exposed by the relevant library code is obvious. The need for a given 
synchronization and data exchange is very much determined by the way 
scheme the client app adopts to distribute the work and data amongst 
threads (or processes). Even granted a library that can provide a 
suitably general API it may well be that a client thread which initiates 
an action has no handle on the library thread which performs the action 
or, more problematically, an indirect thread which co-operates with the 
latter thread.

Yet, it is very easy for an agent to ensure that specific actions in 
arbitrary threads are correlated and that conditions at the point of 
action are compared and validated. This is indeed precisely the kind of 
use case Byteman was originally developed to support. The motivating use 
case for Byteman was to validate a sequence of asynchronous message 
exchanges in a variety of success and failure scenarios as a distributed 
web transaction manager and/or client proceeded through the stages of a 
two phase commit.

A similar concern arises with the use of agents for monitoring in 
production. One of the reasons observability has recently become such a 
major concern and, likewise, why agents have become so attractive as a 
way to achieve observability, is precisely because Java deployments are 
making so much more use of multi-threading and multi-processing.

Legacy solutions, like logging, are creaking at the seams: they make it 
very hard to collate the information they collect from multiple, 
independent data sources by forcing it to be deposited in multiple 
independent data sinks. This problem is only exacerbated by the clear 
impossibility for developers in a library eco-system as large as that of 
Java to provide a common model for logging that an app can be sure all 
its libraries will coherently support in all the various versions it 
needs to rely on.

Agent instrumentation, including in some cases code that breaks process, 
thread, module, package and class encapsulation boundaries, provides a 
far more realistic means of addressing both the distribution problem and 
the versioning problem than the pipe dream of a one size fits all API. 
So, once again a discussion of encapsulation that ignores thread and 
process encapsulation and the need for mechanisms to selectively bypass 
it at runtime is going to omit important considerations that many 
readers of this JEP will be highly aware of.


regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill



More information about the jigsaw-dev mailing list