JEP draft: Integrity and Strong Encapsulation
Andrew Dinn
adinn at redhat.com
Mon May 22 14:01:31 UTC 2023
Hi Ron,
I've been studiously reading all the discussion on the integrity JEP
(currently numbered 8305968) -- whether direct comments in this thread
or indirect comments from the various threads that relate to changing
the dynamic agent load default. It seems obvious to me that there is a
rather unfortunate amount of confusion regarding the goals and strategic
direction of the JEP. I believe that this has led to many objections
that I think miss their target. However, I think it is incontrovertible
that this implies the JEP needs some refinement and perhaps to consider
some of its conclusions.
I'd like to propose two very specific changes to the text that might
help to steer readers in the intended direction and thereby avoid a lot
of the confusion. I also have some suggestions for extra material (but
not yet any explicit edits) that I think could to be added to the JEP in
order to clarify and amplify the need for certain of the legitimate uses
cases for breaking encapsulation that the JEP *already acknowledges*.
1) My first suggestion is to modify the summary so as to clarify and
highlight the issue of control.
"As Java continues to move forward, it is appropriate to restrict all
APIs so that they cannot break strong encapsulation /without explicit
end user permission/, while still accommodating use cases that need to
operate beyond encapsulation boundaries."
Why this change? A key element of the argument in the JEP is that the
end user must be in a position to decide whether and where encapsulation
can be restricted (or at least, in the case of naive end users, the
program run script or tool that initiates execution). The above edit not
only makes this explicit, it also underlines up front a key truth that
the JEP acknowledges: there are use cases where an end user may
legitimately decide to break encapsulation (in some controlled manner).
Without this qualification it is easy for a reader to misread this
sentence and think that the capability is being taken away entirely --
yes, despite that immediately following clause -- as the discussion has
shown. Its presence also sets up the user for subsequent discussion as
to what guiding principles and language or implementation mechanisms are
needed and available to manage that permission process.
2) My second suggestion is to change the first sentence in the section
headed "Strong Encapsulation by Default".
"Because Java since JDK 1.1 had allowed encapsulation to be broken
via deep reflection, a number of /production/ libraries came to depend
on the ability to break it."
Why this change? This section rightly focuses on historic, invalid use
cases for breaking encapsulation in order to motivate the discussion of
how the general integrity goals outlined in the previous section have
been broken and dangerously so. The cited example usages identify
behaviours that are definitely problematic in a production deployment.
However, the first example, working around a missing API, is not only
less problematic in a development, test-time environment, it may indeed
in some cases a necessary step, the only step to achieve certain kinds
of test coverage -- a point that is even made later on in the
discussion of white box testing.
Without this change a reader may easily draw the wrong conclusion that
the most obvious example of legitimate, development time encapsulation
breakage, the use of white box testing libraries, is being proscribed by
the JEP. The change more precisely highlights the much more pernicious
abuse, libraries that bypass end user regulation of encapsulation
breakage in /deployed/ software, without running the risk of conflating
this highlighted problem case with legitimate cases.
3) My third concern is that the JEP is missing an important aspect of
why and how 'encapsulation' is broken by agents. The JEP concentrates on
the use of deep reflection and agents to bypass data and behaviour
access restrictions defined using module visibility in combination with
the available combinations of package- and class-level (public,
protected or private) access. This misses another level of encapsulation
that has always been present in Java and is becoming ever more important
with the greater adoption of micro-services i.e. thread and process
encapsulation. This is highly relevant for development time testing but
also especially needs to be taken into account when it comes to
assessing the vital importance of agents for observability and
discussing/clarifying their legitimacy.
Many important component and integration test scenarios require
cross-validation of actions performed, or data produced and consumed, in
disparate threads (a fortiori, processes). However, this is difficult
because processes and threads encapsulate data and behaviours. This is
not achieved by through enforcement of private or protected accesses to
a (directly or indirectly) referenced object by type. The encapsulation
occurs by ensuring that references held by one thread are just never
made available to some other thread or threads through a shared data
path. Direct cross-validation at test code, whether by the code executed
by these threads or by third-party test code is usually impossible. In
most cases there is no desire and, just as often, there are many
impediments, to enabling synchronization or exchange of data between
threads in the deployed application.
The impracticality of implementing such validation via a general API
exposed by the relevant library code is obvious. The need for a given
synchronization and data exchange is very much determined by the way
scheme the client app adopts to distribute the work and data amongst
threads (or processes). Even granted a library that can provide a
suitably general API it may well be that a client thread which initiates
an action has no handle on the library thread which performs the action
or, more problematically, an indirect thread which co-operates with the
latter thread.
Yet, it is very easy for an agent to ensure that specific actions in
arbitrary threads are correlated and that conditions at the point of
action are compared and validated. This is indeed precisely the kind of
use case Byteman was originally developed to support. The motivating use
case for Byteman was to validate a sequence of asynchronous message
exchanges in a variety of success and failure scenarios as a distributed
web transaction manager and/or client proceeded through the stages of a
two phase commit.
A similar concern arises with the use of agents for monitoring in
production. One of the reasons observability has recently become such a
major concern and, likewise, why agents have become so attractive as a
way to achieve observability, is precisely because Java deployments are
making so much more use of multi-threading and multi-processing.
Legacy solutions, like logging, are creaking at the seams: they make it
very hard to collate the information they collect from multiple,
independent data sources by forcing it to be deposited in multiple
independent data sinks. This problem is only exacerbated by the clear
impossibility for developers in a library eco-system as large as that of
Java to provide a common model for logging that an app can be sure all
its libraries will coherently support in all the various versions it
needs to rely on.
Agent instrumentation, including in some cases code that breaks process,
thread, module, package and class encapsulation boundaries, provides a
far more realistic means of addressing both the distribution problem and
the versioning problem than the pipe dream of a one size fits all API.
So, once again a discussion of encapsulation that ignores thread and
process encapsulation and the need for mechanisms to selectively bypass
it at runtime is going to omit important considerations that many
readers of this JEP will be highly aware of.
regards,
Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill
More information about the jigsaw-dev
mailing list