Avoid some GCC 10.X warnings in HotSpot

Tue Jul 7 16:45:05 UTC 2020

On Jul 7, 2020, at 7:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
> Using memcpy instead of byte_at_put (and getting rid of byte_at_put)
> seems like a good idea to me.

We have had problems with memcpy in the long past, and I’m
personally still nervous when I see a call to it.  The problem is
subtle, and escaped everybody’s notice the first time around,
and I’d prefer not to make more bugs of the same kind.

What problem?  Well, memcpy does not guarantee integrity
of the copy from the point of view of racing threads.  Most
specifically, it is allowed to copy machine words (in memory)
byte-by-byte.  This leads to race conditions (word tearing)
if the destination of the memcpy is being read by another
thread while it is being written by the memcpy thread.
If the data being written is in the Java heap, the Java
memory model will be violated.  If the data is somewhere
else, perhaps some assumption about lock-free programming
could be violated.  And (here is the important point) the
choice of memcpy to copy using bytes instead of words
is private to memcpy and can change (essentially at any
moment.  Clearly a well-groomed memcpy will use word-wise
copies when it can, and so *in practice* memcpy will never
cause word tearing.  But *in rare cases* we have seen bugs
where some edge case is touched and a word gets torn.
These bugs *being rare* are exceedingly difficult to detect.

All this happened long ago, and was a painful learning
experience for the whole team.  One of our responses was
to pivot away from using memcpy and use our own loops
instead, even though we knew they were less performant
in some cases.  Another response was to codify our best
practices in the Copy class, and to more clearly document
which copy operations required exactly which behaviors.

Nowadays memcpy participates fully in C++ optimizations,
so there are new reasons to reach for it.  But the old reasons
to be cautious of it are still present.  (AFAIK all platforms
preserve memcpy’s “right” to tear words.  But even if some
platform made it better behaved, HotSpot’s cross-platform
code must make worst-case assumptions.)  So I want to
add this cautionary note about memcpy; I don’t want us
to use memcpy as a cure-all for copy problems, and
forget that it’s not always the right tool for the job.

I have a specific ask:  Most parts of the HotSpot code base
are under our direct and detailed control, and many parts
of those have heightened concurrency requirements.
(By heightened, I mean that the usual C++ rules about
undetermined behavior in the case of races or word
tearing, etc., are unacceptable to us.  We are building
a runtime for a language with different rules than C++.)
So my ask is that we don’t just say “memcpy” in (most
of) our source code; we should say something like
Copy::private_bytes, not just memcpy.  (There’s a
bikeshed color here, which I don’t care about.)
The Copy:: routine can be just an alias for memcpy.

So what’s the advantage of having a private reserve
Special Name for good old memcpy?  Isn’t this just
obscuring what’s going on in the code?  No, it’s not.
Bitter experience tells us that memcpy has some
hidden sharp edges, and our best move for protecting
ourselves is to wrap it in a custom-made handle.

By asking maintainers to refer to a new file, copy.hpp,
we are *also* giving them the chance to read some
informative *documentation* on the memcpy-alias,
that explains, for *our* source base, what are the
best practices for using this tricky primitive.  Yes,
memcpy is documented adequately, and surely
competent maintainers know that documentation,
but (as you can see) we need HotSpot-specific
documentation for tricky primitives like memcpy,
and the best way to ensure that such documentation
is appropriately referenced is to make a local alias,
a better handle, for the tricky primitive.

All that said, I have no problem with memcpy being
used (though I prefer a locally documented alias!) in
the places where it is appearing in our source base.
Those places are, yes, private to some construction
process, or singly-threaded for some other reason,
perhaps a safepoint.  But I think you understand now
that I am nervous when I see more and more uses
of memcpy in our code, because I think that now
it’s just a matter of time before someone concludes
that memcpy is the new (old) best way to copy data,
and the hard work of codifying our practices in
Copy will start to be neglected.  A new programmer
in our code base could make the mistake of just
reaching for memcpy instead of doing the work of
deciding which kind of Copy to use.  And that will
eventually cause a difficult-to-detect bug.  Let’s not.

— John