RFR: 8240110: Improve NULL
Kim Barrett
kim.barrett at oracle.com
Thu Apr 23 09:30:34 UTC 2020
> On Apr 19, 2020, at 11:39 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
I've not yet covered all the items; I'll note inline which ones I
haven't addressed yet.
> On 17/04/2020 09:46, Kim Barrett wrote:
>>> On Apr 14, 2020, at 9:53 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>> On 06/04/2020 01:37, David Holmes wrote:
>>>>> I could change CHECK_0 to use NULL_WORD though and if I do, I get compiler errors. CHECK_0 is used for both pointers and integral zeroes and types that implicitly converts from either. What the original thought was is unclear to me; is CHECK_0 to be used to return things convertible from pointers? If so, what is CHECK_NULL for?
>>>> No CHECK_0 is intended for int-returning functions (which would also extend to intptr_t).
>>>
>>> It might be intended for it, but it is not used in such a way (therefore my remark on compile failures when using NULL_WORD)
>>>
>>> I will do no change here.
>> It sounds like there are some places using CHECK_0 that should be
>> using CHECK_NULL, and they should be fixed. But that can be a separate
>> change. It might be aided (and backsliding prevented) by doing
>> something similar to what I suggest for JVMCI_CHECK_0/NULL below.
>
> I believe that there are places that could use neither (the macro handles both integral and pointer types), but I do not want to fix that macro nest.
I'm not sure if you are saying anything beyond "CHECK_0 is being used
in places where CHECK_NULL should be used instead"?
Unfortunately, as discussed in the part of the thread with David, my
suggestion for JVMCI_CHECK_0 turns out to not have the desired effect.
So I agree, let's just leave these macros alone (both the JVMCI_ and
unprefixed variants).
>> In the end you seem to have backed away from fixing the definition of
>> NULL_WORD, and I'm not sure why. Though I think the definition that
>> was proposed is at least unnecessarily baroque, possibly wrong. It
>> seems to me that a good and sufficient change for that would be to
>> delete all the existing platform definitions and put the following in
>> globalDefinitions.hpp:
>> const intptr_t NULL_WORD = 0;
>> Then it's no longer a null pointer constant, so can't be misused in
>> contexts where a pointer is required, will be treated as an integral
>> value by templates, &etc.
>
> my first try was (in part2):
>
> struct null_word_t {
> operator intptr_t() const { return 0; }
> };
>
> which IMO is nice as it:
> 1) mirrors my (initial) implementation of null_t
> 2) can be template specialized against
> 3) is defined in one place
>
> Unfortunately we do not yet have constexpr.
>
> So my (temporary) fix was (in part3):
> const intptr_t NULL_WORD = 0;
>
> This is /identical/ to what you propose. I think this is a good compromise. However after much effort from my side trying to get this solution you did not want my changes to NULL (and maybe I misunderstood you regarding NULL_WORD) so I reverted that in part5.
>
> I am very much confused to what you want. Should I once again revert NULL_WORD and put it in globalDefinitions.hpp?
Sorry, but I haven't looked at all the different stages of your
changes. I've only looked carefully through 8240110/1, to most recent
total combined change.
And it seems the discussion of NULL and the discussion of NULL_WORD
have gotten confused and mixed up. So I'll be explicit here.
I think / agree the following definition of NULL_WORD in
globalDefinitions.hpp is fine:
const intptr_t NULL_WORD = 0;
// The Standard only guarantees intptr_t is at least big enough to
// hold a pointer value, and purportedly there are platforms where
// this will fail. If this ever trips we'll need to think about
// whether it really matters and what to do if it does.
STATIC_ASSERT(sizeof(NULL_WORD) == sizeof(void*));
(The comment for the static assert is only intended to be suggestive
of wording; feel free to reword to suit.)
I agree there are some places that are using NULL that should be using
NULL_WORD, and vice versa.
I wonder if it would be worthwhile splitting out the NULL_WORD
defintion and the NULL <-> NULL_WORD changes as a separate smaller
change that is (I think) easily dealt with. It will then no longer
clutter the remainder.
>> I wouldn't worry about this comment in globalDefinitions_solstudio.hpp:
>> "On some platforms, sizeof(intptr_t) > sizeof(void*)"
>> or you can be really paranoid and STATIC_ASSERT that they are the same
>> size. Though I really wonder what platform(s) that's talking about.
>> That comment comes from 6452081: 3/4 Allow for Linux builds with Sun
>> Studio Linux compilers. So maybe 32bit Solaris Studio Linux is weird?
>
> I did this originally (disregard the typo of uint vs int) in part1.
>
> STATIC_ASSERT(sizeof(uintptr_t) == sizeof(char *)); // uintptr_t is only guaranteed to be at /least/ as big as a pointer
>
> I later reverted this change as I thought you dismissed all of my NULL and NULL_WORD definitions. I really wanted this.
See above.
>> src/hotspot/cpu/sparc/c1_CodeStubs_sparc.cpp
>> 303 AddressLiteral addrlit(NULL_WORD, metadata_Relocation::spec(_index));
>> 318 AddressLiteral addrlit(NULL_WORD, oop_Relocation::spec(_index));
>> I think these should be left as NULL rather than being changed to
>> NULL_WORD. The expected type is the address type, and other types are
>> converted to that via casts in constructor overloads.
>> Similarly here:
>> src/hotspot/cpu/sparc/c1_LIRAssembler_sparc.cpp
>> 426 AddressLiteral addrlit(NULL_WORD, oop_Relocation::spec(oop_index));
>> 445 AddressLiteral addrlit(NULL_WORD, metadata_Relocation::spec(index));
>
> I believe those do not at all auto convert, but promote to (integral) zero as I (quite thoroughly) described in my reply to David (and thus /should/ be NULL_WORD. I believe this is one prime example when our NULL definitions results in unintuitive selection of integer constructor.
I'd failed to notice that this one is in Solaris/sparc code, where the
gcc rules don't apply.
I think the expectation here is that NULL would select the pointer
overload, but it doesn't on this platform because NULL is a literal
0/0L, and as you say, the integer overload is better.
Similar code compiled with gcc would also select the integer overload,
but with a -Wconversion-null warning. Apparently gcc's __null, while
fairly similar to nullptr, does support conversion to an integer type
but with a warning. And that's intentional; quoting from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35669#c13
0 is still a valid definition of NULL so conversion from NULL to int
is OK. And __null does not have type nullptr_t, changing it to have
that type would break a lot of code
nullptr would do the right thing here.
But note that this code is slated to go away real soon (JEP 381).
My inclination would be to not make this change, in case there is a
repo of changes for that removal that will have a merge conflict if
this change were to go in first. But I no longer care all that much.
>> src/hotspot/os/linux/cgroupV1Subsystem_linux.hpp
>> [old code]
>> 100 CachingCgroupController* _memory = NULL;
>> 101 CgroupV1Controller* _cpuset = NULL;
>> 102 CachingCgroupController* _cpu = NULL;
>> 103 CgroupV1Controller* _cpuacct = NULL;
>> How does this even compile? Non-static member initializers are a
>> C++11 feature. Weirdly, g++ with -std=gnu++98 warns about this for
>> some types/initializers but not for others.
>
> As I remarked in my original mail, I believe it to be a gcc bug, but I am not sure.
Agreed that it’s a gcc bug / weirdness. Deleting that stuff is fine with me.
>> src/hotspot/os/solaris/attachListener_solaris.cpp
>> 461 status = ::sema_init(&_wakeup, 0, NULL, NULL);
>> =>
>> 461 status = ::sema_init(&_wakeup, 0, 0, NULL);
>> The old code appears to have been cribbed from the documentation:
>> https://docs.oracle.com/cd/E36784_01/html/E36874/sema-init-3c.html
>> See the example of "Default semaphore initialization (intra-process)".
>> Rather than a literal 0, better would be USYNC_THREAD. Presumably the
>> value of USYNC_THREAD is 0.
>
> I could not easily find the value of the symbols, so I only changed it to the correct type, with resent changes to Java regarding Solaris do you care enough for me to check this out, or is 0 good enough?
This is another bit of Solaris code that is slated to go away real
soon now. So I don't care all that much.
>> src/hotspot/os/windows/os_windows.cpp
>> 1659 if (!GetFileVersionInfo(kernel32_path, 0, version_size, version_info)) {
>> Given that the second argument, which was changed from NULL to 0, is a
>> "handle" (though unused), I think using NULL was appropriate and this
>> change shouldn't be made.
>
> I have no windows machine, so I can not check this out 100% at the moment, but according to the compile error I got, and the documentation of GetFileVersionInfoA or GetFileVersionInfoW I find on the net:
> https://docs.microsoft.com/en-us/windows/win32/api/winver/nf-winver-getfileversioninfoa
>
> The type of dwHandle is DWORD. Defined as:
> https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/262627d8-3418-4627-9218-4ffe110850b2
>
> What makes you think the handle is of pointer type, to me it looks like a long?
A Windows "handle" is generally a pointer, so NULL makes sense
semantically, and that's presumably what the original author thought
too. OTOH, the examples of calling that function that I found via a
web search just now all seem to pass literal 0. So I'm okay with the
change after all.
>> src/hotspot/share/classfile/javaClasses.cpp
>> 851 k->set_java_mirror_handle(NULL);
>> =>
>> 851 k->set_java_mirror_handle(OopHandle(NULL));
>> PATTERN CHANGE
>> Similarly here:
>> src/hotspot/share/classfile/moduleEntry.hpp
>> 82 _module = OopHandle(NULL);
>> 84 _pd = OopHandle(NULL); >
>> And here:
>> src/hotspot/share/classfile/modules.cpp
>> 156 version_symbol = TempNewSymbol(NULL); > 161 TempNewSymbol location_symbol(NULL);
>> 368 version_symbol = TempNewSymbol(NULL);
>> 373 TempNewSymbol location_symbol(NULL);
>> Although I'm not sure why the assignments on line 156 and line 368 is
>> even being done, since each version_symbol was already default
>> initialized.
> Probably to /show/ that it has an initialized value (I do not necessarily agree that it is good). Do you want me to change any of those, and if so to what?
Unfinished.
>> src/hotspot/share/code/codeHeapState.cpp
>> 140 if (((_termString) != 0) && (strlen(_termString) > 0)){ \
>> 154 if (((_termString) != 0) && (strlen(_termString) > 0)){ \
>> I think these changes to compare _termstring with 0 rather than NULL
>> should not be made. _termString is supposed to be a `[const] char*`,
>> so comparing to NULL is preferable to comparing to 0.
>
> _termString is not a pointer (or at least not most of the time), see my thorough discussion of the topic with David. I believe my solution is the next safest one (after rewriting that macro nest which I am not going to do now).
Unfinished.
>> Though at a brief skim it looks like _termString is always provided
>> and never NULL, so it might be better to assert that _termString is
>> non-NULL. But that's a completely different change.
>
> I believe we should use less macros, and I agree that is different change.
Unfinished.
>> src/hotspot/share/gc/parallel/psAdaptiveSizePolicy.cpp
>> src/hotspot/share/gc/parallel/psParallelCompact.cpp
>> src/hotspot/share/gc/shared/memset_with_concurrent_readers.hpp
>> src/hotspot/share/gc/z/zErrno.cpp
>> These #include changes seem unrelated to NULL.
>
> They are all (unnecessary) including standard headers, It is hard to know which headers that (indirectly) redefines NULL. It is better to not include them if you do not have to. If I remember correctly I removed all but one standard library include in the "gc" folder. The one I left was for <new>.
Regarding knowing which standard headers might redefine NULL, we
should not care. We should be using the native NULL and don't care who
might define it.
For zErrno.cpp, the #include of <string.h> seems unnecessary, since it
is using os::strerror. But I think it is appropriate to #include
<errno.h> here, rather than relying on it being implicitly included by
some dependency that could change, breaking this unrelated code.
For memset_with_concurrent_readers.hpp, I'm not a big fan of
globalDefinitions.hpp (I really dislike big omnibus random collections
of stuff headers). But okay. I didn't re-examine the others.
>> src/hotspot/share/jfr/jni/jfrGetAllEventClasses.cpp
>> 172 assert(add_method_signature != NULL, "invariant");
>> =>
>> 172 assert(add_method_sig_sym != NULL, "invariant");
>> This seems to be an unrelated bug fix; the wrong value was being
>> compared to NULL.
>
> It is highly related as it did not compile on my original definition of NULL and was thus found.
The change is fine, but I would have filed and fixed the bug rather than
burying it in this large change. Much better for archeology.
>> src/hotspot/share/jfr/jni/jfrJavaCall.cpp
>> 57 assert(_storage != NULL, "invariant");
>> =>
>> 57 assert(_storage_index < (int) ARRAY_SIZE(_storage), "invariant");
>> _storage is an array, so the original assert is definitionally true,
>> but it's harmless to verify, in case future maintenance makes it an
>> allocated array rather than inline.
>
> It is so statically true that it should not be a runtime check.
>
> My replacement check is not only implicitly checking that the variable is not NULL, it does so at compile time (ARRAY_SIZE will fail on NULL) and it does an even better runtime check that actually adds value.
Unfinished.
>> The size of _storage is definitionally SIZE, and _storage_index is
>> checked against SIZE a couple lines later, making the proposed
>> replacement just a duplicative check.
>
> I have no problem with duplicate checks, but if anything should be removed, it is the SIZE assert. My assert clearly shows that the access is okay even though someone would change the size of the array to not be of size SIZE. My assert can be understood without looking outside the method.
Unfinished.
>> So I think the proposed change should not be made.
>> Similarly at lines 65, 73, and 91.
> I strongly disagree. Each of my replacement asserts is strictly superior if I am not missing something. Not only that, but they might not compile on windows in the future depending on how we define NULL in the future.
>
> Most of the changes I have made are because of compile errors after my new NULL, not "just" because of cosmetic issues.
Unfinished.
>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>> 1562 JVMCIENV->set_HotSpotStackFrameReference_localIsVirtual(hs_frame, 0);
>> This change to use literal 0 rather than NULL as a null pointer
>> constant does not seem like an improvement to me. I think this change
>> should not be made.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>> I looked at the changes from JVMCI_CHECK_0 to JVMCI_CHECK_NULL and
>> think they are correct, assuming a suitable definition of the latter.
>> (Though see below regarding that.)
>> ------------------------------------------------------------------------------
>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>> 2572 JVMCIObjectArray current_array = NULL;
>> =>
>> 2572 JVMCIObjectArray current_array(NULL);
>> PATTERN CHANGE
>> ------------------------------------------------------------------------------
>> src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp
>> 267 if ((value) != 0) { \
>> Value is a char*; I don't think this change should be made.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/jvmci/jvmciExceptions.hpp
>> 50 #define JVMCI_CHECK_NULL JVMCI_CHECK_(NULL)
>> =>
>> 50 #define JVMCI_CHECK_NULL JVMCI_CHECK_(0)
>> For platforms that warn about converting NULL to an integral value
>> this is a type safety regression. I think this change should not be
>> made.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/jvmci/jvmciExceptions.hpp
>> 49 #define JVMCI_CHECK_0 JVMCI_CHECK_(0)
>> =>
>> 49 #define JVMCI_CHECK_0 JVMCI_CHECK_(NULL_WORD)
>> I kind of think this is an abuse of NULL_WORD. It also serves no real
>> purpose with the current definitions of NULL_WORD (though would with
>> the suggestion far above). If the goal is to have something that isn't
>> a null pointer constant so it can't be used in inappropriate
>> contexts (which is a reasonable goal), then replace "0" with "int(0)"
>> (and comment appropriately).
>
> Please read my conversation with David and see if you after that still have any issue with what I am proposing.
Unfinished.
>> src/hotspot/share/oops/access.hpp
>> 296 class ArrayAccess ...
>> How about adding the following private functions to this class, and
>> use them in the obvious places:
>> template<typename T> static const T* const_raw_null() { return NULL; }
>> template<typename T> static T* raw_null() { return NULL; }
>
> I am lacking c++ and template skills, but the problem here is that we need to make the type of the null pointer concrete for the template system. I fail to see how these template methods achieve this from static context and no parameters. The code fails to compile for me, and I am probably missing something, but I fail to see the idea behind you construction. The only type parameter is the return one, and that is the one we do not know, right?
>
> As another solution we could specialize the type explicitly at the call-site instead of in the delivered parameters (I tried that first), but that was IMO /much/ harder to read.
>
> Even if raw_null() does work (assuming failure from my side), the original is cleaner IMO.
I guess my suggestion left too much as an exercise for the reader. Here’s the
change I was contemplating, which seems to work fine:
diff -r 1fa1ec0e7048 -r 08ee68a3d2a4 src/hotspot/share/oops/access.hpp
--- a/src/hotspot/share/oops/access.hpp Sat Apr 18 20:19:45 2020 +0200
+++ b/src/hotspot/share/oops/access.hpp Sun Apr 19 21:08:43 2020 -0400
@@ -295,13 +295,17 @@
template <DecoratorSet decorators = DECORATORS_NONE>
class ArrayAccess: public HeapAccess<IS_ARRAY | decorators> {
typedef HeapAccess<IS_ARRAY | decorators> AccessT;
+
+ template<typename T> static const T* const_raw_null() { return NULL; }
+ template<typename T> static T* raw_null() { return NULL; }
+
public:
template <typename T>
static inline void arraycopy(arrayOop src_obj, size_t src_offset_in_bytes,
arrayOop dst_obj, size_t dst_offset_in_bytes,
size_t length) {
- AccessT::arraycopy(src_obj, src_offset_in_bytes, reinterpret_cast<const T*>(NULL),
- dst_obj, dst_offset_in_bytes, reinterpret_cast<T*>(NULL),
+ AccessT::arraycopy(src_obj, src_offset_in_bytes, const_raw_null<T>(),
+ dst_obj, dst_offset_in_bytes, raw_null<T>(),
length);
}
@@ -309,7 +313,7 @@
static inline void arraycopy_to_native(arrayOop src_obj, size_t src_offset_in_bytes,
T* dst,
size_t length) {
- AccessT::arraycopy(src_obj, src_offset_in_bytes, reinterpret_cast<const T*>(NULL),
+ AccessT::arraycopy(src_obj, src_offset_in_bytes, const_raw_null<T>(),
NULL, 0, dst,
length);
}
@@ -319,15 +323,15 @@
arrayOop dst_obj, size_t dst_offset_in_bytes,
size_t length) {
AccessT::arraycopy(NULL, 0, src,
- dst_obj, dst_offset_in_bytes, reinterpret_cast<T*>(NULL),
+ dst_obj, dst_offset_in_bytes, raw_null<T>(),
length);
}
static inline bool oop_arraycopy(arrayOop src_obj, size_t src_offset_in_bytes,
arrayOop dst_obj, size_t dst_offset_in_bytes,
size_t length) {
- return AccessT::oop_arraycopy(src_obj, src_offset_in_bytes, reinterpret_cast<const HeapWord*>(NULL),
- dst_obj, dst_offset_in_bytes, reinterpret_cast<HeapWord*>(NULL),
+ return AccessT::oop_arraycopy(src_obj, src_offset_in_bytes, const_raw_null<HeapWord>(),
+ dst_obj, dst_offset_in_bytes, raw_null<HeapWord>(),
length);
}
>> src/hotspot/share/oops/constantPool.cpp
>> 270 set_resolved_references(OopHandle(NULL));
>> 302 set_resolved_references(OopHandle(NULL));
>> 376 set_resolved_references(OopHandle(NULL));
>> PATTERN CHANGE
>> Similarly here:
>> src/hotspot/share/oops/cpCache.cpp
>> 766 set_resolved_references(OopHandle(NULL));
>> src/hotspot/share/oops/klass.cpp
>> 549 _java_mirror = OopHandle(NULL);
>> src/hotspot/share/prims/jni.cpp
>> 319 TempNewSymbol class_name(NULL);
>> src/hotspot/share/prims/jvm.cpp
>> 954 TempNewSymbol class_name(NULL);
>> src/hotspot/share/prims/methodHandles.cpp
>> 1311 TempNewSymbol name(NULL);
>> 1312 TempNewSymbol sig(NULL);
>
> I understand you have no problem with those changes, is that correct?
Not correct. See “PATTERN CHANGE”. Unfinished.
More information about the hotspot-dev
mailing list