Sentinels in collections, was RE: Primitive Queue<any T> considerations
Rezaei, Mohammad A.
Mohammad.Rezaei at gs.com
Wed Nov 18 21:18:39 UTC 2015
It’s not “lack of efficiently expressing a nullable primitive”. It’s a need for one or more sentinel values. (Please note that I changed the subject because this is a very different concern from the original thread, where the return value required something special.) The sentinels in this thread are entirely internal and completely invisible to the user and the API. Sentinel values are also prevalent in plain object collections, where creating a sentinel is easy because reference equality can be guaranteed to be false for an object that’s visible only to the data structure. For example, UnifiedMap in GS collections has two object sentinels (NULL_KEY and CHAINED_KEY) as well as null (meaning empty). ConcurrentHashMap in GS collections has 4 sentinels (RESIZE_SENTINEL, RESIZED, RESIZING and null). TObjectHash from Trove uses two sentinels (REMOVED and FREE).
The checking against 0/1 is far faster than losing a byte (and possibly more due to alignment) for every entry or having a header or having a bit set, etc.
Sentinels are useful for many collections that use arrays to store stuff. Node based collections generally need one sentinel, the terminal node (and they cannot be implemented with values as proposed here due to the self-referential nature of such structures).
From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
Sent: Wednesday, November 18, 2015 3:51 PM
To: Rezaei, Mohammad A. [Tech]
Cc: Richard Warburton; valhalla-dev at openjdk.java.net
Subject: Re: Sentinels in collections, was RE: Primitive Queue<any T> considerations
This may work for a set, but doesn't necessarily generalize to any data structure. Moreover, you're now checking the argument for being one of the two sentinels on all these operations. There are times when side data or out-of-band data is warranted, but those occasions should be dictated by things other than lack of efficiently expressing a nullable primitive.
On Wed, Nov 18, 2015 at 3:11 PM, Rezaei, Mohammad A. <Mohammad.Rezaei at gs.com<mailto:Mohammad.Rezaei at gs.com>> wrote:
Many primitive collections use sentinels *without* requiring removal of those in the collection. This is done by storing a bit of side information. For example, for a set of int, GS Collections uses 0 and 1 as sentinels (meaning empty and removed, respectively). Every call to methods like get/contains/add/etc first checks to see if the value is a sentinel, and if so, queries/updates the side data. For a set, the side data is tiny and trivial: there is a single byte that encode if the set contains 0, 1, both or none.
>From: valhalla-dev [mailto:valhalla-dev-bounces at openjdk.java.net<mailto:valhalla-dev-bounces at openjdk.java.net>] On Behalf Of
>Sent: Wednesday, November 18, 2015 2:31 PM
>To: Richard Warburton
>Cc: valhalla-dev at openjdk.java.net<mailto:valhalla-dev at openjdk.java.net>
>Subject: Re: Primitive Queue<any T> considerations
>Although I agree that sometimes a sentinel or even an entire range of
>values is available to indicate absence, I don't think it's the right
>answer to the question. In particular, if you're writing a general purpose
>collection, you cannot expect your users to remove one possible value from
>their universe to signal absence. The collections that do this today are
>simply working around lack of nullable primitives.
>On Wed, Nov 18, 2015 at 2:18 PM, Richard Warburton <
>richard.warburton at gmail.com<mailto:richard.warburton at gmail.com>> wrote:
>> I'm not sure about the default null-indicator because every time a queue is
>> > needed in our library, that can serve a source with unique choice of a
>> > default-null value. It is unlikely the users of the library want or can
>> > specify that all the time.
>> I appreciate that this is diverging directly form your language related
>> question, but it is pertinent to writing anyfied collections using
>> Valhalla. My experience using primitive specialised collections is that it
>> has always been possible to have a sentinel value, though obviously my
>> experience won't necessary reflect that of every single person in the Java
>> community. It isn't always appropriate for it to be 0, which is what I
>> think T.default is currently going to return for primitives and when I've
>> done primitive specialised collections before they allow you to specify the
>> missing value sentinel.
>> Have you got of an actual genuine use case where a sentinel value doesn't
>> work well, rather than a theoretical "hey, maybe you want to have a set
>> that contains every possible int?"
>> Richard Warburton
>> @RichardWarburto <http://twitter.com/richardwarburto>
More information about the valhalla-dev