Sentinels in collections, was RE: Primitive Queue<any T> considerations
Rezaei, Mohammad A.
Mohammad.Rezaei at gs.com
Thu Nov 19 02:57:45 UTC 2015
The simple answer is, I don’t know, or rather, I have a few really bad ideas. I know what I would do in C, but that’s the realm of headerless-structs, not immutable-value-types.
Here is my current list of bad ideas:
1) For pure value types (no references), it’s safe to fill the bits with arbitrary values, so long as that doesn’t leak outside the collection. Let me construct any pure user defined value type with a single byte (I don’t think I need more than 256 sentinels), fill the rest with zero. This can be extended to value types that have at least 1 byte of primitive storage, but doesn’t work for value types that are just references.
a. Additionally, let me store these not as class members, but per-reified-type-static members.
2) Some sort of union type. I think this can be formalized better than (1), but I’m not sure all safety can be guaranteed without overhead. It inherently also requires a subject that I haven’t seen discussed: casting and maybe its uglier cousin, coercion. Primitives have both and that’ll be an interesting story for user defined value types.
3) Give up and use one of the following:
a. Flat, but fat: define my own value type that’s a byte+V. Decent cache line behavior, but significant memory overhead and alignment concerns.
b. Flat, reasonably slim, but painful: define my own value type that’s a byte + 4 values (to get up to 4 sentinels, 2 bits per value). Address calculation is a pain, but addressing the individual values is an exercise in patience (and 4x code bloat).
c. Unflat, fast-ish: keep a separate bit set (with as many bits per value as I need sentinel types).
d. Unflat, compact: since most of the time the special sentinels (e.g. REMOVED) are sparse, use a sparse side structure instead of a bitset. I think this will be quite slow and the sparseness assumption better be right, or it’ll also be bigger than a bitset.
I’ve been shaking my head as I wrote this down and I’m questioning my sanity for mailing this out, but there you have it.
From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
Sent: Wednesday, November 18, 2015 9:13 PM
To: Rezaei, Mohammad A. [Tech]
Cc: valhalla-dev at openjdk.java.net; Richard Warburton
Subject: RE: Sentinels in collections, was RE: Primitive Queue<any T> considerations
How do you imagine multiple sentinels would/could work in a generic collection? Specialization for some known types? How would user defined value types fit there?
sent from my phone
On Nov 18, 2015 8:15 PM, "Rezaei, Mohammad A." <Mohammad.Rezaei at gs.com<mailto:Mohammad.Rezaei at gs.com>> wrote:
Richard asked for examples of collections that had to have sentinels and I provided some, but realizing that these are internal ones, changed the subject.
From a Valhalla perspective, the fact that one cannot manufacture a sentinel besides T.default is a real problem for collections that need more than one sentinel. The current types (primitives, references) don’t have this limitation. (This is also not a problem in languages like C where I can cast a struct to char* and fill it with arbitrary bit patterns).
From: Vitaly Davidovich [mailto:vitalyd at gmail.com<mailto:vitalyd at gmail.com>]
Sent: Wednesday, November 18, 2015 6:32 PM
To: Richard Warburton
Cc: valhalla-dev at openjdk.java.net<mailto:valhalla-dev at openjdk.java.net>; Rezaei, Mohammad A. [Tech]
Subject: Re: Sentinels in collections, was RE: Primitive Queue<any T> considerations
I'm still unclear on how we ended up discussing internal sentinels that are effectively implementation details. I don't think there's any debate on this part.
I do, however, disagree on sentinels that leak out of the implementation and require user to select a magic number. And I say this as someone who constantly exploits domain specific constraints on values to optimize code. In particular, in a world where optional<value_type> is cheap asking user to declare a magic number is unfortunate; this is required today, but it's a language limitation. I'd hate to have that perpetuated once better language features are available.
sent from my phone
On Nov 18, 2015 6:19 PM, "Richard Warburton" <richard.warburton at gmail.com<mailto:richard.warburton at gmail.com>> wrote:
Agreed, so maybe I misinterpreted Richard’s email (I tried to mitigate by changing the subject). Richard, which type of sentinel did you mean?
Well I was originally thinking in terms of the Queue interface itself, which needed a sentinel taken out of the user's domain. I don't think its a problem to expand the discussion though and you're certainly right to observe that there are two different kinds of sentinel.
More information about the valhalla-dev