[PATCH] improve javadoc in TreeSet#add api documentation

Stuart Marks stuart.marks at oracle.com
Fri Dec 21 22:34:16 UTC 2018


Hi Kishor,

I took a look at the patch you sent. It looks like it was created against an 
older repository, paerhaps jdk8u-dev. We don't make spec changes to the older 
releases. Please base your changes on the latest mainline repository, which is

     http://hg.openjdk.java.net/jdk/jdk/

Notably, the source file hierarchy has been reorganized since jdk8u. Even after 
accounting for filenames, the patch didn't apply, because conflicting changes 
had been introduced, mainly changes from <tt> to {@code} markup. The latter is 
now preferred for introducing code fragments in javadoc. Also use <p> instead of 
<br> to denote paragraphs, consistent with the rest of the javadoc in this file.

(Tip: for attaching files that are primarily text -- including patches -- use a 
.txt suffix. While non-txt attachments often make it through email, they tend to 
get corrupted in the archives. Unfortunately, few people use .txt suffixes, so 
there are lots of corrupted attachments in the OpenJDK mail archives.)

**

All that aside, the patch was basically a replacement for the first three 
paragraphs of the SortedSet class doc and as such wasn't difficult to deal with. 
Comments on the new text follow.

>  * A <b>comparison method</b> is either {@link Comparator#compare} or
>  * {@link Comparable#compareTo}, implemented by elements of this {@link Set}.<br/><br/>

The rule for javadoc is that the first sentence (actually sentence fragment) of 
a doc comment is considered as summary text for the class and is pulled into 
various index pages. So, keep the initial sentence fragment as it is.

It's indeed good practice to introduce a term's definition ("comparison method") 
before its first use, but it can't displace the summary sentence.

>  * A {@link Set} that additionally provides <i>ordering</i> on its elements, using

The original text said "total ordering". Not sure why this is changed simply to 
"ordering". The term "total ordering" has a precise definition and is important 
for this specification.

>  * comparison method. Client <b>must</b> provide comparison method either by making set
>  * elements {@link Comparable} or providing custom {@link Comparator} while set creation
>  * eg, {@linkplain TreeSet#TreeSet(Comparator)}. Furthermore, all set elements must be

It's incorrect that the *client must provide* a Comparator or use Comparable 
elements. The original wording said that a Comparator is "typically provided" at 
construction time. The JDK implementations of SortedSet such as TreeSet and 
ConcurrentSkipListSet allow for this. However, it's reasonable for a subclass to 
provide its own comparison method, or for it to provide some other means 
(possibly implicitly) of selecting a comparison method, so it's not the case 
that the client must provide the comparison method.

For the definition of "comparison method" it would be nice to define it as

     e1.compareTo(e2)

if the elements are Comparable, or as

     compare(e1, e2)

if a Comparator is in use. That way, these method names need not be repeated 
later, as they are in the original text.

>  * <i>mutually comparable</i>, that is comparison method <b>must</b> not throw
>  * <tt>ClassCastException</tt> for any two elements. Several additional operations are
>  * provided to take advantage of the ordering. (This interface is the set analogue of
>  * {@link SortedMap}.) <br/><br/>

The sentence from the original text,

     The set's iterator will traverse the set in ascending element order.

seems to have been deleted. This is a very important part of the specification! 
It must remain.

Another sentence from the original, regarding mutual comparability,

     Attempts to violate this restriction will cause the
     offending method or constructor invocation to throw a
     {@code ClassCastException}.

has also been deleted. This is important too. In this case the issue is what 
happens if the elements aren't mutually comparable. Without this assertion, a 
SortedSet implementation would be free to ignore ClassCastExceptions. This 
sentence is important because it requires that proper SortedSet implementations 
propagate these exceptions to the caller.

The "Attempts to violate" wording from the original is a bit odd, though. Not 
only is it possible to attempt to violate the restriction, such attempts will 
always succeed! I'd reword this to say "Violations of this restriction...."

>  * Unlike other {@link Set} implementations, eligibility of an element is decided by
>  * comparison method and not using {@link Object#equals}.

OK, but "eligibility" needs to be defined.

A Set has the invariant that no has no pair of elements e1 and e2 such that 
e1.equals(e2). A similar statement should be made here for SortedSet, except 
using the comparison method instead of equals().

>  * The equals <b>must</b>
>  * be consistent with comparison method, ie if elementA is equals to elementB (according
>  * to equals), comparison method must return 0 (zero) for them and vice versa. If two
>  * elements are not equal (according to equals method), but returns 0 (zero) when
>  * compared (using comparison method) will be considered equal.<br/><br/>

This is unfortunately not correct at all.

It's NOT a requirement that the comparison method be consistent with equals. You 
have to read the original text carefully to understand what it's saying. It does 
not say,

     Note that the ordering maintained by a sorted set...must be
     consistent with equals.

Instead, it says,

     Note that the ordering maintained by a sorted set...must be
     consistent with equals if the sorted set is to correctly implement
     the Set interface.

It's not terribly clear what this means -- which is one of the reasons this 
specification needs work -- but it's clear that consistency with equals is a 
conditional, not absolute requirement. The original spec attempts to clarify 
this (not very successfully, in my opinion) by later stating:

     The behavior of a sorted set <i>is</i> well-defined even if its
     ordering is inconsistent with equals; it just fails to obey the general
     contract of the {@code Set} interface.

There are a bunch of things going on here all at once. Let me try to unpack them.


1. Set has a well-defined behavior based on equals()

2. SortedSet has a well-defined behavior based on its comparison method,
    assuming the comparison method itself is well-behaved

3. the well-behavedness of a comparison method is defined in the Comparable
    and Comparator interface specifications

4. a comparison method is consistent with equals iff it returns zero when
    equals() returns true

5. a well-behaved comparison method need not be consistent with equals

6. it is strongly recommended that comparison methods be consistent with equals


7. a SortedSet conforms to the general contract of Set iff the SortedSet's
    comparison method is consistent with equals


(Not all of this needs to be said in this specification; some of it is in the 
Set specification and some in the Comparator/Comparable specifications)

Note that it is NOT AN ERROR for a comparison method to be inconsistent with equals.

But what are the consequences of a comparison method being inconsistent with 
equals? If this occurs, the SortedSet still has well-defined and correct 
behavior according to its own specification, but it violates the specification 
requirements imposed by its superinterface Set.

(Briefly restated, if a SortedSet uses a comparison method that is inconsistent 
with equals, it is not Liskov-substitutable for a Set.)

That's what this section is trying to say. It's really trying to thread a 
needle. It's describing a situation that is unusual and that might result in 
unexpected behavior, but one that is not incorrect and one that isn't illegal.

**

So, what next? You probably have to start over, given that you need to switch 
over to work against the mainline source base anyway. I'd suggest starting from 
the existing text and working on a single issue, such as the definition of 
"comparison method". Work a definition of this into the first paragraph, and 
then edit the subsequent paragraphs to use "comparison method" instead of the 
clumsy compareTo-or-compare text that occurs later on.

This will help set up for subsequent revisions, including the "consistent with 
equals" discussion and the definition of eligibility for set membership.

Thanks,

s'marks



More information about the core-libs-dev mailing list