RFR: 8249783: Simplify DerValue and DerInputStream [v2]
Weijun Wang
weijun at openjdk.java.net
Sat Sep 26 01:00:21 UTC 2020
On Sat, 19 Sep 2020 00:36:23 GMT, Valerie Peng <valeriep at openjdk.org> wrote:
>> Weijun Wang has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Enhance DerValue::getOctetString to be able to read multi-level constructed value.
>
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 62:
>
>> 60: final byte[] data;
>> 61: final int start;
>> 62: final int end;
>
> Well, end is really "start + length", the data range is from start to (end -1) (inclusive).
Will add some comments.
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 69:
>
>> 67: int mark;
>> 68:
>> 69: public DerInputStream(byte[] data, int start, int length, boolean allowBER) {
>
> Add javadoc for the new impl? Do we need to validate things like data != null, start and length have valid values? Or
> assuming that since all its callers are internal, no need to check.
I don't intend to add any check because callers are internal. I can add a comment.
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 100:
>
>> 98: * @return the read DerValue.
>> 99: * @throws IOException if a DerValue cannot be constructed starting from this position
>> 100: * because of byte shortage or encoding error.
>
> Add javadoc for the new impl?
OK.
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 90:
>
>> 88: }
>> 89:
>> 90: public byte[] toByteArray() {
>
> Add javadoc? Returns the remaining unread bytes? The name may lead people to expect the bytes returned are the one
> passing to the constructor.
Well. This method is sometimes used to read remaining bytes and sometimes used to read all (suppose none has been read
yet). I will clarify.
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 103:
>
>> 101: */
>> 102: public DerValue getDerValue() throws IOException {
>> 103: DerValue result = new DerValue(
>
> Check (this.end - this.pos > 0) before calling DerValue()?
new DerValue() would fail in this case.
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 139:
>
>> 137: // does not accept constructed OCTET STRING.
>> 138: DerValue v = getDerValue();
>> 139: if (v.tag != DerValue.tag_OctetString) {
>
> Only this method checks tag of the DerValue?
Yes, this is because `DerValue::getOctetString` allows constructed value but `DerInputStream::getOctetString` only
accepts primitive value. This is for compatibility.
All other methods will have tag checked inside the corresponding DerValue method. Do you prefer a fast fail?
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 207:
>
>> 205: }
>> 206:
>> 207: static int getLength(InputStream in) throws IOException {
>
> This and other getLength(...) methods seems out-of-place with the new implementation code?
Yes, I should do some refactoring. Both methods are internal (to these 2 classes).
> src/java.base/share/classes/sun/security/util/DerInputStream.java line 280:
>
>> 278: * a later call to <code>reset</code> will return here.
>> 279: */
>> 280: public void mark(int value) { mark = pos; }
>
> This seems very strange, having an int argument which is not used?
`ByteArrayInoutStream::mark` also does this. When the data is all there and no need to decide how many to remember,
this argument is useless. I'll rename it to dummy.
> src/java.base/share/classes/sun/security/util/DerValue.java line 163:
>
>> 161: // should call withTag() or data() instead (do not use the data field).
>> 162:
>> 163: public /*final*/ byte tag;
>
> Conceptually, the class is far from immutable if the tag is public and non-final. Any caller could have changed the tag
> value. Is it that we are keeping all public fields/methods the same for max compatibility-sake? It'd be nice if it can
> be real immutable.
Yes, I should have said that this class can be used as an immutable class if you do not use this and that. We will
decide if we can remove those usages and make this really immutable in the future.
> src/java.base/share/classes/sun/security/util/DerValue.java line 170:
>
>> 168:
>> 169: // Unsafe. Legacy. Never null.
>> 170: final public DerInputStream data;
>
> nit: public final for sake of consistency? If not meant to be used anymore, mark it clearly across all relevant methods
> with deprecated javadoc tag?
public for compatibility, and final because it won't need to change. I'll see how many `@deprecated` I can add. Or,
I'll see if I can move all will-be-deprecated code together.
> src/java.base/share/classes/sun/security/util/DerValue.java line 226:
>
>> 224: this.end = end;
>> 225: this.allowBER = allowBER;
>> 226: this.data = new DerInputStream(this);
>
> this.data used to just contain the "value" part of the DerValue, now it's the whole DerValue (including tag and length
> in addition to value). Is its usage still compatible?
The constructor `DerInputStream(DerValue)` has not read tag at all. I can rewrite it the last line to `new
DerInputStream(buffer, start, end - start, allowBER)`.
> src/java.base/share/classes/sun/security/util/DerValue.java line 284:
>
>> 282:
>> 283: /**
>> 284: * Parse an ASN.1/BER encoded datum. The entire encoding must hold exactly
>
> Are you intentionally put BER here because allowBER=true?
Yes. In fact I'm not quite confident on our usage of allowBER. In a lot of cases we are actually quite strict. Since
this code change is meant to be a refactoring and does not intend to fix many things, I don't intent to make many
behavior change.
> src/java.base/share/classes/sun/security/util/DerValue.java line 294:
>
>> 292:
>> 293: /**
>> 294: * Parse an ASN.1/BER encoded datum from a byte array.
>
> DER instead of BER?
There is allowBER, so BER.
> src/java.base/share/classes/sun/security/util/DerValue.java line 301:
>
>> 299: * @param allowBER whether BER is allowed
>> 300: * @param allowMore whether extra bytes are allowed after the encoded datum.
>> 301: * If true, {@code len} can be bigger than the length og
>
> typo: og =>of?
Oops.
> src/java.base/share/classes/sun/security/util/DerValue.java line 305:
>
>> 303: *
>> 304: * @throws IOException if it's an invalid encoding or there are extra bytes
>> 305: * after the encoded datum but {@code allowMore} is false.
>
> but => and or when?
and.
> src/java.base/share/classes/sun/security/util/DerValue.java line 318:
>
>> 316:
>> 317: int length;
>> 318: if (lenByte == (byte)0x80) {
>
> Add comment for this being the indefinite case?
Sure.
> src/java.base/share/classes/sun/security/util/DerValue.java line 322:
>
>> 320: } else if ((lenByte & 0x080) == 0x00) { // short form, 1 byte datum
>> 321: length = lenByte;
>> 322: } else { // long form or indefinite
>
> This should just the long form. Isn't the indefinite case covered by the first if-check?
Correct.
> src/java.base/share/classes/sun/security/util/DerValue.java line 325:
>
>> 323: lenByte &= 0x07f;
>> 324: if (lenByte == 0) {
>> 325: length = -1;
>
> This should never happen given the first if-check?
>
> There is a lot of duplicated code (line 318-348)with the various length-parsing methods in DerInputStream class. Have
> you considered consolidating them? It seems that you are keeping track of the pos value for the subsequent
> "Too-little"/"Too much" checks. Is "allowMore" really used? If "allowBER" is false, could "allowMore" be true?
Correct. I didn't really modify the old code much here. Will see if worth fixing.
allowMore is still useful for allowBER == true unless indefinite length is used. For indefinite length, our current
DerIndefLenConverter always consumes all available bytes. This is not perfect, let's see if we can enhance it later.
> src/java.base/share/classes/sun/security/util/DerValue.java line 446:
>
>> 444: }
>> 445:
>> 446: public final DerInputStream getData() {
>
> Mark this method as deprecated javadoc tag? Is it possible to just change this method impl with the new impl under
> data()? The data field is already public, can't we just update the impl of this method? Having the data field,
> getData() method and now data() looks very confusing.
There are a lot of classes calling this method and it's different from `data()`. `data()` returns a new stream always
pointing to the 1st sub-value in the content, but `getData()` returns the data field which could advance after a read
call. This means you can call `getData().getDerValue()` to read the 1st sub-value, and then `getData().getDerValue()`
again to read the 2nd. Quite unobvious but the actual behavior. I can add some comments. I named the new method
`data()` because I want the future nice-behavior method looks simple.
> src/java.base/share/classes/sun/security/util/DerValue.java line 584:
>
>> 582: }
>> 583: // TODO
>> 584: return new BigInteger(1, buffer, start, end - start).intValue();
>
> More info for what to do? - // TODO
> Enumerated is just integer, is there any spec/doc saying it's non-negative? All I can find is just integer.
Maybe that's what I meant? The positive thing?
I'll refactor `getBigIntegerInternal()` so this method will call it.
> src/java.base/share/classes/sun/security/util/DerValue.java line 670:
>
>> 668: }
>> 669: if (end == start) {
>> 670: throw new IOException("No padding");
>
> Same as above comment regarding "No Padding".
Yes.
> src/java.base/share/classes/sun/security/util/DerValue.java line 675:
>
>> 673: if (end == start + 1) {
>> 674: return new BitArray(0);
>> 675: } else {
>
> This is different from the existing impl in DerInputBuffer.getUnalignedBitString() which throws IOException if the
> numberOfPadBits is invalid. It's also different from the getBitString() impl in this PR where the numOfPadBits is still
> checked. Is this difference in behavior intentional?
No. I didn't intend to make any change. I'll investigate. Maybe I copied from an old version?
> src/java.base/share/classes/sun/security/util/DerValue.java line 680:
>
>> 678: throw new IOException("Invalid number of padding bits");
>> 679: }
>> 680: return new BitArray((end - start - 1) * 8 - numOfPadBits,
>
> << 3
Sure.
> src/java.base/share/classes/sun/security/util/DerValue.java line 768:
>
>> 766: */
>> 767: public String getUniversalString() throws IOException {
>> 768: return readStringInternal(tag_UniversalString, new UTF_32BE());
>
> It looks like UTF_32BE now is always supported? If yes, perhaps adding a constant for it just like for other charsets?
> Also the "return" javadoc tag for this method needs to be updated?
That's a public Java SE API and I don't intent to add one. I'll fix the javadoc.
> src/java.base/share/classes/sun/security/util/DerValue.java line 777:
>
>> 775: * if UTC Time is to be read.
>> 776: */
>> 777: private Date getTime(int len, boolean generalized) throws IOException {
>
> You have the Internal suffix on other private helper routines. Maybe add it here as well?
Good idea.
> src/java.base/share/classes/sun/security/util/DerValue.java line 976:
>
>> 974: throw new IOException("DER UTC Time length error");
>> 975: }
>> 976: }
>
> nit: add a javadoc for this like other public methods of this class.
> nit: move getNull() up before getTime() so all the time-related methods are together.
> The IOException messages in this method has UTC Time related words, maybe copy-n-paste error?
Ah, yes.
> src/java.base/share/classes/sun/security/util/DerValue.java line 991:
>
>> 989:
>> 990: data.pos = data.end; // Compatibility. Reach end.
>> 991: return getTime(end - start, false);
>
> Given that getTime() have access to all the fields, it seems redundant to have to specify (end-start) when calling it.
Good idea.
> src/java.base/share/classes/sun/security/util/DerValue.java line 1043:
>
>> 1041: @Override
>> 1042: public String toString() {
>> 1043: return String.format("DerValue(%02x, %s, %d, 5d)",
>
> 5d should be %d?
Ouch.
> src/java.base/share/classes/sun/security/util/DerValue.java line 1154:
>
>> 1152: * @return a new DerValue
>> 1153: */
>> 1154: public DerValue withTag(byte newTag) {
>
> Seems like a somewhat dangerous method. The value may not match the new tag? The caller is expected to know what it's
> doing?
I intend to use it to turn back IMPLICIT element to its original tag, otherwise getXyz() calls would fail because they
are checking the tag. In the old design, checking the tag (in DerValue) and reading the value (in DerInputBuffer) are
separated but now they are in a single method (in DerValue).
The caller pattern should always look like `withTag(INTEGER).getInteger()`.
> src/java.base/share/classes/sun/security/util/DerValue.java line 1173:
>
>> 1171:
>> 1172: DerValue[] subs(byte expectedTag, int startLen) throws IOException {
>> 1173: if (expectedTag != 0 && expectedTag != tag) {
>
> so if expectedTag == 0, then we don't check for tag match? Looks strange. If this is intentional, we should add comment
> to clarify why.
For IMPLICIT. Will comment.
> test/jdk/sun/security/util/DerValue/DeepOctets.java line 56:
>
>> 54: Utils.runAndCheckException(
>> 55: () -> new DerInputStream(input).getOctetString(),
>> 56: IOException.class);
>
> It seems that the existing impl already differs and you are just adding a regression test to highlight the difference,
> right?
The existing impl already differs, but `DerValue::getOctetString` can only read one level of constructed OCTET STRING.
My second commit can read arbitrary levels and hence this this regression test. There are two levels of 0x24 here.
-------------
PR: https://git.openjdk.java.net/jdk/pull/232
More information about the security-dev
mailing list