RFR: JDK-8153781 Issue in XMLScanner: EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping large DOCTYPE section with CRLF at wrong place
huizhe wang
huizhe.wang at oracle.com
Tue Apr 12 17:53:32 UTC 2016
Also, EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET was a wrong msg
id. It would be good to change that to DoctypedeclNotClosed and add a
message to XMLMessages.properties right before DoctypedeclUnterminated,
sth. like the following:
DoctypedeclNotClosed = The document type declaration for root element
type \"{0}\" must be closed with '']''.
Thanks,
Joe
On 4/11/2016 5:49 PM, huizhe wang wrote:
>
> On 4/7/2016 1:45 PM, Langer, Christoph wrote:
>> Hi,
>>
>>
>>
>> I've run into a peculiar issue with Xerces.
>>
>>
>>
>> The problem is happening when a DTD shall be skipped, the DTD is
>> larger than the buffer of the current entity and a CRLF sequence
>> occurs just one char before the buffer end.
>>
>>
>>
>> The reason is that method skipDTD of class XMLDTDScannerImpl (about
>> line 389) calls XMLEntityScanner.scanData() to scan for the next
>> occurrence of ']'. The scanData method might return true which
>> indicates that the delimiter ']' was not found yet and more data is
>> to scan. Other users of scanData would handle this and call this
>> method in a loop until it returns false or some other condition
>> happens. So I've fixed that place like at the other callers of scanData.
>
> This part of the change looks good.
>>
>>
>>
>> Nevertheless, the scanData method would usually load more data when
>> it is at the end of the buffer. But in the special case when CRLF is
>> found at the end of buffer - 1, scanData would just return true. So I
>> also removed that check at line 1374 in XMLEntityScanner. Do you see
>> any problem with that? Is there any reason behind it which I'm
>> overseeing?
>
> No need to remove this after the above change. The parser needs to
> retain what's in the xml, e.g., not removing new lines.
>>
>> Furthermore I took the chance for further little cleanups. I've added
>> the new copyright header to the files... is that the correct one?
>
> Yes, that's the right license header. However,
>>
>>
>> I also aligned the calls to invokeListeners(position) in
>> XMLEntityScanner to always pass the actual position from which the
>> load is started. Do you think this is correct?
>
> Yes.
>>
>>
>>
>> Here is the bug:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8153781
>>
>>
>>
>> Here is the webrev:
>>
>> http://cr.openjdk.java.net/~clanger/webrevs/8153781.0/
>>
>>
>>
>> Please give me some comments before I finalize my change including a
>> jtreg testcase.
>
> It would be better if you had included the testcase so that the review
> can be done together with the code change.
>
> Thanks,
> Joe
>
>>
>>
>>
>> Thanks & Best regards
>>
>> Christoph
>>
>>
>>
>
More information about the core-libs-dev
mailing list