RFR: JDK-8153781 Issue in XMLScanner: EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping large DOCTYPE section with CRLF at wrong place

Tue Apr 12 17:53:32 UTC 2016

Also, EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET was a wrong msg 
id. It would be good to change that to DoctypedeclNotClosed and add a 
message to XMLMessages.properties right before DoctypedeclUnterminated, 
sth. like the following:

DoctypedeclNotClosed = The document type declaration for root element 
type \"{0}\" must be closed with '']''.

Thanks,
Joe

On 4/11/2016 5:49 PM, huizhe wang wrote:
>
> On 4/7/2016 1:45 PM, Langer, Christoph wrote:
>> Hi,
>>
>>
>>
>> I've run into a peculiar issue with Xerces.
>>
>>
>>
>> The problem is happening when a DTD shall be skipped, the DTD is 
>> larger than the buffer of the current entity and a CRLF sequence 
>> occurs just one char before the buffer end.
>>
>>
>>
>> The reason is that method skipDTD of class XMLDTDScannerImpl (about 
>> line 389) calls XMLEntityScanner.scanData() to scan for the next 
>> occurrence of ']'. The scanData method might return true which 
>> indicates that the delimiter ']' was not found yet and more data is 
>> to scan. Other users of scanData would handle this and call this 
>> method in a loop until it returns false or some other condition 
>> happens. So I've fixed that place like at the other callers of scanData.
>
> This part of the change looks good.
>>
>>
>>
>> Nevertheless, the scanData method would usually load more data when 
>> it is at the end of the buffer. But in the special case when CRLF is 
>> found at the end of buffer - 1, scanData would just return true. So I 
>> also removed that check at line 1374 in XMLEntityScanner. Do you see 
>> any problem with that? Is there any reason behind it which I'm 
>> overseeing?
>
> No need to remove this after the above change. The parser needs to 
> retain what's in the xml, e.g., not removing new lines.
>>
>> Furthermore I took the chance for further little cleanups. I've added 
>> the new copyright header to the files... is that the correct one?
>
> Yes, that's the right license header. However,
>>
>>
>> I also aligned the calls to invokeListeners(position) in 
>> XMLEntityScanner to always pass the actual position from which the 
>> load is started. Do you think this is correct?
>
> Yes.
>>
>>
>>
>> Here is the bug:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8153781
>>
>>
>>
>> Here is the webrev:
>>
>> http://cr.openjdk.java.net/~clanger/webrevs/8153781.0/
>>
>>
>>
>> Please give me some comments before I finalize my change including a 
>> jtreg testcase.
>
> It would be better if you had included the testcase so that the review 
> can be done together with the code change.
>
> Thanks,
> Joe
>
>>
>>
>>
>> Thanks & Best regards
>>
>> Christoph
>>
>>
>>
>