RFR: 8169631: [JAXP] XALAN: transformation of XML via namespace-unaware SAX input yields a different result than namespace-unaware DOM input

Langer, Christoph christoph.langer at sap.com
Fri Nov 18 22:51:11 UTC 2016


Hi Joe,

thanks for the feedback.

I've now understood the testcase that you've sent over and the reason that it is reporting failure after my fix is that the output of its transform operation is rather correct now. And before it was wrong. :)
The test is comparing the actual result to a "golden" result file in the end and both of these were not looking healthy so far. The reason is that your test is using a namespace unaware SAX Parser as input. With the current JDK XALAN, you could already modify your smoketest to use a namespace aware parser.

E.g. replace lines 

82        // Use the JAXP way to get an XMLReader
83        XMLReader reader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();

with 

82        // Use the JAXP way to get an XMLReader
83        SAXParserFactory spf = SAXParserFactory.newInstance();
84        spf.setNamespaceAware(true);
85        XMLReader reader = spf.newSAXParser().getXMLReader();

...and you would already get correct results that also DOM input or Stream Input would yield.

So, are there other concerns/issues with this fix? Do you want me to include a transformation operation like the one that your SmokeTest does to TransformerTest which would illustrate the problem with namespace unaware SAX input data?

Best regards
Christoph

> -----Original Message-----
> From: Joe Wang [mailto:huizhe.wang at oracle.com]
> Sent: Freitag, 18. November 2016 05:53
> To: Langer, Christoph <christoph.langer at sap.com>
> Cc: core-libs-dev at openjdk.java.net
> Subject: Re: RFR: 8169631: [JAXP] XALAN: transformation of XML via
> namespace-unaware SAX input yields a different result than namespace-
> unaware DOM input
> 
> 
> 
> On 11/14/16, 11:43 PM, Langer, Christoph wrote:
> > Hi Joe,
> >
> > thanks for looking.
> >
> > Can you let me know which smoke test is failing? I didn't see issues so far - I
> was merely running the jtreg unittests for transformer.
> 
> I sent the test to your mailbox.
> >
> > I stepped back from replacing Vector with ArrayList for m_prefixMappings
> because the code is using methods indexOf() with a start index and setSize() for
> which ArrayList has no direct matchings. One could, for sure, add some other
> coding, e.g. use ArrayList's subList() method for the index based search - but I
> wouldn't want to run the risk of adding a regression here just because I
> modified the code and did not well test it. But if you insist, I could have another
> look.
> 
> Ok, that's fine. subList would do, but setSize may need a bit more work.
> 
> Best,
> Joe
> >
> > Best regards
> > Christoph
> >
> >> -----Original Message-----
> >> From: Joe Wang [mailto:huizhe.wang at oracle.com]
> >> Sent: Dienstag, 15. November 2016 03:23
> >> To: Langer, Christoph<christoph.langer at sap.com>
> >> Cc: core-libs-dev at openjdk.java.net
> >> Subject: Re: RFR: 8169631: [JAXP] XALAN: transformation of XML via
> >> namespace-unaware SAX input yields a different result than namespace-
> >> unaware DOM input
> >>
> >> Hi Christoph,
> >>
> >> Not all tests have finished yet, but there's at least one failure in the
> >> smoke test. I'll get to the details when I have time.
> >>
> >> Any reason why m_prefixMappings can not be replaced with ArrayList?
> >>
> >> Thanks,
> >> Joe
> >>
> >> On 11/14/16, 6:10 AM, Langer, Christoph wrote:
> >>> Hi,
> >>>
> >>> please review this fix for bug 8169631.
> >>>
> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8169631
> >>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8169631.0/
> >>>
> >>> When XALAN is handling namespace unaware input, it behaves differently
> >> while using SAX input compared to DOM input.
> >>> With both input source types, the class
> >> com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2 converts SAX
> >> input into a DTM representation for processing by the XALAN transformer.
> Its
> >> method startElement takes URI, localname and qName as attribute. In case
> of
> >> missing feature namespaces, startElement and localname can be empty.
> >> However, the function uses the localname value for the call to
> >> m_expandedNameTable.getExpandedTypeID() and further processing. In the
> >> case where only qName has data, this leads to issues.
> >>> When using DOM input, the class
> >> com.sun.org.apache.xalan.internal.xsltc.trax.DOM2SAX converts the DOM
> input
> >> into SAX input. In the case of empty localname, it fills localname with qname
> >> data. See method getLocalName() [1], called by parse() [2].
> >>> When directly using SAX input, the SAX parser calls the startElement()
> >> function on XALAN's handler with empty uri and localname - which seems
> >> correct, as per the spec.
> >>> Both paths end up in SAX2DTM2's startElement(). So I suggest to change
> this
> >> method to handle the case when uri and localname are empty and then set
> >> qname as localname. Maybe one should even change DOM2SAX's
> >> getLocalName handling to not fill localname with qname in case it is empty
> >> after SAX2DTM was changed..
> >>> Generally, JavaDoc for SAXSource says that "Attempting to transform an
> input
> >> source that is not generated with a namespace-aware parser may result in
> >> errors." But why not fix some of these :)
> >>> Furthermore I did some cleanups in the code.
> >>>
> >>> Thanks and best regards
> >>> Christoph
> >>>
> >>> [1]
> >>
> http://hg.openjdk.java.net/jdk9/dev/jaxp/file/71558b38bad7/src/java.xml/shar
> >>
> e/classes/com/sun/org/apache/xalan/internal/xsltc/trax/DOM2SAX.java#l139
> >>> [2]
> >>
> http://hg.openjdk.java.net/jdk9/dev/jaxp/file/71558b38bad7/src/java.xml/shar
> >>
> e/classes/com/sun/org/apache/xalan/internal/xsltc/trax/DOM2SAX.java#l279
> >>> [3]
> >>
> https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/sax/SAXSource
> >> .html
> >>>


More information about the core-libs-dev mailing list