[icedtea-web] RFC: use tagsoup to try and parse malformed JNLP files

Jiri Vanek jvanek at redhat.com
Fri Apr 5 06:59:54 PDT 2013


pinging this to not let it die

-------- Original Message --------
Subject: RE: [icedtea-web] RFC: use tagsoup to try and parse malformed JNLP	files
Date: Wed, 06 Jun 2012 15:12:17 -0400
From: Adam Domurad <adomurad at redhat.com>
To: distro-pkg-dev at openjdk.java.net

I have attached the patch of what I ended up with after applying the
patch to the recent code, to make further review easier hopefully.

I have also attached only the updated test changes from the same patch.

(Note that my changes to Makefile.am are dubious, I was just doing what
was necessary to test the patch.)

I'd appreciate if another reviewer ran the tests as I'm having trouble
getting with running a few of them (in general). From what I see though
this patch passes more tests than normal, including two of the malformed
XML unit tests.

>From what I've seen the changes to the code look solid, I'd comment more
but I'm not too sure of the impact of some of the changes.

> Hi,
>
> I have come across a number of JNLP files that are not valid xml. Netx
> can not parse these files using a xml parser, and fails to run them. I
> spent some time looking for a solution and came across TagSoup[1]. The
> TagSoup library parses a malformed HTML document into a well-formed
> xml-like HTML document, but it works almost perfectly for our purposes too.
>
> The attached patch makes use of TagSoup for parsing input jnlp files.
>
> Parsing is currently implemented in two passes. In the first pass,
> TagSoup reads the "xml" (which can be malformed and hence not really
> xml), and outputs valid XML. Netx then uses this valid XML and uses it's
> own XML parser to parse the file.
>
> The patch requires TagSoup as an optional dependency. To use TagSoup,
> run configure (--with-tagsoup can be used to point to a TagSoup jar). To
> not use TagSoup (even if it installed), use --with-tagsoup=no
>
> The patch also adds an additional command line option, -xml ,to the
> javaws binary. This option can be used to force Netx to use the normal
> xml parser instead of TagSoup to parse the jnlp file.
>
> Any thoughts or comments?
>
> ChangeLog:
> 2011-01-10  Omair Majid  <omajid at redhat.com>
>
>      * Makefile.am: Add NETX_EXCLUDE_SRCS, NETX_DUMMY_CLASSPATH
>      (netx-source-files.txt): Selectively exclude some sources from
>      compilation.
>      (stamps/netx.stamp): Depend on netx-dummy.jar
>      (netx-dummy.jar): New target. Empty jar. Used so there is always at
>      least one class on the classpath.
>      ($(NETX_DIR)/launcher/%.o): Add classpath.
>      * NEWS: Update with fix.
>      * acinclude.m4: Add IT_CHECK_FOR_TAGSOUP.
>      * configure.ac: Call IT_CHECK_FOR_TAGSOUP.
>      * netx/net/sourceforge/jnlp/JNLPFile.java: Add new member
>      parserSettings.
>      (JNLPFile(URL)): Pass a ParserSettings object.
>      (JNLPFile(URL,boolean)): Refactored into...
>      (JNLPFile(URL,ParserSettings)): New method.
>      (JNLPFile(URL,Version,boolean)): Refactored into...
>      (JNLPFile(URL,Version,ParserSettings)): New method.
>      (JNLPFile(URL,Version,boolean,UpdatePolicy)): Refactored into...
>      (JNLPFile(URL,Version,ParserSettings,UpdatePolicy)): New method.
>      (JNLPFile(URL,String,Version,boolean,UpdatePolicy)): Refactored
>      into...
>      (JNLPFile(URL,String,Version,ParserSettings,UpdatePolicy)): New
>      method.
>      (JNLPFile(InputStream,boolean)): Refactored into...
>      (JNLPFile(InputStream,ParserSettings)): New method.
>      (getParserSettings): New method.
>      (parse(Node,boolean,URL)): Refactored into...
>      (parse(InputStream,URL)): New method. Invoke parser to get the root
>      node and then parse it.
>      * netx/net/sourceforge/jnlp/Launcher.java
>      (toFile): Use new ParserSettings object.
>      * netx/net/sourceforge/jnlp/Parser.java
>      (Parser(JNLPFile,URL,Node,boolean,boolean)): Refactored into...
>      (Parser(JNLPFile,URL,Node,ParserSettings)): New method.
>      (getRootNode): Implementation moved into XMLParser.getRootNode.
>      Selects the right subclass of XMLParser to use.
>      (getEncoding): Moved to XMLParser.
>      * netx/net/sourceforge/jnlp/ParserSettings.java: New file.
>      (ParserSettings): New method.
>      (ParserSettings(boolean,boolean,boolean)): New method.
>      (isExtensionAllowed): New method.
>      (isMalfromedXmlAllowed): New method.
>      (isStrict): New method.
>      * netx/net/sourceforge/jnlp/XMLParser.java
>      (getRootNode): New method. Contains implementation from
>      Parser.getRootNode.
>      (getEncoding): New method. Moved from Parser.
>      * netx/net/sourceforge/jnlp/MalformedXMLParser.java: New file.
>      (getRootNode): New method. Transform input into valid xml and
>      delegate to parent to parse it.
>      (xmlizeInputStream): New method. Read contents from an input stream
>      and transform it into valid xml.
>      * netx/net/sourceforge/jnlp/resources/Messages.properties: Add
>      BOXml.
>      * netx/net/sourceforge/jnlp/runtime/Boot.java: Add -xml option.
>      (getFile): Parse -xml option and create a new ParserSettings object
>      based on it.
>      * netx/net/sourceforge/jnlp/runtime/JNLPClassLoader.java
>      (getInstance(URL,String,Version,UpdatePolicy)): Refactored into...
>      (getInstance(URL,String,Version,ParserSettings,UpdatePolicy): New
>      method.
>      (initializeExtensions): Use the same parser settings to parse the
>      extension as used in the original file.
>
> Cheers,
> Omair
>
> [1] http://home.ccil.org/~cowan/XML/tagsoup/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: update_to_patch.patch
Type: text/x-patch
Size: 41913 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20130405/0e261339/update_to_patch.patch 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_changes.patch
Type: text/x-patch
Size: 9777 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20130405/0e261339/test_changes.patch 


More information about the distro-pkg-dev mailing list