[icedtea-web] RFC: use tagsoup to try and parse malformed JNLP files
Dr Andrew John Hughes
ahughes at redhat.com
Tue Jan 11 06:03:58 PST 2011
On 19:10 Mon 10 Jan , Omair Majid wrote:
> On 01/10/2011 05:33 PM, Dr Andrew John Hughes wrote:
> > On 16:39 Mon 10 Jan , Omair Majid wrote:
> >> Hi,
> >>
> >> I have come across a number of JNLP files that are not valid xml. Netx
> >> can not parse these files using a xml parser, and fails to run them. I
> >> spent some time looking for a solution and came across TagSoup[1]. The
> >> TagSoup library parses a malformed HTML document into a well-formed
> >> xml-like HTML document, but it works almost perfectly for our purposes too.
> >>
> >> The attached patch makes use of TagSoup for parsing input jnlp files.
> >>
> >> Parsing is currently implemented in two passes. In the first pass,
> >> TagSoup reads the "xml" (which can be malformed and hence not really
> >> xml), and outputs valid XML. Netx then uses this valid XML and uses it's
> >> own XML parser to parse the file.
> >>
> >> The patch requires TagSoup as an optional dependency. To use TagSoup,
> >> run configure (--with-tagsoup can be used to point to a TagSoup jar). To
> >> not use TagSoup (even if it installed), use --with-tagsoup=no
> >>
> >> The patch also adds an additional command line option, -xml ,to the
> >> javaws binary. This option can be used to force Netx to use the normal
> >> xml parser instead of TagSoup to parse the jnlp file.
> >>
> >> Any thoughts or comments?
> >>
> >> ChangeLog:
> >> 2011-01-10 Omair Majid<omajid at redhat.com>
> >>
> >> * Makefile.am: Add NETX_EXCLUDE_SRCS, NETX_DUMMY_CLASSPATH
> >> (netx-source-files.txt): Selectively exclude some sources from
> >> compilation.
> >> (stamps/netx.stamp): Depend on netx-dummy.jar
> >> (netx-dummy.jar): New target. Empty jar. Used so there is always at
> >> least one class on the classpath.
> >> ($(NETX_DIR)/launcher/%.o): Add classpath.
> >> * NEWS: Update with fix.
> >> * acinclude.m4: Add IT_CHECK_FOR_TAGSOUP.
> >> * configure.ac: Call IT_CHECK_FOR_TAGSOUP.
> >> * netx/net/sourceforge/jnlp/JNLPFile.java: Add new member
> >> parserSettings.
> >> (JNLPFile(URL)): Pass a ParserSettings object.
> >> (JNLPFile(URL,boolean)): Refactored into...
> >> (JNLPFile(URL,ParserSettings)): New method.
> >> (JNLPFile(URL,Version,boolean)): Refactored into...
> >> (JNLPFile(URL,Version,ParserSettings)): New method.
> >> (JNLPFile(URL,Version,boolean,UpdatePolicy)): Refactored into...
> >> (JNLPFile(URL,Version,ParserSettings,UpdatePolicy)): New method.
> >> (JNLPFile(URL,String,Version,boolean,UpdatePolicy)): Refactored
> >> into...
> >> (JNLPFile(URL,String,Version,ParserSettings,UpdatePolicy)): New
> >> method.
> >> (JNLPFile(InputStream,boolean)): Refactored into...
> >> (JNLPFile(InputStream,ParserSettings)): New method.
> >> (getParserSettings): New method.
> >> (parse(Node,boolean,URL)): Refactored into...
> >> (parse(InputStream,URL)): New method. Invoke parser to get the root
> >> node and then parse it.
> >> * netx/net/sourceforge/jnlp/Launcher.java
> >> (toFile): Use new ParserSettings object.
> >> * netx/net/sourceforge/jnlp/Parser.java
> >> (Parser(JNLPFile,URL,Node,boolean,boolean)): Refactored into...
> >> (Parser(JNLPFile,URL,Node,ParserSettings)): New method.
> >> (getRootNode): Implementation moved into XMLParser.getRootNode.
> >> Selects the right subclass of XMLParser to use.
> >> (getEncoding): Moved to XMLParser.
> >> * netx/net/sourceforge/jnlp/ParserSettings.java: New file.
> >> (ParserSettings): New method.
> >> (ParserSettings(boolean,boolean,boolean)): New method.
> >> (isExtensionAllowed): New method.
> >> (isMalfromedXmlAllowed): New method.
> >> (isStrict): New method.
> >> * netx/net/sourceforge/jnlp/XMLParser.java
> >> (getRootNode): New method. Contains implementation from
> >> Parser.getRootNode.
> >> (getEncoding): New method. Moved from Parser.
> >> * netx/net/sourceforge/jnlp/MalformedXMLParser.java: New file.
> >> (getRootNode): New method. Transform input into valid xml and
> >> delegate to parent to parse it.
> >> (xmlizeInputStream): New method. Read contents from an input stream
> >> and transform it into valid xml.
> >> * netx/net/sourceforge/jnlp/resources/Messages.properties: Add
> >> BOXml.
> >> * netx/net/sourceforge/jnlp/runtime/Boot.java: Add -xml option.
> >> (getFile): Parse -xml option and create a new ParserSettings object
> >> based on it.
> >> * netx/net/sourceforge/jnlp/runtime/JNLPClassLoader.java
> >> (getInstance(URL,String,Version,UpdatePolicy)): Refactored into...
> >> (getInstance(URL,String,Version,ParserSettings,UpdatePolicy): New
> >> method.
> >> (initializeExtensions): Use the same parser settings to parse the
> >> extension as used in the original file.
> >>
> >> Cheers,
> >> Omair
> >>
> >> [1] http://home.ccil.org/~cowan/XML/tagsoup/
> >
> > I've just looked at the build changes. I'll leave someone with better knowledge
> > of the source code to look at those changes.
> >
>
> Thanks for looking over the changes so quickly!
>
> > With Makefile.am, I don't see why NETX_DUMMY_CLASSPATH is needed or the additional
> > rule that creates a JAR file. Neither do you need to set NETX_EXCLUDE_SRCS to empty;
> > this is the default.
> >
>
> Automake complains if a variable is not set before using += :
>
> NETX_EXCLUDE_SRCS must be set with `=' before using `+='
>
Yeah, so we just use '=' now :-)
> > if HAVE_TAGSOUP
> > NETX_CLASSPATH_ARG=-classpath $(TAGSOUP_JAR)
> > NETX_LAUNCHER_ARG="-Xbootclasspath/a:$(TAGSOUP_JAR)"
> > else
> > NETX_EXCLUDE_SRCS+=net.sourceforge.jnlp.MalformedXMLParser.java
> > endif
> >
> > would work fine and you can drop the netx-dummy.jar rule.
> >
>
> Thanks for the idea. What I wanted to do (a little prematurely, I
> suppose) was to make sure that more dependencies could be added in the
> future (with their own configure flags, if necessary) without changing
> the code too much. I also wanted all build code-paths to be as close as
> possible. Which is why I wanted to always have a classpath for
> netx-building (even if it was effectively blank using netx-dummy.jar)
> But my approach just makes the Makefile look like a mess.
>
Yeah, I guessed your motivation. I'm just not sure it's worth bending over
backwards to accomodate it. I guess we can scratch our heads over a good
solution should we need a second dependency.
> > Should we really be putting tagsoup on the bootclasspath? What's wrong with the classpath?
> >
>
> I have tested it out now with classpath and it looks like the javaws
> launcher does not like it:
> $ javaws XEtchedButtonDemo.jnlp
> Unrecognized option: -classpath /usr/share/java/tagsoup.jar
> Could not create the Java virtual machine.
>
> There is probably a way around this, I will see if I can find it.
>
After writing the last reply, it also came to mind that setting this might
cause issues with a classpath passed to javaws (if that's possible).
So needs some testing. I'm just wary that tagsoup includes unknown code and
it's a bit dangerous to put it on the privileged bootclasspath. Then again,
I'm not sure any of javaws should be on the bootclasspath.
> > As to excluding the file, have you tested this? Are you sure no other Java files pull
> > that class in?
> >
>
> Yup. This is one code path I made sure to test. MalformedXMLParser is a
> new file I added in this patch. The class is never used directly. Only
> net.sourceforge.jnlp.Parser uses it, and that too through reflection.
> Building (and running) without tagsoup works just fine.
>
> > For configure, the argument should be the path to the jar file. Otherwise, the JAR file
> > always has to be 'tagsoup.jar' which may not be the case.
> >
>
> Isnt this already the case? Perhaps I missed something, but the code
> does this: if --with-tagsoup=no then HAVE_TAGSOUP is set to false. if
> --with-tagsoup=somevar then somevar is used as the location of the
> tagsoup.jar. If --with-tagsoup is not used, then /usr/share/java (and
> other locations) are searched for a tagsoup.jar.
>
Yes sorry, you're right. The block I was looking at is only used if
no option is provided by the user, in which case we know the predefined paths.
> > You should also check /usr/share/tagsoup/lib/tagsoup.jar which is the Gentoo installation path.
> > Debian uses /usr/share/java/tagsoup.jar as already checked.
> >
>
> Ah, thanks. Updated patch attached.
>
Thanks. Don't know why they don't use /usr/share/java.
> Cheers,
> Omair
> diff -r dc02a605f905 Makefile.am
> --- a/Makefile.am Fri Jan 07 08:00:08 2011 -0500
> +++ b/Makefile.am Mon Jan 10 19:09:30 2011 -0500
> @@ -31,6 +31,8 @@
> net.sourceforge.jnlp.services net.sourceforge.jnlp.tools \
> net.sourceforge.jnlp.util net.sourceforge.jnlp.controlpanel
>
> +NETX_EXCLUDE_SRCS=
> +
> # Conditional defintions
> if ENABLE_PLUGIN
> ICEDTEAPLUGIN_CLEAN = clean-IcedTeaPlugin
> @@ -68,6 +70,13 @@
> endif
> endif
>
> +if HAVE_TAGSOUP
> +NETX_CLASSPATH_ARG=-classpath $(TAGSOUP_JAR)
> +NETX_LAUNCHER_ARG="-Xbootclasspath/a:$(TAGSOUP_JAR)",
> +else
> +NETX_EXCLUDE_SRCS+=net.sourceforge.jnlp.MalformedXMLParser.java
> +endif
> +
> # Launcher
>
> LAUNCHER_SRCDIR = $(abs_top_srcdir)/launcher
> @@ -279,14 +288,19 @@
> # a patch applied to sun.plugin.AppletViewerPanel and generated sources
>
> netx-source-files.txt:
> - find $(NETX_SRCDIR) -name '*.java' | sort > $@
> + find $(NETX_SRCDIR) -name '*.java' | sort > $@ ; \
> + for src in $(NETX_EXCLUDE_SRCS) ; \
> + do \
> + sed -i "/$${src}/ d" $@ ; \
> + done
>
> -stamps/netx.stamp: netx-source-files.txt stamps/bootstrap-directory.stamp
> +stamps/netx.stamp: netx-source-files.txt stamps/bootstrap-directory.stamp netx-dummy.jar
> mkdir -p $(NETX_DIR)
> $(BOOT_DIR)/bin/javac $(IT_JAVACFLAGS) \
> -d $(NETX_DIR) \
> -sourcepath $(NETX_SRCDIR) \
> -bootclasspath $(RUNTIME) \
> + $(NETX_CLASSPATH_ARG) \
> @netx-source-files.txt
> (cd $(NETX_RESOURCE_DIR); \
> for files in $$(find . -type f); \
> @@ -349,7 +363,7 @@
> $(NETX_DIR)/launcher/%.o: $(LAUNCHER_SRCDIR)/%.c
> mkdir -p $(NETX_DIR)/launcher && \
> $(CC) $(LAUNCHER_FLAGS) \
> - -DJAVA_ARGS='{ "-J-ms8m", "-J-Djava.icedtea-web.bin=$(DESTDIR)$(bindir)/javaws", "net.sourceforge.jnlp.runtime.Boot", }' \
> + -DJAVA_ARGS='{ "-J-ms8m", "-J-Djava.icedtea-web.bin=$(DESTDIR)$(bindir)/javaws", $(NETX_LAUNCHER_ARG) "net.sourceforge.jnlp.runtime.Boot", }' \
> -DPROGNAME='"javaws"' -c -o $@ $<
>
> $(NETX_DIR)/launcher/controlpanel/%.o: $(LAUNCHER_SRCDIR)/%.c
> diff -r dc02a605f905 NEWS
> --- a/NEWS Fri Jan 07 08:00:08 2011 -0500
> +++ b/NEWS Mon Jan 10 19:09:30 2011 -0500
> @@ -8,7 +8,12 @@
>
> CVE-XXXX-YYYY: http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=XXXX-YYYY
>
> -New in release 1.0 (2010-XX-XX):
> +New in release 1.1 (2011-XX-XX):
> +
> +* NetX
> + - Netx can now parse malformed jnlp files using tagsoup
> +
> +New in release 1.0 (2011-XX-XX):
>
> * Initial release of IcedTea-Web
> * Security updates
> diff -r dc02a605f905 acinclude.m4
> --- a/acinclude.m4 Fri Jan 07 08:00:08 2011 -0500
> +++ b/acinclude.m4 Mon Jan 10 19:09:30 2011 -0500
> @@ -297,6 +297,36 @@
> fi
> ])
>
> +
> +AC_DEFUN_ONCE([IT_CHECK_FOR_TAGSOUP],
> +[
> + AC_MSG_CHECKING([for tagsoup])
> + AC_ARG_WITH([tagsoup],
> + [AS_HELP_STRING([--with-tagsoup],
> + [support malformed jnlp files])],
> + [ TAGSOUP_JAR=${withval} ],
> + [ TAGSOUP_JAR= ])
> + if test x"${TAGSOUP_JAR}" = xyes ; then
> + TAGSOUP_JAR=
> + fi
> + if test -z "${TAGSOUP_JAR}" ; then
> + for dir in /usr/share/java /usr/local/share/java \
> + /usr/share/tagsoup/lib/ ; do
> + if test -f $dir/tagsoup.jar; then
> + TAGSOUP_JAR=$dir/tagsoup.jar
> + break
> + fi
> + done
> + fi
> + if test x"${TAGSOUP_JAR}" = x ; then
> + TAGSOUP_JAR=no
> + fi
> + AC_MSG_RESULT(${TAGSOUP_JAR})
> + AC_SUBST(TAGSOUP_JAR)
> + AM_CONDITIONAL([HAVE_TAGSOUP], [test x$TAGSOUP_JAR != xno])
> +])
> +
> +
> dnl Generic macro to check for a Java class
> dnl Takes the name of the class as an argument. The macro name
> dnl is usually the name of the class with '.'
> diff -r dc02a605f905 configure.ac
> --- a/configure.ac Fri Jan 07 08:00:08 2011 -0500
> +++ b/configure.ac Mon Jan 10 19:09:30 2011 -0500
> @@ -80,4 +80,6 @@
> IT_CHECK_FOR_CLASS(SUN_APPLET_APPLETIMAGEREF, [sun.applet.AppletImageRef])
> IT_CHECK_FOR_APPLETVIEWERPANEL_HOLE
>
> +IT_CHECK_FOR_TAGSOUP
> +
> AC_OUTPUT
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/JNLPFile.java
> --- a/netx/net/sourceforge/jnlp/JNLPFile.java Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/JNLPFile.java Mon Jan 10 19:09:30 2011 -0500
> @@ -67,6 +67,9 @@
> /** the network location of this JNLP file */
> protected URL fileLocation;
>
> + /** the ParserSettings which were used to parse this file */
> + protected ParserSettings parserSettings = null;
> +
> /** A key that uniquely identifies connected instances (main jnlp+ext) */
> protected String uniqueKey = null;
>
> @@ -132,7 +135,7 @@
> * @throws ParseException if the JNLP file was invalid
> */
> public JNLPFile(URL location) throws IOException, ParseException {
> - this(location, false); // not strict
> + this(location, new ParserSettings());
> }
>
> /**
> @@ -140,12 +143,12 @@
> * default policy.
> *
> * @param location the location of the JNLP file
> - * @param strict whether to enforce the spec when
> + * @param settings the parser settings to use while parsing the file
> * @throws IOException if an IO exception occurred
> * @throws ParseException if the JNLP file was invalid
> */
> - public JNLPFile(URL location, boolean strict) throws IOException, ParseException {
> - this(location, (Version) null, strict);
> + public JNLPFile(URL location, ParserSettings settings) throws IOException, ParseException {
> + this(location, (Version) null, settings);
> }
>
> /**
> @@ -154,12 +157,12 @@
> *
> * @param location the location of the JNLP file
> * @param version the version of the JNLP file
> - * @param strict whether to enforce the spec when
> + * @param settings the parser settings to use while parsing the file
> * @throws IOException if an IO exception occurred
> * @throws ParseException if the JNLP file was invalid
> */
> - public JNLPFile(URL location, Version version, boolean strict) throws IOException, ParseException {
> - this(location, version, strict, JNLPRuntime.getDefaultUpdatePolicy());
> + public JNLPFile(URL location, Version version, ParserSettings settings) throws IOException, ParseException {
> + this(location, version, settings, JNLPRuntime.getDefaultUpdatePolicy());
> }
>
> /**
> @@ -168,14 +171,15 @@
> *
> * @param location the location of the JNLP file
> * @param version the version of the JNLP file
> - * @param strict whether to enforce the spec when
> + * @param settings the parser settings to use while parsing the file
> * @param policy the update policy
> * @throws IOException if an IO exception occurred
> * @throws ParseException if the JNLP file was invalid
> */
> - public JNLPFile(URL location, Version version, boolean strict, UpdatePolicy policy) throws IOException, ParseException {
> - Node root = Parser.getRootNode(openURL(location, version, policy));
> - parse(root, strict, location);
> + public JNLPFile(URL location, Version version, ParserSettings settings, UpdatePolicy policy) throws IOException, ParseException {
> + this.parserSettings = settings;
> +
> + parse(openURL(location, version, policy), location);
>
> //Downloads the original jnlp file into the cache if possible
> //(i.e. If the jnlp file being launched exist locally, but it
> @@ -202,13 +206,13 @@
> * @param location the location of the JNLP file
> * @param uniqueKey A string that uniquely identifies connected instances
> * @param version the version of the JNLP file
> - * @param strict whether to enforce the spec when
> + * @param settings the parser settings to use while parsing the file
> * @param policy the update policy
> * @throws IOException if an IO exception occurred
> * @throws ParseException if the JNLP file was invalid
> */
> - public JNLPFile(URL location, String uniqueKey, Version version, boolean strict, UpdatePolicy policy) throws IOException, ParseException {
> - this(location, version, strict, policy);
> + public JNLPFile(URL location, String uniqueKey, Version version, ParserSettings settings, UpdatePolicy policy) throws IOException, ParseException {
> + this(location, version, settings, policy);
> this.uniqueKey = uniqueKey;
>
> if (JNLPRuntime.isDebug())
> @@ -218,11 +222,14 @@
> /**
> * Create a JNLPFile from an input stream.
> *
> + * @param input input stream to read the JNLP file from
> + * @param settings the parser settings to use while parsing the file
> * @throws IOException if an IO exception occurred
> * @throws ParseException if the JNLP file was invalid
> */
> - public JNLPFile(InputStream input, boolean strict) throws ParseException {
> - parse(Parser.getRootNode(input), strict, null);
> + public JNLPFile(InputStream input, ParserSettings settings) throws ParseException {
> + this.parserSettings = settings;
> + parse(input, null);
> }
>
> /**
> @@ -288,6 +295,13 @@
> }
>
> /**
> + * Returns the ParserSettings that was used to parse this file
> + */
> + public ParserSettings getParserSettings() {
> + return parserSettings;
> + }
> +
> + /**
> * Returns the JNLP file's version.
> */
> public Version getFileVersion() {
> @@ -548,12 +562,13 @@
> * @param strict whether to enforce the spec when
> * @param location the file location or null
> */
> - private void parse(Node root, boolean strict, URL location) throws ParseException {
> + private void parse(InputStream input, URL location) throws ParseException {
> try {
> //if (location != null)
> // location = new URL(location, "."); // remove filename
>
> - Parser parser = new Parser(this, location, root, strict, true); // true == allow extensions
> + Node root = Parser.getRootNode(input, parserSettings);
> + Parser parser = new Parser(this, location, root, parserSettings);
>
> // JNLP tag information
> specVersion = parser.getSpecVersion();
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/Launcher.java
> --- a/netx/net/sourceforge/jnlp/Launcher.java Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/Launcher.java Mon Jan 10 19:09:30 2011 -0500
> @@ -360,9 +360,11 @@
> JNLPFile file = null;
>
> try {
> - file = new JNLPFile(location, (Version) null, true, updatePolicy); // strict
> + ParserSettings settings = new ParserSettings(true, true, false);
> + file = new JNLPFile(location, (Version) null, settings, updatePolicy); // strict
> } catch (ParseException ex) {
> - file = new JNLPFile(location, (Version) null, false, updatePolicy);
> + ParserSettings settings = new ParserSettings(false, true, true);
> + file = new JNLPFile(location, (Version) null, settings, updatePolicy);
>
> // only here if strict failed but lax did not fail
> LaunchException lex =
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/MalformedXMLParser.java
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/netx/net/sourceforge/jnlp/MalformedXMLParser.java Mon Jan 10 19:09:30 2011 -0500
> @@ -0,0 +1,101 @@
> +// Copyright (C) 2011 Red Hat, Inc.
> +//
> +// This library is free software; you can redistribute it and/or
> +// modify it under the terms of the GNU Lesser General Public
> +// License as published by the Free Software Foundation; either
> +// version 2.1 of the License, or (at your option) any later version.
> +//
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +// Lesser General Public License for more details.
> +//
> +// You should have received a copy of the GNU Lesser General Public
> +// License along with this library; if not, write to the Free Software
> +// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> +
> +package net.sourceforge.jnlp;
> +
> +import static net.sourceforge.jnlp.runtime.Translator.R;
> +
> +import java.io.ByteArrayInputStream;
> +import java.io.ByteArrayOutputStream;
> +import java.io.IOException;
> +import java.io.InputStream;
> +import java.io.OutputStreamWriter;
> +import java.io.Writer;
> +
> +import net.sourceforge.jnlp.runtime.JNLPRuntime;
> +
> +import org.ccil.cowan.tagsoup.HTMLSchema;
> +import org.ccil.cowan.tagsoup.Parser;
> +import org.ccil.cowan.tagsoup.XMLWriter;
> +import org.xml.sax.InputSource;
> +import org.xml.sax.SAXException;
> +import org.xml.sax.XMLReader;
> +
> +/**
> + * An specialized {@link XMLParser} that uses TagSoup[1] to parse
> + * malformed XML
> + *
> + * Used by net.sourceforge.jnlp.Parser
> + *
> + * [1] http://home.ccil.org/~cowan/XML/tagsoup/
> + */
> +public class MalformedXMLParser extends XMLParser {
> +
> + /**
> + * Parses the data from an {@link InputStream} to create a XML tree.
> + * Returns a {@link Node} representing the root of the tree.
> + *
> + * @param input the {@link InputStream} to read data from
> + * @throws ParseException if an exception occurs while parsing the input
> + */
> + @Override
> + public Node getRootNode(InputStream input) throws ParseException {
> + if (JNLPRuntime.isDebug()) {
> + System.out.println("Using MalformedXMLParser");
> + }
> + InputStream xmlInput = xmlizeInputStream(input);
> + return super.getRootNode(xmlInput);
> + }
> +
> + /**
> + * Reads malformed XML from the InputStream original and returns a new
> + * InputStream which can be used to read a well-formed version of the input
> + *
> + * @param original
> + * @return an {@link InputStream} which can be used to read a well-formed
> + * version of the input XML
> + * @throws ParseException
> + */
> + private InputStream xmlizeInputStream(InputStream original) throws ParseException {
> + try {
> + ByteArrayOutputStream out = new ByteArrayOutputStream();
> +
> + HTMLSchema schema = new HTMLSchema();
> + XMLReader reader = new Parser();
> +
> + reader.setProperty(Parser.schemaProperty, schema);
> + reader.setFeature(Parser.bogonsEmptyFeature, false);
> + reader.setFeature(Parser.ignorableWhitespaceFeature, true);
> + reader.setFeature(Parser.ignoreBogonsFeature, false);
> +
> + Writer writeger = new OutputStreamWriter(out);
> + XMLWriter x = new XMLWriter(writeger);
> +
> + reader.setContentHandler(x);
> +
> + InputSource s = new InputSource(original);
> +
> + reader.parse(s);
> + return new ByteArrayInputStream(out.toByteArray());
> + } catch (SAXException e) {
> + throw new ParseException(R("PBadXML"), e);
> + } catch (IOException e) {
> + throw new ParseException(R("PBadXML"), e);
> + }
> +
> + }
> +
> +}
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/Parser.java
> --- a/netx/net/sourceforge/jnlp/Parser.java Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/Parser.java Mon Jan 10 19:09:30 2011 -0500
> @@ -1,5 +1,5 @@
> // Copyright (C) 2001-2003 Jon A. Maxwell (JAM)
> -// Copyright (C) 2009 Red Hat, Inc.
> +// Copyright (C) 2011 Red Hat, Inc.
> //
> // This library is free software; you can redistribute it and/or
> // modify it under the terms of the GNU Lesser General Public
> @@ -20,15 +20,13 @@
> import static net.sourceforge.jnlp.runtime.Translator.R;
>
> import java.io.*;
> +import java.lang.reflect.InvocationTargetException;
> +import java.lang.reflect.Method;
> import java.net.*;
> import java.util.*;
> -//import javax.xml.parsers.*; // commented to use right Node
> -//import org.w3c.dom.*; // class for using Tiny XML | NanoXML
> -//import org.xml.sax.*;
> -//import gd.xml.tiny.*;
> +
> import net.sourceforge.jnlp.UpdateDesc.Check;
> import net.sourceforge.jnlp.UpdateDesc.Policy;
> -import net.sourceforge.nanoxml.*;
>
> /**
> * Contains methods to parse an XML document into a JNLPFile.
> @@ -105,15 +103,14 @@
> * @param file the (uninitialized) file reference
> * @param base if codebase is not specified, a default base for relative URLs
> * @param root the root node
> - * @param strict whether to enforce strict compliance with the JNLP spec
> - * @param allowExtensions whether to allow extensions to the JNLP spec
> + * @param settings the parser settings to use when parsing the JNLP file
> * @throws ParseException if the JNLP file is invalid
> */
> - public Parser(JNLPFile file, URL base, Node root, boolean strict, boolean allowExtensions) throws ParseException {
> + public Parser(JNLPFile file, URL base, Node root, ParserSettings settings) throws ParseException {
> this.file = file;
> this.root = root;
> - this.strict = strict;
> - this.allowExtensions = allowExtensions;
> + this.strict = settings.isStrict();
> + this.allowExtensions = settings.isExtensionAllowed();
>
> // ensure it's a JNLP node
> if (root == null || !root.getNodeName().equals("jnlp"))
> @@ -1205,116 +1202,33 @@
> *
> * @throws ParseException if the JNLP file is invalid
> */
> - public static Node getRootNode(InputStream input) throws ParseException {
> + public static Node getRootNode(InputStream input, ParserSettings settings) throws ParseException {
> + String className = null;
> + if (settings.isMalfromedXmlAllowed()) {
> + className = "net.sourceforge.jnlp.MalformedXMLParser";
> + } else {
> + className = "net.sourceforge.jnlp.XMLParser";
> + }
> +
> try {
> - /* SAX
> - DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
> - factory.setValidating(false);
> - factory.setNamespaceAware(true);
> - DocumentBuilder builder = factory.newDocumentBuilder();
> - builder.setErrorHandler(errorHandler);
> + Class<?> klass = null;
> + try {
> + klass = Class.forName(className);
> + } catch (ClassNotFoundException e) {
> + klass = Class.forName("net.sourceforge.jnlp.XMLParser");
> + }
> + Object instance = klass.newInstance();
> + Method m = klass.getMethod("getRootNode", InputStream.class);
>
> - Document doc = builder.parse(input);
> - return doc.getDocumentElement();
> - */
> -
> - /* TINY
> - Node document = new Node(TinyParser.parseXML(input));
> - Node jnlpNode = getChildNode(document, "jnlp"); // skip comments
> - */
> -
> - //A BufferedInputStream is used to allow marking and reseting
> - //of a stream.
> - BufferedInputStream bs = new BufferedInputStream(input);
> -
> - /* NANO */
> - final XMLElement xml = new XMLElement();
> - final PipedInputStream pin = new PipedInputStream();
> - final PipedOutputStream pout = new PipedOutputStream(pin);
> - final InputStreamReader isr = new InputStreamReader(bs, getEncoding(bs));
> - // Clean the jnlp xml file of all comments before passing
> - // it to the parser.
> - new Thread(
> - new Runnable() {
> - public void run() {
> - (new XMLElement()).sanitizeInput(isr, pout);
> - try {
> - pout.close();
> - } catch (IOException ioe) {
> - ioe.printStackTrace();
> - }
> - }
> - }).start();
> - xml.parseFromReader(new InputStreamReader(pin));
> - Node jnlpNode = new Node(xml);
> - return jnlpNode;
> - } catch (Exception ex) {
> - throw new ParseException(R("PBadXML"), ex);
> + return (Node) m.invoke(instance, input);
> + } catch (InvocationTargetException e) {
> + if (e.getCause() instanceof ParseException) {
> + throw (ParseException)(e.getCause());
> + }
> + throw new ParseException(R("PBadXML"), e);
> + } catch (Exception e) {
> + throw new ParseException(R("PBadXML"), e);
> }
> }
>
> - /**
> - * Returns the name of the encoding used in this InputStream.
> - *
> - * @param input the InputStream
> - * @return a String representation of encoding
> - */
> - private static String getEncoding(InputStream input) throws IOException {
> - //Fixme: This only recognizes UTF-8, UTF-16, and
> - //UTF-32, which is enough to parse the prolog portion of xml to
> - //find out the exact encoding (if it exists). The reason being
> - //there could be other encodings, such as ISO 8859 which is 8-bits
> - //but it supports latin characters.
> - //So what needs to be done is to parse the prolog and retrieve
> - //the exact encoding from it.
> -
> - int[] s = new int[4];
> - String encoding = "UTF-8";
> -
> - //Determine what the first four bytes are and store
> - //them into an int array.
> - input.mark(4);
> - for (int i = 0; i < 4; i++) {
> - s[i] = input.read();
> - }
> - input.reset();
> -
> - //Set the encoding base on what the first four bytes of the
> - //inputstream turn out to be (following the information from
> - //www.w3.org/TR/REC-xml/#sec-guessing).
> - if (s[0] == 255) {
> - if (s[1] == 254) {
> - if (s[2] != 0 || s[3] != 0) {
> - encoding = "UnicodeLittle";
> - } else {
> - encoding = "X-UTF-32LE-BOM";
> - }
> - }
> - } else if (s[0] == 254 && s[1] == 255 && (s[2] != 0 ||
> - s[3] != 0)) {
> - encoding = "UTF-16";
> -
> - } else if (s[0] == 0 && s[1] == 0 && s[2] == 254 &&
> - s[3] == 255) {
> - encoding = "X-UTF-32BE-BOM";
> -
> - } else if (s[0] == 0 && s[1] == 0 && s[2] == 0 &&
> - s[3] == 60) {
> - encoding = "UTF-32BE";
> -
> - } else if (s[0] == 60 && s[1] == 0 && s[2] == 0 &&
> - s[3] == 0) {
> - encoding = "UTF-32LE";
> -
> - } else if (s[0] == 0 && s[1] == 60 && s[2] == 0 &&
> - s[3] == 63) {
> - encoding = "UTF-16BE";
> - } else if (s[0] == 60 && s[1] == 0 && s[2] == 63 &&
> - s[3] == 0) {
> - encoding = "UTF-16LE";
> - }
> -
> - return encoding;
> - }
> -
> }
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/ParserSettings.java
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/netx/net/sourceforge/jnlp/ParserSettings.java Mon Jan 10 19:09:30 2011 -0500
> @@ -0,0 +1,55 @@
> +// Copyright (C) 2011 Red Hat, Inc.
> +//
> +// This library is free software; you can redistribute it and/or
> +// modify it under the terms of the GNU Lesser General Public
> +// License as published by the Free Software Foundation; either
> +// version 2.1 of the License, or (at your option) any later version.
> +//
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +// Lesser General Public License for more details.
> +//
> +// You should have received a copy of the GNU Lesser General Public
> +// License along with this library; if not, write to the Free Software
> +// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> +
> +package net.sourceforge.jnlp;
> +
> +/**
> + * Encapsulates settings to use with the JNLP Parser
> + */
> +public class ParserSettings {
> +
> + private final boolean strict;
> + private final boolean extensionAllowed;
> + private final boolean malformedXmlAllowed;
> +
> + /** Create a new ParserSettings with the defautl parser settings */
> + public ParserSettings() {
> + this(false, true, true);
> + }
> +
> + /** Create a new ParserSettings object */
> + public ParserSettings(boolean strict, boolean extensionAllowed, boolean malformedXmlAllowed) {
> + this.strict = strict;
> + this.extensionAllowed = extensionAllowed;
> + this.malformedXmlAllowed = malformedXmlAllowed;
> + }
> +
> + /** @return true if extensions to the spec are allowed */
> + public boolean isExtensionAllowed() {
> + return extensionAllowed;
> + }
> +
> + /** @return true if parsing malformed xml is allowed */
> + public boolean isMalfromedXmlAllowed() {
> + return malformedXmlAllowed;
> + }
> +
> + /** @return true if strict parsing mode is to be used */
> + public boolean isStrict() {
> + return strict;
> + }
> +
> +}
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/XMLParser.java
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/netx/net/sourceforge/jnlp/XMLParser.java Mon Jan 10 19:09:30 2011 -0500
> @@ -0,0 +1,163 @@
> +// Copyright (C) 2001-2003 Jon A. Maxwell (JAM)
> +// Copyright (C) 2011 Red Hat, Inc.
> +//
> +// This library is free software; you can redistribute it and/or
> +// modify it under the terms of the GNU Lesser General Public
> +// License as published by the Free Software Foundation; either
> +// version 2.1 of the License, or (at your option) any later version.
> +//
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +// Lesser General Public License for more details.
> +//
> +// You should have received a copy of the GNU Lesser General Public
> +// License along with this library; if not, write to the Free Software
> +// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> +
> +package net.sourceforge.jnlp;
> +
> +import static net.sourceforge.jnlp.runtime.Translator.R;
> +
> +import java.io.BufferedInputStream;
> +import java.io.IOException;
> +import java.io.InputStream;
> +import java.io.InputStreamReader;
> +import java.io.PipedInputStream;
> +import java.io.PipedOutputStream;
> +
> +import net.sourceforge.nanoxml.XMLElement;
> +
> +//import javax.xml.parsers.*; // commented to use right Node
> +//import org.w3c.dom.*; // class for using Tiny XML | NanoXML
> +//import org.xml.sax.*;
> +//import gd.xml.tiny.*;
> +
> +/**
> + * A gateway to the actual implementation of the parsers.
> + *
> + * Used by net.sourceforge.jnlp.Parser
> + */
> +class XMLParser {
> +
> + /**
> + * Parses input from an InputStream and returns a Node representing the
> + * root of the parse tree.
> + *
> + * @param input the {@link InputStream} containing the XML
> + * @return a {@link Node} representing the root of the parsed XML
> + * @throws ParseException
> + */
> + public Node getRootNode(InputStream input) throws ParseException {
> +
> + try {
> + /* SAX
> + DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
> + factory.setValidating(false);
> + factory.setNamespaceAware(true);
> + DocumentBuilder builder = factory.newDocumentBuilder();
> + builder.setErrorHandler(errorHandler);
> +
> + Document doc = builder.parse(input);
> + return doc.getDocumentElement();
> + */
> +
> + /* TINY
> + Node document = new Node(TinyParser.parseXML(input));
> + Node jnlpNode = getChildNode(document, "jnlp"); // skip comments
> + */
> +
> + //A BufferedInputStream is used to allow marking and reseting
> + //of a stream.
> + BufferedInputStream bs = new BufferedInputStream(input);
> +
> + /* NANO */
> + final XMLElement xml = new XMLElement();
> + final PipedInputStream pin = new PipedInputStream();
> + final PipedOutputStream pout = new PipedOutputStream(pin);
> + final InputStreamReader isr = new InputStreamReader(bs, getEncoding(bs));
> + // Clean the jnlp xml file of all comments before passing
> + // it to the parser.
> + new Thread(
> + new Runnable() {
> + public void run() {
> + (new XMLElement()).sanitizeInput(isr, pout);
> + try {
> + pout.close();
> + } catch (IOException ioe) {
> + ioe.printStackTrace();
> + }
> + }
> + }).start();
> + xml.parseFromReader(new InputStreamReader(pin));
> + Node jnlpNode = new Node(xml);
> + return jnlpNode;
> + } catch (Exception ex) {
> + throw new ParseException(R("PBadXML"), ex);
> + }
> + }
> +
> + /**
> + * Returns the name of the encoding used in this InputStream.
> + *
> + * @param input the InputStream
> + * @return a String representation of encoding
> + */
> + private static String getEncoding(InputStream input) throws IOException {
> + //Fixme: This only recognizes UTF-8, UTF-16, and
> + //UTF-32, which is enough to parse the prolog portion of xml to
> + //find out the exact encoding (if it exists). The reason being
> + //there could be other encodings, such as ISO 8859 which is 8-bits
> + //but it supports latin characters.
> + //So what needs to be done is to parse the prolog and retrieve
> + //the exact encoding from it.
> +
> + int[] s = new int[4];
> + String encoding = "UTF-8";
> +
> + //Determine what the first four bytes are and store
> + //them into an int array.
> + input.mark(4);
> + for (int i = 0; i < 4; i++) {
> + s[i] = input.read();
> + }
> + input.reset();
> +
> + //Set the encoding base on what the first four bytes of the
> + //inputstream turn out to be (following the information from
> + //www.w3.org/TR/REC-xml/#sec-guessing).
> + if (s[0] == 255) {
> + if (s[1] == 254) {
> + if (s[2] != 0 || s[3] != 0) {
> + encoding = "UnicodeLittle";
> + } else {
> + encoding = "X-UTF-32LE-BOM";
> + }
> + }
> + } else if (s[0] == 254 && s[1] == 255 && (s[2] != 0 ||
> + s[3] != 0)) {
> + encoding = "UTF-16";
> +
> + } else if (s[0] == 0 && s[1] == 0 && s[2] == 254 &&
> + s[3] == 255) {
> + encoding = "X-UTF-32BE-BOM";
> +
> + } else if (s[0] == 0 && s[1] == 0 && s[2] == 0 &&
> + s[3] == 60) {
> + encoding = "UTF-32BE";
> +
> + } else if (s[0] == 60 && s[1] == 0 && s[2] == 0 &&
> + s[3] == 0) {
> + encoding = "UTF-32LE";
> +
> + } else if (s[0] == 0 && s[1] == 60 && s[2] == 0 &&
> + s[3] == 63) {
> + encoding = "UTF-16BE";
> + } else if (s[0] == 60 && s[1] == 0 && s[2] == 63 &&
> + s[3] == 0) {
> + encoding = "UTF-16LE";
> + }
> +
> + return encoding;
> + }
> +}
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/resources/Messages.properties
> --- a/netx/net/sourceforge/jnlp/resources/Messages.properties Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/resources/Messages.properties Mon Jan 10 19:09:30 2011 -0500
> @@ -158,6 +158,7 @@
> BOHeadless = Disables download window, other UIs.
> BOStrict = Enables strict checking of JNLP file format.
> BOViewer = Shows the trusted certificate viewer.
> +BOXml = Uses an XML parser to parse the JNLP file.
> BXnofork = Do not create another JVM.
> BXclearcache= Clean the JNLP application cache.
> BOHelp = Print this message and exit.
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/runtime/Boot.java
> --- a/netx/net/sourceforge/jnlp/runtime/Boot.java Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/runtime/Boot.java Mon Jan 10 19:09:30 2011 -0500
> @@ -34,6 +34,7 @@
> import net.sourceforge.jnlp.LaunchException;
> import net.sourceforge.jnlp.Launcher;
> import net.sourceforge.jnlp.ParseException;
> +import net.sourceforge.jnlp.ParserSettings;
> import net.sourceforge.jnlp.PropertyDesc;
> import net.sourceforge.jnlp.ResourcesDesc;
> import net.sourceforge.jnlp.cache.CacheUtil;
> @@ -104,6 +105,7 @@
> + " -noupdate " + R("BONoupdate") + "\n"
> + " -headless " + R("BOHeadless") + "\n"
> + " -strict " + R("BOStrict") + "\n"
> + + " -xml " + R("BOXml") + "\n"
> + " -Xnofork " + R("BXnofork") + "\n"
> + " -Xclearcache " + R("BXclearcache") + "\n"
> + " -help " + R("BOHelp") + "\n";
> @@ -262,13 +264,22 @@
> e.printStackTrace();
> }
>
> - boolean strict = (null != getOption("-strict"));
> + boolean strict = false;
> + boolean malformedXmlAllowed = true;
>
> - JNLPFile file = new JNLPFile(url, strict);
> + if (null != getOption("-strict")) {
> + strict = true;
> + }
> + if (null != getOption("-xml")) {
> + malformedXmlAllowed = false;
> + }
> + ParserSettings settings = new ParserSettings(strict, true, malformedXmlAllowed);
> +
> + JNLPFile file = new JNLPFile(url, settings);
>
> // Launches the jnlp file where this file originated.
> if (file.getSourceLocation() != null) {
> - file = new JNLPFile(file.getSourceLocation(), strict);
> + file = new JNLPFile(file.getSourceLocation(), settings);
> }
>
> // add in extra params from command line
> diff -r dc02a605f905 netx/net/sourceforge/jnlp/runtime/JNLPClassLoader.java
> --- a/netx/net/sourceforge/jnlp/runtime/JNLPClassLoader.java Fri Jan 07 08:00:08 2011 -0500
> +++ b/netx/net/sourceforge/jnlp/runtime/JNLPClassLoader.java Mon Jan 10 19:09:30 2011 -0500
> @@ -50,6 +50,7 @@
> import net.sourceforge.jnlp.JNLPFile;
> import net.sourceforge.jnlp.LaunchException;
> import net.sourceforge.jnlp.ParseException;
> +import net.sourceforge.jnlp.ParserSettings;
> import net.sourceforge.jnlp.PluginBridge;
> import net.sourceforge.jnlp.ResourcesDesc;
> import net.sourceforge.jnlp.SecurityDesc;
> @@ -324,12 +325,12 @@
> * @param version the file's version
> * @param policy the update policy to use when downloading resources
> */
> - public static JNLPClassLoader getInstance(URL location, String uniqueKey, Version version, UpdatePolicy policy)
> + public static JNLPClassLoader getInstance(URL location, String uniqueKey, Version version, ParserSettings settings, UpdatePolicy policy)
> throws IOException, ParseException, LaunchException {
> JNLPClassLoader loader = urlToLoader.get(uniqueKey);
>
> if (loader == null || !location.equals(loader.getJNLPFile().getFileLocation()))
> - loader = getInstance(new JNLPFile(location, uniqueKey, version, false, policy), policy);
> + loader = getInstance(new JNLPFile(location, uniqueKey, version, settings, policy), policy);
>
> return loader;
> }
> @@ -348,7 +349,7 @@
> for (int i = 0; i < ext.length; i++) {
> try {
> String uniqueKey = this.getJNLPFile().getUniqueKey();
> - JNLPClassLoader loader = getInstance(ext[i].getLocation(), uniqueKey, ext[i].getVersion(), updatePolicy);
> + JNLPClassLoader loader = getInstance(ext[i].getLocation(), uniqueKey, ext[i].getVersion(), file.getParserSettings(), updatePolicy);
> loaderList.add(loader);
> } catch (Exception ex) {
> ex.printStackTrace();
--
Andrew :)
Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)
Support Free Java!
Contribute to GNU Classpath and IcedTea
http://www.gnu.org/software/classpath
http://icedtea.classpath.org
PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint = F8EF F1EA 401E 2E60 15FA 7927 142C 2591 94EF D9D8
More information about the distro-pkg-dev
mailing list