Fwd: [rfc][icedtea-web] renewed tagsoup

Adam Domurad adomurad at redhat.com
Mon Jun 10 08:39:50 PDT 2013


On 06/04/2013 09:21 AM, Jiri Vanek wrote:
> attached missing attachment
>
> -------- Original Message --------
> Subject: [rfc][icedtea-web] renewed tagsoup
> Date: Tue, 04 Jun 2013 15:18:40 +0200
> From: Jiri Vanek <jvanek at redhat.com>
> To: IcedTea Distro List <distro-pkg-dev at openjdk.java.net>
>
> Again, I'm trying to reincarnate old Omair's  tagsoup patch.
> As this was developed by Omair, then rewritten by Adam, and now by Me, I

I wouldn't say I rewrote it.

> think it is quite reviwed :)
> tests are adapted, and few added
>
> Ok for head? Some tests included.
>
> Also do not forget  taht the results of this patch are visible only with
> tagsoup installed.  Without it, or with -xml it behaves as normlay.
> (note, this will need to go to wiki)
> By the way the PR1026 is fixed by this, and there can come more
>
> I have already made quite deep testing, and some more testing will come,
> so please do not halt this for tests. Soem reproducers are in row.
>
> J.

Thank you very much for doing this!

We should collect what we could be fixed by this:

1. Mario noticed invalid XML warnings here:

javaws 
http://docs.oracle.com/javase/tutorialJWS/uiswing/misc/ex7/GradientTranslucentWindowDemo.jnlp
Status: Still broken. Unfortunately tagsoup cannot fix this unclosed 
string. I wonder what if anything we should do about this ?

2. From PR1026, applet at http://www.knuddels.de:8080/index.html?v=90aez&c=7
Status: Now works (as you have noted)

Actually I'm having trouble finding more. We should aim to support as 
much broken XML as there is out there, so hopefully we have more than 
one confirmed fixed case before deciding to go this route.


Comments inline:

> diff -r 6209aba5fa14 NEWS
> --- a/NEWS	Tue Jun 04 13:22:38 2013 +0200
> +++ b/NEWS	Tue Jun 04 14:46:38 2013 +0200
> @@ -9,8 +9,10 @@
>  CVE-XXXX-YYYY: http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=XXXX-YYYY
>
>  New in release 1.5 (2013-XX-XX):
> +* IcedTea-Web now using tagsoup as default (tagsoup depndence) sanitizer for input

[typo] depndence -> dependence

>  * NetX
>   - Netx can now parse malformed jnlp files using tagsoup

Huh, why is this already in the diff ? This is a better message, IMO. 
'for input' is pretty vague.

> + - PR1026 - Apps fail to run because of the nanoxml parser's strict XML validation
>  * Plugin
>    - PR854: Resizing an applet several times causes 100% CPU load
>
> diff -r 01b48fc84f85 -r 6209aba5fa14 Makefile.am
> --- a/Makefile.am	Mon Jun 03 14:16:42 2013 +0200
> +++ b/Makefile.am	Tue Jun 04 13:22:38 2013 +0200
> @@ -119,9 +119,9 @@
>  #    IllegalAccessException
>  #  - we want full privileges
>  #
> -export LAUNCHER_BOOTCLASSPATH="-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar$(RHINO_RUNTIME)"
> -export PLUGIN_BOOTCLASSPATH='"-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar:$(datadir)/$(PACKAGE_NAME)/plugin.jar$(RHINO_RUNTIME)"'
> -export PLUGIN_COVERAGE_BOOTCLASSPATH='"-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar:$(datadir)/$(PACKAGE_NAME)/plugin.jar$(RHINO_RUNTIME):$(JACOCO_CLASSPATH)"'
> +export LAUNCHER_BOOTCLASSPATH="-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar$(RHINO_RUNTIME):$(TAGSOUP_JAR)"
> +export PLUGIN_BOOTCLASSPATH='"-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar:$(datadir)/$(PACKAGE_NAME)/plugin.jar$(RHINO_RUNTIME):$(TAGSOUP_JAR)"'
> +export PLUGIN_COVERAGE_BOOTCLASSPATH='"-Xbootclasspath/a:$(datadir)/$(PACKAGE_NAME)/netx.jar:$(datadir)/$(PACKAGE_NAME)/plugin.jar$(RHINO_RUNTIME):$(JACOCO_CLASSPATH):$(TAGSOUP_JAR)"'
>
>  # Fake update version to work with the Deployment Toolkit script used by Oracle
>  # http://download.oracle.com/javase/tutorial/deployment/deploymentInDepth/depltoolkit_index.html
> @@ -138,7 +138,15 @@
>  	net.sourceforge.jnlp.security.viewer net.sourceforge.jnlp.services \
>  	net.sourceforge.jnlp.tools net.sourceforge.jnlp.util
>
> +NETX_EXCLUDE_SRCS=
> +
>  # Conditional defintions
> +if HAVE_TAGSOUP
> +NETX_CLASSPATH_ARG=-classpath $(TAGSOUP_JAR)
> +else
> +NETX_EXCLUDE_SRCS+=net.sourceforge.jnlp.MalformedXMLParser.java
> +endif
> +
>  if ENABLE_PLUGIN
>  export ICEDTEAPLUGIN_CLEAN = clean-IcedTeaPlugin
>  export LIVECONNECT_DIR = netscape sun/applet
> @@ -415,6 +423,7 @@
>  	  $(BOOT_DIR)/bin/javac $(IT_JAVACFLAGS) \
>  	      -d $(abs_top_builddir)/liveconnect \
>  	      -bootclasspath $(NETX_DIR):$(RUNTIME) \
> +	      $(NETX_CLASSPATH_ARG) \
>  	      -sourcepath $(LIVECONNECT_SRCS) \
>  	      @liveconnect-source-files.txt ; \
>  	fi
> @@ -445,7 +454,11 @@
>  # a patch applied to sun.plugin.AppletViewerPanel and generated sources
>
>  netx-source-files.txt:
> -	find $(NETX_SRCDIR) -name '*.java' | sort > $@
> +	find $(NETX_SRCDIR) -name '*.java' | sort > $@ ; \
> +	for src in $(NETX_EXCLUDE_SRCS) ; \
> +	do \
> +	  sed -i "/$${src}/ d" $@ ; \
> +	done
>  if !WITH_RHINO
>  	sed -i '/RhinoBasedPacEvaluator/ d' $@
>  endif
> @@ -459,6 +472,7 @@
>  	    -d $(NETX_DIR) \
>  	    -sourcepath $(NETX_SRCDIR) \
>  	    -bootclasspath $(RUNTIME) \
> +	    $(NETX_CLASSPATH_ARG) \
>  	    @netx-source-files.txt
>  	(cd $(NETX_RESOURCE_DIR); \
>  	 for files in $$(find . -type f); \
> @@ -472,6 +486,10 @@
>  	mkdir -p stamps
>  	touch $@
>
> +netx-dummy.jar:
> +	echo "Manifest-Version: 1.0" > netx-dummy-manifest.mf
> +	jar cfm $@ netx-dummy-manifest.mf
> +

What is this used for ?

>  stamps/netx-dist.stamp: stamps/netx.stamp $(abs_top_builddir)/netx.manifest
>  	(cd $(NETX_DIR) ; \
>  	 mkdir -p lib ; \
> @@ -486,6 +504,7 @@
>
>  clean-netx:
>  	rm -rf $(NETX_DIR)
> +	rm -f netx-dummy.jar
>  	rm -f stamps/netx-dist.stamp
>  	rm -f netx-source-files.txt
>  	rm -f stamps/netx.stamp
> @@ -1006,7 +1025,7 @@
>  	mkdir -p $(NETX_UNIT_TEST_DIR) && \
>  	$(BOOT_DIR)/bin/javac $(IT_JAVACFLAGS) \
>  	 -d $(NETX_UNIT_TEST_DIR) \
> -	 -classpath $(JUNIT_JAR):$(abs_top_builddir)/liveconnect/lib/classes.jar:$(NETX_DIR)/lib/classes.jar:$(TEST_EXTENSIONS_DIR) \
> +	 -classpath $(JUNIT_JAR):$(abs_top_builddir)/liveconnect/lib/classes.jar:$(NETX_DIR)/lib/classes.jar:$(TEST_EXTENSIONS_DIR):$(TAGSOUP_JAR) \
>  	 @netx-unit-tests-source-files.txt && \
>  	mkdir -p stamps && \
>  	touch $@
> @@ -1036,7 +1055,7 @@
>  	done ; \
>  	cd $(NETX_UNIT_TEST_DIR) ; \
>  	class_names=`cat $(UNIT_CLASS_NAMES)` ; \
> -	CLASSPATH=$(NETX_DIR)/lib/classes.jar:$(abs_top_builddir)/liveconnect/lib/classes.jar:$(JUNIT_JAR):$(JUNIT_RUNNER_JAR):$(TEST_EXTENSIONS_DIR):.:$(TEST_EXTENSIONS_SRCDIR) \
> +	CLASSPATH=$(NETX_DIR)/lib/classes.jar:$(abs_top_builddir)/liveconnect/lib/classes.jar:$(JUNIT_JAR):$(JUNIT_RUNNER_JAR):$(TEST_EXTENSIONS_DIR):.:$(TEST_EXTENSIONS_SRCDIR):$(TAGSOUP_JAR) \
>  	  $(BOOT_DIR)/bin/java -Xbootclasspath:$(RUNTIME) CommandLine $$class_names
>  if WITH_XSLTPROC
>  	-$(XSLTPROC) --stringparam logs logs_unit.html $(TESTS_SRCDIR)/$(REPORT_STYLES_DIRNAME)/jreport.xsl $(NETX_UNIT_TEST_DIR)/tests-output.xml > $(TESTS_DIR)/index_unit.html
> @@ -1073,6 +1092,9 @@
>  	 -cp $(RHINO_RUNTIME) \
>  	 -cp $(TEST_EXTENSIONS_DIR) \
>  	 -cp $(TEST_EXTENSIONS_SRCDIR) \
> +if HAVE_TAGSOUP
> +	 -cp $(TAGSOUP_JAR) \
> +endif
>  	 -cp . \
>  	 -ix "-org.junit.*" \
>  	 -ix "-junit.*" \
> @@ -1117,7 +1139,7 @@
>  	  mv $(NETX_UNIT_TEST_DIR)/$$file  $(NETX_UNIT_TEST_DIR)/"$$file""$(EMMA_BACKUP_SUFFIX)" ; \
>  	done ;\
>  	class_names=`cat $(UNIT_CLASS_NAMES)` ; \
> -	CLASSPATH=$(NETX_DIR)/lib/classes.jar:$(abs_top_builddir)/liveconnect/lib/classes.jar:$(JUNIT_JAR):$(JUNIT_RUNNER_JAR):$(TEST_EXTENSIONS_DIR):$(JACOCO_CLASSPATH):.:$(TEST_EXTENSIONS_SRCDIR)  \
> +	CLASSPATH=$(NETX_DIR)/lib/classes.jar:$(abs_top_builddir)/liveconnect/lib/classes.jar:$(JUNIT_JAR):$(JUNIT_RUNNER_JAR):$(TEST_EXTENSIONS_DIR):$(JACOCO_CLASSPATH):.:$(TEST_EXTENSIONS_SRCDIR):$(TAGSOUP_JAR)  \
>  	  $(BOOT_DIR)/bin/java $(JACOCO_AGENT_SWITCH) -Xbootclasspath:$(RUNTIME)  CommandLine $$class_names ; \
>  	for file in $(EMMA_MODIFIED_FILES) ; do \
>  	  mv $(NETX_UNIT_TEST_DIR)/$$file  $(NETX_UNIT_TEST_DIR)/"$$file""$(EMMA_SUFFIX)" ; \



> diff -r 01b48fc84f85 -r 6209aba5fa14 NEWS
> --- a/NEWS	Mon Jun 03 14:16:42 2013 +0200
> +++ b/NEWS	Tue Jun 04 13:22:38 2013 +0200

Two diffs to the same file ? How'd that happen :-)

> @@ -9,6 +9,8 @@
>  CVE-XXXX-YYYY: http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=XXXX-YYYY
>
>  New in release 1.5 (2013-XX-XX):
> +* NetX
> + - Netx can now parse malformed jnlp files using tagsoup
>  * Plugin
>    - PR854: Resizing an applet several times causes 100% CPU load
>
> diff -r 01b48fc84f85 -r 6209aba5fa14 acinclude.m4
> --- a/acinclude.m4	Mon Jun 03 14:16:42 2013 +0200
> +++ b/acinclude.m4	Tue Jun 04 13:22:38 2013 +0200
> @@ -403,6 +403,31 @@
>  fi
>  ])
>
> +AC_DEFUN_ONCE([IT_CHECK_FOR_TAGSOUP],
> +[
> +  AC_MSG_CHECKING([for tagsoup])
> +  AC_ARG_WITH([tagsoup],
> +             [AS_HELP_STRING([--with-tagsoup],
> +                             [tagsoup.jar])],
> +             [
> +                TAGSOUP_JAR=${withval}
> +             ],
> +             [
> +                TAGSOUP_JAR=
> +             ])
> +  if test -z "${TAGSOUP_JAR}"; then
> +    for dir in /usr/share/java /usr/local/share/java ; do
> +      if test -f $dir/tagsoup.jar; then
> +        TAGSOUP_JAR=$dir/tagsoup.jar
> +	    break
> +      fi
> +    done
> +  fi
> +  AC_MSG_RESULT(${TAGSOUP_JAR})
> +  AC_SUBST(TAGSOUP_JAR)
> +  AM_CONDITIONAL([HAVE_TAGSOUP], [test x$TAGSOUP_JAR != xno])
> +])
> +
>  dnl Generic macro to check for a Java class
>  dnl Takes the name of the class as an argument.  The macro name
>  dnl is usually the name of the class with '.'
> diff -r 01b48fc84f85 -r 6209aba5fa14 configure.ac
> --- a/configure.ac	Mon Jun 03 14:16:42 2013 +0200
> +++ b/configure.ac	Tue Jun 04 13:22:38 2013 +0200
> @@ -111,6 +111,8 @@
>  IT_FIND_OPTIONAL_JAR([asm], ASM,
>      [/usr/share/java/objectweb-asm4/asm-all.jar /usr/share/java/objectweb-asm4/asm-all-4.0.jar /usr/share/java/objectweb-asm/asm-all.jar])
>
> +IT_CHECK_FOR_TAGSOUP
> +
>  AC_CONFIG_FILES([jrunscript], [chmod u+x jrunscript])
>  AC_CONFIG_FILES([build.properties])
>
> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/JNLPCreator.java
> --- a/netx/net/sourceforge/jnlp/JNLPCreator.java	Mon Jun 03 14:16:42 2013 +0200
> +++ b/netx/net/sourceforge/jnlp/JNLPCreator.java	Tue Jun 04 13:22:38 2013 +0200
> @@ -28,8 +28,8 @@
>  import net.sourceforge.jnlp.cache.UpdatePolicy;
>
>  public class JNLPCreator {
> -    public JNLPFile create(URL location, Version version, boolean strict,
> +    public JNLPFile create(URL location, Version version, ParserSettings settings,
>              UpdatePolicy policy, URL forceCodebase) throws IOException, ParseException {
> -        return new JNLPFile(location, version, strict, policy, forceCodebase);
> +        return new JNLPFile(location, version, settings, policy, forceCodebase);
>      }
>  }
> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/JNLPFile.java
> --- a/netx/net/sourceforge/jnlp/JNLPFile.java	Mon Jun 03 14:16:42 2013 +0200
> +++ b/netx/net/sourceforge/jnlp/JNLPFile.java	Tue Jun 04 13:22:38 2013 +0200
> @@ -67,6 +67,9 @@
>      /** the network location of this JNLP file */
>      protected URL fileLocation;
>
> +    /** the ParserSettings which were used to parse this file */
> +    protected ParserSettings parserSettings = null;
> +
>      /** A key that uniquely identifies connected instances (main jnlp+ext) */
>      protected String uniqueKey = null;
>
> @@ -145,7 +148,7 @@
>       * @throws ParseException if the JNLP file was invalid
>       */
>      public JNLPFile(URL location) throws IOException, ParseException {
> -        this(location, false); // not strict
> +        this(location, new ParserSettings());
>      }
>
>      /**
> @@ -153,12 +156,12 @@
>       * default policy.
>       *
>       * @param location the location of the JNLP file
> -     * @param strict whether to enforce the spec when
> +     * @param settings the parser settings to use while parsing the file
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(URL location, boolean strict) throws IOException, ParseException {
> -        this(location, (Version) null, strict);
> +    public JNLPFile(URL location, ParserSettings settings) throws IOException, ParseException {
> +        this(location, (Version) null, settings);
>      }
>
>      /**
> @@ -167,12 +170,12 @@
>       *
>       * @param location the location of the JNLP file
>       * @param version the version of the JNLP file
> -     * @param strict whether to enforce the spec when
> +     * @param settings the parser settings to use while parsing the file
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(URL location, Version version, boolean strict) throws IOException, ParseException {
> -        this(location, version, strict, JNLPRuntime.getDefaultUpdatePolicy());
> +    public JNLPFile(URL location, Version version, ParserSettings settings) throws IOException, ParseException {
> +        this(location, version, settings, JNLPRuntime.getDefaultUpdatePolicy());
>      }
>
>      /**
> @@ -186,8 +189,8 @@
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(URL location, Version version, boolean strict, UpdatePolicy policy) throws IOException, ParseException {
> -        this(location, version, strict, policy, null);
> +    public JNLPFile(URL location, Version version, ParserSettings settings, UpdatePolicy policy) throws IOException, ParseException {
> +	this(location, version, settings, policy, null);
>      }
>
>      /**
> @@ -196,15 +199,16 @@
>       *
>       * @param location the location of the JNLP file
>       * @param version the version of the JNLP file
> -     * @param strict whether to enforce the spec when
> +     * @param settings the parser settings to use while parsing the file
>       * @param policy the update policy
>       * @param forceCodebase codebase to use if not specified in JNLP file.
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    protected JNLPFile(URL location, Version version, boolean strict, UpdatePolicy policy, URL forceCodebase) throws IOException, ParseException {
> -        Node root = Parser.getRootNode(openURL(location, version, policy));
> -        parse(root, strict, location, forceCodebase);
> +    protected JNLPFile(URL location, Version version, ParserSettings settings, UpdatePolicy policy, URL forceCodebase) throws IOException, ParseException {
> +        InputStream input = openURL(location, version, policy);
> +        this.parserSettings = settings;
> +        parse(input, location, forceCodebase);
>
>          //Downloads the original jnlp file into the cache if possible
>          //(i.e. If the jnlp file being launched exist locally, but it
> @@ -231,13 +235,13 @@
>       * @param location the location of the JNLP file
>       * @param uniqueKey A string that uniquely identifies connected instances
>       * @param version the version of the JNLP file
> -     * @param strict whether to enforce the spec when
> +     * @param settings the parser settings to use while parsing the file
>       * @param policy the update policy
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(URL location, String uniqueKey, Version version, boolean strict, UpdatePolicy policy) throws IOException, ParseException {
> -        this(location, version, strict, policy);
> +    public JNLPFile(URL location, String uniqueKey, Version version, ParserSettings settings, UpdatePolicy policy) throws IOException, ParseException {
> +        this(location, version, settings, policy);
>          this.uniqueKey = uniqueKey;
>
>          if (JNLPRuntime.isDebug())
> @@ -250,8 +254,9 @@
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(InputStream input, boolean strict) throws ParseException {
> -        this(input, null, strict);
> +    public JNLPFile(InputStream input, ParserSettings settings) throws ParseException {
> +        this.parserSettings = settings;
> +        parse(input, null, null);
>      }
>
>      /**
> @@ -263,22 +268,23 @@
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    public JNLPFile(InputStream input, URL codebase, boolean strict) throws ParseException {
> -        parse(Parser.getRootNode(input), strict, null, codebase);
> +    public JNLPFile(InputStream input, URL codebase, ParserSettings settings) throws ParseException {
> +        this.parserSettings = settings;
> +        parse(input, null, codebase);
>      }
>
>      /**
>       * Create a JNLPFile from a character stream.
>       *
>       * @param input the stream
> -     * @param strict whether to enforce the spec when
> +     * @param settings the parser settings to use while parsing the file
>       * @throws IOException if an IO exception occurred
>       * @throws ParseException if the JNLP file was invalid
>       */
> -    private JNLPFile(Reader input, boolean strict) throws ParseException {
> -        // todo: now that we are using NanoXML we can use a Reader
> -        //parse(Parser.getRootNode(input), strict, null);
> -    }
> +//    private JNLPFile(Reader input, ParserSettings settings) throws ParseException {
> +//        todo: now that we are using NanoXML we can use a Reader
> +//        parse(Parser.getRootNode(input), settings, null);
> +//    }

I'm in favour of just axing this.

[..snip..]
> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/MalformedXMLParser.java
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/netx/net/sourceforge/jnlp/MalformedXMLParser.java	Tue Jun 04 13:22:38 2013 +0200
> @@ -0,0 +1,121 @@
> +/*
> +   Copyright (C) 2013 Red Hat, Inc.
> +
> +This file is part of IcedTea.
> +
> +IcedTea is free software; you can redistribute it and/or
> +modify it under the terms of the GNU General Public License as published by
> +the Free Software Foundation, version 2.
> +
> +IcedTea is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with IcedTea; see the file COPYING.  If not, write to
> +the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> +02110-1301 USA.
> +
> +Linking this library statically or dynamically with other modules is
> +making a combined work based on this library.  Thus, the terms and
> +conditions of the GNU General Public License cover the whole
> +combination.
> +
> +As a special exception, the copyright holders of this library give you
> +permission to link this library with independent modules to produce an
> +executable, regardless of the license terms of these independent
> +modules, and to copy and distribute the resulting executable under
> +terms of your choice, provided that you also meet, for each linked
> +independent module, the terms and conditions of the license of that
> +module.  An independent module is a module which is not derived from
> +or based on this library.  If you modify this library, you may extend
> +this exception to your version of the library, but you are not
> +obligated to do so.  If you do not wish to do so, delete this
> +exception statement from your version.
> + */
> +package net.sourceforge.jnlp;
> +
> +import static net.sourceforge.jnlp.runtime.Translator.R;
> +
> +import java.io.ByteArrayInputStream;
> +import java.io.ByteArrayOutputStream;
> +import java.io.IOException;
> +import java.io.InputStream;
> +import java.io.OutputStreamWriter;
> +import java.io.Writer;
> +
> +import net.sourceforge.jnlp.runtime.JNLPRuntime;
> +
> +import org.ccil.cowan.tagsoup.HTMLSchema;
> +import org.ccil.cowan.tagsoup.Parser;
> +import org.ccil.cowan.tagsoup.XMLWriter;
> +import org.xml.sax.InputSource;
> +import org.xml.sax.SAXException;
> +import org.xml.sax.XMLReader;
> +
> +/**
> + * An specialized {@link XMLParser} that uses TagSoup[1] to parse
> + * malformed XML
> + *
> + * Used by net.sourceforge.jnlp.Parser
> + *
> + * [1] http://home.ccil.org/~cowan/XML/tagsoup/
> + */
> +public class MalformedXMLParser extends XMLParser {
> +
> +    /**
> +     * Parses the data from an {@link InputStream} to create a XML tree.
> +     * Returns a {@link Node} representing the root of the tree.
> +     *
> +     * @param input the {@link InputStream} to read data from
> +     * @throws ParseException if an exception occurs while parsing the input
> +     */
> +    @Override
> +    public Node getRootNode(InputStream input) throws ParseException {
> +        if (JNLPRuntime.isDebug()) {
> +            System.out.println("Using MalformedXMLParser");
> +        }
> +        InputStream xmlInput = xmlizeInputStream(input);
> +        return super.getRootNode(xmlInput);
> +    }
> +
> +    /**
> +     * Reads malformed XML from the InputStream original and returns a new
> +     * InputStream which can be used to read a well-formed version of the input
> +     *
> +     * @param original
> +     * @return an {@link InputStream} which can be used to read a well-formed
> +     * version of the input XML
> +     * @throws ParseException
> +     */
> +    private InputStream xmlizeInputStream(InputStream original) throws ParseException {
> +        try {
> +            ByteArrayOutputStream out = new ByteArrayOutputStream();
> +
> +            HTMLSchema schema = new HTMLSchema();
> +            XMLReader reader = new Parser();
> +
> +            reader.setProperty(Parser.schemaProperty, schema);
> +            reader.setFeature(Parser.bogonsEmptyFeature, false);
> +            reader.setFeature(Parser.ignorableWhitespaceFeature, true);
> +            reader.setFeature(Parser.ignoreBogonsFeature, false);

It'd be nice to get some comments here about why these are chosen.

[..snip..]
> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/XmlParser.java
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/netx/net/sourceforge/jnlp/XmlParser.java	Tue Jun 04 13:22:38 2013 +0200
> @@ -0,0 +1,183 @@
> +/*
> +   Copyright (C) 2013 Red Hat, Inc.
> +
> +This file is part of IcedTea.
> +
> +IcedTea is free software; you can redistribute it and/or
> +modify it under the terms of the GNU General Public License as published by
> +the Free Software Foundation, version 2.
> +
> +IcedTea is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with IcedTea; see the file COPYING.  If not, write to
> +the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> +02110-1301 USA.
> +
> +Linking this library statically or dynamically with other modules is
> +making a combined work based on this library.  Thus, the terms and
> +conditions of the GNU General Public License cover the whole
> +combination.
> +
> +As a special exception, the copyright holders of this library give you
> +permission to link this library with independent modules to produce an
> +executable, regardless of the license terms of these independent
> +modules, and to copy and distribute the resulting executable under
> +terms of your choice, provided that you also meet, for each linked
> +independent module, the terms and conditions of the license of that
> +module.  An independent module is a module which is not derived from
> +or based on this library.  If you modify this library, you may extend
> +this exception to your version of the library, but you are not
> +obligated to do so.  If you do not wish to do so, delete this
> +exception statement from your version.
> + */
> +
> +package net.sourceforge.jnlp;
> +
> +import static net.sourceforge.jnlp.runtime.Translator.R;
> +
> +import java.io.BufferedInputStream;
> +import java.io.IOException;
> +import java.io.InputStream;
> +import java.io.InputStreamReader;
> +import java.io.PipedInputStream;
> +import java.io.PipedOutputStream;
> +
> +import net.sourceforge.nanoxml.XMLElement;
> +
> +//import javax.xml.parsers.*; // commented to use right Node
> +//import org.w3c.dom.*;       // class for using Tiny XML | NanoXML
> +//import org.xml.sax.*;
> +//import gd.xml.tiny.*;

[nit] Hm can we remove these ? I don't insist but they aren't clarifying 
anything for me, they're just confusing.

[..snip..]
> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/resources/Messages.properties
> --- a/netx/net/sourceforge/jnlp/resources/Messages.properties	Mon Jun 03 14:16:42 2013 +0200
> +++ b/netx/net/sourceforge/jnlp/resources/Messages.properties	Tue Jun 04 13:22:38 2013 +0200
> @@ -189,6 +189,7 @@
>  BOHeadless  = Disables download window, other UIs.
>  BOStrict    = Enables strict checking of JNLP file format.
>  BOViewer    = Shows the trusted certificate viewer.
> +BOXml       = Uses an strict XML parser to parse the JNLP file.

[typo] 'Uses an strict' -> 'Uses a strict'

[..snip..]

Overall looks good to me.

Cheers,
-Adam



More information about the distro-pkg-dev mailing list