Fwd: [rfc][icedtea-web] renewed tagsoup
Jiri Vanek
jvanek at redhat.com
Thu Jun 20 08:17:44 PDT 2013
On 06/10/2013 05:39 PM, Adam Domurad wrote:
> On 06/04/2013 09:21 AM, Jiri Vanek wrote:
>> attached missing attachment
>>
>> -------- Original Message --------
>> Subject: [rfc][icedtea-web] renewed tagsoup
>> Date: Tue, 04 Jun 2013 15:18:40 +0200
>> From: Jiri Vanek <jvanek at redhat.com>
>> To: IcedTea Distro List <distro-pkg-dev at openjdk.java.net>
>>
>> Again, I'm trying to reincarnate old Omair's tagsoup patch.
>> As this was developed by Omair, then rewritten by Adam, and now by Me, I
>
> I wouldn't say I rewrote it.
Thank for review anyway :)
>
>> think it is quite reviwed :)
>> tests are adapted, and few added
>>
>> Ok for head? Some tests included.
>>
>> Also do not forget taht the results of this patch are visible only with
>> tagsoup installed. Without it, or with -xml it behaves as normlay.
>> (note, this will need to go to wiki)
>> By the way the PR1026 is fixed by this, and there can come more
>>
>> I have already made quite deep testing, and some more testing will come,
>> so please do not halt this for tests. Soem reproducers are in row.
>>
>> J.
>
> Thank you very much for doing this!
>
> We should collect what we could be fixed by this:
>
> 1. Mario noticed invalid XML warnings here:
>
> javaws http://docs.oracle.com/javase/tutorialJWS/uiswing/misc/ex7/GradientTranslucentWindowDemo.jnlp
> Status: Still broken. Unfortunately tagsoup cannot fix this unclosed string. I wonder what if
> anything we should do about this ?
>
> 2. From PR1026, applet at http://www.knuddels.de:8080/index.html?v=90aez&c=7
> Status: Now works (as you have noted)
>
> Actually I'm having trouble finding more. We should aim to support as much broken XML as there is
> out there, so hopefully we have more than one confirmed fixed case before deciding to go this route.
I believe there is not osu many of them. But they exists. I recently found my old one (but already
fixed)
>
>
> Comments inline:
>
>> diff -r 6209aba5fa14 NEWS
>> --- a/NEWS Tue Jun 04 13:22:38 2013 +0200
>> +++ b/NEWS Tue Jun 04 14:46:38 2013 +0200
>> @@ -9,8 +9,10 @@
>> CVE-XXXX-YYYY: http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=XXXX-YYYY
>>
>> New in release 1.5 (2013-XX-XX):
>> +* IcedTea-Web now using tagsoup as default (tagsoup depndence) sanitizer for input
>
> [typo] depndence -> dependence
fixed
>
>> * NetX
>> - Netx can now parse malformed jnlp files using tagsoup
>
> Huh, why is this already in the diff ? This is a better message, IMO. 'for input' is pretty vague.
Because it have fixed the bug. And xml parsing is no longer stricxt with tagsoup
>
>> + - PR1026 - Apps fail to run because of the nanoxml parser's strict XML validation
>> * Plugin
>> - PR854: Resizing an applet several times causes 100% CPU load
>> @@ -472,6 +486,10 @@
>> mkdir -p stamps
>> touch $@
>>
>> +netx-dummy.jar:
>> + echo "Manifest-Version: 1.0" > netx-dummy-manifest.mf
>> + jar cfm $@ netx-dummy-manifest.mf
>> +
>
> What is this used for ?
Good catch. It was in one older versions. And I removed all traces of this - and missed this two.
Removed. thanx!
>
>> stamps/netx-dist.stamp: stamps/netx.stamp $(abs_top_builddir)/netx.manifest
>> (cd $(NETX_DIR) ; \
>> mkdir -p lib ; \
>> @@ -486,6 +504,7 @@
>>
>> clean-netx:
>> rm -rf $(NETX_DIR)
>> + rm -f netx-dummy.jar
>> rm -f stamps/netx-dist.stamp
>> rm -f netx-source-files.txt
>> rm -f stamps/netx.stamp
>> @@ -1006,7 +1025,7 @@
>> mkdir -p $(NETX_UNIT_TEST_DIR) && \
>
>
>
>> diff -r 01b48fc84f85 -r 6209aba5fa14 NEWS
>> --- a/NEWS Mon Jun 03 14:16:42 2013 +0200
>> +++ b/NEWS Tue Jun 04 13:22:38 2013 +0200
>
> Two diffs to the same file ? How'd that happen :-)
?? I'm wondering to!
>
>> @@ -9,6 +9,8 @@
>> CVE-XXXX-YYYY: http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=XXXX-YYYY
>>
>> New in release 1.5 (2013-XX-XX):
>> +* NetX
>> + - Netx can now parse malformed jnlp files using tagsoup
>> * Plugin
>> - PR854: Resizing an applet several times causes 100% CPU load
>>
>> diff -r 01b48fc84f85 -r 6209aba5fa14 acinclude.m4
>> --- a/acinclude.m4 Mon Jun 03 14:16:42 2013 +0200
>> +++ b/acinclude.m4 Tue Jun 04 13:22:38 2013 +0200
>> @@ -403,6 +403,31 @@
>> fi
>> ])
>>
snip
>> - private JNLPFile(Reader input, boolean strict) throws ParseException {
>> - // todo: now that we are using NanoXML we can use a Reader
>> - //parse(Parser.getRootNode(input), strict, null);
>> - }
>> +// private JNLPFile(Reader input, ParserSettings settings) throws ParseException {
>> +// todo: now that we are using NanoXML we can use a Reader
>> +// parse(Parser.getRootNode(input), settings, null);
>> +// }
>
> I'm in favour of just axing this.
ok
>
> [..snip..]
>> + HTMLSchema schema = new HTMLSchema();
>> + XMLReader reader = new Parser();
>> +
>> + reader.setProperty(Parser.schemaProperty, schema);
>> + reader.setFeature(Parser.bogonsEmptyFeature, false);
>> + reader.setFeature(Parser.ignorableWhitespaceFeature, true);
>> + reader.setFeature(Parser.ignoreBogonsFeature, false);
>
> It'd be nice to get some comments here about why these are chosen.
just hinted.. Needs more investigations
>
> [..snip..]
>> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/XmlParser.java
>> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
>> +++ b/netx/net/sourceforge/jnlp/XmlParser.java Tue Jun 04 13:22:38 2013 +0200
>> @@ -0,0 +1,183 @@
>> +/*
>> + Copyright (C) 2013 Red Hat, Inc.
>> +
>> +This file is part of IcedTea.
>> +
>> +IcedTea is free software; you can redistribute it and/or
>> +modify it under the terms of the GNU General Public License as published by
>> +the Free Software Foundation, version 2.
>> +
>> +IcedTea is distributed in the hope that it will be useful,
>> +but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> +General Public License for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with IcedTea; see the file COPYING. If not, write to
>> +the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>> +02110-1301 USA.
>> +
>> +Linking this library statically or dynamically with other modules is
>> +making a combined work based on this library. Thus, the terms and
>> +conditions of the GNU General Public License cover the whole
>> +combination.
>> +
>> +As a special exception, the copyright holders of this library give you
>> +permission to link this library with independent modules to produce an
>> +executable, regardless of the license terms of these independent
>> +modules, and to copy and distribute the resulting executable under
>> +terms of your choice, provided that you also meet, for each linked
>> +independent module, the terms and conditions of the license of that
>> +module. An independent module is a module which is not derived from
>> +or based on this library. If you modify this library, you may extend
>> +this exception to your version of the library, but you are not
>> +obligated to do so. If you do not wish to do so, delete this
>> +exception statement from your version.
>> + */
>> +
>> +package net.sourceforge.jnlp;
>> +
>> +import static net.sourceforge.jnlp.runtime.Translator.R;
>> +
>> +import java.io.BufferedInputStream;
>> +import java.io.IOException;
>> +import java.io.InputStream;
>> +import java.io.InputStreamReader;
>> +import java.io.PipedInputStream;
>> +import java.io.PipedOutputStream;
>> +
>> +import net.sourceforge.nanoxml.XMLElement;
>> +
>> +//import javax.xml.parsers.*; // commented to use right Node
>> +//import org.w3c.dom.*; // class for using Tiny XML | NanoXML
>> +//import org.xml.sax.*;
>> +//import gd.xml.tiny.*;
>
> [nit] Hm can we remove these ? I don't insist but they aren't clarifying anything for me, they're
> just confusing.
I would rather keep them in. as remainder of our shame - and as remainder that we can use DOM once :)
>
> [..snip..]
>> diff -r 01b48fc84f85 -r 6209aba5fa14 netx/net/sourceforge/jnlp/resources/Messages.properties
>> --- a/netx/net/sourceforge/jnlp/resources/Messages.properties Mon Jun 03 14:16:42 2013 +0200
>> +++ b/netx/net/sourceforge/jnlp/resources/Messages.properties Tue Jun 04 13:22:38 2013 +0200
>> @@ -189,6 +189,7 @@
>> BOHeadless = Disables download window, other UIs.
>> BOStrict = Enables strict checking of JNLP file format.
>> BOViewer = Shows the trusted certificate viewer.
>> +BOXml = Uses an strict XML parser to parse the JNLP file.
>
> [typo] 'Uses an strict' -> 'Uses a strict'
Fixed (how come! not my sentence! :) )
>
> [..snip..]
>
> Overall looks good to me.
>
> Cheers,
> -Adam
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tagsoup2.diff
Type: text/x-patch
Size: 99429 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20130620/8c96dd19/tagsoup2.diff
More information about the distro-pkg-dev
mailing list