[rfc] [icedtea-web] cleaning XML logging from wrong character

Mon Dec 3 07:06:00 PST 2012

Hi Jiri,

I have some comments re: your patch, please look to the notes below:

+    
+    public static String clearChars(String ss) {
+        StringBuilder s = new StringBuilder(ss);
+        for (int i = 0; i < s.length(); i++) {
+            char q = s.charAt(i);
+            if (q == '\n') {
+                continue;
+            }
+            int iq = (int) q;
+            if (iq == 9) {

*** 9 is a good old friendly :-) TAB character, so it should be simply tested as q == '\t' in the previous "if" clause

+                continue;
+            }
+            if (iq <= 31 || iq > 65533) {

*** Well 31 value is understandable (you are afraid of '\0' especially, aren't you?),
*** but I'm unsure about value 65533. Given that "char q" is an unsigned value in a range from 0 to 65535 and
*** you are converting it to "int", do you still need to filter out the last two chars positions only?
*** It is true that FFFE or FFFF positions can not be in Unicode, but the same applies to positions
*** FDD0 to FDEF.

+                s.setCharAt(i, 'I');
+                s.insert(i + 1, "NVALID_CHAR_" + iq);
+                i -= 1;

*** i-- :-)

+            }
+        }
+        return s.toString();
+    }

*** Still I think it's not a good idea to use String.charAt() to read characters from a string because
you will miss all surrogate pairs. char is 16 bit wide only and you AFAIK needs at least 21 bits to represent
all Unicode characters. So this code will fail in case of some asian character sets, for example (and Tengwar too :)

+            try {
+                Class clazz = Class.forName(stack[i].getClassName());
+                String path = null;
+                try {
+                    path = clazz.getProtectionDomain().getCodeSource().getLocation().getPath();
+                } catch (NullPointerException ex) {
+                    //silently ignoring and continuing with null path

*** you can't be sure where the NPE occurs, is not it better to print stack trace at least?

+                }
+                if (path != null && path.contains("/tests.build/")) {
+                    if (!path.contains("/test-extensions/")) {
+                        break;
+                    }
+                } else {
+                    //running from ide 
+                    if (!stack[i].getClassName().startsWith("net.sourceforge.jnlp.")) {
+                        break;
+                    }
+                }
+            } catch (ClassNotFoundException ex) {
+                ///should not happen, laoding only loaded 

*** sorry, can't understand this comment (laoding->loading, but what do you mean pls?)

+                ex.printStackTrace();

Cheers,
Pavel

----- Original Message -----
> Epiphany is littel bit confuse dby JS<->applet tests and is returning
> negative, zero  or to high
> retyped chars.
> Those chars break xml processing, so I would like to get them out
> from xml by this patch (kept in
> plain text).
> 
> Also I have realised I have broken unittest logging some tie ago.
> Fixed also:(
> 
> J.
>