[9] RFC JDK-8068373: (prefs) FileSystemPreferences writes \0 to XML storage, causing loss of all preferences

Paul Sandoz paul.sandoz at oracle.com
Thu Feb 12 17:29:12 UTC 2015


On Feb 12, 2015, at 5:27 PM, Brian Burkhalter <brian.burkhalter at oracle.com> wrote:

> 
> On Feb 12, 2015, at 8:18 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
>>> This is a morass and I hope that someone more apt to know it well would comment. The U+0000 null control character is always illegal though I do know that.
>> 
>> Yes. IIRC XML 1.1 basically allows any character except U+0000.
> 
> "Note that the code point U+0000, assigned to the null control character, is the only character encoded in Unicode and ISO/IEC 10646 that is always invalid in any XML 1.0 and 1.1 document."
> 
> http://en.wikipedia.org/wiki/Valid_characters_in_XML#Characters_allowed_but_discouraged
> 
>>> 
>>> The problem is that on OSX and Windows prefs are not stored to XML
>> 
>> What are they stored in? name/value pairs?
> 
> Yes. On OSX I think Property List (.plist) files; on Windows I do not know yet.
> 
>>> whereas on Unix they are.
>> 
>> Is that a specification requirement?
> 
> No, I think it’s an artifact of no inherent DB-like facility in the system.
> 
>>> That would make it an error to add such a value to the prefs on some platforms but not on others.
>> 
>> Yes, for an interoperable format potentially read by other tools having U+0000 is a really bad idea.
> 
> Yep.
> 

And i think that applies to plist files too.


>> My inclination is if properties are written out to a text file then it should fail if a key/value contains U+0000 (Binary data should be base64 encoded in such cases.) Replacing just subtlety hides or defers the issue.
> 
> That was my original idea  \in fact (webrev.00, unpublished). It would however require a spec update to
> 
> http://docs.oracle.com/javase/8/docs/api/java/util/prefs/Preferences.html#put-java.lang.String-java.lang.String-
> 
> to allow for an IAE in this case. This would also be a backward-incompatible change for platforms which do allow storing such values.
> 

> Note that a similar situation applies to Properties.
> 

My recommendation is serialization of properties to any textual format should barf if a U+0000 is encountered. Otherwise it's just hiding bugs. In such cases i think there is a strong justification to introduce such an incompatible change. I except it is rare to encounter in practice.

Paul.



More information about the core-libs-dev mailing list