[rfc][icedtea-web] Replace single quote with two single quotes when using MessageFormat in Translator

Jacob Wisor gitne at gmx.de
Mon Mar 16 20:06:20 UTC 2015


On 03/16/2015 07:20 PM CET Jie Kang wrote:
> ----- Original Message -----
>> On 03/16/2015 05:32 PM CET Jie Kang wrote:
>>> ----- Original Message -----
>>>> Hello,
>>>>
>>>> When looking into reproducer failures I noticed that the single quote '
>>>> was
>>>> not being displayed in the logs.
>>>>
>>>> E.g:
>>>>
>>>> don't --> dont
>>>>
>>>> This was caused by the use of MessageFormat in the Translator class.
>>>>
>>>>   From [1]: Within a String, "''" represents a single quote. [...]
>>>>
>>>> [1] http://docs.oracle.com/javase/6/docs/api/java/text/MessageFormat.html
>>>>
>>>> Out of the various possible fixes, I have decided to replace single quote
>>>> instances with two quotes only prior to using the MessageFormat class.
>>>> Afaict, this doesn't cause any issues and doesn't affect things that don't
>>>> go through the MessageFormat class.
>>>>
>>>> This bug caused a number of reproducers to fail (ex.
>>>> CodeBaseManifestEntryUnsignedNotMatching.BrowserJNLPHrefRemoteTest) as
>>>> they
>>>> expected a string like "don't" but the log output contained a string
>>>> "dont".
>>>> With this fix, those reproducers now pass.
>>>>
>>>> Thoughts?
>>
>> Hmm, the solution to this problem actually depends on the coding style of
>> message property files. For example, OpenJDK works around this issue is by
>> actually putting double apostrophe characters into their tool's message
>> property
>> files. So, IcedTea-Web should have or decide on some sort of coding style for
>> its message property files. Which one should it be is arguable. Your approach
>> is
>> certainly valid, however whether it also acceptable depends on the coding
>> style
>> for message property files in IcedTea-Web. Personally, I would prefer
>> OpenJDK's
>> approach, hence modify the property files and do not get into possibly error
>> prone mangling of strings.
>
> I prefer users of the property files not having to deal with how the code formats the strings. I don't care which method goes in as long as one goes in so I'll wait for a third opinion.

I understand and I would like this to happen too. However, you would also have 
to provide for proper handling of explicit occurrences of curly brackets (U+007B 
and U+007D). I am sure we do not want to get into this game.

The MessageFormat's documentation also states:
"Warning:
	The rules for using quotes within message format patterns unfortunately have 
shown to be somewhat confusing. In particular, it isn't always obvious to 
localizers whether single quotes need to be doubled or not. Make sure to inform 
localizers about the rules, and tell them (for example, by using comments in 
resource bundle source files) which strings will be processed by MessageFormat. 
Note that localizers may need to use single quotes in translated strings where 
the original version doesn't have them."

>> Btw, the proper term for ' (U+0027) is APOSTROPHE. There does not exist such
>> a
>> thing as a "single quote".
>
> Sure apostrophe is more apt here.
>
> 'This sentence is surrounded by single quotation marks (quotes).'

Jacob


More information about the distro-pkg-dev mailing list