<i18n dev> Date parsing issues with SimpleDateFormat and DateTimeFormatter

Lothar Kimmeringer job at kimmeringer.de
Thu Oct 12 08:58:27 UTC 2023



Am 11.10.2023 um 20:23 schrieb Naoto Sato:

> I think the difference MMM and LLL does not matter here. As they both represent
> short months, either context or standalone. The fundamental issue whether the
> dot in the pattern is parsed as part of the month name, or the literal in the
> pattern is the same for both M/L.

I've looked up the meaning of stand alone by now and agree ;-) Again, I think
the examples in the Javadoc description should be extended to contain one
non-EN-based one to see the differences, e.g. here "settembre" (M) and
"Settembre" (L) for italian.

>> This time though it happened between an update of Java 11.0.14 to 11.0.20 that
>> the result of a date-parsing process produced different results (I assume due
>> to the non-derterministic nature of map.keySet but I stopped debugging that
>> road when "discovering" LLL).
> 
> This is interesting, as I expect no difference between update releases as
> CLDR version remains the same for both.

I wondered myself but the reason why I dug into the insides of these classes
at all was a user not being able to parse an Ebay Revenue-summary CSV where invoice
dates are of the form "12 Dez 2023" (i.e. german month names) and I weren't able
to reproduce that problem. That only changed after I've tried it with a more recent
version of Java 11.

>> For our particular application I've solved it by setting up a SimpleDateFormat,
>> get the abbreviated months with sdf.getShortMonths(), cut them down to three
>> characters and reset them using dfs.setShortMonths.
> 
> Thanks for letting us know the workaround. This kind of information helps our future work.

I'm not sure how because it's quite specific, but you're welcome ;-)

>>   "23. Dez. 2016 11:12:13.456" with template "dd. LLL. yyyy HH:mm:ss.S"
>>   and
>>   "23. Dez. 2016 11:12:13.4" with template "dd. LLL. yyyy HH:mm:ss.SSS"
> 
> The default parsing mode is "strict" in DateTimeFormatter, so the number of digits in
> those nano seconds should match. You will need to build the formatter such that:
> 
> new DateTimeFormatterBuilder().parseLenient().appendPattern("S").toFormatter()
> 
> to accept 0 to 9 digits nanosecs without padding.

Why is it only strict with nano seconds? Other parts of the time are parsed
in a lenient way as well, i.e.

   assertEquals("check parsing of short month",
    "Sun Sep 03 01:02:03 CEST 2023",
    String.valueOf(parser.executeParser("03.09.2023 01:02:03", "d.M.y H:m:s", new Locale("en", "US"))));

pass. That's a bit inconsequential. ;-)


Thanks and best regards,

Lothar Kimmeringer


More information about the i18n-dev mailing list