<i18n dev> Date parsing issues with SimpleDateFormat and DateTimeFormatter
Lothar Kimmeringer
job at kimmeringer.de
Thu Oct 12 08:58:27 UTC 2023
Am 11.10.2023 um 20:23 schrieb Naoto Sato:
> I think the difference MMM and LLL does not matter here. As they both represent
> short months, either context or standalone. The fundamental issue whether the
> dot in the pattern is parsed as part of the month name, or the literal in the
> pattern is the same for both M/L.
I've looked up the meaning of stand alone by now and agree ;-) Again, I think
the examples in the Javadoc description should be extended to contain one
non-EN-based one to see the differences, e.g. here "settembre" (M) and
"Settembre" (L) for italian.
>> This time though it happened between an update of Java 11.0.14 to 11.0.20 that
>> the result of a date-parsing process produced different results (I assume due
>> to the non-derterministic nature of map.keySet but I stopped debugging that
>> road when "discovering" LLL).
>
> This is interesting, as I expect no difference between update releases as
> CLDR version remains the same for both.
I wondered myself but the reason why I dug into the insides of these classes
at all was a user not being able to parse an Ebay Revenue-summary CSV where invoice
dates are of the form "12 Dez 2023" (i.e. german month names) and I weren't able
to reproduce that problem. That only changed after I've tried it with a more recent
version of Java 11.
>> For our particular application I've solved it by setting up a SimpleDateFormat,
>> get the abbreviated months with sdf.getShortMonths(), cut them down to three
>> characters and reset them using dfs.setShortMonths.
>
> Thanks for letting us know the workaround. This kind of information helps our future work.
I'm not sure how because it's quite specific, but you're welcome ;-)
>> "23. Dez. 2016 11:12:13.456" with template "dd. LLL. yyyy HH:mm:ss.S"
>> and
>> "23. Dez. 2016 11:12:13.4" with template "dd. LLL. yyyy HH:mm:ss.SSS"
>
> The default parsing mode is "strict" in DateTimeFormatter, so the number of digits in
> those nano seconds should match. You will need to build the formatter such that:
>
> new DateTimeFormatterBuilder().parseLenient().appendPattern("S").toFormatter()
>
> to accept 0 to 9 digits nanosecs without padding.
Why is it only strict with nano seconds? Other parts of the time are parsed
in a lenient way as well, i.e.
assertEquals("check parsing of short month",
"Sun Sep 03 01:02:03 CEST 2023",
String.valueOf(parser.executeParser("03.09.2023 01:02:03", "d.M.y H:m:s", new Locale("en", "US"))));
pass. That's a bit inconsequential. ;-)
Thanks and best regards,
Lothar Kimmeringer
More information about the i18n-dev
mailing list