<i18n dev> Date parsing issues with SimpleDateFormat and DateTimeFormatter
Naoto Sato
naoto.sato at oracle.com
Wed Oct 11 18:23:04 UTC 2023
Hi,
On 10/11/23 2:29 AM, Lothar Kimmeringer wrote:
> Hi,
>
> thanks for the answer.
>
> Am 10.10.2023 um 19:24 schrieb Naoto Sato:
>
>> The issue you are seeing here is the one we are aware of for a long
>> time: https://bugs.openjdk.org/browse/JDK-8194289
>
> These issues share key words but I think they are different. The bug
> report you've
> linked to is using MMM as template (as our user base is currently using)
> but the
> solution for that particular problem (as far as I understood) is using
> LLL instead.
> As shown in my JUnit-test when using Java 11, these tests go through:
I think the difference MMM and LLL does not matter here. As they both
represent short months, either context or standalone. The fundamental
issue whether the dot in the pattern is parsed as part of the month
name, or the literal in the pattern is the same for both M/L.
>
> assertEquals("check parsing of short month", "Tue Sep 12 00:00:00 CEST
> 2023",
> String.valueOf(parser.executeParser("12. Sep 2023", "dd. LLL yyyy",
> new Locale("de", "DE"))));
> assertEquals("check parsing of short month", "Sun Mar 12 00:00:00 CET
> 2023",
> String.valueOf(parser.executeParser("12 Mär 2023", "dd LLL yyyy",
> new Locale("de", "DE"))));
>
> are passing.
>
> What's failing is the case where you always have an abbreviated month
> with a dot
> (i.e. in English as well):
>
> assertEquals("check parsing of short month", "Fri Dec 23 11:12:13 CET
> 2016",
> String.valueOf(parser.executeParser("23. Dec. 2016 11:12:13.445",
> "dd. LLL. yyyy HH:mm:ss.SSS", new Locale("en", "US"))));
>
> is passing while
>
> assertEquals("check parsing of short month", "Fri Dec 23 11:12:13 CET
> 2016",
> String.valueOf(parser.executeParser("23. Dez. 2016 11:12:13.456",
> "dd. LLL. yyyy HH:mm:ss.SSS", new Locale("de", "DE"))));
>
> is not. My understanding is that LLL is supposed to represent abbreviated
> months with three letters without dot, so there shouldn't be different
> outcomes when using different Locales. At least I would expect format and
> parse are bijective which is not the case with this particular example.
The pattern "LLL" does not mean "without dot." It merely represents
stand-alone short months. So depending on locales, it may or may not
include the trailing dot (cf.
https://www.unicode.org/cldr/charts/43/by_type/date_&_time.gregorian.html#5a9ad242364bdb22)
>
>> The locale data changes from time to time, from small ones such as
>> translation
>> changes to the somewhat significant ones
>
> I know. That's why I have plenty of of unit-tests to see these changes
> between
> Java releases ;-) and this isn't the first time I had to find
> workarounds to keep
> the functionality that is provided to our user base.
>
> This time though it happened between an update of Java 11.0.14 to
> 11.0.20 that
> the result of a date-parsing process produced different results (I
> assume due
> to the non-derterministic nature of map.keySet but I stopped debugging that
> road when "discovering" LLL).
This is interesting, as I expect no difference between update releases
as CLDR version remains the same for both.
>
>> so I understand you will need to adapt to whatever the date format the
>> app receives.
>>
>> As the immediate temporary solution, you could choose the legacy
>> locale data over CLDR,
>
> For our particular application I've solved it by setting up a
> SimpleDateFormat,
> get the abbreviated months with sdf.getShortMonths(), cut them down to
> three
> characters and reset them using dfs.setShortMonths.
Thanks for letting us know the workaround. This kind of information
helps our future work.
>
> I don't suggest that as general solution, it's just a workaround "here" to
> keep the application's behavior to the way it's been the last decades. ;-)
> So my actual "problem" has been solved with this workaround but I'm simply
> confused about the "LLL." issue shown above and the ".S" issue that got
> lost in your mail, i.e. the issue with java.time.DateTimeFormatter.parse
> not
> being able to parse
>
> "23. Dez. 2016 11:12:13.456" with template "dd. LLL. yyyy HH:mm:ss.S"
> and
> "23. Dez. 2016 11:12:13.4" with template "dd. LLL. yyyy HH:mm:ss.SSS"
The default parsing mode is "strict" in DateTimeFormatter, so the number
of digits in those nano seconds should match. You will need to build the
formatter such that:
new
DateTimeFormatterBuilder().parseLenient().appendPattern("S").toFormatter()
to accept 0 to 9 digits nanosecs without padding.
Naoto
More information about the i18n-dev
mailing list