RFR: 8065138 - Encodings.isRecognizedEnconding sometimes fails to recognize 'UTF8'
Daniel Fuchs
daniel.fuchs at oracle.com
Thu Nov 20 17:25:03 UTC 2014
On 20/11/14 14:36, Erik Joelsson wrote:
> Here is my proposal for fixing the particular issue of generating the
> correct properties files. I'm simply adding LC_ALL=C to the whole
> command line instead of just the sort at the end. It seems to require
> using "export" to get picked up.
Hi Erik,
Looks good to me.
I have applied your patch (manually, because the copy/paste
seems to have eaten the tab characters away, which caused patch to
reject the diff) - and I confirm that the issue has disappeared.
Thanks for solving this!
Do you think it would be worth to commit this test modification
later on, in a followup Bug/RFE?
http://cr.openjdk.java.net/~dfuchs/webrev_8065138/webrev.00/jdk/test/javax/xml/jaxp/Encodings/CheckEncodingPropertiesFile.java.frames.html
If so I will take care of it.
best regards,
-- daniel
> Bug: https://bugs.openjdk.java.net/browse/JDK-8065138
> Patch:
> diff --git a/make/common/JavaCompilation.gmk
> b/make/common/JavaCompilation.gmk
> --- a/make/common/JavaCompilation.gmk
> +++ b/make/common/JavaCompilation.gmk
> @@ -400,13 +400,15 @@
> # Now we can setup the depency that will trigger the copying.
> $$($1_BIN)$$($2_TARGET) : $2
> $(MKDIR) -p $$(@D)
> - $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
> 's/\([^\\]\)=/\1\\=/g' \
> + export LC_ALL=C ; $(CAT) $$< \
> + | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e 's/\([^\\]\)=/\1\\=/g' \
> -e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
> | $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
> | $(SED) -e '/^#/d' -e '/^$$$$/d' \
> -e :a -e '/\\$$$$/N; s/\\\n//; ta' \
> -e 's/^[ ]*//;s/[ ]*$$$$//' \
> - -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
> + -e 's/\\=/=/' \
> + | $(SORT) > $$@
> $(CHMOD) -f ug+w $$@
>
> # And do not forget this target
>
>
> I filed a separate issue [1] for investigating the use of pipefail.
>
> /Erik
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8065576
>
> On 2014-11-20 10:34, Daniel Fuchs wrote:
>> On 11/20/14 10:26 AM, Erik Joelsson wrote:
>>> Hello,
>>>
>>> On 2014-11-20 02:20, Martin Buchholz wrote:
>>>> Amusingly, the $(SORT) has an LC_ALL=C carefully placed before it, but
>>>> the $(SED)s need it too!
>>> Yes, I think that's the correct fix in this case.
>>>> On Wed, Nov 19, 2014 at 5:18 PM, Martin Buchholz
>>>> <martinrb at google.com> wrote:
>>>>> [+ build-dev]
>>>>>
>>>>> I think I see the problem. By default, a Unix pipeline sadly fails
>>>>> only when the last command fails. In bash, you can change this to a
>>>>> more sensible default via
>>>>> set -o pipefail
>>>>> but that's not portable enough for openjdk.
>>> This is interesting and something I had missed. I will experiment
>>> with enabling pipefail if configure finds support for it. This will
>>> include setting the SHELL to bash. We really should fail instead of
>>> silently generating broken builds.
>>>
>>> Daniel, I can take over this bug if you want and work on a proper
>>> build fix.
>>
>> Thanks Erik! You are welcome!
>> So the whole issue seems to be that my default setting is
>> LC_ALL=en_US.UTF-8
>> whereas sed requires LC_ALL=C to work properly on these property files...
>>
>> When the test first failed I tried to rerun the test with LC_ALL=C -
>> with no success
>> of course. But I never thought of rebuilding with LC_ALL=C :-(
>>
>> My apologies for the red herring :-(
>>
>> best regards
>>
>> -- daniel
>>
>>>
>>> /Erik
>>>>> $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
>>>>> 's/\([^\\]\)=/\1\\=/g' \
>>>>> -e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
>>>>> | $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
>>>>> | $(SED) -e '/^#/d' -e '/^$$$$/d' \
>>>>> -e :a -e '/\\$$$$/N; s/\\\n//; ta' \
>>>>> -e 's/^[ ]*//;s/[ ]*$$$$//' \
>>>>> -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
>>>>>
>>>>> On Wed, Nov 19, 2014 at 5:07 PM, huizhe wang
>>>>> <huizhe.wang at oracle.com> wrote:
>>>>>> On 11/19/2014 4:09 PM, Mandy Chung wrote:
>>>>>>>
>>>>>>> On 11/19/2014 3:49 PM, Mandy Chung wrote:
>>>>>>>>
>>>>>>>> On 11/19/2014 12:50 PM, Daniel Fuchs wrote:
>>>>>>>>> On 11/19/14 9:36 PM, Mandy Chung wrote:
>>>>>>>>>> resources.jar will be gone when we move to the modular runtime
>>>>>>>>>> image
>>>>>>>>>> (JEP 220 [1]).
>>>>>>>>>> JDK-8065138 and JDK-8065365 will become non-issue in JDK 9.
>>>>>>>>> Do you mean that the property files will no longer be stripped
>>>>>>>>> of their
>>>>>>>>> comments?
>>>>>>>>
>>>>>>>> (sorry for my delay in reply as I was trying to get the number
>>>>>>>> of the
>>>>>>>> resources in the modular image vs resources.jar but got
>>>>>>>> distracted.)
>>>>>>>>
>>>>>>>> The current version copies all bytes when generating the modular
>>>>>>>> image.
>>>>>>>> This is a good question whether we should strip off the comments
>>>>>>>> when
>>>>>>>> writing to the modular runtime image. I think we should look
>>>>>>>> at the
>>>>>>>> footprint and any performance saving and determine if we should
>>>>>>>> do the same
>>>>>>>> in JDK 9.
>>>>>>>>
>>>>>>> I looked at the exploded image under
>>>>>>> BUILD_OUTPUTDIR/jdk/modules/java.xml
>>>>>>> and found that the comments of Encodings.properties are stripped.
>>>>>>> I was
>>>>>>> confused with the mention of resources.jar that I assume it was a
>>>>>>> step
>>>>>>> stripping the comments before writing to resources.jar. This is
>>>>>>> still
>>>>>>> an issue in jigsaw m2 I believe.
>>>>>>>
>>>>>>> Where does the build strip the comments?
>>>>>>
>>>>>> A previous issue was that the build process strips off anything
>>>>>> after '#'
>>>>>> when copying properties files. In JDK8:
>>>>>> jaxp/make/BuildJaxp.gmk:
>>>>>> $(JAXP_OUTPUTDIR)/classes/%.properties:
>>>>>> $(JAXP_TOPDIR)/src/%.properties
>>>>>> $(MKDIR) -p $(@D)
>>>>>> $(RM) $@ $@.tmp
>>>>>> $(CAT) $< | LANG=C $(NAWK) '{ sub(/#.*$$/,"#"); print }'
>>>>>> > $@.tmp
>>>>>> $(MV) $@.tmp $@
>>>>>>
>>>>>> This was fixed in JDK 9. The per-repo process was removed. It
>>>>>> looks like
>>>>>> the properties processing process is now consolidated into
>>>>>> make/common/JavaCompilation.gmk. So the issue Daniel found is new
>>>>>> (in terms
>>>>>> of stripping). Search 'properties files' to see the macro.
>>>>>>
>>>>>> Joe
>>>>>>
>>>>>>> Mandy
>>>>>>>
>>>
>>
>
More information about the build-dev
mailing list