RFR: 8065138 - Encodings.isRecognizedEnconding sometimes fails to recognize 'UTF8'

huizhe wang huizhe.wang at oracle.com
Thu Nov 20 17:57:36 UTC 2014


Awesome!  Glad to see the fix in the build process. It's a relief to 
know we don't have to worry about individual properties files.

Daniel, your test looks good to me. That's how jaxp loads 
Encodings.properties.

Thanks,
Joe

On 11/20/2014 9:25 AM, Daniel Fuchs wrote:
> On 20/11/14 14:36, Erik Joelsson wrote:
>> Here is my proposal for fixing the particular issue of generating the
>> correct properties files. I'm simply adding LC_ALL=C to the whole
>> command line instead of just the sort at the end. It seems to require
>> using "export" to get picked up.
>
> Hi Erik,
>
> Looks good to me.
>
> I have applied your patch (manually, because the copy/paste
> seems to have eaten the tab characters away, which caused patch to
> reject the diff) - and I confirm that the issue has disappeared.
>
> Thanks for solving this!
>
> Do you think it would be worth to commit this test modification
> later on, in a followup Bug/RFE?
>
> http://cr.openjdk.java.net/~dfuchs/webrev_8065138/webrev.00/jdk/test/javax/xml/jaxp/Encodings/CheckEncodingPropertiesFile.java.frames.html 
>
>
> If so I will take care of it.
>
> best regards,
>
> -- daniel
>
>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8065138
>> Patch:
>> diff --git a/make/common/JavaCompilation.gmk
>> b/make/common/JavaCompilation.gmk
>> --- a/make/common/JavaCompilation.gmk
>> +++ b/make/common/JavaCompilation.gmk
>> @@ -400,13 +400,15 @@
>>     # Now we can setup the depency that will trigger the copying.
>>     $$($1_BIN)$$($2_TARGET) : $2
>>       $(MKDIR) -p $$(@D)
>> -    $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
>> 's/\([^\\]\)=/\1\\=/g' \
>> +    export LC_ALL=C ; $(CAT) $$< \
>> +        | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e 's/\([^\\]\)=/\1\\=/g' \
>>               -e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
>>           | $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
>>           | $(SED) -e '/^#/d' -e '/^$$$$/d' \
>>               -e :a -e '/\\$$$$/N; s/\\\n//; ta' \
>>               -e 's/^[     ]*//;s/[     ]*$$$$//' \
>> -            -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
>> +            -e 's/\\=/=/' \
>> +        | $(SORT) > $$@
>>       $(CHMOD) -f ug+w $$@
>>
>>     # And do not forget this target
>>
>>
>> I filed a separate issue [1] for investigating the use of pipefail.
>>
>> /Erik
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8065576
>>
>> On 2014-11-20 10:34, Daniel Fuchs wrote:
>>> On 11/20/14 10:26 AM, Erik Joelsson wrote:
>>>> Hello,
>>>>
>>>> On 2014-11-20 02:20, Martin Buchholz wrote:
>>>>> Amusingly, the $(SORT) has an LC_ALL=C carefully placed before it, 
>>>>> but
>>>>> the $(SED)s need it too!
>>>> Yes, I think that's the correct fix in this case.
>>>>> On Wed, Nov 19, 2014 at 5:18 PM, Martin Buchholz
>>>>> <martinrb at google.com> wrote:
>>>>>> [+ build-dev]
>>>>>>
>>>>>> I think I see the problem.  By default, a Unix pipeline sadly fails
>>>>>> only when the last command fails.  In bash, you can change this to a
>>>>>> more sensible default via
>>>>>> set -o pipefail
>>>>>> but that's not portable enough for openjdk.
>>>> This is interesting and something I had missed. I will experiment
>>>> with enabling pipefail if configure finds support for it. This will
>>>> include setting the SHELL to bash. We really should fail instead of
>>>> silently generating broken builds.
>>>>
>>>> Daniel, I can take over this bug if you want and work on a proper
>>>> build fix.
>>>
>>> Thanks Erik! You are welcome!
>>> So the whole issue seems to be that my default setting is
>>> LC_ALL=en_US.UTF-8
>>> whereas sed requires LC_ALL=C to work properly on these property 
>>> files...
>>>
>>> When the test first failed I tried to rerun the test with LC_ALL=C -
>>> with no success
>>> of course. But I never thought of rebuilding with LC_ALL=C :-(
>>>
>>> My apologies for the red herring :-(
>>>
>>> best regards
>>>
>>> -- daniel
>>>
>>>>
>>>> /Erik
>>>>>> $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
>>>>>> 's/\([^\\]\)=/\1\\=/g' \
>>>>>>         -e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
>>>>>>     | $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
>>>>>>     | $(SED) -e '/^#/d' -e '/^$$$$/d' \
>>>>>>         -e :a -e '/\\$$$$/N; s/\\\n//; ta' \
>>>>>>         -e 's/^[ ]*//;s/[ ]*$$$$//' \
>>>>>>         -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
>>>>>>
>>>>>> On Wed, Nov 19, 2014 at 5:07 PM, huizhe wang
>>>>>> <huizhe.wang at oracle.com> wrote:
>>>>>>> On 11/19/2014 4:09 PM, Mandy Chung wrote:
>>>>>>>>
>>>>>>>> On 11/19/2014 3:49 PM, Mandy Chung wrote:
>>>>>>>>>
>>>>>>>>> On 11/19/2014 12:50 PM, Daniel Fuchs wrote:
>>>>>>>>>> On 11/19/14 9:36 PM, Mandy Chung wrote:
>>>>>>>>>>> resources.jar will be gone when we move to the modular runtime
>>>>>>>>>>> image
>>>>>>>>>>> (JEP 220 [1]).
>>>>>>>>>>> JDK-8065138 and JDK-8065365 will become non-issue in JDK 9.
>>>>>>>>>> Do you mean that the property files will no longer be stripped
>>>>>>>>>> of their
>>>>>>>>>> comments?
>>>>>>>>>
>>>>>>>>> (sorry for my delay in reply as I was trying to get the number
>>>>>>>>> of the
>>>>>>>>> resources in the modular image vs resources.jar but got
>>>>>>>>> distracted.)
>>>>>>>>>
>>>>>>>>> The current version copies all bytes when generating the modular
>>>>>>>>> image.
>>>>>>>>> This is a good question whether we should strip off the comments
>>>>>>>>> when
>>>>>>>>> writing to the modular runtime image.   I think we should look
>>>>>>>>> at the
>>>>>>>>> footprint and any performance saving and determine if we should
>>>>>>>>> do the same
>>>>>>>>> in JDK 9.
>>>>>>>>>
>>>>>>>> I looked at the exploded image under
>>>>>>>> BUILD_OUTPUTDIR/jdk/modules/java.xml
>>>>>>>> and found that the comments of Encodings.properties are stripped.
>>>>>>>> I was
>>>>>>>> confused with the mention of resources.jar that I assume it was a
>>>>>>>> step
>>>>>>>> stripping the comments before writing to resources.jar. This is
>>>>>>>> still
>>>>>>>> an issue in jigsaw m2 I believe.
>>>>>>>>
>>>>>>>> Where does the build strip the comments?
>>>>>>>
>>>>>>> A previous issue was that the build process strips off anything
>>>>>>> after '#'
>>>>>>> when copying properties files. In JDK8:
>>>>>>> jaxp/make/BuildJaxp.gmk:
>>>>>>> $(JAXP_OUTPUTDIR)/classes/%.properties:
>>>>>>> $(JAXP_TOPDIR)/src/%.properties
>>>>>>>          $(MKDIR) -p $(@D)
>>>>>>>          $(RM) $@ $@.tmp
>>>>>>>          $(CAT) $< | LANG=C $(NAWK) '{ sub(/#.*$$/,"#"); print }'
>>>>>>> > $@.tmp
>>>>>>>          $(MV) $@.tmp $@
>>>>>>>
>>>>>>> This was fixed in JDK 9.  The per-repo process was removed. It
>>>>>>> looks like
>>>>>>> the properties processing process is now consolidated into
>>>>>>> make/common/JavaCompilation.gmk. So the issue Daniel found is new
>>>>>>> (in terms
>>>>>>> of stripping). Search 'properties files' to see the macro.
>>>>>>>
>>>>>>> Joe
>>>>>>>
>>>>>>>> Mandy
>>>>>>>>
>>>>
>>>
>>
>




More information about the build-dev mailing list