RFR: 8065138 - Encodings.isRecognizedEnconding sometimes fails to recognize 'UTF8'
Erik Joelsson
erik.joelsson at oracle.com
Thu Nov 20 13:36:48 UTC 2014
Here is my proposal for fixing the particular issue of generating the
correct properties files. I'm simply adding LC_ALL=C to the whole
command line instead of just the sort at the end. It seems to require
using "export" to get picked up.
Bug: https://bugs.openjdk.java.net/browse/JDK-8065138
Patch:
diff --git a/make/common/JavaCompilation.gmk
b/make/common/JavaCompilation.gmk
--- a/make/common/JavaCompilation.gmk
+++ b/make/common/JavaCompilation.gmk
@@ -400,13 +400,15 @@
# Now we can setup the depency that will trigger the copying.
$$($1_BIN)$$($2_TARGET) : $2
$(MKDIR) -p $$(@D)
- $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
's/\([^\\]\)=/\1\\=/g' \
+ export LC_ALL=C ; $(CAT) $$< \
+ | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e 's/\([^\\]\)=/\1\\=/g' \
-e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
| $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
| $(SED) -e '/^#/d' -e '/^$$$$/d' \
-e :a -e '/\\$$$$/N; s/\\\n//; ta' \
-e 's/^[ ]*//;s/[ ]*$$$$//' \
- -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
+ -e 's/\\=/=/' \
+ | $(SORT) > $$@
$(CHMOD) -f ug+w $$@
# And do not forget this target
I filed a separate issue [1] for investigating the use of pipefail.
/Erik
[1] https://bugs.openjdk.java.net/browse/JDK-8065576
On 2014-11-20 10:34, Daniel Fuchs wrote:
> On 11/20/14 10:26 AM, Erik Joelsson wrote:
>> Hello,
>>
>> On 2014-11-20 02:20, Martin Buchholz wrote:
>>> Amusingly, the $(SORT) has an LC_ALL=C carefully placed before it, but
>>> the $(SED)s need it too!
>> Yes, I think that's the correct fix in this case.
>>> On Wed, Nov 19, 2014 at 5:18 PM, Martin Buchholz
>>> <martinrb at google.com> wrote:
>>>> [+ build-dev]
>>>>
>>>> I think I see the problem. By default, a Unix pipeline sadly fails
>>>> only when the last command fails. In bash, you can change this to a
>>>> more sensible default via
>>>> set -o pipefail
>>>> but that's not portable enough for openjdk.
>> This is interesting and something I had missed. I will experiment
>> with enabling pipefail if configure finds support for it. This will
>> include setting the SHELL to bash. We really should fail instead of
>> silently generating broken builds.
>>
>> Daniel, I can take over this bug if you want and work on a proper
>> build fix.
>
> Thanks Erik! You are welcome!
> So the whole issue seems to be that my default setting is
> LC_ALL=en_US.UTF-8
> whereas sed requires LC_ALL=C to work properly on these property files...
>
> When the test first failed I tried to rerun the test with LC_ALL=C -
> with no success
> of course. But I never thought of rebuilding with LC_ALL=C :-(
>
> My apologies for the red herring :-(
>
> best regards
>
> -- daniel
>
>>
>> /Erik
>>>> $(CAT) $$< | $(SED) -e 's/\([^\\]\):/\1\\:/g' -e
>>>> 's/\([^\\]\)=/\1\\=/g' \
>>>> -e 's/\([^\\]\)!/\1\\!/g' -e 's/#.*/#/g' \
>>>> | $(SED) -f "$(SRC_ROOT)/make/common/support/unicode2x.sed" \
>>>> | $(SED) -e '/^#/d' -e '/^$$$$/d' \
>>>> -e :a -e '/\\$$$$/N; s/\\\n//; ta' \
>>>> -e 's/^[ ]*//;s/[ ]*$$$$//' \
>>>> -e 's/\\=/=/' | LC_ALL=C $(SORT) > $$@
>>>>
>>>> On Wed, Nov 19, 2014 at 5:07 PM, huizhe wang
>>>> <huizhe.wang at oracle.com> wrote:
>>>>> On 11/19/2014 4:09 PM, Mandy Chung wrote:
>>>>>>
>>>>>> On 11/19/2014 3:49 PM, Mandy Chung wrote:
>>>>>>>
>>>>>>> On 11/19/2014 12:50 PM, Daniel Fuchs wrote:
>>>>>>>> On 11/19/14 9:36 PM, Mandy Chung wrote:
>>>>>>>>> resources.jar will be gone when we move to the modular runtime
>>>>>>>>> image
>>>>>>>>> (JEP 220 [1]).
>>>>>>>>> JDK-8065138 and JDK-8065365 will become non-issue in JDK 9.
>>>>>>>> Do you mean that the property files will no longer be stripped
>>>>>>>> of their
>>>>>>>> comments?
>>>>>>>
>>>>>>> (sorry for my delay in reply as I was trying to get the number
>>>>>>> of the
>>>>>>> resources in the modular image vs resources.jar but got
>>>>>>> distracted.)
>>>>>>>
>>>>>>> The current version copies all bytes when generating the modular
>>>>>>> image.
>>>>>>> This is a good question whether we should strip off the comments
>>>>>>> when
>>>>>>> writing to the modular runtime image. I think we should look
>>>>>>> at the
>>>>>>> footprint and any performance saving and determine if we should
>>>>>>> do the same
>>>>>>> in JDK 9.
>>>>>>>
>>>>>> I looked at the exploded image under
>>>>>> BUILD_OUTPUTDIR/jdk/modules/java.xml
>>>>>> and found that the comments of Encodings.properties are stripped.
>>>>>> I was
>>>>>> confused with the mention of resources.jar that I assume it was a
>>>>>> step
>>>>>> stripping the comments before writing to resources.jar. This is
>>>>>> still
>>>>>> an issue in jigsaw m2 I believe.
>>>>>>
>>>>>> Where does the build strip the comments?
>>>>>
>>>>> A previous issue was that the build process strips off anything
>>>>> after '#'
>>>>> when copying properties files. In JDK8:
>>>>> jaxp/make/BuildJaxp.gmk:
>>>>> $(JAXP_OUTPUTDIR)/classes/%.properties:
>>>>> $(JAXP_TOPDIR)/src/%.properties
>>>>> $(MKDIR) -p $(@D)
>>>>> $(RM) $@ $@.tmp
>>>>> $(CAT) $< | LANG=C $(NAWK) '{ sub(/#.*$$/,"#"); print }'
>>>>> > $@.tmp
>>>>> $(MV) $@.tmp $@
>>>>>
>>>>> This was fixed in JDK 9. The per-repo process was removed. It
>>>>> looks like
>>>>> the properties processing process is now consolidated into
>>>>> make/common/JavaCompilation.gmk. So the issue Daniel found is new
>>>>> (in terms
>>>>> of stripping). Search 'properties files' to see the macro.
>>>>>
>>>>> Joe
>>>>>
>>>>>> Mandy
>>>>>>
>>
>
More information about the build-dev
mailing list