RFR: JDK-8061777, , (zipfs) IllegalArgumentException in ZipCoder.toString when using Shitft_JIS
Ulf Zibis
Ulf.Zibis at CoSoCo.de
Tue May 31 13:27:16 UTC 2016
Hi Sherman,
1.) wouldn't it be better to have a public getBytes() in AbstractStringBuilder?
Then you could save the array copy with sb.toString() here:
178 return new ZipPath(this, sb.toString(), zc.isUTF8());
525 return zfs.getBytes(to.toString());
You could simplify even more.
2.) No need for fork on isUTF8 in ZFS.iteratorOf() and for parameter isUTF8:
55 ZipPath(ZipFileSystem zfs, byte[] path, boolean normalized)
56 {
57 this.zfs = zfs;
58 if (normalized)
59 this.path = path;
60 // if zfs is NOT in utf8, normalize the path as "String"
61 // to avoid incorrectly normalizing byte '0x5c' (as '\')
62 // to '/'.
63 else if (zfs.zc.isUTF8())
64 this.path = normalize(path);
65 else
66 this.path = normalize(zfs.getString(path));
67 }
64 ZipPath(ZipFileSystem zfs, String path) {
65 this.zfs = zfs;
66 // if zfs is NOT in utf8, normalize the path as "String"
67 // to avoid incorrectly normalizing byte '0x5c' (as '\')
68 // to '/'.
69 if (zfs.zc.isUTF8()) {
70 this.path = normalize(zfs.getBytes(path));
71 } else {
72 this.path = normalize(path);
73 }
74 }
75
3.) We could benefit from better byte array performance for all one byte charsets, e.g. common
Windows-1252:
With:
if (zfs.zc.isOneByte())
this.path = normalize(path, zfs.zc.backSlashCode, zfs.zc.slashCode);
4.) We could do the same for all double byte charsets and benefit from the new intrinsic accessor
methods on byte arrays/buffers with VarHandles.
5.) As alternative to 2.) consider moving the string concatenation to ZipPath constructor. I
believe, this would make things more simple and less redundant:
ZipPath(ZipFileSystem zfs, String first, String... more)
6.) Avoid String instatiation especially when called from child paths iterator *loop*.
Instead:
500 if (len > 1 && prevC == '/')
501 path = path.substring(0, len - 1);
502 return zfs.getBytes(path);
provide:
500 return zfs.getBytes(path, 0, len - (len > 1 && prevC == '/' ? 1 : 0 ));
7.) Last but not least, the normalize algorithm could be an additional usecase for
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6695386
with '\' as terminator.
-Ulf
Am 31.05.2016 um 07:07 schrieb Xueming Shen:
> Thanks Paul,
>
> updated accordingly.
>
> http://cr.openjdk.java.net/~sherman/8061777/webrev
>
> -sherman
>
> On 5/30/16 2:31 AM, Paul Sandoz wrote:
>> Hi Sherman,
>>
>> Do you consider modifying the new ZipPath constructor you added to accept a boolean value for
>> UTF-8 encoding?
>>
>> If so you can more clearly document the behaviour and avoid duplication of the operators in
>> ZipFileSystem e.g.:
>>
>> return new ZipPath(this, first, zc.isUTF8());
>>
>> Paul.
More information about the core-libs-dev
mailing list