From claes.redestad at oracle.com Sat Jun 1 00:13:35 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Sat, 1 Jun 2019 02:13:35 +0200 Subject: RFR: 8225061: Performance regression in Regex Message-ID: <4f665c8c-5afa-3121-4671-3cc92f8b1a48@oracle.com> Hi, recent Unicode 12.1 updates caused a noticeable regression to Mac OS X build times. Quoting Naoto: "The regression was caused by the call to Grapheme.nextBoundary() in NFCCharProperty.match() method, which got slower with the fix to JDK-8221431 / JDK-8222978 (Unicode 12.1 / Grapheme 12.0 support). The purpose of issuing nextBoundary() is to detect whether to call (much heavy weight) Normalizer.normalize() call or not. Since this fast check does not require fully fledged boundary detection, including stateful segmentation check such as Emoji sequence, simply checking the break possibility between two code points as before should suffice. Suggested fix is to bring back the isBoundary(cp1, cp2) method from the previous revision in Grapheme.java, and issue it only from NFCCharProperty.match() method for the fast check." Bug: https://bugs.openjdk.java.net/browse/JDK-8225061 Webrev: http://cr.openjdk.java.net/~redestad/8225061/open.01/ While narrowing this down, I created a couple of microbenchmarks and experimented with a sequence of optimizations that got the regression of using the heavier nextBoundary() check down from about 300x to just about 2x as costly as before JDK-8221431. These improvements were then bypassed by reverting to isBoundary in some micros, but still helps a lot in other cases that has taken a toll from making the grapheme logic more complete/correct, so I'd like to leave them in. Testing: tier1-3, verified a 300x speedup in the complex Pattern.CANON_EQ micro, and a 2x speedup on the simpler Grapheme/\\b{g} micro. Thanks! /Claes From naoto.sato at oracle.com Sat Jun 1 00:23:08 2019 From: naoto.sato at oracle.com (naoto.sato at oracle.com) Date: Fri, 31 May 2019 17:23:08 -0700 Subject: RFR: 8225061: Performance regression in Regex In-Reply-To: <4f665c8c-5afa-3121-4671-3cc92f8b1a48@oracle.com> References: <4f665c8c-5afa-3121-4671-3cc92f8b1a48@oracle.com> Message-ID: <4ebb190d-aa71-650f-c74f-5f77da72c2bd@oracle.com> Hi Claes, Looks good to me. Thanks for catching this on so quickly! Naoto On 5/31/19 5:13 PM, Claes Redestad wrote: > Hi, > > recent Unicode 12.1 updates caused a noticeable regression to Mac OS X > build times. > > Quoting Naoto: > "The regression was caused by the call to Grapheme.nextBoundary() in > NFCCharProperty.match() method, which got slower with the fix to > JDK-8221431 / JDK-8222978 (Unicode 12.1 / Grapheme 12.0 support). The > purpose of issuing nextBoundary() is to detect whether to call (much > heavy weight) Normalizer.normalize() call or not. Since this fast check > does not require fully fledged boundary detection, including stateful > segmentation check such as Emoji sequence, simply checking the break > possibility between two code points as before should suffice. Suggested > fix is to bring back the isBoundary(cp1, cp2) method from the previous > revision in Grapheme.java, and issue it only from > NFCCharProperty.match() method for the fast check." > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8225061 > Webrev: http://cr.openjdk.java.net/~redestad/8225061/open.01/ > > While narrowing this down, I created a couple of microbenchmarks and > experimented with a sequence of optimizations that got the regression of > using the heavier nextBoundary() check down from about 300x to just > about 2x as costly as before JDK-8221431. These improvements were then > bypassed by reverting to isBoundary in some micros, but still helps a > lot in other cases that has taken a toll from making the grapheme logic > more complete/correct, so I'd like to leave them in. > > Testing: tier1-3, verified a 300x speedup in the complex > Pattern.CANON_EQ micro, and a 2x speedup on the simpler Grapheme/\\b{g} > micro. > > Thanks! > > /Claes From claes.redestad at oracle.com Sat Jun 1 00:58:59 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Sat, 1 Jun 2019 02:58:59 +0200 Subject: RFR: 8225061: Performance regression in Regex In-Reply-To: <4ebb190d-aa71-650f-c74f-5f77da72c2bd@oracle.com> References: <4f665c8c-5afa-3121-4671-3cc92f8b1a48@oracle.com> <4ebb190d-aa71-650f-c74f-5f77da72c2bd@oracle.com> Message-ID: Hi Naoto, thanks for reviewing! /Claes On 2019-06-01 02:23, naoto.sato at oracle.com wrote: > Hi Claes, > > Looks good to me. Thanks for catching this on so quickly! > > Naoto > > On 5/31/19 5:13 PM, Claes Redestad wrote: >> Hi, >> >> recent Unicode 12.1 updates caused a noticeable regression to Mac OS X >> build times. >> >> Quoting Naoto: >> "The regression was caused by the call to Grapheme.nextBoundary() in >> NFCCharProperty.match() method, which got slower with the fix to >> JDK-8221431 / JDK-8222978 (Unicode 12.1 / Grapheme 12.0 support). The >> purpose of issuing nextBoundary() is to detect whether to call (much >> heavy weight) Normalizer.normalize() call or not. Since this fast check >> does not require fully fledged boundary detection, including stateful >> segmentation check such as Emoji sequence, simply checking the break >> possibility between two code points as before should suffice. Suggested >> fix is to bring back the isBoundary(cp1, cp2) method from the previous >> revision in Grapheme.java, and issue it only from >> NFCCharProperty.match() method for the fast check." >> >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8225061 >> Webrev: http://cr.openjdk.java.net/~redestad/8225061/open.01/ >> >> While narrowing this down, I created a couple of microbenchmarks and >> experimented with a sequence of optimizations that got the regression of >> using the heavier nextBoundary() check down from about 300x to just >> about 2x as costly as before JDK-8221431. These improvements were then >> bypassed by reverting to isBoundary in some micros, but still helps a >> lot in other cases that has taken a toll from making the grapheme logic >> more complete/correct, so I'd like to leave them in. >> >> Testing: tier1-3, verified a 300x speedup in the complex >> Pattern.CANON_EQ micro, and a 2x speedup on the simpler Grapheme/\\b{g} >> micro. >> >> Thanks! >> >> /Claes From peter.levart at gmail.com Sun Jun 2 07:21:36 2019 From: peter.levart at gmail.com (Peter Levart) Date: Sun, 2 Jun 2019 09:21:36 +0200 Subject: RFR 8220238 : Enhancing j.l.Runtime/System::gc specification with an explicit 'no guarantee' statement In-Reply-To: <99120787-64db-449b-1df2-9e69c14efb03@oracle.com> References: <0f7c965f-e07b-67d4-2d37-ba911ca0c66e@oracle.com> <8ce9af61-98a2-3497-d59e-6daa00d951a4@redhat.com> <939dc43a-2080-c055-51a7-f7f66bd44e2c@redhat.com> <61a8908b-1b5f-40d0-8f01-feef59f27d52@oracle.com> <449aa33b-39e2-8f8f-90f2-ffcb04c7e6a0@redhat.com> <47be64b7-f88b-8125-ac55-8fb65cfd0cbc@redhat.com> <99120787-64db-449b-1df2-9e69c14efb03@oracle.com> Message-ID: <51381bab-2d30-26aa-8bd9-0d1211af4f9c@gmail.com> On 5/31/19 7:24 PM, Roger Riggs wrote: > Hi Martin, > > True, calling System.gc() and then checking for its hoped-for/expected > side-effects is the norm. > But its robustness depends on a combination of gc implementation > behavior and > the particular side-effect expected: allocation, reference processing, > etc. > > Roger > > On 05/30/2019 01:30 PM, Martin Buchholz wrote: >> If you are calling System.gc() for correctness (e.g. in a test), it >> is probably because some sort of finalization is being triggered.? >> And that happens in some Java thread (e.g. Reference Handler) that >> System.gc() has no control over.? So in practice, users need to call >> System.gc() and then wait for subsequent reference processing somehow. > ...there is an internal API (java.lang.ref.Reference#waitForReferenceProcessing) and a usage of it (java.nio.Bits#reserveMemory) where it is hoped that there is a "happens before" between making the newly discovered and cleared Reference(s) available for enqueue-ing/processing (in Reference Handler thread) and System.gc() returning. In such case, there is a minimum possible latency between requesting System.gc() and actual processing of Reference(s) such that there is no Thread.sleep() involved. But there's also delay based fallback in case the "effort" by System.gc() is not synchronous... So if some (futre) GC(s) make System.gc() method (partly) asynchronous, the above usage will still work, but perhaps with more latency when native memory is exhausted. Regards, Peter From christoph.langer at sap.com Sun Jun 2 21:35:33 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Sun, 2 Jun 2019 21:35:33 +0000 Subject: RFR 8213031: (zipfs) Add support for POSIX file permissions In-Reply-To: <33E52896-E491-4660-A932-46D5A270A63E@oracle.com> References: <46af4f9d-60f9-c60d-e6f1-8fb97df3ba2e@oracle.com> <5d28b0715d2041ff892a3c44e024f120@sap.com> <8e9231ff-b7d5-bc2f-2643-713f3c670a1d@oracle.com> <3aeba9a64a434a968fc1d82a44077745@sap.com> <953449f0913340f6a94ae3d83611fd92@sap.com> <9b3c9cfe-63e9-94ea-1af0-7ba9e2db52fd@oracle.com> <62a26037-a991-dc31-a972-a82386f63b92@oracle.com> <9806f4b1-9b55-d1d8-4511-5db0ef6786a5@oracle.com> <33E52896-E491-4660-A932-46D5A270A63E@oracle.com> Message-ID: Hi Alan, Lance, thanks for the updated wording in module-info, that really looks good. I incorporated it into my change, no we'd be here: http://cr.openjdk.java.net/~clanger/webrevs/8213031.13/ To be honest, I was hoping it would still make it into JDK13. I guess now I shall update the CSR to get it reviewed, correct? Thanks Christoph From: Lance Andersen Sent: Freitag, 31. Mai 2019 20:01 To: Alan Bateman Cc: Langer, Christoph ; nio-dev ; Java Core Libs Subject: Re: RFR 8213031: (zipfs) Add support for POSIX file permissions On May 31, 2019, at 12:32 PM, Alan Bateman > wrote: On 29/05/2019 13:16, Langer, Christoph wrote: Hi Alan, The table items in L119-150 look fine, we just need to avoid really long lines One minor comment on L123 is that it might be clearer if you drop "created" from the sentence. L48-78 is a "wall of text" and links that I don't think will be easy for most developers to read. Can I provide suggested wording for this part of the spec? I'm just thinking that an alternative wording might help avoid too much iteration on this. I have created a new webrev to add some linebreaks and pick up your suggestion to drop the word "created" in L123. http://cr.openjdk.java.net/~clanger/webrevs/8213031.12/ Waiting on your update for the other part. Attached is alternative wording for the "Support for POSIX file permissions" section. My concern with the proposed in webrev.12 is that it's dense and not easy to read. It also misses a few things - one important one is that access permissions aren't enforced. I think the wording below is looking good. A few minor suggestions below. So overall I think you've got this feature into reasonable shape (I realize it has taken 7 months to get here, this is perhaps a good example of something that needs a lot of up front discussion before going near the code). Once we finalize the CSR, we should look towards getting the changes in early in the JDK 14 cycle so that we have time to vet/catch potential issues. -Alan import java.nio.file.Files; import java.nio.file.LinkOption; import java.nio.file.Path; import java.nio.file.attribute.FileAttributeView; import java.nio.file.attribute.PosixFileAttributes; import java.nio.file.attribute.PosixFilePermission; import java.nio.file.attribute.PosixFileAttributeView; import java.util.Set; *

POSIX file attributes

* *

A Zip file system supports a file attribute {@link FileAttributeView view} * named "{@code zip}" that defines the following file attribute: * *

* * * * * * * * * * * * * * *
Supported attributes
Name Type
permissions {@link Set}<{@link PosixFilePermission}>
*
* * The "permissions" attribute is the set of access permissions that are optionally * stored for entries in a Zip file. The value of the attribute is {@code null} * for entries that do not have access permissions. Zip file systems do not * enforce access permissions. * *

The "permissions" attribute can be read and set using the can-> may * {@linkplain Files#getAttribute(Path, String, LinkOption...) Files.getAttribute} and * {@linkplain Files#setAttribute(Path, String, Object, LinkOption...) Files.setAttribute} * methods. The following example uses these methods to read and set the attribute: *

 {@code
 *     Set perms = Files.getAttribute(entry, "zip:permissions");
 *     if (perms == null) {
 *         perms = PosixFilePermissions.fromString("rw-rw-rw-");
 *         Files.setAttribute(entry, "zip:permissions", perms);
 *     }
 * } 
* *

In addition to the "{@code zip}" view, a Zip file system optionally supports * the {@link PosixFileAttributeView POSIX file attribute view} ("{@code posix}"). * This view extends the "{@code basic}" view with type safe access to the * {@link PosixFileAttributes#owner() owner}, {@link PosixFileAttributes#group() group-owner}, * and {@link PosixFileAttributes#permissions() permissions} attributes. The * "{@code posix}" view is only supported when the Zip file system is created with * the provider property "{@code enablePosixFileAttributes}" set to "{@code true}". * The following creates a file system with this property and reads the access * permissions of a file: *

 {@code
 *     var env = Map.of("enablePosixFileAttributes", "true");
 *     try (FileSystem fs = FileSystems.newFileSystem(file, env) {
 *         Path entry = fs.getPath("entry");
 *         Set perms = Files.getPosixFilePermissions(entry);
 *     }
 * } 
* *

The file owner and group owner attributes are not persisted, meaning they are * not stored in the zip file. The "{@code defaultOwner}" and "{@code defaultGroup}" * provider properties (listed below) can be used to configure the default values * for these attributes. If these properties are not set then the file owner * defaults to the owner of the zip file, and the group owner defaults to the * zip file's group owner (or the file owner on platforms that don't support a * group owner). * *

The "{@code permissions}" attribute is not optional in the "{@code posix}" * view so a default of set of permissions are used for entries that do not have ^^^ of seems out of place * access permissions stored in the Zip file. The default set of permissions * is {@link PosixFilePermission#OWNER_READ OWNER_READ}, {@link PosixFilePermission#OWNER_WRITE is -> are * OWNER_WRITE} and {@link PosixFilePermission#GROUP_READ GROUP_READ}. perhaps consider using a