From ron.pressler at oracle.com Mon May 1 09:57:58 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Mon, 1 May 2023 09:57:58 +0000 Subject: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> Message-ID: Hi Volker! > On 28 Apr 2023, at 16:38, Volker Simonis wrote: > > I think it is a little unfortunate to put the usage of s.m.Unsafe and > JNI/Instrumentation/JVMTI into the same category, especially when it > comes to blaming developers for their usage. While s.m.Unsafe has > always been an internal, undocumented and unsupported API, the latter > three are part of the Java Platform (e.g. "native" is a Java keyword > and Runtime.loadLibrary() is part of the Java API). To have integrity by default, theses must all become restricted. In fact ? not just them. Even the fully-official and brand-new FFM API, that a lot of investment has gone into very recently, must also be restricted. That these features must be restricted doesn?t mean they?re wrong or bad. It just means that they're superpowers, and so the user must acknowledge the choice to use them over the loss of integrity. Native libraries are good and integrity is good, but because they?re in contradiction, there must be a switch expressing the user?s preference like other switches offered by the runtime to select between alternatives. Unsafe, on the other hand, may become more than just restricted over time. It may gradually be emptied out until it?s gone. > > Do you really plan to make JNI an optional feature which will have to > be manually enabled at startup? Not optional at all, but an important, useful feature that is restricted; JNI?s replacement, FFM will be restricted, too (in its use of native libraries). The restriction of FFM is already mentioned in JEP 442. Another JEP addressing JNI will be published soonish. > What will be the benefit? Integrity. The ability to to bypass encapsulation when needed is not being taken away, but we need the new ability to establish and enforce invariants. We don?t yet have it. > I understand > that in an ideal world where you had no user-supplied JNI libraries at > all, you might be able to perform more/better optimizations. But as > you'd have to support JNI anyway, wouldn't the maintenance of the > resulting code become a nightmare. How many "if (JNI) {..} else {..}" > would we get? There?s no need for such code. Modules that need JNI will use JNI. The application will simply give them permission to do so with --enable-native-access=MODULE-NAME, as it would also do to allow FFM to use native libraries. > And what would be the benefit of disabling it by default > for the user except increased "integrity"? Not disabled, restricted, and integrity is the benefit for the user, e.g. in the form of programs not breaking (or breaking less) when upgrading the JDK. Integrity is required for the platform to continue evolve while keeping the ecosystem sustainable. > I.e. do you have some > concrete examples of planned features X, Y, Z which will only work > with disabled JNI? Not disabled, restricted. Like all encapsulation-breaking restricted superpowers, allowing them might have implications on possible Leyden features. For example, if a private method could be accessed from outside a module ? whether through deep reflection or JNI ? private methods could not be removed at link time. > Will these features be Java SE features or > implementation specific OpenJDK-only features? As with all integrity and strong encapsulation features, all limitations will be part of the platform spec. We realise that in each individual case there might be good reasons to allow knocking down encapsulation barriers. But whereas every application and library author rightfully want minimise the burden on their particular code, such individual decisions inevitably lead to a tragedy of the commons (as they already have). We must strive to minimise the overall burden integrated over the entire ecosystem *as a whole*. So the platform will have the right defaults for the ecosystem, and every application would be able to relax encapsulation to suit its particular needs. Most Java program don?t use native libraries, agents (startup or dynamic), or deep reflection. Many do, and these features are very powerful and can be very useful, but with great power comes great responsibility, and that responsibility falls on the *application*. Libraries must not silently impose that responsibility on the application in a way that makes it infeasible to exercise. Moreover, most encapsulation boundaries are never bypassed, but without integrity by default, the platform and its users still can?t be certain that code means what it says as long as any fourth-level dependency can decide on its own that any line of code in the program might mean something else. > > I don't think it is fair to assume that profilers are the only "valid" > use case for agents and imply that all other use cases are a mis-use > of the API. We are not assuming that at all. Only the use of *dynamically loaded* agents *by libraries* is misuse. Dynamically loaded agents were specifically designed to support serviceability tools, not to allow libraries to circumvent the need to ask the application for permission to break encapsulation. > > I don't understand this "Non-Goal"? The Attach API [1] allows to > dynamically attach to a running JVM and "Once a reference to a virtual > machine is obtained, the loadAgent, loadAgentLibrary, and > loadAgentPath methods are used to load agents into target virtual > machine". So how can you achieve this JEP's goals without > changing/restricting the Attach API? I therefore think this "Non-Goal" > should be rephrased to explain which parts of the Attach API will be > changed and moved to the "Goal" section instead. It says ?for monitoring and management purposes.? These purposes don?t require dynamically loaded agents. They rarely require agents at all, but when they do, they only need agents loaded at startup. > > General comments: > > - You go into great detail to explain why a human-operated tool is > "superior" (in the sense of trust and security) to a library and > "would ideally not be subject to the integrity constraints imposed on > the application". I can't follow this argument, because both, the > decision to use a specific tool as well as the decision to rely on a > library is taken by a human. A tool is not superior. Only: 1. Most libraries that break encapsulation are not chosen by application authors. They are usually low-level libraries chosen by the authors of the libraries that the application uses, i.e. they?re transitive dependencies. I don?t think that applications in the JDK 8 timeframe became non-portable as a result of a conscious choice. Moreover, it is practically infeasible to actually know everything the code you use may do even if you want to. So not only do application authors not know what libraries do (especially deep dependencies), but they *can?t* feasibly know. 2. You expect a mechanic to tune your car engine but you'd probably be surprised to learn that the little tree air freshener climbs down from the rearview mirror at night and crawls into the engine to make modifications. When an operator uses a serviceability tool, they expect it to open up the box and rummage through internals. That?s what servicing often means, in software as in the physical world. They do not expect that of libraries. > I'd even argue that the decision to > depend on a specific library which requires the dynmaic attach > mechanism is taken by a more knowledgeable user (i.e. the developer > himself). Of course both, a tool as well as a library can contain > malicious code, but I don't see a fundamental difference between the > two. Malicious code is not a concern at all; we assume all code ? whether in tools or libraries ? is trusted and benevolent. (Even when looking at the security aspect in the server side ecosystem overall, malicious code amounts to a minuscule portion of the danger, judging by the number of attacks. When it comes to server security, benevolent code poses a much greater risk than malicious code, as the vast majority of security attacks exploit vulnerabilities in benevolent, trusted code. Of course, benevolent code imposes other risks covered in the JEP that are unrelated to security). Knowledgeable users who want to allow a library to arbitrarily change the meaning of code in the application are free to give it the permission to do so. But too many applications don?t even know that a dependency of a dependency of a dependency of theirs does it, and so the permission to do it cannot be the default. > > - You may argue that users have to be protected from malicious > libraries which gain their superpowers by secretly loading agents at > runtime. Again, malicious code is largely a non-issue for Java since Applets were removed. Since you brought up malicious code in previous conversations, too, let me repeat that again: Even though there have been some software supply chain attacks on various language ecosystems, malicious code poses a relatively small risk to Java nowadays and it is *not* a major concern (at least for the moment); most risks ? including security risks ? are due to nice, helpful code. > But users who don't know and don't care about their library > dependencies will just as easy and without reflection (pun intended :) > add the -XX:+EnableDynamicAgentLoading to their command line arguments > (making this the new, most often used command line option even > surpassing the usage of --add-opens :) Adding permissions by ?cargo cult? is, indeed, a problem, but at the very least the command line would still offer an auditable record of the risks taken up by the application. Responsible companies know that in some situations they may be held accountable for their technical decisions and deviations from recommended practices, and will have mechanisms in place to review command-line permissions just as they review code. As a general rule, while we certainly want to help users do the right thing, we must first give those who want to do the right thing the ability to do so. Without strengthening strong encapsulation, even someone who really wants to know the integrity risks is unable to do so without an infeasible analysis of ever line of code in the application and all of its dependencies. Moreover, because quite a few application authors do want to carefully consider risks, the fact that they need to explicitly accept more risk to use certain libraries would put pressure on libraries to reduce their superpower demands. > > - I still can't understand the benefit of "only" changing the default > behavior for dynamic agent loading. I could understand this if you'd > do it with a plan to deprecate and completely remove the dynamic agent > loading capability. But what are the benefits of changing the default > if you'll have to support the functionality anyway? The application can choose to knock down encapsulation barriers as it wishes (after all, it can even modify the Java runtime as it controls it), but we want the command line to offer a map of the codebase and its abilities. You get integrity by default, and an auditable record of the encapsulation choices always. We want tools to have superpowers, and it?s even arguably okay for certain libraries to be granted superpowers in certain situations provided that it?s done with the application?s explicit consent. It?s just that the situation where superpowers are given silently and by default has become untenable for the ecosystem as a whole. > As mentioned in > earlier discussions, my main concern with the proposed change is the > impact it will have on the evolution of Java. Java's dynamic features > are one of its biggest strength and a major reason for its success. That?s right, but the Why Now? section covers that in detail. In short, the times ? they are a changin?, and Java must be a changin? with them. Even putting aside the new requirements and more Java-in-Java, the old situation has become untenable as we saw in the 8 -> 9+ migration. The reason the old way worked ? until it didn?t ? was that for a long while (the 6-8 timeframe) Java was relatively stagnant. However, that relative stagnation didn?t just allow the encapsulation free-for-all to work; it?s also what made much of it necessary in the first place, to work around shortcomings in the JDK?s development. So not only can we not continue with the old regime, but there?s not as much need for it anymore. However useful dynamism is at times, we must have the ability to control it. The faster Java evolves, the more important that control becomes. The most important thing to remember is that the need for integrity doesn?t come from some theoretical desire for architectural cleanliness, but from real user needs. Users want smoother upgrades; they want robust security; they want more features, and they want new kinds of features that reduce startup time. All these things require control over Java?s dynamism. Some users may want dynamism, too, but since these desires are in conflict, applications must choose between them, and that?s the idea of integrity by default (dynamism by default and integrity by choice can?t work because of the structure of the Java ecosystem). > Sacrificing some of them or making their usage increasingly expensive > requires a broader discussion in the community and shouldn't happen > "under the hood" of a discussion about the default setting of a > command line flag. First, while requiring an auditable map of the codebase certainly does require some effort, let?s not exaggerate it. We?re talking about cost that is negligible compared to that of developing software, cost that is imposed only on those who want or need to use relatively advanced superpowered features, and cost that results in a something of value: a map of the codebase and its permissions. But yes, some will be inconvenienced by this, but so are those who cannot easily upgrade JDK versions due to non-portable libraries. You can call it a sacrifice if you wish, but whatever we do or don?t do *someone?s* convenience will be sacrificed, and this direction reduces rather than increases the overall sacrifice. That?s exactly why this is the direction ? we want to reduce the overall pain for users and increase their value. Second, that lengthy discussion about this direction already took place over years when Jigsaw was under development (as did the discussion about agents in particular). What was missing was a summary in JEP form, hence the informational JEP. We?ll post the JEP to jdk-dev once we finish writing it, but it describes a path that Java has already been on for several years. > > - I don't understand why this JEP has scope "SE". As you rightly > mentioned, the Attach API is a "non-standard" API which can be changed > at any time and without affecting the Java SE specification, so this > JEP should rather have scope "JDK" instead. On the other hand, the > fact that this functionality is not governed by the SE specification > will allow different OpenJDK distributors to use a different default > setting for -XX:EnableDynamicAgentLoading which has the potential to > cause a lot of confusion if we can't sattle on a common strategy. The Attach API is JDK-specific, but agents are an SE feature (well JVM TI is optional), and the platform spec will say something along the lines of ?if an implementation offers a way to attach an agent to a running JVM instance, that capability must be disabled by default and enabled with an explicit flag?. > > - If doing this change at all, I think it would be better to do it in > a non-LTS release first. LTS is a service governed by Oracle Sales or something like that on the business side of things. OpenJDK has no concept of LTS. Nevertheless, given that for various business reasons more people are likely to be using JDK 21 than other versions, we can take that expected popularity into account. See my reply to Dan. ? Ron From ron.pressler at oracle.com Mon May 1 11:21:37 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Mon, 1 May 2023 11:21:37 +0000 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> Message-ID: On 28 Apr 2023, at 20:14, Eirik Bj?rsn?s > wrote: > Agents are used by profiling tools to instrument Java applications, > but agents can also be misused to undermine the integrity of the > Java Platform. I don't think it is fair to assume that profilers are the only "valid" use case for agents and imply that all other use cases are a mis-use of the API. First, I don't read the JEP as implying that all non-profiler use cases are misuse. Having said that, I do think that agents can in fact strengthen the integrity of the platform. Case in point is that when the Java serialization vulnerabilities hit around 2015, I could very quickly ( a few hours) whip together the "NotSoSerial" serialization firewall agent [1] to efficiently prevent exploits. I later got word that a large CMS vendor deployed it to their platform which included some of the world's busiest websites. I don't know if they used the attach mechanism to plug their serialization holes, but they surely could at the time. With microservices gaining popularity over the years, restarts are probably more common and automated now, including configuration of JVM options. So attaching to long-running instances to prevent restarts is probably becoming less useful over time. The agent misuse that the JEP is referring to here is perhaps mostly concerning libraries using the attach mechanism to get access they otherwise would not have in a running JVM? Perhaps the JEP could be updated to be more clear on this? Cheers, Eirik. [1] https://github.com/kantega/notsoserial/ Keep in mind two things: 1. Dynamically loaded agents are more limited in their capabilities than agents loaded at startup because redefinition/retransformation is limited to changing the body of existing methods. Redefinition can only fix issues if you?re lucky. 2. Java offers no general mechanism to make patches applied through redefinition persistent. They are reverted at the next startup. Due to these two facts, patching code in production to change its logic (as opposed to benign instrumentation with profiling events) has never been a sanctioned usage of dynamic agents. It?s simply not a generally-effective mechanism for that. Tools that offer less restricted dynamic patching (e.g. JRebel) require an agent *loaded at startup*. ? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleenp at openjdk.org Mon May 1 11:49:24 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 11:49:24 GMT Subject: RFR: 8306851: Move Method access flags [v5] In-Reply-To: <9mcZrjg-k3wLBxbR3dCguWSBKxZkZJVGtQLsV30bMhI=.e9b2c774-c968-46e4-9b92-3e090edc07d5@github.com> References: <9mcZrjg-k3wLBxbR3dCguWSBKxZkZJVGtQLsV30bMhI=.e9b2c774-c968-46e4-9b92-3e090edc07d5@github.com> Message-ID: <1EOjCTT4zhwfv9O1Nfw3mevCOOc9Uihbhlqh4CAcnxg=.5ea8147b-6107-4b19-9acc-0fa5f7c64881@github.com> On Fri, 28 Apr 2023 19:59:53 GMT, Coleen Phillimore wrote: >> This change moves the flags from AccessFlags to either ConstMethodFlags or MethodFlags, depending on whether they are set at class file parse time, which makes them essentially const, or at runtime, which makes them needing atomic access. >> >> This leaves AccessFlags int size because Klass still has JVM flags that are more work to move, but this change doesn't increase Method size. I didn't remove JVM_RECOGNIZED_METHOD_MODIFIERS with this change since there are several of these in other places, and with this change the code is benign. >> >> Tested with tier1-6, and some manual verification of printing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix constMethod printing. Thanks David, Chris, Doug, Matias and Fred for reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13654#issuecomment-1529606901 From coleenp at openjdk.org Mon May 1 11:49:27 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 11:49:27 GMT Subject: Integrated: 8306851: Move Method access flags In-Reply-To: References: Message-ID: On Tue, 25 Apr 2023 19:09:23 GMT, Coleen Phillimore wrote: > This change moves the flags from AccessFlags to either ConstMethodFlags or MethodFlags, depending on whether they are set at class file parse time, which makes them essentially const, or at runtime, which makes them needing atomic access. > > This leaves AccessFlags int size because Klass still has JVM flags that are more work to move, but this change doesn't increase Method size. I didn't remove JVM_RECOGNIZED_METHOD_MODIFIERS with this change since there are several of these in other places, and with this change the code is benign. > > Tested with tier1-6, and some manual verification of printing. This pull request has now been integrated. Changeset: 316d303c Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/316d303c1da550c9589c9be56b65650964e3886b Stats: 781 lines in 27 files changed: 316 ins; 297 del; 168 mod 8306851: Move Method access flags Reviewed-by: cjplummer, dholmes, dnsimon, matsaave, fparain ------------- PR: https://git.openjdk.org/jdk/pull/13654 From coleenp at openjdk.org Mon May 1 13:58:25 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 13:58:25 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 22:17:30 GMT, Daniel D. Daugherty wrote: > A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. Yes, this looks good and also trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13704#pullrequestreview-1407611225 From rriggs at openjdk.org Mon May 1 15:41:24 2023 From: rriggs at openjdk.org (Roger Riggs) Date: Mon, 1 May 2023 15:41:24 GMT Subject: Integrated: 8304915: Create jdk.internal.util.Architecture enum and apply In-Reply-To: <7m7tWvmLzDchLaIvsJDDT0zrQaT4KaYPkZM87F2qrjs=.94301c48-a73d-4fd4-9cec-64754e574a97@github.com> References: <7m7tWvmLzDchLaIvsJDDT0zrQaT4KaYPkZM87F2qrjs=.94301c48-a73d-4fd4-9cec-64754e574a97@github.com> Message-ID: On Wed, 5 Apr 2023 15:58:08 GMT, Roger Riggs wrote: > Define an internal jdk.internal.util.Architecture enumeration and static methods to replace uses of the system property `os.arch`. > The enumeration values are defined to match those used in the build. > The initial values are: `X64, X86, AARCH64, RISCV64, S390, PPC64` > Note that `amd64` and `x86_64` in the build are represented by `X64`. > The value of the system property `os.arch` is unchanged. > > The API is similar to the jdk.internal.util.OperatingSystem enum created by #[12931](https://git.openjdk.org/jdk/pull/12931). > Uses in `java.base` and a few others are included but other modules will be done in separate PRs. This pull request has now been integrated. Changeset: f00a748b Author: Roger Riggs URL: https://git.openjdk.org/jdk/commit/f00a748bc5b708d4f8f277d075859b058f9d575c Stats: 411 lines in 7 files changed: 343 ins; 57 del; 11 mod 8304915: Create jdk.internal.util.Architecture enum and apply Reviewed-by: erikj, mdoerr, amitkumar ------------- PR: https://git.openjdk.org/jdk/pull/13357 From coleenp at openjdk.org Mon May 1 15:52:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 15:52:54 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v67] In-Reply-To: <3Iabuiks5W03nXCOPejWEQAZMz1GqlvaZUmuvs5Bczs=.b8433f00-9394-437f-a7e1-db407bbba983@github.com> References: <3Iabuiks5W03nXCOPejWEQAZMz1GqlvaZUmuvs5Bczs=.b8433f00-9394-437f-a7e1-db407bbba983@github.com> Message-ID: On Fri, 28 Apr 2023 19:23:24 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 164 commits: > > - Merge commit '452cb8432f4d45c3dacd4415bc9499ae73f7a17c' into JDK-8291555-v2 > - Fix arm and ppcle builds > - Merge branch 'master' into JDK-8291555-v2 > - Fix formatting > - Suggestios by @dcubed-ojdk > - Suggested changes by @merykitty > - Remove unnecessary comments > - Simple build fix for extra arches > - Merge remote-tracking branch 'upstream/master' into JDK-8291555-v2 > - A few more LM_ prefixes in 32bit code > - ... and 154 more: https://git.openjdk.org/jdk/compare/452cb843...39b199b6 I had a couple of drive-by comments. ------------- PR Review: https://git.openjdk.org/jdk/pull/10907#pullrequestreview-1407685113 From coleenp at openjdk.org Mon May 1 15:52:55 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 15:52:55 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v56] In-Reply-To: References: <2J4SoXF42zWujj5jjDllPGCHVLxuuT44tO-Oiz1PFNI=.a7bfa89d-3f4d-49b8-81ae-cd416cb5d263@github.com> <78es_NBdhW3jSDDYRHU8wcmuV53gwrvd4SB5i6g2HC4=.b93cd4c4-f0ac-44e0-b36a-854ce2f0cfac@github.com> <6vD1PFLLelAVWsCl3YpuPBhd_tuc-xlE3wH _HCp7Lu8=.6b9ed684-f94c-434e-82df-15003ded284d@github.com> Message-ID: <1qR1v6blUYOYHfR5nlceKqwHSIMhIgj6NdgXQgC37Ds=.8cb79274-56f3-4875-bf53-95bb311451d7@github.com> On Wed, 12 Apr 2023 05:26:23 GMT, Stefan Karlsson wrote: >> The old code is "racy but safe - it basically answers the question "what thread held the lock at the time I was asking?" and if we get a stack-addr as the owner at the time we ask, and that stack-address belongs to a given thread t then we report t as the owner. The fact t may have released the lock as soon as we read the stack-addr is immaterial. >> >> The new code may be a different matter however. Now the race involves oops, and potentially stale ones IIUC what Stefan is saying. So now the race is not safe, and potentially may crash. > >> That seems fine to me, as long as we don't crash. But my understanding is that Generational ZGC will crash if it sees a stale oop. Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop? To avoid this, unused slots may need to be set to nullptr even in product builds. But I'm not a GC expert so maybe there's no problem. > > Generational ZGC has verification code in fastdebug builds that try to detect stale oops. However, the current LockStack implementation seems to always clear unused slots when running in debug builds. That minimizes the risk that the verification code would find stale oops in the LockStack. > > Regarding release build, given that the LockStack code doesn't dereference any of the contained oops and we don't have oop verification code in release builds, I don't see of ZGC would crash because of this race. Note however that these kind of races are technically undefined behavior, so I wouldn't be too confident that this code is safe. Can you add a comment and file a CR describing this issue? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1181630364 From Alan.Bateman at oracle.com Mon May 1 15:56:00 2023 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 1 May 2023 16:56:00 +0100 Subject: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> Message-ID: <94f3c6e4-24cd-cfca-65c9-c18b56508239@oracle.com> On 01/05/2023 10:57, Ron Pressler wrote: > : >> Do you really plan to make JNI an optional feature which will have to >> be manually enabled at startup? > Not optional at all, but an important, useful feature that is restricted; JNI?s replacement, FFM will be restricted, too (in its use of native libraries). The restriction of FFM is already mentioned in JEP 442. Another JEP addressing JNI will be published soonish. Just to add that "Restricted methods" have been in the Java SE spec since Java 19. So far it has just been the restricted methods in the FFM API but it's hard to see how this would be extended to list Runtime.load/loadLibrary at some point. -Alan From coleenp at openjdk.org Mon May 1 15:52:57 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 May 2023 15:52:57 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v62] In-Reply-To: References: Message-ID: On Wed, 26 Apr 2023 16:07:33 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary comments > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 231: > >> 229: >> 230: public JavaThread owningThreadFromMonitor(ObjectMonitor monitor) { >> 231: if (VM.getVM().getCommandLineFlag("LockingMode").getInt() == 2) { > > Please put a comment after that literal '2': > > if (VM.getVM().getCommandLineFlag("LockingMode").getInt() == 2 /* LM_LIGHTWEIGHT */) { You could add the LM_LEGACY, LM_LIGHTWEIGHT literals to vmStructs.cpp and compare with them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1181662882 From heidinga at redhat.com Mon May 1 16:15:00 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 1 May 2023 12:15:00 -0400 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: <1D74CB9A-1040-4992-954E-AAA1430FE4F0@oracle.com> References: <1D74CB9A-1040-4992-954E-AAA1430FE4F0@oracle.com> Message-ID: On Sun, Apr 30, 2023 at 10:19?AM Ron Pressler wrote: > Hi Dan! > > > On 29 Apr 2023, at 03:30, Dan Heidinga wrote: > > > > Hi Ron, > > > > Thanks for writing up the JEP draft outlining the proposal to disallow > dynamic loading of agents by default. The Red Hat Java team has continued > to discuss this proposal internally and with our stakeholders. > > > > While there is a general agreement (or at least acceptance) with the > overall direction of this proposal, we still see the concerns raised by > Andrew [0] as needing to be addressed. > > > > So let?s start with the high-order bit: timing. > > > > The JEP draft states it ?intends to adopt the proposal [to disable > agents by default] six years later, in 2023, for JDK 21.? We would like to > see this targeted to JDK 22 instead as the change has not been widely > evangelized yet and comes as a surprise to many, both to those actively > developing OpenJDK and to those monitoring its development. > > > > We owe it to the community to give this proposal enough bake time, > especially given that JDK 21 is an LTS release. Though the official > position is that LTS releases are no different than any other release, the > actions of JDK consumers make them special. Users will be surprised by > this change when they migrate from JDK 17 to 21. If we delay till JDK 22, > we give the ecosystem time to adapt to the change before the next LTS. > > > > Additionally, users who have tested with the > -XX:-EnableDynamicAgentLoading option will have false expectations if they > validated their use of jcmd to load the agent as the behaviour was not > correct prior to JDK 21 [1]. > > > > The next concern is one you call out in the draft that ?Java's excellent > serviceability has long been a source of pride for the platform.? We > agree! > > > > Java currently has an excellent, prime position in Observability > capabilities. For better or for worse, there are many Observability tools > that have relied on dynamic attach to provide the necessary monitoring for > workloads > > > > It?s important we give Java?s monitoring tools sufficient time to > migrate comfortably without shocking the ecosystem by changing the default > in an LTS. By delaying the change till JDK 22, we give the ecosystem 2 > years to migrate and to prepare users for the change. > > > > Additionally, this provides the time for Java?s profiling tools to adapt > as well. And for the ?ad-hoc troubleshooting? tools - like Byteman - to > retrain their users. > > That?s a fair point. Even though the change was announced some years ago, > some strong encapsulation features had a transition period where they > emitted warnings before changing defaults. Since we can reasonably expect > 21 to see relatively high adoption, we could take that opportunity to > educate more users and only emit a warning when an agent is loaded > dynamically (otherwise, since many users unfortunately skip versions, they > would be equally surprised and unprepared at the next version they adopt as > they would be if the default change were made in 21). Would you find that > reasonable? > This "print a warning" approach makes a lot of sense - as you say, it educates users of dynamic agents that action will be required while not impeding the uptake of JDK 21. It also follows the precedent set by the --illegal-access option in JDK 9+. Users who don't want to see the warning in their logs can specify -XX:+EnableDynamicAgentLoading and are then well prepared for JDK 22+. Seems like a win-win approach. > > If so, we may perhaps be able to also emit warnings on JNI use in 21, thus > bringing agents, JNI, and FFM to the same baseline in 21, i.e. they would > all issue warnings unless sanctioned by the application. > I'm still working through the integrity JEP so I'll hold off on responding regarding JNI for now. > > > > > Finally, while it?s easy to agree with the principle that ?the > application owner is given the final say on whether to allow dynamic > loading of agents?, the argument can (and should!) be made that those > application owners have made that final decision by deploying the libraries > and tools that use dynamic attach. A JVM command line argument is not the > only way to express their consent for code to run on their system. > > > > For many production uses, the reality is more complicated than a single > ?application owner?. Take a Kubernetes cluster for example. > > > > Who is the application owner when deploying to a Kubernetes cluster? > The dev team that develops the application? The operations team that > manages the cluster? The monitoring team that monitors the applications? > The Support team that troubleshoots issues with the deployment? > > > > We should be careful not to understate or misrepresent the complex web > of ?owners? that are currently able (and, for business reasons, need) to > apply agents dynamically. Downplaying the complexity many organizations > experience when dealing with changes to command line options (as an > example) weakens the argument for changing today?s status quo. > > Right. In this case, ?owner? means any person who has been given the > sufficient OS privileges to attach a dynamic agent (and who then also has > sufficient privileges to stop or start the process). > > Because the ideal is not to disrupt tools at all but rather to prevent > libraries from escalating their powers without the application?s knowledge > and consent, we?ve begun to explore means other than the flag to allow a > tool to load an agent at runtime. Two ideas we?ve had so far are a > challenge-response mechanism that would verify there?s a person in the loop > or issuing certificates to tools that would be used by the VM to verify > that it is an approved tool that?s loading an agent (revoking certificates > that find their way to libraries). These mechanisms are, however, complex, > so they (or perhaps some other alternative) may appear only later. > I'm still a little dubious of the distinction between tools and libraries being drawn here. In both cases, a responsible person has chosen to deploy the library or the tool in their environment. There's a human in the loop, albeit at different stages as one decision is made during development and the other during deployment. While I understand the benefits to the runtime in not allowing dynamic attach as J9 operated in that model (or with limited capabilities for dynamically attached agents) for many years, I also saw the frequent requests to enable more dynamic capabilities for such agents from both vendors and users. The fact that users were frequently requesting it - even though attaching at launch would have resolved their issue - was surprising given how unhappy they were with that solution even though it resolved the issue. > > > > Dynamically attached agents have been a ?superpower? of Java and their > use has greatly benefited the ecosystem as they?ve helped service and > support use cases that otherwise aren?t possible, as they helped propel > Java to the forefront of the Observability tooling, and allowed many other > useful libraries to be developed. > > > > Let?s delay flipping the default until JDK 22 to give the breadth of the > ecosystem time to absorb this change. > > Very well. If a warning is acceptable, we can do that in 21 and delay the > default change to 22. > This gets a +1 from me. > > ? Ron > > P.S. > > > We also know that in many cases customers and users may not be in a > position to modify startup scripts (e.g. even to add in an extra parameter) > as to do so may invalidate support contracts, etc. > > Could you expand more on that? Even if the default change happens in 22, > it would not apply retroactively. Upgrading to a new JDK requires changing > startup scripts, as does adding/upgrading libraries, which happens at least > as frequently as upgrading a JDK version. How can a Java application be > developed and deployed without the ability to change the command line? I > can?t see how an application is expected to change its runtime version and > yet not be able to change the command line? I mean, I can imagine setups > where that could sometimes *happen* to work, but not a way for this to be > *expected* to work. Certainly since the JRE was removed it?s been the > assumption that upgrading a JDK version may require changing the command > line. > Users believed - rightly or wrongly - that some applications restricted the set of options that could be modified when deploying some Java-based applications if they wanted support. I don't have more specifics here but this concern was raised more than once. Setting the heap size was OK, modifying other -XX options was considered not OK. Historically, there has also been massive resistance to changes to command lines due to the complexity in updating launchers, launch scripts, etc. Applications that provide launcher scripts that must work across different Java releases have struggled with detecting the correct version of Java to set the options. The difficulty with command line options is why things like "-XX:+IgnoreUnrecognizedVMOptions" exist. Since the new option has existed since JDK 9, only those supporting both JDK 8 & a newer release should experience this kind of issue. --Dan > > There are more important mechanisms than loading agents dynamically that > require setting VM options, such as selecting a GC/heap configuration > tailored to the application?s particular needs. Even in third-party hosting > situations, the applications needs some level of control over the command > line and the host will appreciate more control that allows it to select > what capabilities it offers hosted applications. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Mon May 1 16:44:34 2023 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 1 May 2023 09:44:34 -0700 Subject: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: <94f3c6e4-24cd-cfca-65c9-c18b56508239@oracle.com> References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> <94f3c6e4-24cd-cfca-65c9-c18b56508239@oracle.com> Message-ID: <87746fa3-f6ed-0fea-f19c-38e65d770721@oracle.com> On 5/1/2023 8:56 AM, Alan Bateman wrote: > Just to add that "Restricted methods" have been in the Java SE spec > since Java 19. So far it has just been the restricted methods in the FFM > API but it's hard to see how this would be extended to list > Runtime.load/loadLibrary at some point. Clarification: ... but it's NOT hard to see ... FYI the section about restricted methods in the Java SE Platform Spec: - https://cr.openjdk.org/~iris/se/19/spec/latest/index.html#Restricted-methods - https://cr.openjdk.org/~iris/se/20/spec/latest/index.html#Restricted-methods Alex From eirbjo at gmail.com Mon May 1 17:04:31 2023 From: eirbjo at gmail.com (=?UTF-8?B?RWlyaWsgQmrDuHJzbsO4cw==?=) Date: Mon, 1 May 2023 19:04:31 +0200 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> Message-ID: > > Keep in mind two things: > > 1. Dynamically loaded agents are more limited in their capabilities than > agents loaded at startup because redefinition/retransformation is limited > to changing the body of existing methods. Redefinition can only fix issues > if you?re lucky. > > 2. Java offers no general mechanism to make patches applied through > redefinition persistent. They are reverted at the next startup. > Ron, My concern was more the observation that the Summary of the draft JEP can be misunderstood, (It seems to have happened in this thread): Disallow the dynamic loading of agents into a running JVM by default. > Agents are used by profiling tools to instrument Java applications, but > agents can also be misused to undermine the integrity of the Java Platform. I think the second sentence here ("agents can also be misused..") was meant to refer specifically to dynamically loaded agents, not agents in general. To help avoid confusion, this sentence could be updated to clarify that it's the dynamic loading that is misused, not the agent mechanism per se. Something like: "but *dynamically loaded agents* can also be misused to.." I know the leading sentence defines the context, but I think a bit of redunancy here would help reduce misunderstandings and prevent people from thinking "now they're taking away agents too". Thanks, Eirik. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pchilanomate at openjdk.org Mon May 1 17:19:53 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 1 May 2023 17:19:53 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v5] In-Reply-To: References: Message-ID: <4yGc6aKmFKk8rf3Aqg3EY_ayzU5nCPqgY1ANU5FL2jM=.e6c99211-5a05-455e-8aa6-fed2a52330ea@github.com> On Thu, 27 Apr 2023 04:52:53 GMT, Serguei Spitsyn wrote: >> This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. >> >> Testing: mach5 tiers 1-6 were successful. > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'br29' of https://github.com/sspitsyn/jdk into br29 > merge with branch29 > - move code a little bit Hi Serguei, Changes look good to me. Thanks for taking care of the refactoring. Patricio src/hotspot/share/runtime/sharedRuntime.cpp line 639: > 637: JRT_END > 638: > 639: JRT_ENTRY(void, SharedRuntime::notify_jvmti_vthread_start(oopDesc* vt, jboolean dummy, JavaThread* current)) Maybe rename dummy to hide and just assert is false in this case and true for the vthread_end case? ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13484#pullrequestreview-1407836173 PR Review Comment: https://git.openjdk.org/jdk/pull/13484#discussion_r1181722432 From Alan.Bateman at oracle.com Mon May 1 18:22:07 2023 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 1 May 2023 19:22:07 +0100 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: <41BFCE0A-AE21-468B-8CF0-57710034845C@oracle.com> References: <1D74CB9A-1040-4992-954E-AAA1430FE4F0@oracle.com> <41BFCE0A-AE21-468B-8CF0-57710034845C@oracle.com> Message-ID: <3a0460a6-739f-eefb-87b4-01a20542b5e2@oracle.com> On 30/04/2023 23:24, Ron Pressler wrote: > Hi Mike! > >> On 30 Apr 2023, at 19:59, Mike Hearn wrote: >> >>> we?ve begun to explore means other than the flag to allow a tool to >>> load an agent at runtime >> >> How about restricting access to the jcmd socket. For in-VM code it can >> be blocked at the filesystem implementation level, and for >> sub-processes by using the operating system APIs to determine if the >> other side of the socket is part of the same process tree at connect >> time. This would avoid the need for new UI to re-enable existing jcmd >> functionality, whilst preventing code loaded into the VM from >> connecting back to that same VM. Only truly external tools could >> trigger agent loading, or modules that had been given permission to do >> that. >> > > Determining the process on the other side and/or maintaining the > integrity of the process tree is not easy on all OSes. > Right, it's feasible to get the peer pid on some platforms but you can't rely on the process tree due to re-parenting when a parent terminates. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From amenkov at openjdk.org Mon May 1 18:26:30 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 1 May 2023 18:26:30 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: Added "no continuations" test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/d149be41..dd3be3b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=07-08 Stats: 26 lines in 1 file changed: 23 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From cjplummer at openjdk.org Mon May 1 18:30:31 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 1 May 2023 18:30:31 GMT Subject: RFR: 8282384: [LOOM] Need test for ThreadReference.interrupt() on a vthread [v4] In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 17:45:59 GMT, Chris Plummer wrote: >> Convert this ThreadReference.interrupt() test to support virtual threads. I believe this is the only test for ThreadReference.interrupt() that we have. >> >> Tested by running with and without -Dmain.wrapper=Virtual on all supported platforms. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > update comment Thanks for the reviews Serguei and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13696#issuecomment-1530032710 From cjplummer at openjdk.org Mon May 1 18:30:44 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 1 May 2023 18:30:44 GMT Subject: Integrated: 8282384: [LOOM] Need test for ThreadReference.interrupt() on a vthread In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 17:42:41 GMT, Chris Plummer wrote: > Convert this ThreadReference.interrupt() test to support virtual threads. I believe this is the only test for ThreadReference.interrupt() that we have. > > Tested by running with and without -Dmain.wrapper=Virtual on all supported platforms. This pull request has now been integrated. Changeset: ae5f678f Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/ae5f678fbafcd643a5a74447ed718636a53f9e2b Stats: 21 lines in 2 files changed: 3 ins; 2 del; 16 mod 8282384: [LOOM] Need test for ThreadReference.interrupt() on a vthread Reviewed-by: lmesnik, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13696 From greggwon at cox.net Mon May 1 19:35:27 2023 From: greggwon at cox.net (Gregg Wonderly) Date: Mon, 1 May 2023 14:35:27 -0500 Subject: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: <94f3c6e4-24cd-cfca-65c9-c18b56508239@oracle.com> References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> <94f3c6e4-24cd-cfca-65c9-c18b56508239@oracle.com> Message-ID: > On May 1, 2023, at 10:56 AM, Alan Bateman wrote: > > On 01/05/2023 10:57, Ron Pressler wrote: >> : >>> Do you really plan to make JNI an optional feature which will have to >>> be manually enabled at startup? >> Not optional at all, but an important, useful feature that is restricted; JNI?s replacement, FFM will be restricted, too (in its use of native libraries). The restriction of FFM is already mentioned in JEP 442. Another JEP addressing JNI will be published soonish. > Just to add that "Restricted methods" have been in the Java SE spec since Java 19. So far it has just been the restricted methods in the FFM API but it's hard to see how this would be extended to list Runtime.load/loadLibrary at some point. In many different places that I am aware of, there are still people using serial port connected devices. Because Sun stopped supporting their JNI based access, there have been other versions of such things done. I have packed .dll and .so libraries into jar files, and copied them to ?temp? files and loaded them from there to provide OS independent jars that could provide applications that use serial ports, USB or otherwise connected. If Runtime.load/loadLibrary are limited, a nice way to provide a description of the details should be part of any implementation that prompts the user to accept the use of JNI code. Realistically, I don?t know how you do this and provide the users any guarantees of what OS functions are actually going to be used. This is the number one reason for me, that I don?t understand why this support was removed instead of being made a platform feature. Runtime.load/loadLibrary are the way that the community supports the needs of their users when the platform doesn?t extend to such functionality. Using it as a safety ?gateway? feature is a steep path? Gregg Wonderly From ron.pressler at oracle.com Mon May 1 20:14:02 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Mon, 1 May 2023 20:14:02 +0000 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: <151F4784-C3CB-4751-A56E-9C0914C9DAA4@kleczek.org> References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> <151F4784-C3CB-4751-A56E-9C0914C9DAA4@kleczek.org> Message-ID: <6B91AFF4-9983-4ABE-A0E2-8273831867BA@oracle.com> > On 1 May 2023, at 18:08, Micha? K?eczek wrote: > > > I wonder if you are planning to define a formal grammar for all these command line options defining ?integrity policies? as it surely looks to me like? > We already have! With the exception of --enable-native-access=M1,M2,M3, the access policy is declared by modules in their module-info.java files, using a grammar that is now part of the Java language. Flags such as --add-opens, --add-exports, and --patch-module, when used *in production* (as opposed to in whitebox testing, where the configuration should be created automatically by tools), are not a policy but an emergency override of the policy that signifies some technical debt in the code that needs to be resolved. ? Ron From sspitsyn at openjdk.org Mon May 1 21:09:24 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 1 May 2023 21:09:24 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v5] In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 04:52:53 GMT, Serguei Spitsyn wrote: >> This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. >> >> Testing: mach5 tiers 1-6 were successful. > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'br29' of https://github.com/sspitsyn/jdk into br29 > merge with branch29 > - move code a little bit Patricio, thank you a lot for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13484#issuecomment-1530272166 From sspitsyn at openjdk.org Mon May 1 21:09:27 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 1 May 2023 21:09:27 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v5] In-Reply-To: <4yGc6aKmFKk8rf3Aqg3EY_ayzU5nCPqgY1ANU5FL2jM=.e6c99211-5a05-455e-8aa6-fed2a52330ea@github.com> References: <4yGc6aKmFKk8rf3Aqg3EY_ayzU5nCPqgY1ANU5FL2jM=.e6c99211-5a05-455e-8aa6-fed2a52330ea@github.com> Message-ID: On Mon, 1 May 2023 17:02:04 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'br29' of https://github.com/sspitsyn/jdk into br29 >> merge with branch29 >> - move code a little bit > > src/hotspot/share/runtime/sharedRuntime.cpp line 639: > >> 637: JRT_END >> 638: >> 639: JRT_ENTRY(void, SharedRuntime::notify_jvmti_vthread_start(oopDesc* vt, jboolean dummy, JavaThread* current)) > > Maybe rename dummy to hide and just assert is false in this case and true for the vthread_end case? Good suggestion. Thank you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13484#discussion_r1181893509 From michal at kleczek.org Mon May 1 17:08:04 2023 From: michal at kleczek.org (=?utf-8?Q?Micha=C5=82_K=C5=82eczek?=) Date: Mon, 1 May 2023 19:08:04 +0200 Subject: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <168352AB-C660-484F-BAA6-31A2B6F0D0C8@oracle.com> Message-ID: <151F4784-C3CB-4751-A56E-9C0914C9DAA4@kleczek.org> > On 1 May 2023, at 11:57, Ron Pressler wrote: > [...] > > There?s no need for such code. Modules that need JNI will use JNI. The application will simply give them permission to do so with --enable-native-access=MODULE-NAME, as it would also do to allow FFM to use native libraries. I wonder if you are planning to define a formal grammar for all these command line options defining ?integrity policies? as it surely looks to me like? grant MODULE-NAME { AllPermission } grant MODULE-NAME { OpenModulePermission(?module-to-open-name?) } Wouldn?t it be better to reconsider JEP 411 and just make running under security manager the default? ? Michal From sspitsyn at openjdk.org Mon May 1 23:42:28 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 1 May 2023 23:42:28 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v6] In-Reply-To: References: Message-ID: > This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. > > Testing: mach5 tiers 1-6 were successful. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: addressed review comment: add a couple of asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13484/files - new: https://git.openjdk.org/jdk/pull/13484/files/debe49c3..157f33af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13484&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13484&range=04-05 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13484/head:pull/13484 PR: https://git.openjdk.org/jdk/pull/13484 From lmesnik at openjdk.org Tue May 2 00:57:15 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 2 May 2023 00:57:15 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v6] In-Reply-To: References: Message-ID: On Mon, 1 May 2023 23:42:28 GMT, Serguei Spitsyn wrote: >> This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. >> >> Testing: mach5 tiers 1-6 were successful. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > addressed review comment: add a couple of asserts Please update copyrights, at leas in symbols-unix. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13484#pullrequestreview-1408264864 From sspitsyn at openjdk.org Tue May 2 01:07:29 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 01:07:29 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v6] In-Reply-To: References: Message-ID: On Mon, 1 May 2023 23:42:28 GMT, Serguei Spitsyn wrote: >> This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. >> >> Testing: mach5 tiers 1-6 were successful. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > addressed review comment: add a couple of asserts Leonid, thank you a lot for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13484#issuecomment-1530734063 From sspitsyn at openjdk.org Tue May 2 01:09:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 01:09:40 GMT Subject: Withdrawn: 8297286: runtime/vthread tests crashing after JDK-8296324 In-Reply-To: References: Message-ID: On Wed, 23 Nov 2022 00:24:28 GMT, Serguei Spitsyn wrote: > This problem has two sides. > One is that the `VirtualThread::run() `cashes the field `notifyJvmtiEvents` value. > It caused the native method `notifyJvmtiUnmountBegin()` not called after the field `notifyJvmtiEvents` > value has been set to `true` when an agent library is loaded into running VM. > The fix is to get rid of this cashing. > Another is that enabling `notifyJvmtiEvents` notifications needs a synchronization. > Otherwise, a VTMS transition start can be missed which will cause some asserts to fire. > The fix is to use a JvmtiVTMSTransitionDisabler helper for sync. > > Testing: > The originally failed tests are passed now: > > runtime/vthread/RedefineClass.java > runtime/vthread/TestObjectAllocationSampleEvent.java > > In progress: > Run the tiers 1-6 to make sure there are no regression. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/11304 From sspitsyn at openjdk.org Tue May 2 01:23:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 01:23:22 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v7] In-Reply-To: References: Message-ID: > This enhancement adds support of virtual threads to the JVMTI `StopThread` function. > In preview releases before this enhancement the StopThread returned the JVMTI_ERROR_UNSUPPORTED_OPERATION error code for virtual threads. > > The `StopThread` supports sending an asynchronous exception to a virtual thread only if it is current or suspended at mounted state. For instance, a virtual thread can be suspended at a JVMTI event. If the virtual thread is not suspended and is not current then the `JVMTI_ERROR_THREAD_NOT_SUSPENDED` error code is returned. If the virtual thread was suspended at unmounted state then the `JVMTI_ERROR_OPAQUE_FRAME` error code is returned. > > The `StopThread` has the following description for `JVMTI_ERROR_OPAQUE_FRAME` error code: >> The thread is a suspended virtual thread and the implementation >> was unable to throw an asynchronous exception from this frame. > > A couple of the `serviceability/jvmti/vthread` tests has been updated to adopt to new `StopThread` behavior. > > The CSR is: https://bugs.openjdk.org/browse/JDK-8306434 > > Testing: > The mach5 tears 1-6 are in progress. > Preliminary test runs were good in general. > The JDB test `vmTestbase/nsk/jdb/kill/kill001/kill001.java` has been problem-listed and will be fixed by the corresponding debugger enhancement which is going to adopt JDWP/JDI specs to new behavior of the JVMTI `StopThread` related to virtual threads. > > Also, two JCK JVMTI tests are failing in the tier-6 : >> vm/jvmti/StopThread/stop001/stop00103/stop00103.html >> vm/jvmti/StopThread/stop001/stop00103/stop00103a.html > > These two tests will be excluded from the test runs by the JCK team and then adjusted to new `StopThread` behavior. Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge - install_async_exception: set interrupt status for platform threads only - minor tweak in new test - 1. Address review comments 2. Clear interrupt bit in the TestTaskThread - corrections for BoundVirtualThread and test typos - addressed review comments on new test - fixed trailing spaces - 8306034: add support of virtual threads to JVMTI StopThread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13546/files - new: https://git.openjdk.org/jdk/pull/13546/files/0113f034..50e615eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=05-06 Stats: 58946 lines in 964 files changed: 40128 ins; 12285 del; 6533 mod Patch: https://git.openjdk.org/jdk/pull/13546.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13546/head:pull/13546 PR: https://git.openjdk.org/jdk/pull/13546 From sspitsyn at openjdk.org Tue May 2 01:53:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 01:53:49 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v7] In-Reply-To: References: Message-ID: > This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. > > Testing: mach5 tiers 1-6 were successful. Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge - minor correction in sharedRuntime.cpp - addressed review comment: add a couple of asserts - Merge branch 'br29' of https://github.com/sspitsyn/jdk into br29 merge with branch29 - Merge branch 'master' into br29 - move code a little bit - do more refactoring including VirtualThread class - Merge - 8304444: Reappearance of NULL in jvmtiThreadState.cpp - 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions ------------- Changes: https://git.openjdk.org/jdk/pull/13484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13484&range=06 Stats: 333 lines in 16 files changed: 184 ins; 71 del; 78 mod Patch: https://git.openjdk.org/jdk/pull/13484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13484/head:pull/13484 PR: https://git.openjdk.org/jdk/pull/13484 From sspitsyn at openjdk.org Tue May 2 02:01:44 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 02:01:44 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v8] In-Reply-To: References: Message-ID: > This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. > > Testing: mach5 tiers 1-6 were successful. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: update copyright comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13484/files - new: https://git.openjdk.org/jdk/pull/13484/files/02b27601..f4227c7a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13484&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13484&range=06-07 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13484/head:pull/13484 PR: https://git.openjdk.org/jdk/pull/13484 From sspitsyn at openjdk.org Tue May 2 02:01:44 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 02:01:44 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 00:54:07 GMT, Leonid Mesnik wrote: > Please update copyrights, at leas in symbols-unix. Done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13484#issuecomment-1530762860 From sspitsyn at openjdk.org Tue May 2 02:44:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 02:44:46 GMT Subject: Integrated: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions In-Reply-To: References: Message-ID: <4HBGM1LMxLP3QICIjPSjRmgbqAEpAKwbSipcIsun7F0=.80d9c95c-9f81-4a6a-bf25-1731681f7f1e@github.com> On Fri, 14 Apr 2023 22:01:23 GMT, Serguei Spitsyn wrote: > This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. > > Testing: mach5 tiers 1-6 were successful. This pull request has now been integrated. Changeset: 1227a275 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/1227a275a1c1e82b9a6410843f32534d7e841f54 Stats: 335 lines in 16 files changed: 184 ins; 71 del; 80 mod 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions 8304444: Reappearance of NULL in jvmtiThreadState.cpp Reviewed-by: pchilanomate, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/13484 From sspitsyn at openjdk.org Tue May 2 03:22:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 03:22:20 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: Message-ID: > This enhancement adds support of virtual threads to the JVMTI `StopThread` function. > In preview releases before this enhancement the StopThread returned the JVMTI_ERROR_UNSUPPORTED_OPERATION error code for virtual threads. > > The `StopThread` supports sending an asynchronous exception to a virtual thread only if it is current or suspended at mounted state. For instance, a virtual thread can be suspended at a JVMTI event. If the virtual thread is not suspended and is not current then the `JVMTI_ERROR_THREAD_NOT_SUSPENDED` error code is returned. If the virtual thread was suspended at unmounted state then the `JVMTI_ERROR_OPAQUE_FRAME` error code is returned. > > The `StopThread` has the following description for `JVMTI_ERROR_OPAQUE_FRAME` error code: >> The thread is a suspended virtual thread and the implementation >> was unable to throw an asynchronous exception from this frame. > > A couple of the `serviceability/jvmti/vthread` tests has been updated to adopt to new `StopThread` behavior. > > The CSR is: https://bugs.openjdk.org/browse/JDK-8306434 > > Testing: > The mach5 tears 1-6 are in progress. > Preliminary test runs were good in general. > The JDB test `vmTestbase/nsk/jdb/kill/kill001/kill001.java` has been problem-listed and will be fixed by the corresponding debugger enhancement which is going to adopt JDWP/JDI specs to new behavior of the JVMTI `StopThread` related to virtual threads. > > Also, two JCK JVMTI tests are failing in the tier-6 : >> vm/jvmti/StopThread/stop001/stop00103/stop00103.html >> vm/jvmti/StopThread/stop001/stop00103/stop00103a.html > > These two tests will be excluded from the test runs by the JCK team and then adjusted to new `StopThread` behavior. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: minor tweak of JVMTI_ERROR_OPAQUE_FRAME description ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13546/files - new: https://git.openjdk.org/jdk/pull/13546/files/50e615eb..0ad9a6cc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13546.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13546/head:pull/13546 PR: https://git.openjdk.org/jdk/pull/13546 From sspitsyn at openjdk.org Tue May 2 03:22:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 03:22:21 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v6] In-Reply-To: References: <7fdlC2euVU0tBa91ZqEuLj9QLVNXe5hTT0KnImBaGgw=.e0a45607-2a7b-462c-98b6-16d5982ec495@github.com> <9XF3Y1s-QPZYzNu335PSoVIny_NvhIBEquY4qegGmXk=.e648f206-d58b-49b7-bf58-6360d275394d@github.com> Message-ID: On Fri, 28 Apr 2023 00:50:54 GMT, Serguei Spitsyn wrote: >> We have two suggestions: >>> - "or a function on a thread cannot be performed at the thread's current frame". >>> - "the function cannot be performed on the thread's current frame." >> >> So, we need to pick one. The second one looks simpler to me but >> I'm not completely sure that it reflects the full meaning correctly. >> I wonder about a mix of the two suggestions above: >> >>> "the function cannot be performed at the thread's current frame." > > We need to account for the `SetLocalXXX` functions with the `depth` parameter which also return `OPAQUE_FRAME` error code for virtual frames. My concern is if the "current frame" part is fully correct. I've pushed variant from Chris which is a rephrase of what Alan suggested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1182047387 From alanb at openjdk.org Tue May 2 06:31:26 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 2 May 2023 06:31:26 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v6] In-Reply-To: References: <7fdlC2euVU0tBa91ZqEuLj9QLVNXe5hTT0KnImBaGgw=.e0a45607-2a7b-462c-98b6-16d5982ec495@github.com> <9XF3Y1s-QPZYzNu335PSoVIny_NvhIBEquY4qegGmXk=.e648f206-d58b-49b7-bf58-6360d275394d@github.com> Message-ID: <1CiuncDd2MDNP-jjJel1tWLwmjgLXjVqCL8aiBVZ4H8=.dab456e2-fba6-4968-8bc8-6e25600bc58c@github.com> On Tue, 2 May 2023 03:17:42 GMT, Serguei Spitsyn wrote: >> We need to account for the `SetLocalXXX` functions with the `depth` parameter which also return `OPAQUE_FRAME` error code for virtual frames. My concern is if the "current frame" part is fully correct. > > I've pushed variant from Chris which is a rephrase of what Alan suggested. I can't help thinking we can do better than "on the thread's current frame" but in the absence of a better suggestion then I think what you have is okay. I think the CSR will need to be edited to sync it up with the wording that has been agreed here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1182132180 From sspitsyn at openjdk.org Tue May 2 06:36:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 06:36:15 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 12:48:44 GMT, Stefan Johansson wrote: > Hi all, > > Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint when there is nothing that can be cleaned up. > > **Summary** > When transforming/redefining classes a previous version list is linked together in the InstanceKlass. The original class is added to this list if it is still used or shared. The difference between shared and used is not currently noted. This leads to a problem when doing concurrent class unloading, because during that we postpone some potential work to a safepoint (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it is triggered by the ServiceThread if there is work to be done, for example if InstanceKlass::_has_previous_versions is true. > > Since we currently does not differentiate between shared and "in use" we always set _has_previous_versions if anything is on this list. This together with the fact that shared previous versions should never be cleaned out leads to this safepoint being triggered after every concurrent class unloading even though there is nothing that can be cleaned out. > > This can be avoided by making sure the _previous_versions list is only cleaned when there are non-shared classes on it. This change renames `_has_previous_versions` to `_clean_previous_versions` and only updates it if we have non-shared classes on the list. > > **Testing** > * A lot of manual testing verifying that we do get the safepoint when we should. > * Added new test to verify expected behavior by parsing the logs. The test uses JFR to trigger redefinition of some shared classes (when -Xshare:on). > * Mach5 run of new test and tier 1-3 src/hotspot/share/oops/instanceKlass.hpp line 718: > 716: > 717: private: > 718: static bool _clean_previous_versions; Nit: I'd suggest to name it as `_should_clean_previous_versions`. Then the corresponding function needs to be named as `should_clean_previous_versions()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13716#discussion_r1182136483 From sspitsyn at openjdk.org Tue May 2 06:58:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 06:58:15 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 12:48:44 GMT, Stefan Johansson wrote: > Hi all, > > Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint when there is nothing that can be cleaned up. > > **Summary** > When transforming/redefining classes a previous version list is linked together in the InstanceKlass. The original class is added to this list if it is still used or shared. The difference between shared and used is not currently noted. This leads to a problem when doing concurrent class unloading, because during that we postpone some potential work to a safepoint (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it is triggered by the ServiceThread if there is work to be done, for example if InstanceKlass::_has_previous_versions is true. > > Since we currently does not differentiate between shared and "in use" we always set _has_previous_versions if anything is on this list. This together with the fact that shared previous versions should never be cleaned out leads to this safepoint being triggered after every concurrent class unloading even though there is nothing that can be cleaned out. > > This can be avoided by making sure the _previous_versions list is only cleaned when there are non-shared classes on it. This change renames `_has_previous_versions` to `_clean_previous_versions` and only updates it if we have non-shared classes on the list. > > **Testing** > * A lot of manual testing verifying that we do get the safepoint when we should. > * Added new test to verify expected behavior by parsing the logs. The test uses JFR to trigger redefinition of some shared classes (when -Xshare:on). > * Mach5 run of new test and tier 1-3 Thank you for taking care about it. I've posted a couple of comments but it it looks good anyway. Thanks, Serguei test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineSharedClassJFR.java line 94: > 92: .shouldNotContain("scratch class added; one of its methods is on_stack.") > 93: .shouldHaveExitValue(0); > 94: return; The fragments 61-74 and 79-93 have a big common part which can be a good base for a refactoring. But it can be not worth it. So, I leave it up to you. ------------- PR Review: https://git.openjdk.org/jdk/pull/13716#pullrequestreview-1408498631 PR Review Comment: https://git.openjdk.org/jdk/pull/13716#discussion_r1182156381 From sspitsyn at openjdk.org Tue May 2 07:05:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 07:05:18 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v6] In-Reply-To: <1CiuncDd2MDNP-jjJel1tWLwmjgLXjVqCL8aiBVZ4H8=.dab456e2-fba6-4968-8bc8-6e25600bc58c@github.com> References: <7fdlC2euVU0tBa91ZqEuLj9QLVNXe5hTT0KnImBaGgw=.e0a45607-2a7b-462c-98b6-16d5982ec495@github.com> <9XF3Y1s-QPZYzNu335PSoVIny_NvhIBEquY4qegGmXk=.e648f206-d58b-49b7-bf58-6360d275394d@github.com> <1CiuncDd2MDNP-jjJel1tWLwmjgLXjVqCL8aiBVZ4H8=.dab456e2-fba6-4968-8bc8-6e25600bc58c@github.com> Message-ID: On Tue, 2 May 2023 06:27:04 GMT, Alan Bateman wrote: >> I've pushed variant from Chris which is a rephrase of what Alan suggested. > > I can't help thinking we can do better than "on the thread's current frame" but in the absence of a better suggestion then I think what you have is okay. I think the CSR will need to be edited to sync it up with the wording that has been agreed here. Thank you, Alan. Updated the CSR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1182163356 From dholmes at openjdk.org Tue May 2 07:36:17 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 May 2023 07:36:17 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests Not sure about keeping the "finalize" terminology - though perhaps with more extensive commenting in the key classes it may be okay. test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 48: > 46: * It is implemented in FinalizableObject. > 47: */ > 48: public void registerCleanup(); Can you not implement this as a default method? test/hotspot/jtreg/vmTestbase/nsk/share/FinalizableObject.java line 39: > 37: > 38: /** > 39: * This method will be invoked by Finalizer when virtual mashine You copied the `mashine` typo here. test/hotspot/jtreg/vmTestbase/nsk/share/FinalizableObject.java line 49: > 47: public void cleanup() {} > 48: /** > 49: * This method will be invoked by Finalizer when virtual mashine Another `mashine` typo. ------------- PR Review: https://git.openjdk.org/jdk/pull/13420#pullrequestreview-1408541140 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1182188265 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1182182712 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1182183297 From dholmes at openjdk.org Tue May 2 08:08:16 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 May 2023 08:08:16 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 22:17:30 GMT, Daniel D. Daugherty wrote: > A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. One minor nit but otherwise good. Thanks. src/hotspot/share/runtime/threadSMR.cpp line 828: > 826: if (java_thread != JavaThread::current()) { > 827: // java_thread is not the current JavaThread so we have to verify it > 828: // against the ThreadsList: Colon at the end of the comment seems odd ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13704#pullrequestreview-1408596858 PR Review Comment: https://git.openjdk.org/jdk/pull/13704#discussion_r1182218848 From sspitsyn at openjdk.org Tue May 2 08:31:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 08:31:18 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: <0cQDniTsL610poR0coQ_ilDBCGcwp-LTMYdzESTd6FI=.f4f52d76-8413-44c9-a1ec-3b1ed8da34df@github.com> On Mon, 1 May 2023 18:26:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Added "no continuations" test case src/hotspot/share/prims/jvmtiTagMap.cpp line 2796: > 2794: if (!java_thread->has_last_Java_frame()) { > 2795: // this may be only platform thread > 2796: assert(mounted_vt == nullptr, "must be"); I'm not sure this assert is right. I think, a virtual thread may have an empty stack observable from a VM_op, for instance when it is in a process of being terminated. Though, it is not that easy to make this assert fired with a test case and prove this can happen. Another danger is that a virtual thread can be observed from a VM_op as in a VTMS (mount/unmount) transition. I need to think a little bit about possible consequences. Is it better to treat current thread identity as of a carrier thread in such a case? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182242390 From sspitsyn at openjdk.org Tue May 2 09:38:17 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 09:38:17 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 22:17:30 GMT, Daniel D. Daugherty wrote: > A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13704#pullrequestreview-1408752268 From sspitsyn at openjdk.org Tue May 2 09:43:19 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 09:43:19 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Mon, 1 May 2023 18:26:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Added "no continuations" test case test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 208: > 206: > 207: private static void verifyVthreadMounted(Thread t, boolean expectedMounted) { > 208: // Hucky, but simple. Nit: Hucky => Hacky ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182325468 From sspitsyn at openjdk.org Tue May 2 09:50:26 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 09:50:26 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Mon, 1 May 2023 18:26:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Added "no continuations" test case test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 38: > 36: * @test id=no-vmcontinuations > 37: * @requires vm.jvmti > 38: * @enablePreview We do not @enablePreview at lines 28 and 38 anymore. test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 41: > 39: * @run main/othervm/native > 40: * -XX:+UnlockExperimentalVMOptions -XX:-VMContinuations > 41: * -Djdk.virtualThreadScheduler.parallelism=1 Why do we need the line 41 in this case? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182331454 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182328988 From sspitsyn at openjdk.org Tue May 2 10:13:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 10:13:18 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Mon, 1 May 2023 18:26:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Added "no continuations" test case src/hotspot/share/prims/jvmtiTagMap.cpp line 2245: > 2243: bool is_top_frame; > 2244: int depth; > 2245: frame* last_entry_frame; The field names of a helper class are usually started with '_' symbol. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182355013 From sspitsyn at openjdk.org Tue May 2 10:23:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 10:23:18 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: <9YaXP7ZdK8KKjcm6sLlatsammtgIlNG9shPhhp2UQ3Y=.f990013e-9a2d-4383-8364-02260791469e@github.com> On Mon, 1 May 2023 18:26:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Added "no continuations" test case src/hotspot/share/prims/jvmtiTagMap.cpp line 2319: > 2317: } > 2318: } > 2319: } The fragments 2289-2303 and 2305-2319 are based on the `StackValueCollection` and look very similar. It can be worth to refactor these fragments into two function calls: bool report_stack_value_collection(jmethodID method, int idx_base, StackValueCollection* elems, jlocation bci) { for (int index = 0; index < exprs->size(); index++) { if (exprs->at(index)->type() == T_OBJECT) { oop obj = elems->obj_at(index)(); if (obj == nullptr) { continue; } // stack reference if (!CallbackInvoker::report_stack_ref_root(thread_tag, tid, depth, method, bci, idx_base + index, obj)) { return false; } } } return true; // ??? . . . . . jlocation bci = (jlocation)jvf->bci(); StackValueCollection* locals = jvf->locals(); if (!report_stack_value_collection(method, locals, 0 /* idx_base*/, bci)) { return false; } StackValueCollection* exprs = jvf->expressions(); if (!report_stack_value_collection(method, exprs, locals->size(), bci)) { return false; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1182363174 From rkennke at openjdk.org Tue May 2 12:40:17 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 12:40:17 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v68] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @coleenp's review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/39b199b6..a3e41c41 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=67 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=66-67 Stats: 14 lines in 3 files changed: 12 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Tue May 2 12:43:43 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 12:43:43 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v69] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix copyright on new files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/a3e41c41..9b25681f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=68 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=67-68 Stats: 6 lines in 3 files changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From ron.pressler at oracle.com Tue May 2 14:14:26 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Tue, 2 May 2023 14:14:26 +0000 Subject: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default In-Reply-To: References: <1D74CB9A-1040-4992-954E-AAA1430FE4F0@oracle.com> Message-ID: On 1 May 2023, at 17:15, Dan Heidinga > wrote: This "print a warning" approach makes a lot of sense - as you say, it educates users of dynamic agents that action will be required while not impeding the uptake of JDK 21. It also follows the precedent set by the --illegal-access option in JDK 9+. Users who don't want to see the warning in their logs can specify -XX:+EnableDynamicAgentLoading and are then well prepared for JDK 22+. Seems like a win-win approach. Great. I?ve amended the JEP to propose a warning rather than an error: https://openjdk.org/jeps/8306275 I'm still a little dubious of the distinction between tools and libraries being drawn here. In both cases, a responsible person has chosen to deploy the library or the tool in their environment. There's a human in the loop, albeit at different stages as one decision is made during development and the other during deployment. What we meant in the JEP is that a library operates as part of the application and its functionality is part of the application?s functionality (and so a library can make the application non-portable) while the kind of deep troubleshooting that requires the loading of an agent at runtime requires an operator to trigger its functionality. But another distinction would be one of expectation. People expect a deep troubleshooting tool to peer into and perhaps even rummage arbitrarily in the deep internals of the runtime but they don?t expect that of most libraries. For example, people were surprised when their applications broke on JDK upgrades even though the non-portable libraries they used were non-portable by design. While I understand the benefits to the runtime in not allowing dynamic attach as J9 operated in that model (or with limited capabilities for dynamically attached agents) for many years, I also saw the frequent requests to enable more dynamic capabilities for such agents from both vendors and users. The fact that users were frequently requesting it - even though attaching at launch would have resolved their issue - was surprising given how unhappy they were with that solution even though it resolved the issue. Users may demand arbitrary dynamism and users may demand integrity. But because the two are in inherent conflict, and any choice taken on their behalf will make some users unhappy, users must be able to choose between the conflicting options. Prior to strong encapsulation, the effects of a library on integrity were unknowable, so users did not have that choice. Users believed - rightly or wrongly - that some applications restricted the set of options that could be modified when deploying some Java-based applications if they wanted support. I don't have more specifics here but this concern was raised more than once. Setting the heap size was OK, modifying other -XX options was considered not OK. I've heard such stories, too, but an actual report of a difficulty remains elusive. Until we get reports of actual problems that people have encountered, it?s hard to see how a Java application with sophisticated and relatively unusual needs could be deployed and maintained without control over the command line. Java has always required -XX flags for an application to specify its specialised GC needs; I don?t see how requiring such a flag for even more specialised tooling needs could be onerous (quotidian tool use for debugging, monitoring, control, APM, or most in-production profiling does not require the loading of agents at runtime). ? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleenp at openjdk.org Tue May 2 15:44:18 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 May 2023 15:44:18 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: <1mWZqnMxg3gZ4rPnhtAtreL91yv7xwIq9lptf0I0k0I=.196bdd2d-ac63-408c-89c8-cb7b24006a1e@github.com> On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests This looks good to me with correcting mashine typo. We should file another RFE to rename these classes and methods and fix all the comments, many containing typos and let the hotspot/srv take care of that. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13420#pullrequestreview-1409376718 From coleenp at openjdk.org Tue May 2 15:44:20 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 May 2023 15:44:20 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 07:32:18 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > > test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 48: > >> 46: * It is implemented in FinalizableObject. >> 47: */ >> 48: public void registerCleanup(); > > Can you not implement this as a default method? It seems much better to force implementors to provide this method. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1182724791 From tschatzl at openjdk.org Tue May 2 15:53:17 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 15:53:17 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v5] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into 8306836-remove-pinned-tag - remove is_young_gc_movable in full gc code - cplummer review - ayang review - Fix hsdb - compilation fixes - Initial implementation ------------- Changes: https://git.openjdk.org/jdk/pull/13643/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=04 Stats: 69 lines in 20 files changed: 12 ins; 30 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/13643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13643/head:pull/13643 PR: https://git.openjdk.org/jdk/pull/13643 From tschatzl at openjdk.org Tue May 2 16:47:06 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 16:47:06 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Remove is_young_gc_movable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13643/files - new: https://git.openjdk.org/jdk/pull/13643/files/3577054b..3516e982 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=04-05 Stats: 17 lines in 6 files changed: 1 ins; 9 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/13643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13643/head:pull/13643 PR: https://git.openjdk.org/jdk/pull/13643 From tschatzl at openjdk.org Tue May 2 16:47:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 16:47:36 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v5] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 15:53:17 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into 8306836-remove-pinned-tag > - remove is_young_gc_movable in full gc code > - cplummer review > - ayang review > - Fix hsdb > - compilation fixes > - Initial implementation I removed the `young_gc_is_movable()` predicate; it is probably the wrong time to introduce more abstract concepts like this in this change. Moved off the refactoring of the `G1CollectionSetChooser::should_add()` and its caller to sometime else too - it's not relevant to this change either. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13643#issuecomment-1531806113 From kevinw at openjdk.org Tue May 2 18:05:11 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 2 May 2023 18:05:11 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl Message-ID: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. This class is not referenced by other classes or tests. ------------- Commit messages: - 8307244: Remove redundant class RMIIIOPServerImpl Changes: https://git.openjdk.org/jdk/pull/13758/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13758&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307244 Stats: 102 lines in 2 files changed: 0 ins; 101 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13758.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13758/head:pull/13758 PR: https://git.openjdk.org/jdk/pull/13758 From rkennke at openjdk.org Tue May 2 18:38:11 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 18:38:11 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v70] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Add missing new file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/9b25681f..423dbcdb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=69 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=68-69 Stats: 60 lines in 1 file changed: 60 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From sspitsyn at openjdk.org Tue May 2 18:46:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 18:46:15 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. As I see, this transport was deprecated for some time. But it is not clear in what release. The fix looks good to me. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13758#pullrequestreview-1409689868 From cjplummer at openjdk.org Tue May 2 19:00:30 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 19:00:30 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 03:22:20 GMT, Serguei Spitsyn wrote: >> This enhancement adds support of virtual threads to the JVMTI `StopThread` function. >> In preview releases before this enhancement the StopThread returned the JVMTI_ERROR_UNSUPPORTED_OPERATION error code for virtual threads. >> >> The `StopThread` supports sending an asynchronous exception to a virtual thread only if it is current or suspended at mounted state. For instance, a virtual thread can be suspended at a JVMTI event. If the virtual thread is not suspended and is not current then the `JVMTI_ERROR_THREAD_NOT_SUSPENDED` error code is returned. If the virtual thread was suspended at unmounted state then the `JVMTI_ERROR_OPAQUE_FRAME` error code is returned. >> >> The `StopThread` has the following description for `JVMTI_ERROR_OPAQUE_FRAME` error code: >>> The thread is a suspended virtual thread and the implementation >>> was unable to throw an asynchronous exception from this frame. >> >> A couple of the `serviceability/jvmti/vthread` tests has been updated to adopt to new `StopThread` behavior. >> >> The CSR is: https://bugs.openjdk.org/browse/JDK-8306434 >> >> Testing: >> The mach5 tears 1-6 are in progress. >> Preliminary test runs were good in general. >> The JDB test `vmTestbase/nsk/jdb/kill/kill001/kill001.java` has been problem-listed and will be fixed by the corresponding debugger enhancement which is going to adopt JDWP/JDI specs to new behavior of the JVMTI `StopThread` related to virtual threads. >> >> Also, two JCK JVMTI tests are failing in the tier-6 : >>> vm/jvmti/StopThread/stop001/stop00103/stop00103.html >>> vm/jvmti/StopThread/stop001/stop00103/stop00103a.html >> >> These two tests will be excluded from the test runs by the JCK team and then adjusted to new `StopThread` behavior. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > minor tweak of JVMTI_ERROR_OPAQUE_FRAME description src/hotspot/share/prims/jvmti.xml line 1925: > 1923: > 1924: The thread is a suspended virtual thread and the implementation was unable > 1925: to throw an asynchronous exception from this frame. This part no longer has wording similar to the general description of JVMTI_ERROR_OPAQUE_FRAME below. Maybe that was understood and intended when the rewording was done. Just want to make sure you are aware of it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1182936329 From sspitsyn at openjdk.org Tue May 2 19:02:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 19:02:22 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable This looks good in general. I can't judge on the GC side decision about this removal and all updated comments but it looks consistent. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1409712756 From sspitsyn at openjdk.org Tue May 2 19:17:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 19:17:21 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Fri, 28 Apr 2023 21:30:23 GMT, Chris Plummer wrote: >> Note this PR depends on the #13546 PR for the following: >> >> [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread >> >> So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. >> >> >> [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. >> >> Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. >> >> Our existing testing for ThreadReference.stop() is fairly weak: >> >> - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. >> - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() >> - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly >> >> I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Some test logging improvements. src/java.se/share/data/jdwp/jdwp.spec line 2024: > 2022: (Error THREAD_NOT_SUSPENDED "The thread is a virtual thread and was not suspended.") > 2023: (Error OPAQUE_FRAME "The thread is a suspended virtual thread and the implementation " > 2024: "was unable to throw an asynchronous exception from this frame.") Should it be aligned with JVMTI and some other places in your fix and say "from the current frame" instead of "from this frame"? src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 132: > 130: * @throws OpaqueFrameException if the thread is a suspended > 131: * virtual thread and the implementation was unable to throw an > 132: * asynchronous exception from this frame The same comment as for the jdwp spec: Should it be aligned with JVMTI and some other places in your fix and say "from the current frame" instead of "from this frame"? src/jdk.jdi/share/classes/com/sun/tools/jdi/ThreadReferenceImpl.java line 279: > 277: case JDWP.Error.OPAQUE_FRAME: > 278: assert isVirtual(); // can only happen with virtual threads > 279: throw new OpaqueFrameException(); Should the OpaqueFrameException also provide a message? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182950871 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182951575 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182954262 From alanb at openjdk.org Tue May 2 19:37:16 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 2 May 2023 19:37:16 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: On Tue, 2 May 2023 18:43:26 GMT, Serguei Spitsyn wrote: > As I see, this transport was deprecated for some time. > But it is not clear in what release. There were a couple of steps in this. The JSR-160 specification was updated for Java 8 so that the RMI connector wasn't require to support the IIOP transport. RMI-IIOP was eventually removed in Java 11 (JEP 320). As I recall, RMIIIOPServerImpl was for implementers but was/is part of API so it couldn't be removed. I've added the csr label to the PR as this will need to be tracked as an API change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13758#issuecomment-1532038573 From sspitsyn at openjdk.org Tue May 2 19:40:16 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 19:40:16 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Fri, 28 Apr 2023 21:30:23 GMT, Chris Plummer wrote: >> Note this PR depends on the #13546 PR for the following: >> >> [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread >> >> So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. >> >> >> [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. >> >> Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. >> >> Our existing testing for ThreadReference.stop() is fairly weak: >> >> - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. >> - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() >> - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly >> >> I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Some test logging improvements. test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 39: > 37: /** > 38: * The test checks that the JDI method:
com.sun.jdi.ThreadReference.stop()
> 39: * behaves properly in various situations. I consists of 5 subtests. Typo: "I consists" => "It consists". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182973906 From cjplummer at openjdk.org Tue May 2 20:09:14 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 20:09:14 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v5] In-Reply-To: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: > Note this PR depends on the #13546 PR for the following: > > [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread > > So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. > > > [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. > > Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. > > Our existing testing for ThreadReference.stop() is fairly weak: > > - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. > - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() > - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly > > I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: fix minor comment typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13548/files - new: https://git.openjdk.org/jdk/pull/13548/files/801362e5..478b6b93 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13548.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13548/head:pull/13548 PR: https://git.openjdk.org/jdk/pull/13548 From cjplummer at openjdk.org Tue May 2 20:09:21 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 20:09:21 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Tue, 2 May 2023 19:11:14 GMT, Serguei Spitsyn wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> Some test logging improvements. > > src/java.se/share/data/jdwp/jdwp.spec line 2024: > >> 2022: (Error THREAD_NOT_SUSPENDED "The thread is a virtual thread and was not suspended.") >> 2023: (Error OPAQUE_FRAME "The thread is a suspended virtual thread and the implementation " >> 2024: "was unable to throw an asynchronous exception from this frame.") > > Should it be aligned with JVMTI and some other places in your fix and say > "from the current frame" instead of "from this frame"? I'm waiting for your JVMTI PR to finish review. I don't want to have to change this more than once. > src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 132: > >> 130: * @throws OpaqueFrameException if the thread is a suspended >> 131: * virtual thread and the implementation was unable to throw an >> 132: * asynchronous exception from this frame > > The same comment as for the jdwp spec: > Should it be aligned with JVMTI and some other places in your fix and say > "from the current frame" instead of "from this frame"? I'm waiting for your JVMTI PR to finish review. I don't want to have to change this more than once. > src/jdk.jdi/share/classes/com/sun/tools/jdi/ThreadReferenceImpl.java line 279: > >> 277: case JDWP.Error.OPAQUE_FRAME: >> 278: assert isVirtual(); // can only happen with virtual threads >> 279: throw new OpaqueFrameException(); > > Should the OpaqueFrameException also provide a message? The implementation tends to only include an exception message when it helps to clarify the reason for the exception. The spec only gives one possible reason for this exception, so no clarification should be needed. Plus it would be hard to meaningfully convey this reason in a short message. Here's what the spec says: * @throws OpaqueFrameException if the thread is a suspended * virtual thread and the implementation was unable to throw an * asynchronous exception from this frame > test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002t.java line 78: > >> 76: /* >> 77: * TEST #2: async exception while suspended at a breakpoint. >> 78: */ > > Where is a similar comment for TEST #1 ? > Would it make sense to implement each subtest as a separate method? Test #1 does not involve the debuggee since it is suppose to fail on the JDI side with an exception. I don't think separate methods helps much here. It might even make it a bit harder to understand the flow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182992243 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182992404 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182995311 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182997188 From sspitsyn at openjdk.org Tue May 2 20:09:23 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 20:09:23 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Fri, 28 Apr 2023 21:30:23 GMT, Chris Plummer wrote: >> Note this PR depends on the #13546 PR for the following: >> >> [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread >> >> So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. >> >> >> [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. >> >> Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. >> >> Our existing testing for ThreadReference.stop() is fairly weak: >> >> - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. >> - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() >> - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly >> >> I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Some test logging improvements. test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 63: > 61: // debuggee local var used to find needed non-throwable object > 62: static final String DEBUGGEE_NON_THROWABLE_VAR= "stop002tNonThrowable"; > 63: // debuggee local var used to find needed non-throwable object This comment is a little bit confusing. Should it be one line up or it is intentionally placed before `DEBUGGEE_THROWABLE_VAR`? test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 300: > 298: IntegerValue ival; > 299: do { > 300: ival = (IntegerValue)mainClass.getValue(mainClass.fieldByName("testNumReady")); Do we need a sleep at each iteration? test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002t.java line 78: > 76: /* > 77: * TEST #2: async exception while suspended at a breakpoint. > 78: */ Where is a similar comment for TEST #1 ? Would it make sense to implement each subtest as a separate method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182993575 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182999880 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1182989578 From cjplummer at openjdk.org Tue May 2 20:17:21 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 20:17:21 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v6] In-Reply-To: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: > Note this PR depends on the #13546 PR for the following: > > [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread > > So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. > > > [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. > > Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. > > Our existing testing for ThreadReference.stop() is fairly weak: > > - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. > - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() > - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly > > I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: fix minor comment typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13548/files - new: https://git.openjdk.org/jdk/pull/13548/files/478b6b93..fd89fd3a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13548.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13548/head:pull/13548 PR: https://git.openjdk.org/jdk/pull/13548 From cjplummer at openjdk.org Tue May 2 20:17:25 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 20:17:25 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: <8i8I_TKRhP5x7GUICwq-Ask20zjk96C1Z71fNFqv83U=.732d9b9a-d1c0-41cb-ac2d-f2ea80fc987d@github.com> On Tue, 2 May 2023 19:57:32 GMT, Serguei Spitsyn wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> Some test logging improvements. > > test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 63: > >> 61: // debuggee local var used to find needed non-throwable object >> 62: static final String DEBUGGEE_NON_THROWABLE_VAR= "stop002tNonThrowable"; >> 63: // debuggee local var used to find needed non-throwable object > > This comment is a little bit confusing. > Should it be one line up or it is intentionally placed before `DEBUGGEE_THROWABLE_VAR`? It should be "throwable", not "non-throwable" > test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 300: > >> 298: IntegerValue ival; >> 299: do { >> 300: ival = (IntegerValue)mainClass.getValue(mainClass.fieldByName("testNumReady")); > > Do we need a sleep at each iteration? I intentionally did not do one since `getValue()` is fairly slow and involves sending and receiving a JDWP packet. It didn't seem worth the extra noise of a sleep call and the needed exception handling. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183002318 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183006005 From sspitsyn at openjdk.org Tue May 2 20:17:27 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 20:17:27 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Fri, 28 Apr 2023 21:30:23 GMT, Chris Plummer wrote: >> Note this PR depends on the #13546 PR for the following: >> >> [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread >> >> So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. >> >> >> [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. >> >> Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. >> >> Our existing testing for ThreadReference.stop() is fairly weak: >> >> - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. >> - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() >> - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly >> >> I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Some test logging improvements. test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 148: > 146: /* > 147: * Test #1: verify using a non-throwable object with stop() fails appropriately. > 148: */ The same suggestion about refactoring each subtest case into a separate method. There is some overhead in doing this, so it is up to you but worth to consider. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183003897 From kevinw at openjdk.org Tue May 2 20:31:16 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 2 May 2023 20:31:16 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: <9_nK21tIoGI1CEjQEnQaLzHPWwk4E1EyoIJswT3t2DQ=.dc264e8d-66a7-4580-8f44-29a90bb57485@github.com> On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. Thanks Alan - I had read JDK-8043937 as being where we remove IIOP completely as a transport for JMX (after earlier changes to not build it by default, JDK-8001048 and JDK-8033366). I see https://bugs.openjdk.org/browse/CCC-8043937 records that we deprecated javax.management.remote.rmi.RMIIIOPServerImpl.java I can add a CSR for removal to make this final and complete as I think was intended. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13758#issuecomment-1532105717 From sspitsyn at openjdk.org Tue May 2 20:34:24 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 20:34:24 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: <0o3GWuNIu2VtKEmbA92gGnXwcvSoh4p8mJGpm6UlDtA=.d4a10b6c-ddda-4edb-acf0-06a3cdd4be87@github.com> On Tue, 2 May 2023 19:56:14 GMT, Chris Plummer wrote: >> src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 132: >> >>> 130: * @throws OpaqueFrameException if the thread is a suspended >>> 131: * virtual thread and the implementation was unable to throw an >>> 132: * asynchronous exception from this frame >> >> The same comment as for the jdwp spec: >> Should it be aligned with JVMTI and some other places in your fix and say >> "from the current frame" instead of "from this frame"? > > I'm waiting for your JVMTI PR to finish review. I don't want to have to change this more than once. Okay. >> src/jdk.jdi/share/classes/com/sun/tools/jdi/ThreadReferenceImpl.java line 279: >> >>> 277: case JDWP.Error.OPAQUE_FRAME: >>> 278: assert isVirtual(); // can only happen with virtual threads >>> 279: throw new OpaqueFrameException(); >> >> Should the OpaqueFrameException also provide a message? > > The implementation tends to only include an exception message when it helps to clarify the reason for the exception. The spec only gives one possible reason for this exception, so no clarification should be needed. Plus it would be hard to meaningfully convey this reason in a short message. Here's what the spec says: > > * @throws OpaqueFrameException if the thread is a suspended > * virtual thread and the implementation was unable to throw an > * asynchronous exception from this frame Okay, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183022704 PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183023149 From sspitsyn at openjdk.org Tue May 2 20:37:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 20:37:20 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: <8i8I_TKRhP5x7GUICwq-Ask20zjk96C1Z71fNFqv83U=.732d9b9a-d1c0-41cb-ac2d-f2ea80fc987d@github.com> References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> <8i8I_TKRhP5x7GUICwq-Ask20zjk96C1Z71fNFqv83U=.732d9b9a-d1c0-41cb-ac2d-f2ea80fc987d@github.com> Message-ID: On Tue, 2 May 2023 20:12:15 GMT, Chris Plummer wrote: >> test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002.java line 300: >> >>> 298: IntegerValue ival; >>> 299: do { >>> 300: ival = (IntegerValue)mainClass.getValue(mainClass.fieldByName("testNumReady")); >> >> Do we need a sleep at each iteration? > > I intentionally did not do one since `getValue()` is fairly slow and involves sending and receiving a JDWP packet. It didn't seem worth the extra noise of a sleep call and the needed exception handling. Okay, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183025283 From dcubed at openjdk.org Tue May 2 20:37:18 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 2 May 2023 20:37:18 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: References: Message-ID: > A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: dholmes CR - change ':' to '.'. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13704/files - new: https://git.openjdk.org/jdk/pull/13704/files/bfb24496..59f7a22e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13704&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13704&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13704.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13704/head:pull/13704 PR: https://git.openjdk.org/jdk/pull/13704 From dcubed at openjdk.org Tue May 2 20:37:20 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 2 May 2023 20:37:20 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: References: Message-ID: On Mon, 1 May 2023 13:44:22 GMT, Coleen Phillimore wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes CR - change ':' to '.'. > > Yes, this looks good and also trivial. @coleenp, @dholmes-ora and @sspitsyn - thanks for the reviews! I suppose there isn't any way to give @robehn credit for his reviews when this fix was part of https://github.com/openjdk/jdk/pull/13519. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13704#issuecomment-1532107167 From dcubed at openjdk.org Tue May 2 20:37:21 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 2 May 2023 20:37:21 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 08:04:52 GMT, David Holmes wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes CR - change ':' to '.'. > > src/hotspot/share/runtime/threadSMR.cpp line 828: > >> 826: if (java_thread != JavaThread::current()) { >> 827: // java_thread is not the current JavaThread so we have to verify it >> 828: // against the ThreadsList: > > Colon at the end of the comment seems odd Nice catch! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13704#discussion_r1183022658 From sspitsyn at openjdk.org Tue May 2 20:41:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 20:41:20 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: On Tue, 2 May 2023 20:01:36 GMT, Chris Plummer wrote: >> test/hotspot/jtreg/vmTestbase/nsk/jdi/ThreadReference/stop/stop002t.java line 78: >> >>> 76: /* >>> 77: * TEST #2: async exception while suspended at a breakpoint. >>> 78: */ >> >> Where is a similar comment for TEST #1 ? >> Would it make sense to implement each subtest as a separate method? > > Test #1 does not involve the debuggee since it is suppose to fail on the JDI side with an exception. > > I don't think separate methods helps much here. It might even make it a bit harder to understand the flow. > Test https://github.com/openjdk/jdk/pull/1 does not involve the debuggee > since it is suppose to fail on the JDI side with an exception. Still some comment is needed to explain this. Otherwise, it is confusing that the subtests start from the TEST #2. :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183027318 From dcubed at openjdk.org Tue May 2 20:59:19 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 2 May 2023 20:59:19 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 20:37:18 GMT, Daniel D. Daugherty wrote: >> A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > dholmes CR - change ':' to '.'. This fix was tested with Mach5 Tier[1-8] when it was part of: https://github.com/openjdk/jdk/pull/13519 and is also being tested with Mach5 Tier[1-8] combined with the fix from JDK-8307068 and what remains in JDK-8305670. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13704#issuecomment-1532139191 From lmesnik at openjdk.org Tue May 2 21:28:16 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 2 May 2023 21:28:16 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode Message-ID: The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. ------------- Commit messages: - 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode Changes: https://git.openjdk.org/jdk/pull/13763/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13763&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307305 Stats: 2 lines in 3 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13763.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13763/head:pull/13763 PR: https://git.openjdk.org/jdk/pull/13763 From ayang at openjdk.org Tue May 2 22:08:18 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 2 May 2023 22:08:18 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1409951167 From cjplummer at openjdk.org Tue May 2 22:42:23 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 22:42:23 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v7] In-Reply-To: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: > Note this PR depends on the #13546 PR for the following: > > [JDK-8306434](https://bugs.openjdk.org/browse/JDK-8306434): add support of virtual threads to JVMTI StopThread > > So it can't be finalized and push until after JDK-8306434 is pushed. There will also be GHA failures until then. If JDK-8306434 results in any additional spec changes, they will likely impact this CR also. > > > [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding some virtual thread support to JVMTI StopThread. This will allow JDWP ThreadReference.Stop and JDI ThreadReference.stop() to have the same level support for virtual threads. > > Mostly this is a spec update for JDWP and JDI, but there are also some minor changes needed to the ThreadReference.stop() handling of JDWP errors, and throwing the appropriate exceptions. Also some minor cleanup in jdb. The debug agent doesn't need changes since JVMTI errors are just passed through as the corresponding JDWP errors, and the code for this is already in place. These errors are not new nor need any special handling. > > Our existing testing for ThreadReference.stop() is fairly weak: > > - nsk/jdb/kill/kill001, which tests stop() when the thread is suspended at a breakpoint. It will get problem listed by [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034). I have fixes for it already working and will push it separately. > - nsk/jdi/stop/stop001, which is problem listed and only tests when the thread is blocked in Object.wait() > - nsk/jdi/stop/stop002, which only tests that throwing an invalid exception fails properly > > I decided to take stop002 and make it a more thorough test of ThreadReference.stop(). See the comment at the top for a list of what is tested. As for reviewing this test, it's probably best to look at it as a completely new test rather than looking at diffs since so much has been changed and added. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: clarify test #1 in the debuggee. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13548/files - new: https://git.openjdk.org/jdk/pull/13548/files/fd89fd3a..d9722d1c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13548&range=05-06 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13548.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13548/head:pull/13548 PR: https://git.openjdk.org/jdk/pull/13548 From cjplummer at openjdk.org Tue May 2 22:42:24 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 22:42:24 GMT Subject: RFR: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop() [v4] In-Reply-To: References: <2iIOo1G9Tr-U5dzL82xfGQ5CkEA0V40nGXpKzmST_BM=.9b0ab2ed-996f-4163-b5e2-45a70ff18d9a@github.com> Message-ID: <_RXfyuiXFaiWBbsLRLHgZ8RmqIeqC3hW7vWybPHjJR8=.4e6e1dc9-d6fd-4c29-b1cd-63e1661246aa@github.com> On Tue, 2 May 2023 20:37:21 GMT, Serguei Spitsyn wrote: >> Test #1 does not involve the debuggee since it is suppose to fail on the JDI side with an exception. >> >> I don't think separate methods helps much here. It might even make it a bit harder to understand the flow. > >> Test https://github.com/openjdk/jdk/pull/1 does not involve the debuggee >> since it is suppose to fail on the JDI side with an exception. > > Still some comment is needed to explain this. > Otherwise, it is confusing that the subtests start from the TEST #2. :) Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13548#discussion_r1183109387 From cjplummer at openjdk.org Tue May 2 22:43:17 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 2 May 2023 22:43:17 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode In-Reply-To: References: Message-ID: <1YZqQUsulIUKS0qZirmlvBS1m1827bDqa4hGEaF24DY=.e05bd223-51e0-4690-8dec-c9f7b5a51e9b@github.com> On Tue, 2 May 2023 21:16:20 GMT, Leonid Mesnik wrote: > The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. > However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. What about com/sun/jdi tests? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13763#issuecomment-1532244254 From dlong at openjdk.org Tue May 2 22:49:58 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 2 May 2023 22:49:58 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v70] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 18:38:11 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add missing new file My review applies to the aarch64 changes. I have looked at the aarch64 changes twice and the latest version still looks good. All of my questions or comments have been addressed. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/10907#pullrequestreview-1409981759 From cjplummer at openjdk.org Wed May 3 00:41:18 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 00:41:18 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode In-Reply-To: References: Message-ID: On Tue, 2 May 2023 21:16:20 GMT, Leonid Mesnik wrote: > The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. > However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. I think MainWrapper.java could use some documentation on why/how/when it used. This is something that should have been done before first committing it when loom changes were integrated. Also, I see what looks like a bug w.r.t. this code in Launcher.java: if (System.getProperty("main.wrapper") != null) { cmdline = MainWrapper.class.getName() + " " + System.getProperty("main.wrapper") + " " + cmdline; } It gets the main.wrapper property in order to pass it to MainWrapper.main(), which is in charge of setting the properly, so how could it ever already be set when this Launcher.java code is executed? Same thing in Binder.java and DebugeeBinder.java. test/hotspot/jtreg/vmTestbase/nsk/share/MainWrapper.java line 50: > 48: > 49: // Some tests use this property to understand if virtual threads are used > 50: System.setProperty("main.wrapper", "Virtual"); Shouldn't this be: `System.setProperty("main.wrapper", wrapperName);` ------------- PR Comment: https://git.openjdk.org/jdk/pull/13763#issuecomment-1532316167 PR Review Comment: https://git.openjdk.org/jdk/pull/13763#discussion_r1183151407 From dholmes at openjdk.org Wed May 3 02:43:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 May 2023 02:43:04 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v70] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 18:38:11 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add missing new file src/hotspot/share/runtime/globals.hpp line 1986: > 1984: "0: monitors only, " \ > 1985: "1: monitors & legacy stack-locking (default), " \ > 1986: "2: monitors & new lightweight locking") \ Can we include the `LM_XXX` values in the description string so it is clear which maps to what. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183194309 From lmesnik at openjdk.org Wed May 3 02:48:12 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 3 May 2023 02:48:12 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode In-Reply-To: References: Message-ID: On Wed, 3 May 2023 00:38:17 GMT, Chris Plummer wrote: > I think MainWrapper.java could use some documentation on why/how/when it used. This is something that should have been done before first committing it when loom changes were integrated. > > Also, I see what looks like a bug w.r.t. this code in Launcher.java: > > ``` > if (System.getProperty("main.wrapper") != null) { > cmdline = MainWrapper.class.getName() + " " + System.getProperty("main.wrapper") + " " + cmdline; > } > ``` > > It gets the main.wrapper property in order to pass it to MainWrapper.main(), which is in charge of setting the properly, so how could it ever already be set when this Launcher.java code is executed? Same thing in Binder.java and DebugeeBinder.java. let me file sub-task to add documentation about using "main.wrapper" as a part of https://bugs.openjdk.org/browse/JDK-8303773. Currently, it is propagated to debugee as a part of `test.vm.opts` . However the goal is to use `JTREG_TEST_THREAD_FACTORY` which set Virtual test thread factory and set all required properties. So it is needed to manually set them in debugee while launching them using nsk Wrapper or TestScaffold. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13763#issuecomment-1532394277 From lmesnik at openjdk.org Wed May 3 02:56:57 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 3 May 2023 02:56:57 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode [v2] In-Reply-To: References: Message-ID: > The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. > However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: Fixed main.wrapper property usage. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13763/files - new: https://git.openjdk.org/jdk/pull/13763/files/b5f06cb2..8c000319 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13763&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13763&range=00-01 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13763.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13763/head:pull/13763 PR: https://git.openjdk.org/jdk/pull/13763 From lmesnik at openjdk.org Wed May 3 02:57:00 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 3 May 2023 02:57:00 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 00:14:50 GMT, Chris Plummer wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed main.wrapper property usage. > > test/hotspot/jtreg/vmTestbase/nsk/share/MainWrapper.java line 50: > >> 48: >> 49: // Some tests use this property to understand if virtual threads are used >> 50: System.setProperty("main.wrapper", "Virtual"); > > Shouldn't this be: > > `System.setProperty("main.wrapper", wrapperName);` Thanks, better to use wrapperName as a well as to set it it in TestScaffold. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13763#discussion_r1183199536 From sspitsyn at openjdk.org Wed May 3 03:03:24 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 03:03:24 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 18:57:15 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> minor tweak of JVMTI_ERROR_OPAQUE_FRAME description > > src/hotspot/share/prims/jvmti.xml line 1925: > >> 1923: >> 1924: The thread is a suspended virtual thread and the implementation was unable >> 1925: to throw an asynchronous exception from this frame. > > This part no longer has wording similar to the general description of JVMTI_ERROR_OPAQUE_FRAME below. Maybe that was understood and intended when the rewording was done. Just want to make sure you are aware of it. What part of the statement does not match? Should we say "from the current frame" instead of "from this frame"? The general description has this: "... or the function cannot be performed on the thread's current frame." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183201230 From cjplummer at openjdk.org Wed May 3 03:14:33 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 03:14:33 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: Message-ID: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> On Wed, 3 May 2023 02:58:21 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmti.xml line 1925: >> >>> 1923: >>> 1924: The thread is a suspended virtual thread and the implementation was unable >>> 1925: to throw an asynchronous exception from this frame. >> >> This part no longer has wording similar to the general description of JVMTI_ERROR_OPAQUE_FRAME below. Maybe that was understood and intended when the rewording was done. Just want to make sure you are aware of it. > > What part of the statement does not match? > Should we say "from the current frame" instead of "from this frame"? > > The general description has this: > "... or the function cannot be performed on the thread's current frame." They are both trying to convey the same thing, but using completely different wording to do so. One says "the implementation was unable to throw an asynchronous exception", and the other says "the function cannot be performed". One says "from this frame", and the other says "on the thread's current frame". The meaning is the same, but the wording should be consistent. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183204950 From sspitsyn at openjdk.org Wed May 3 03:23:16 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 03:23:16 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> References: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> Message-ID: On Wed, 3 May 2023 03:12:01 GMT, Chris Plummer wrote: >> What part of the statement does not match? >> Should we say "from the current frame" instead of "from this frame"? >> >> The general description has this: >> "... or the function cannot be performed on the thread's current frame." > > They are both trying to convey the same thing, but using completely different wording to do so. One says "the implementation was unable to throw an asynchronous exception", and the other says "the function cannot be performed". One says "from this frame", and the other says "on the thread's current frame". The meaning is the same, but the wording should be consistent. I feel that it is a feature in the spec to say differently in specific case vs common case. It should help to understand each case better. In this particular case, it can be useful to align wording with "the current frame". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183207895 From cjplummer at openjdk.org Wed May 3 04:16:18 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 04:16:18 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> Message-ID: On Wed, 3 May 2023 03:20:51 GMT, Serguei Spitsyn wrote: >> They are both trying to convey the same thing, but using completely different wording to do so. One says "the implementation was unable to throw an asynchronous exception", and the other says "the function cannot be performed". One says "from this frame", and the other says "on the thread's current frame". The meaning is the same, but the wording should be consistent. > > I feel that it is a feature in the spec to say differently in specific case vs common case. > It should help to understand each case better. > In this particular case, it can be useful to align wording with "the current frame". I can see that reasoning for "unable to throw an asynchronous exception" and "cannot be performed", but what about "the implementation" vs "the function". Can't they both be the same? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183223249 From cjplummer at openjdk.org Wed May 3 04:17:13 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 04:17:13 GMT Subject: RFR: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 02:56:57 GMT, Leonid Mesnik wrote: >> The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. >> However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > Fixed main.wrapper property usage. Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13763#pullrequestreview-1410164531 From sspitsyn at openjdk.org Wed May 3 04:37:19 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 04:37:19 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> Message-ID: On Wed, 3 May 2023 04:13:32 GMT, Chris Plummer wrote: >> I feel that it is a feature in the spec to say differently in specific case vs common case. >> It should help to understand each case better. >> In this particular case, it can be useful to align wording with "the current frame". > > I can see that reasoning for "unable to throw an asynchronous exception" and "cannot be performed", but what about "the implementation" vs "the function". Can't they both be the same? I was thinking about the same. The problem is the spec has several variations for it: - function, operation, implementation... It is hard or impossible to make this completely consistent. But I have a doubt it is very important to polish it like this. The spec might be boring to read if it is fully consistent. :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183229181 From sspitsyn at openjdk.org Wed May 3 05:15:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 05:15:21 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v9] In-Reply-To: References: Message-ID: > This enhancement adds support of virtual threads to the JVMTI `StopThread` function. > In preview releases before this enhancement the StopThread returned the JVMTI_ERROR_UNSUPPORTED_OPERATION error code for virtual threads. > > The `StopThread` supports sending an asynchronous exception to a virtual thread only if it is current or suspended at mounted state. For instance, a virtual thread can be suspended at a JVMTI event. If the virtual thread is not suspended and is not current then the `JVMTI_ERROR_THREAD_NOT_SUSPENDED` error code is returned. If the virtual thread was suspended at unmounted state then the `JVMTI_ERROR_OPAQUE_FRAME` error code is returned. > > The `StopThread` has the following description for `JVMTI_ERROR_OPAQUE_FRAME` error code: >> The thread is a suspended virtual thread and the implementation >> was unable to throw an asynchronous exception from this frame. > > A couple of the `serviceability/jvmti/vthread` tests has been updated to adopt to new `StopThread` behavior. > > The CSR is: https://bugs.openjdk.org/browse/JDK-8306434 > > Testing: > The mach5 tears 1-6 are in progress. > Preliminary test runs were good in general. > The JDB test `vmTestbase/nsk/jdb/kill/kill001/kill001.java` has been problem-listed and will be fixed by the corresponding debugger enhancement which is going to adopt JDWP/JDI specs to new behavior of the JVMTI `StopThread` related to virtual threads. > > Also, two JCK JVMTI tests are failing in the tier-6 : >> vm/jvmti/StopThread/stop001/stop00103/stop00103.html >> vm/jvmti/StopThread/stop001/stop00103/stop00103a.html > > These two tests will be excluded from the test runs by the JCK team and then adjusted to new `StopThread` behavior. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: StopThread spec: minor tweek in description of OPAQUE_FRAME error code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13546/files - new: https://git.openjdk.org/jdk/pull/13546/files/0ad9a6cc..940cda74 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13546.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13546/head:pull/13546 PR: https://git.openjdk.org/jdk/pull/13546 From sspitsyn at openjdk.org Wed May 3 05:19:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 05:19:20 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v8] In-Reply-To: References: <-of_WZARcf8b50SO3evk94KMlP_C9QVbIUngPbk_8m4=.e80d168d-a33e-43f0-b481-5ca7a813d476@github.com> Message-ID: On Wed, 3 May 2023 04:31:34 GMT, Serguei Spitsyn wrote: >> I can see that reasoning for "unable to throw an asynchronous exception" and "cannot be performed", but what about "the implementation" vs "the function". Can't they both be the same? > > I was thinking about the same. > The problem is the spec has several variations for it: > - function, operation, implementation... > > It is hard or impossible to make this completely consistent. > But I have a doubt it is very important to polish it like this. > The spec might be boring to read if it is fully consistent. :) I've pushed an update with the change: `from this frame` => `from the current frame` Also, updated the CSR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13546#discussion_r1183245340 From dholmes at openjdk.org Wed May 3 05:28:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 May 2023 05:28:59 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v70] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 18:38:11 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add missing new file src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 659: > 657: // Invariant: tmpReg == 0. tmpReg is EAX which is the implicit cmpxchg comparand. > 658: lock(); > 659: cmpxchgptr(thread, Address(boxReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); Sorry I don't quite follow the changes here as this appears to changing the logic for all locking modes - aren't we still supposed to be cas'ing in the "box" (scrReg) in legacy mode rather than the "thread"? src/hotspot/share/runtime/javaThread.hpp line 1157: > 1155: static ByteSize lock_stack_offset() { return byte_offset_of(JavaThread, _lock_stack); } > 1156: static ByteSize lock_stack_top_offset() { return lock_stack_offset() + LockStack::top_offset(); } > 1157: static ByteSize lock_stack_base_offset() { return lock_stack_offset() + LockStack::base_offset(); } Some commentary about why the offsets are all-defined relative to the base of the JavaThread would be nice. src/hotspot/share/runtime/lockStack.hpp line 56: > 54: inline JavaThread* get_thread() const; > 55: > 56: bool is_self() const; We've been (slowly) weeding out much of the "self" terminology in the threading and sync code, can we use `is_current` instead? Some comments on each API method would be nice too. src/hotspot/share/runtime/lockStack.inline.hpp line 50: > 48: > 49: inline bool LockStack::is_self() const { > 50: Thread* thread = Thread::current(); Should use JavaThread here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183204942 PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183241855 PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183248726 PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183244575 From alanb at openjdk.org Wed May 3 05:38:15 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 3 May 2023 05:38:15 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. JSR-160 MR3 changed RMIConnector to make the IIOP transport optional. For Java 8 this was JDK-8001048. For Java 9, the IIOP transport was removed via JDK-8043937. It should be okay to remove RMIIIOPServerImpl now. There's an example in the RMIConnector javadoc that uses a service URL and the "iiop" protocol, which I think is okay to leave. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13758#issuecomment-1532472950 From dholmes at openjdk.org Wed May 3 05:45:15 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 May 2023 05:45:15 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> On Tue, 2 May 2023 15:39:49 GMT, Coleen Phillimore wrote: >> test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 48: >> >>> 46: * It is implemented in FinalizableObject. >>> 47: */ >>> 48: public void registerCleanup(); >> >> Can you not implement this as a default method? > > It seems much better to force implementors to provide this method. Why? They all seem to do exactly the same thing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1183256509 From rkennke at openjdk.org Wed May 3 06:15:06 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 06:15:06 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v70] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 03:12:00 GMT, David Holmes wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add missing new file > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 659: > >> 657: // Invariant: tmpReg == 0. tmpReg is EAX which is the implicit cmpxchg comparand. >> 658: lock(); >> 659: cmpxchgptr(thread, Address(boxReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); > > Sorry I don't quite follow the changes here as this appears to changing the logic for all locking modes - aren't we still supposed to be cas'ing in the "box" (scrReg) in legacy mode rather than the "thread"? IIRC, I have done that in response to an earlier review by somebody. The previous logic transiently stored box into the owner, and later - if the CAS succeeded - fetches the current thread* and stores that into owner, a few lines down from here. However, I just noticed that I do not remove that other code. So, for the sake of cleanliness of the legacy path, I'm going to revert this (we can & should make that change in a follow-up). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1183273733 From sjohanss at openjdk.org Wed May 3 08:36:40 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Wed, 3 May 2023 08:36:40 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2] In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 15:57:49 GMT, Coleen Phillimore wrote: >> Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: >> >> - Test refactor >> - Serguei review > > This looks good. Thanks for all the testing and adding the new test. Thanks for the reviews @coleenp and @sspitsyn. Pushed two changes according to Sergueis suggestions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13716#issuecomment-1532646363 From sjohanss at openjdk.org Wed May 3 08:36:22 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Wed, 3 May 2023 08:36:22 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2] In-Reply-To: References: Message-ID: <9NU8MPRH1I0Bp-cxlDzYH5AWkVvde-GdlO3QfcQ4U4k=.abb31d82-81d7-4c8b-af08-6145bde05ec6@github.com> > Hi all, > > Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint when there is nothing that can be cleaned up. > > **Summary** > When transforming/redefining classes a previous version list is linked together in the InstanceKlass. The original class is added to this list if it is still used or shared. The difference between shared and used is not currently noted. This leads to a problem when doing concurrent class unloading, because during that we postpone some potential work to a safepoint (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it is triggered by the ServiceThread if there is work to be done, for example if InstanceKlass::_has_previous_versions is true. > > Since we currently does not differentiate between shared and "in use" we always set _has_previous_versions if anything is on this list. This together with the fact that shared previous versions should never be cleaned out leads to this safepoint being triggered after every concurrent class unloading even though there is nothing that can be cleaned out. > > This can be avoided by making sure the _previous_versions list is only cleaned when there are non-shared classes on it. This change renames `_has_previous_versions` to `_clean_previous_versions` and only updates it if we have non-shared classes on the list. > > **Testing** > * A lot of manual testing verifying that we do get the safepoint when we should. > * Added new test to verify expected behavior by parsing the logs. The test uses JFR to trigger redefinition of some shared classes (when -Xshare:on). > * Mach5 run of new test and tier 1-3 Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: - Test refactor - Serguei review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13716/files - new: https://git.openjdk.org/jdk/pull/13716/files/39c3a1c1..834174f9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13716&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13716&range=00-01 Stats: 47 lines in 5 files changed: 13 ins; 2 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/13716.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13716/head:pull/13716 PR: https://git.openjdk.org/jdk/pull/13716 From rkennke at openjdk.org Wed May 3 09:33:24 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 09:33:24 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @dholmes-ora's review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/423dbcdb..5d5a43dd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=70 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=69-70 Stats: 38 lines in 5 files changed: 25 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From stefank at openjdk.org Wed May 3 09:58:52 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 09:58:52 GMT Subject: RFR: 8307058: Implementation of Generational ZGC Message-ID: Hi all, Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued developme nt of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics * a2824734d23 UPSTREAM: lir_xchg * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI * 447259cea42 UPSTREAM: assembler_ppc ANDI * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: git fetch https://github.com/openjdk/zgc zgc_master git diff zgc_master... There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. ------------- Commit messages: - Whitespace fixes - Copyright fixes - Style, cleanups, and copyright years - Disable ThreadMemoryLeakTest.java for generational ZGC - Fix single gen too early verify_oop - Add vm.opt.final.ZGenerational to JFR event tests - Fix tenuring threshold bounds calculation - Sub code size x86_64 - Stub code size aarch64 - Fix TestStringDeduplicationTools.java for X - ... and 892 more: https://git.openjdk.org/jdk/compare/750bece0...62a4f788 Changes: https://git.openjdk.org/jdk/pull/13771/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307058 Stats: 67415 lines in 690 files changed: 58209 ins; 4275 del; 4931 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From eosterlund at openjdk.org Wed May 3 10:01:28 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 3 May 2023 10:01:28 GMT Subject: RFR: 8307058: Implementation of Generational ZGC In-Reply-To: References: Message-ID: On Wed, 3 May 2023 09:04:50 GMT, Stefan Karlsson wrote: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. I have obviously stared at this code since its inception. To me it doesn't just look good, it looks fantastic. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13771#pullrequestreview-1410554817 From stefank at openjdk.org Wed May 3 10:55:49 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 10:55:49 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Fix PPC build after 8305668 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/62a4f788..da7fdde5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From dfuchs at openjdk.org Wed May 3 11:31:14 2023 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Wed, 3 May 2023 11:31:14 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. Marked as reviewed by dfuchs (Reviewer). Looks good to me. I probably wouldn't have bothered with removing the example of IIOP JMXServiceURL as arguably functional implementations of that might still exist (as long as they don't extend RMIIIOPServerImpl, which is not a requirement). The constructor of RMIIIOPServerImpl throws UnsupportedOperationException unconditionally, so there can't exist any functional subclasses of that class that could be instantiated on Java versions posterior to Java 9. As such removing that class from the public API sounds reasonable. ------------- PR Review: https://git.openjdk.org/jdk/pull/13758#pullrequestreview-1410697538 PR Comment: https://git.openjdk.org/jdk/pull/13758#issuecomment-1532865857 From duke at openjdk.org Wed May 3 12:04:15 2023 From: duke at openjdk.org (Afshin Zafari) Date: Wed, 3 May 2023 12:04:15 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> References: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> Message-ID: On Wed, 3 May 2023 05:42:29 GMT, David Holmes wrote: >> It seems much better to force implementors to provide this method. > > Why? They all seem to do exactly the same thing. After I moved the `registerCleanup` to the body of a `default` method in the interface, there is no need for the implementors of the `Finalizable` interface to provide this method. All of them can use the default one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1183592572 From mdoerr at openjdk.org Wed May 3 12:32:31 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 3 May 2023 12:32:31 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: <6A1nfkn9o4N_h6W4aY_0XT_jW5h478GmIF8B-ZNI4wk=.232e8290-55fd-4a7a-9341-ebb1522423e4@github.com> On Wed, 3 May 2023 10:55:49 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix PPC build after 8305668 Thanks for fixing PPC64! With this, the VM compiles and the `test/hotspot/jtreg/gc` tests are passing on linux PPC64le. I'm glad to see this PR for JDK 21 LTS. It's a big step forward for ZGC. Congratulations! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1532942815 From stefank at openjdk.org Wed May 3 12:45:27 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 12:45:27 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: <6A1nfkn9o4N_h6W4aY_0XT_jW5h478GmIF8B-ZNI4wk=.232e8290-55fd-4a7a-9341-ebb1522423e4@github.com> References: <6A1nfkn9o4N_h6W4aY_0XT_jW5h478GmIF8B-ZNI4wk=.232e8290-55fd-4a7a-9341-ebb1522423e4@github.com> Message-ID: On Wed, 3 May 2023 12:29:15 GMT, Martin Doerr wrote: > Thanks for fixing PPC64! With this, the VM compiles and the `test/hotspot/jtreg/gc` tests are passing on linux PPC64le. > > I'm glad to see this PR for JDK 21 LTS. It's a big step forward for ZGC. Congratulations! Thanks for porting Generational ZGC to PPC! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1532964490 From mdoerr at openjdk.org Wed May 3 13:44:29 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 3 May 2023 13:44:29 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:55:49 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix PPC build after 8305668 "test/hotspot/jtreg/gc" and "test/hotspot/jtreg/compiler/gcbarriers" are also passing with JTREG="VM_OPTIONS=-XX:+UseZGC -XX:+ZGenerational" on linux PPC64 le. I've quickly checked Spec JBB 2005 with ZGC performance. Generational mode was about 7% faster on Power10. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1533053221 From tschatzl at openjdk.org Wed May 3 13:53:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 13:53:27 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v3] In-Reply-To: References: Message-ID: On Wed, 26 Apr 2023 17:28:49 GMT, Chris Plummer wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> cplummer review > > SA changes look good. Thanks @plummercj @sspitsyn @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/13643#issuecomment-1533064129 From tschatzl at openjdk.org Wed May 3 13:53:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 13:53:28 GMT Subject: Integrated: 8306836: Remove pinned tag for G1 heap regions In-Reply-To: References: Message-ID: <4wdBNSgTzWoVKhbSXY8vlBwj_3eE2pyB3knxVGWKDHk=.0225c1ae-6a26-4170-b2ea-1e85ea6e6a64@github.com> On Tue, 25 Apr 2023 13:49:05 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: fc76687c Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/fc76687c2fac39fcbf706c419bfa170b8efa5747 Stats: 62 lines in 18 files changed: 5 ins; 31 del; 26 mod 8306836: Remove pinned tag for G1 heap regions Reviewed-by: ayang, cjplummer, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13643 From kevinw at openjdk.org Wed May 3 14:12:29 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 3 May 2023 14:12:29 GMT Subject: RFR: 8305913: com/sun/jdi/JdbLastErrorTest.java failed with "'lastError = 42' missing from stdout/stderr" [v2] In-Reply-To: References: Message-ID: On Sat, 15 Apr 2023 10:15:20 GMT, Kevin Walls wrote: >> This test is failing often since 8304725 added a call to Thread::current_in_asgct(). This can end up being called e.g. when resolving calls, and then the OS last error value is lost. >> >> The test is reliable with a single warm-up call to getLastError.invoke() before the loop. >> >> The test was introduced when in JDK-8292302 a change was undone that had made JavaThread::threadObj call Thread::current_or_null_safe, as the use of TLS upset this case of accessing last error directly. >> >> This new Thread::current_in_asgct() case shows that the VM will find new ways to interfere with the last error value, or at least new VM code keeps wanting to call Thread::current. This testcase is kind of niche usage, so it not an argument that VM code should not be calling Thread::current. If this test is to stay active, it needs to have this warm-up getLastError call. (If there are more issues, it might mean removing the test.) > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > comment update feedback Thanks for the comments and reviews - the updated Panama situation is that this test as written will still fail sometimes, but that is because the test is doing it wrong. An actual call to the native GetLastError can still overwrite the last error value. (Making a new call is likely to break the last error value just doing method resolution, at least the first time it happens.) But the answer to the original problem is that we now have Linker.Option.CaptureCallState which gives us the chance to capture last error when calling a MethodHandle, and read the stored last error code in a VarHandle. I should remove the test, it is redundant, calling set/get last error directly is not the way to do this, and CaptureCallState is tested in test/jdk/java/foreign/capturecallstate/TestCaptureCallState.java ------------- PR Comment: https://git.openjdk.org/jdk/pull/13481#issuecomment-1533093670 From kevinw at openjdk.org Wed May 3 14:12:30 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 3 May 2023 14:12:30 GMT Subject: Withdrawn: 8305913: com/sun/jdi/JdbLastErrorTest.java failed with "'lastError = 42' missing from stdout/stderr" In-Reply-To: References: Message-ID: On Fri, 14 Apr 2023 19:23:05 GMT, Kevin Walls wrote: > This test is failing often since 8304725 added a call to Thread::current_in_asgct(). This can end up being called e.g. when resolving calls, and then the OS last error value is lost. > > The test is reliable with a single warm-up call to getLastError.invoke() before the loop. > > The test was introduced when in JDK-8292302 a change was undone that had made JavaThread::threadObj call Thread::current_or_null_safe, as the use of TLS upset this case of accessing last error directly. > > This new Thread::current_in_asgct() case shows that the VM will find new ways to interfere with the last error value, or at least new VM code keeps wanting to call Thread::current. This testcase is kind of niche usage, so it not an argument that VM code should not be calling Thread::current. If this test is to stay active, it needs to have this warm-up getLastError call. (If there are more issues, it might mean removing the test.) This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13481 From kevinw at openjdk.org Wed May 3 15:50:14 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 3 May 2023 15:50:14 GMT Subject: RFR: 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java Message-ID: This should be a trivial change, to remove a test which is unreliable, and to remove its problem list entry. The test is unreliable, but also the updated Panama situation is that we now have Linker.Option.CaptureCallState which gives us the chance to capture last error when calling a MethodHandle, and read the stored last error code in a VarHandle. This is the way to reliably capture a last error value, and is tested in test/jdk/java/foreign/capturecallstate/TestCaptureCallState.java. The JdbLastErrorTest should be removed. ------------- Commit messages: - 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java Changes: https://git.openjdk.org/jdk/pull/13781/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13781&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307362 Stats: 104 lines in 2 files changed: 0 ins; 104 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13781/head:pull/13781 PR: https://git.openjdk.org/jdk/pull/13781 From sspitsyn at openjdk.org Wed May 3 18:01:17 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 May 2023 18:01:17 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2] In-Reply-To: <9NU8MPRH1I0Bp-cxlDzYH5AWkVvde-GdlO3QfcQ4U4k=.abb31d82-81d7-4c8b-af08-6145bde05ec6@github.com> References: <9NU8MPRH1I0Bp-cxlDzYH5AWkVvde-GdlO3QfcQ4U4k=.abb31d82-81d7-4c8b-af08-6145bde05ec6@github.com> Message-ID: On Wed, 3 May 2023 08:36:22 GMT, Stefan Johansson wrote: >> Hi all, >> >> Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint when there is nothing that can be cleaned up. >> >> **Summary** >> When transforming/redefining classes a previous version list is linked together in the InstanceKlass. The original class is added to this list if it is still used or shared. The difference between shared and used is not currently noted. This leads to a problem when doing concurrent class unloading, because during that we postpone some potential work to a safepoint (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it is triggered by the ServiceThread if there is work to be done, for example if InstanceKlass::_has_previous_versions is true. >> >> Since we currently does not differentiate between shared and "in use" we always set _has_previous_versions if anything is on this list. This together with the fact that shared previous versions should never be cleaned out leads to this safepoint being triggered after every concurrent class unloading even though there is nothing that can be cleaned out. >> >> This can be avoided by making sure the _previous_versions list is only cleaned when there are non-shared classes on it. This change renames `_has_previous_versions` to `_clean_previous_versions` and only updates it if we have non-shared classes on the list. >> >> **Testing** >> * A lot of manual testing verifying that we do get the safepoint when we should. >> * Added new test to verify expected behavior by parsing the logs. The test uses JFR to trigger redefinition of some shared classes (when -Xshare:on). >> * Mach5 run of new test and tier 1-3 > > Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: > > - Test refactor > - Serguei review Thank you for the update. Looks good. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13716#pullrequestreview-1411506472 From dcubed at openjdk.org Wed May 3 18:54:15 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 3 May 2023 18:54:15 GMT Subject: RFR: 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java In-Reply-To: References: Message-ID: On Wed, 3 May 2023 14:18:00 GMT, Kevin Walls wrote: > This should be a trivial change, to remove a test which is unreliable, and to remove its problem list entry. > > The test is unreliable, but also the updated Panama situation is that we now have Linker.Option.CaptureCallState which gives us the chance to capture last error when calling a MethodHandle, and read the stored last error code in a VarHandle. This is the way to reliably capture a last error value, and is tested in test/jdk/java/foreign/capturecallstate/TestCaptureCallState.java. > > The JdbLastErrorTest should be removed. Thumbs up. I agree this is a trivial fix. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13781#pullrequestreview-1411586683 From cjplummer at openjdk.org Wed May 3 19:05:30 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 19:05:30 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:55:49 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix PPC build after 8305668 test/hotspot/jtreg/ProblemList-generational-zgc.txt line 32: > 30: # Quiet all SA tests > 31: > 32: resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java 8000000 generic-all I'd suggest filing a bug calling out the lack of SA support for generational ZGC and add a comment that there are no plans to address this. test/jdk/ProblemList-generational-zgc.txt line 27: > 25: # > 26: # List of quarantined tests for testing with Generational ZGC. > 27: # Are the tests in `test/jdk/sun/tools/jhsdb/` not failing? test/jdk/com/sun/jdi/ThreadMemoryLeakTest.java line 30: > 28: * > 29: * @comment Don't allow -Xcomp or -Xint as they impact memory useage and number of iterations > 30: * @requires (vm.compMode == "Xmixed") & !(vm.gc.Z & vm.opt.final.ZGenerational) Seems like a bug should be filed for this failure and then problem listed. This test is a bit finicky w.r.t. the specified max heap size and how much memory ends up actually being used by the test. I can probably get it working without much of a problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184124372 PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184126199 PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184128793 From stefank at openjdk.org Wed May 3 19:36:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 19:36:55 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Update SA ProblemList entries ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/da7fdde5..40e8583b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=01-02 Stats: 81 lines in 1 file changed: 0 ins; 0 del; 81 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From stefank at openjdk.org Wed May 3 19:37:00 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 19:37:00 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 18:52:19 GMT, Chris Plummer wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix PPC build after 8305668 > > test/hotspot/jtreg/ProblemList-generational-zgc.txt line 32: > >> 30: # Quiet all SA tests >> 31: >> 32: resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java 8000000 generic-all > > I'd suggest filing a bug calling out the lack of SA support for generational ZGC and add a comment that there are no plans to address this. Sounds like a good idea. I've created JDK-8307393 and will update the problem list. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184167988 From stefank at openjdk.org Wed May 3 19:45:31 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 19:45:31 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 18:57:22 GMT, Chris Plummer wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix PPC build after 8305668 > > test/jdk/com/sun/jdi/ThreadMemoryLeakTest.java line 30: > >> 28: * >> 29: * @comment Don't allow -Xcomp or -Xint as they impact memory useage and number of iterations >> 30: * @requires (vm.compMode == "Xmixed") & !(vm.gc.Z & vm.opt.final.ZGenerational) > > Seems like a bug should be filed for this failure and then problem listed. This test is a bit finicky w.r.t. the specified max heap size and how much memory ends up actually being used by the test. I can probably get it working without much of a problem. Yes, the test was finicky with the heap size. Given that the leak it tries to provoke would be provoked by other GCs as well, we didn't think it was that important to run this particular test with Generational ZGC. If you still think that we should create a Bug and ProblemList it, I'll do so. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184180837 From cjplummer at openjdk.org Wed May 3 20:03:31 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 3 May 2023 20:03:31 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: <6mUJaCFaOVO8V3S6gtr6EZ7uRplgVXXJialmuWqdegM=.e114c23c-28e4-4b12-a99b-15b3a350d4f8@github.com> On Wed, 3 May 2023 19:42:01 GMT, Stefan Karlsson wrote: >> test/jdk/com/sun/jdi/ThreadMemoryLeakTest.java line 30: >> >>> 28: * >>> 29: * @comment Don't allow -Xcomp or -Xint as they impact memory useage and number of iterations >>> 30: * @requires (vm.compMode == "Xmixed") & !(vm.gc.Z & vm.opt.final.ZGenerational) >> >> Seems like a bug should be filed for this failure and then problem listed. This test is a bit finicky w.r.t. the specified max heap size and how much memory ends up actually being used by the test. I can probably get it working without much of a problem. > > Yes, the test was finicky with the heap size. Given that the leak it tries to provoke would be provoked by other GCs as well, we didn't think it was that important to run this particular test with Generational ZGC. If you still think that we should create a Bug and ProblemList it, I'll do so. When I first wrote this test, it ended up failing with ZGC because I hadn't tested it. I considered excluding it for the same reason you've given, but then considered that the test might expose a leak with one GC, but not others, so I decided to fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184197845 From stefank at openjdk.org Wed May 3 21:11:30 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 21:11:30 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 18:54:24 GMT, Chris Plummer wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix PPC build after 8305668 > > test/jdk/ProblemList-generational-zgc.txt line 27: > >> 25: # >> 26: # List of quarantined tests for testing with Generational ZGC. >> 27: # > > Are the tests in `test/jdk/sun/tools/jhsdb/` not failing? It seems like these tests are only run with all GCs at the end of the development cycle. I've run them manually and verified that these tests fail as well. I'm going to problem list them. That run also revealed that jstat doesn't like when we report the initial capacity of the old generation as zero. See the calculation in: src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options column { header "^O^" /* Old Space - Percent Used */ data (1-((sun.gc.generation.1.space.0.capacity - sun.gc.generation.1.space.0.used)/sun.gc.generation.1.space.0.capacity)) * 100 align right scale raw width 6 format "0.00" } I can work around the test problem by faking the capacity to be non-zero, but that's not a pretty solution IMO. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184285686 From stefank at openjdk.org Wed May 3 21:30:34 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 21:30:34 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: <6mUJaCFaOVO8V3S6gtr6EZ7uRplgVXXJialmuWqdegM=.e114c23c-28e4-4b12-a99b-15b3a350d4f8@github.com> References: <6mUJaCFaOVO8V3S6gtr6EZ7uRplgVXXJialmuWqdegM=.e114c23c-28e4-4b12-a99b-15b3a350d4f8@github.com> Message-ID: On Wed, 3 May 2023 20:00:42 GMT, Chris Plummer wrote: >> Yes, the test was finicky with the heap size. Given that the leak it tries to provoke would be provoked by other GCs as well, we didn't think it was that important to run this particular test with Generational ZGC. If you still think that we should create a Bug and ProblemList it, I'll do so. > > When I first wrote this test, it ended up failing with ZGC because I hadn't tested it. I considered excluding it for the same reason you've given, but then considered that the test might expose a leak with one GC, but not others, so I decided to fix it. I've created JDK-8307402. I'll push a problem list entry. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184308134 From mdoerr at openjdk.org Wed May 3 21:35:28 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 3 May 2023 21:35:28 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 19:36:55 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Update SA ProblemList entries I'm getting build warnings on all linux platforms with gcc-11.3.0: ``` src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined by . For historical compatibility, it is currently defined by as well, but we plan to remove this soon. To use "minor", include directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including . [-Werror] 84 | ZDriverMinor* ZDriver::minor() { ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1533781342 From stefank at openjdk.org Wed May 3 21:48:12 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 21:48:12 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v4] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: ProblemList ThreadMemoryLeakTest.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/40e8583b..9cb32f4c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=02-03 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From stefank at openjdk.org Wed May 3 22:01:10 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 22:01:10 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v5] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: ProblemList jhsdb tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/9cb32f4c..d65523f5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=03-04 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From stefank at openjdk.org Wed May 3 22:01:42 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 May 2023 22:01:42 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: References: Message-ID: <45EiQagy_IO6JBPslCPdMF0_Ab5tGpaPLPr-AtgmleI=.159d0eb4-f759-4d28-8872-407598dec193@github.com> On Wed, 3 May 2023 21:32:54 GMT, Martin Doerr wrote: > I'm getting build warnings on all linux platforms with gcc-11.3.0: > > ``` > src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined > by . For historical compatibility, it is > currently defined by as well, but we plan to > remove this soon. To use "minor", include > directly. If you did not intend to use a system-defined macro > "minor", you should undefine it after including . [-Werror] > 84 | ZDriverMinor* ZDriver::minor() { > ``` That's unfortunate as minor and major are quite central to Generational ZGC and having to rename those functions will make the code look worse. I wonder if we should undef minor and major where needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1533806231 From amenkov at openjdk.org Wed May 3 22:02:30 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 3 May 2023 22:02:30 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v10] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/dd3be3b1..1e6ca207 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=08-09 Stats: 87 lines in 1 file changed: 22 ins; 28 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From amenkov at openjdk.org Wed May 3 22:07:26 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 3 May 2023 22:07:26 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Tue, 2 May 2023 10:10:32 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> Added "no continuations" test case > > src/hotspot/share/prims/jvmtiTagMap.cpp line 2245: > >> 2243: bool is_top_frame; >> 2244: int depth; >> 2245: frame* last_entry_frame; > > The field names of a helper class are usually started with '_' symbol. renamed all fields > src/hotspot/share/prims/jvmtiTagMap.cpp line 2319: > >> 2317: } >> 2318: } >> 2319: } > > The fragments 2289-2303 and 2305-2319 are based on the `StackValueCollection` and look very similar. > It can be worth to refactor these fragments into two function calls: > > bool report_stack_value_collection(jmethodID method, int idx_base, > StackValueCollection* elems, jlocation bci) { > for (int index = 0; index < exprs->size(); index++) { > if (exprs->at(index)->type() == T_OBJECT) { > oop obj = elems->obj_at(index)(); > if (obj == nullptr) { > continue; > } > // stack reference > if (!CallbackInvoker::report_stack_ref_root(thread_tag, tid, depth, method, > bci, idx_base + index, obj)) { > return false; > } > } > } > return true; // ??? > > . . . . . > jlocation bci = (jlocation)jvf->bci(); > StackValueCollection* locals = jvf->locals(); > if (!report_stack_value_collection(method, locals, 0 /* idx_base*/, bci)) { > return false; > } > StackValueCollection* exprs = jvf->expressions(); > if (!report_stack_value_collection(method, exprs, locals->size(), bci)) { > return false; > } > > Other complete fragments can be also implemented as separate functions: > 2321-2328 (?), 2330-2351 refactored. > src/hotspot/share/prims/jvmtiTagMap.cpp line 2796: > >> 2794: if (!java_thread->has_last_Java_frame()) { >> 2795: // this may be only platform thread >> 2796: assert(mounted_vt == nullptr, "must be"); > > I'm not sure this assert is right. > I think, a virtual thread may have an empty stack observable from a VM_op, > for instance when it is in a process of being terminated. > Though, it is not that easy to make this assert fired with a test case and prove this can happen. > Another danger is that a virtual thread can be observed from a VM_op as in a VTMS (mount/unmount) transition. I need to think a little bit about possible consequences. Is it better to treat current thread identity as of a carrier thread in such a case? removed the assert for safety. I have no idea how vthread stack (frames on carrier thread and stack chunks) can look like during VTMS transitions (and it's very hard to reproduce the case by test) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184336378 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184337458 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184335758 From sspitsyn at openjdk.org Thu May 4 01:58:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 4 May 2023 01:58:20 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v10] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Wed, 3 May 2023 22:02:30 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > feedback src/hotspot/share/prims/jvmtiTagMap.cpp line 2231: > 2229: > 2230: // Helper class to collect/report stack roots. > 2231: class StackRootCollector { We discussed privately about the following renamings: - `StackRootCollector` => `StackRefCollector` - `collect_stack_roots` => `collect_stack_refs` - `collect_vthread_stack_roots` => `collect_vthread_stack_refs` src/hotspot/share/prims/jvmtiTagMap.cpp line 2284: > 2282: for (int index = 0; index < values->size(); index++) { > 2283: if (values->at(index)->type() == T_OBJECT) { > 2284: oop o = values->obj_at(index)(); I'd suggest to get rid of one-letter identifier like `o` and `c`. They variables can be renamed to `obj` and `cont` instead. It'd better to rename `slot_offset` to `offset`. src/hotspot/share/prims/jvmtiTagMap.cpp line 2893: > 2891: HandleMark hm(current_thread); > 2892: > 2893: StackChunkFrameStream fs(chunk); There are ways to avoid using the `StackChunkFrameStream`. You can find good examples in the jvmtiEnvBase.cpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184469330 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184466352 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184470111 From sspitsyn at openjdk.org Thu May 4 01:58:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 4 May 2023 01:58:22 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Wed, 3 May 2023 22:04:37 GMT, Alex Menkov wrote: >> src/hotspot/share/prims/jvmtiTagMap.cpp line 2319: >> >>> 2317: } >>> 2318: } >>> 2319: } >> >> The fragments 2289-2303 and 2305-2319 are based on the `StackValueCollection` and look very similar. >> It can be worth to refactor these fragments into two function calls: >> >> bool report_stack_value_collection(jmethodID method, int idx_base, >> StackValueCollection* elems, jlocation bci) { >> for (int index = 0; index < exprs->size(); index++) { >> if (exprs->at(index)->type() == T_OBJECT) { >> oop obj = elems->obj_at(index)(); >> if (obj == nullptr) { >> continue; >> } >> // stack reference >> if (!CallbackInvoker::report_stack_ref_root(thread_tag, tid, depth, method, >> bci, idx_base + index, obj)) { >> return false; >> } >> } >> } >> return true; // ??? >> >> . . . . . >> jlocation bci = (jlocation)jvf->bci(); >> StackValueCollection* locals = jvf->locals(); >> if (!report_stack_value_collection(method, locals, 0 /* idx_base*/, bci)) { >> return false; >> } >> StackValueCollection* exprs = jvf->expressions(); >> if (!report_stack_value_collection(method, exprs, locals->size(), bci)) { >> return false; >> } >> >> Other complete fragments can be also implemented as separate functions: >> 2321-2328 (?), 2330-2351 > > refactored. It'd be nice to do even more factoring + renaming. The lines 2326-2345 can be refactored to a function: bool StackRootCollector::report_native_frame_refs(jmethodID method) { _blk->set_context(_thread_tag, _tid, _depth, method); if (_is_top_frame) { // JNI locals for the top frame. assert(_java_thread != nullptr, "sanity"); _java_thread->active_handles()->oops_do(_blk); if (_blk->stopped()) { return false; } } else { if (_last_entry_frame != nullptr) { // JNI locals for the entry frame assert(_last_entry_frame->is_entry_frame(), "checking"); _last_entry_frame->entry_frame_call_wrapper()->handles()->oops_do(_blk); if (_blk->stopped()) { return false; } } } return true; } The function `report_stack_refs` can be renamed to `report_java_frame_refs` to make function name more consistent. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1184463655 From lmesnik at openjdk.org Thu May 4 03:54:20 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 4 May 2023 03:54:20 GMT Subject: Integrated: 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode In-Reply-To: References: Message-ID: On Tue, 2 May 2023 21:16:20 GMT, Leonid Mesnik wrote: > The debugger tests might start all debugee using virtual threads when property "main.wrapper" is set. > However, the new mode JTREG_TEST_THREAD_FACTORY do the same thing. The only difference is that it use predefined problemlist names and doesn't set property "main.wrapper" as a part of jtreg properties. So nsk wrapper should set it additionally. This pull request has now been integrated. Changeset: caee1bea Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/caee1beaaff7c11d5cc07fe31d04d8bf656b7a36 Stats: 3 lines in 4 files changed: 2 ins; 0 del; 1 mod 8307305: Update debugger tests to support JTREG_TEST_THREAD_FACTORY mode Reviewed-by: cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/13763 From rehn at openjdk.org Thu May 4 05:14:18 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 4 May 2023 05:14:18 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: References: Message-ID: <89mSdHMNu940PF6SHvxp7dyvTA6rENTZUcJiPn1fv0Y=.e8c2b8e2-3978-4a74-884a-8effab7a1b55@github.com> On Tue, 2 May 2023 20:37:18 GMT, Daniel D. Daugherty wrote: >> A trivial fix to remove broken EnableThreadSMRExtraValidityChecks option. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > dholmes CR - change ':' to '.'. Looks good, thanks! ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13704#pullrequestreview-1412295770 From kbarrett at openjdk.org Thu May 4 05:35:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 May 2023 05:35:29 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v4] In-Reply-To: References: Message-ID: <6sav8G_h5tJF6Chc-hLzW2k_7WtHPc6uk5Fr7zmuGSM=.bcece9ab-0f31-4f84-8dcf-05e530cac9df@github.com> On Thu, 27 Apr 2023 12:31:24 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > remove is_young_gc_movable in full gc code src/hotspot/share/gc/g1/g1CollectionSetChooser.hpp line 57: > 55: // Determine whether to add the given region to the collection set candidates or > 56: // not. Currently, we skip regions that we will never move during young gc, and > 57: // regions which liveness is below the occupancy threshold. s/liveness is below/liveness is over/ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13643#discussion_r1181174243 From kbarrett at openjdk.org Thu May 4 05:35:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 May 2023 05:35:26 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable Sorry to be late to the review. I noticed a problem in a comment. ------------- PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1407061087 From alanb at openjdk.org Thu May 4 05:53:20 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 4 May 2023 05:53:20 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. Removing RMIIIOPServerImpl is fine, I think I agree with Daniel on leaving out the change to RMIConnector. It may use "iiop" but it just an example. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13758#pullrequestreview-1412327269 From kevinw at openjdk.org Thu May 4 06:34:21 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 4 May 2023 06:34:21 GMT Subject: RFR: 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java In-Reply-To: References: Message-ID: On Wed, 3 May 2023 14:18:00 GMT, Kevin Walls wrote: > This should be a trivial change, to remove a test which is unreliable, and to remove its problem list entry. > > The test is unreliable, but also the updated Panama situation is that we now have Linker.Option.CaptureCallState which gives us the chance to capture last error when calling a MethodHandle, and read the stored last error code in a VarHandle. This is the way to reliably capture a last error value, and is tested in test/jdk/java/foreign/capturecallstate/TestCaptureCallState.java. > > The JdbLastErrorTest should be removed. Thanks Dan. No more noise from this test! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13781#issuecomment-1534156373 From kevinw at openjdk.org Thu May 4 06:34:23 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 4 May 2023 06:34:23 GMT Subject: Integrated: 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java In-Reply-To: References: Message-ID: <0tAbkOmgDSgiKgqdmpJJT8o8FIwnGv_E-DkwjK3M1B8=.e971a422-a3dc-45df-bbb7-068dce1c2834@github.com> On Wed, 3 May 2023 14:18:00 GMT, Kevin Walls wrote: > This should be a trivial change, to remove a test which is unreliable, and to remove its problem list entry. > > The test is unreliable, but also the updated Panama situation is that we now have Linker.Option.CaptureCallState which gives us the chance to capture last error when calling a MethodHandle, and read the stored last error code in a VarHandle. This is the way to reliably capture a last error value, and is tested in test/jdk/java/foreign/capturecallstate/TestCaptureCallState.java. > > The JdbLastErrorTest should be removed. This pull request has now been integrated. Changeset: e206d57b Author: Kevin Walls URL: https://git.openjdk.org/jdk/commit/e206d57bfc09032e17d09714fc54ab2f5e961792 Stats: 104 lines in 2 files changed: 0 ins; 104 del; 0 mod 8307362: Remove test com/sun/jdi/JdbLastErrorTest.java Reviewed-by: dcubed ------------- PR: https://git.openjdk.org/jdk/pull/13781 From kevinw at openjdk.org Thu May 4 06:37:15 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 4 May 2023 06:37:15 GMT Subject: RFR: 8307244: Remove redundant class RMIIIOPServerImpl In-Reply-To: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> References: <69iXE2clpZugZG7uCzAmZfSzAaRBTNJnnPpLnVgWb2M=.dcbc8323-489d-4a75-85b1-a7f0b5fea7ba@github.com> Message-ID: <7BBC2BD3YUjkcPpAC1M01iXNX-OoUS1B3I9AOmR1qhc=.8b1c06dd-4a5d-45b2-a269-33365eae6971@github.com> On Tue, 2 May 2023 17:57:14 GMT, Kevin Walls wrote: > Removal of class, looks like it was missed in the JDK9 removal of RMIIIOP. > This class is not referenced by other classes or tests. Thanks for all the comments and reviews. On the RMIConnector example, I had thought it was misleading to use as an example something which used to work, but which no longer works. But this change doesn't affect whether other address syntaxes are recognised. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13758#issuecomment-1534159706 From dholmes at openjdk.org Thu May 4 06:52:15 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 May 2023 06:52:15 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> Message-ID: On Wed, 3 May 2023 12:01:10 GMT, Afshin Zafari wrote: > All of them can use the default one Exactly my point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1184606381 From azeller at openjdk.org Thu May 4 07:38:12 2023 From: azeller at openjdk.org (Arno Zeller) Date: Thu, 4 May 2023 07:38:12 GMT Subject: RFR: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS Message-ID: Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. It's best that we just don't run the test on OSX if sudo privileges. ------------- Commit messages: - JDK-8307347: Skip on macOS in case sudo is needed Changes: https://git.openjdk.org/jdk/pull/13791/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13791&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307347 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13791.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13791/head:pull/13791 PR: https://git.openjdk.org/jdk/pull/13791 From yyang at openjdk.org Thu May 4 08:40:10 2023 From: yyang at openjdk.org (Yi Yang) Date: Thu, 4 May 2023 08:40:10 GMT Subject: RFR: JDK-8306441: Segmented heap dump [v4] In-Reply-To: <8YqPPHSW4K1s0t317Kp6UqvoGuv5v9oCbjtQ9FX8p2o=.0f6c687b-d031-401d-901d-1ec532715cdc@github.com> References: <8YqPPHSW4K1s0t317Kp6UqvoGuv5v9oCbjtQ9FX8p2o=.0f6c687b-d031-401d-901d-1ec532715cdc@github.com> Message-ID: <6fGz-XulrkMTHQSMPlSvCa-nZwpxf5eglnQXfN1HN3c=.481b0390-8d98-4736-8948-938a7dd94b58@github.com> > Hi, heap dump brings about pauses for application's execution(STW), this is a well-known pain. JDK-8252842 have added parallel support to heapdump in an attempt to alleviate this issue. However, all concurrent threads competitively write heap data to the same file, and more memory is required to maintain the concurrent buffer queue. In experiments, we did not feel a significant performance improvement from that. > > The minor-pause solution, which is presented in this PR, is a two-stage segmented heap dump: > > 1. Stage One(STW): Concurrent threads directly write data to multiple heap files. > 2. Stage Two(Non-STW): Merge multiple heap files into one complete heap dump file. > > Now concurrent worker threads are not required to maintain a buffer queue, which would result in more memory overhead, nor do they need to compete for locks. It significantly reduces 73~80% application pause time. > > | memory | numOfThread | STW | Total | > | --- | --------- | -------------- | ------------ | > | 8g | 1 thread | 15.612 secs | 15.612 secs | > | 8g | 32 thread | 2.5617250 secs | 14.498 secs | > | 8g | 96 thread | 2.6790452 secs | 14.012 secs | > | 16g | 1 thread | 26.278 secs | 26.278 secs | > | 16g | 32 thread | 5.2313740 secs | 26.417 secs | > | 16g | 96 thread | 6.2445556 secs | 27.141 secs | > | 32g | 1 thread | 48.149 secs | 48.149 secs | > | 32g | 32 thread | 10.7734677 secs | 61.643 secs | > | 32g | 96 thread | 13.1522042 secs | 61.432 secs | > | 64g | 1 thread | 100.583 secs | 100.583 secs | > | 64g | 32 thread | 20.9233744 secs | 134.701 secs | > | 64g | 96 thread | 26.7374116 secs | 126.080 secs | > | 128g | 1 thread | 233.843 secs | 233.843 secs | > | 128g | 32 thread | 72.9945768 secs | 207.060 secs | > | 128g | 96 thread | 67.6815929 secs | 336.345 secs | > >> **Total** means the total heap dump including both two phases >> **STW** means the first phase only. >> For parallel dump, **Total** = **STW** + **Merge**. For serial dump, **Total** = **STW** > > ![image](https://user-images.githubusercontent.com/5010047/234534654-6f29a3af-dad5-46bc-830b-7449c80b4dec.png) > > In actual testing, two-stage solution can lead to an increase in the overall time for heapdump(See table above). However, considering the reduction of STW time, I think it is an acceptable trade-off. Furthermore, there is still room for optimization in the second merge stage(e.g. sendfile/splice/copy_file_range instead of read+write combination). Since number of parallel dump thread has a considerable impact on total dump time, I added a parameter that allows users to specify the number of parallel dump thread they wish to run. > > ##### Open discussion > > - Pauseless heap dump solution? > An alternative pauseless solution is to fork a child process, set the parent process heap to read-only, and dump the heap in child process. Once writing happens in parent process, child process observes them by userfaultfd and corresponding pages are prioritized for dumping. I'm also looking forward to hearing comments and discussions about this solution. > > - Client parser support for segmented heap dump > This patch provides a possibility that whether heap dump needs to be complete or not, can the VM directly generate segmented heapdump, and let the client parser complete the merge process? Looking forward to hearing comments from the Eclipse MAT community Yi Yang has updated the pull request incrementally with one additional commit since the last revision: remove useless scope ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13667/files - new: https://git.openjdk.org/jdk/pull/13667/files/9e563ca7..9ee6fe4b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13667&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13667&range=02-03 Stats: 5 lines in 1 file changed: 1 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13667.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13667/head:pull/13667 PR: https://git.openjdk.org/jdk/pull/13667 From stefank at openjdk.org Thu May 4 09:40:33 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 May 2023 09:40:33 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 21:08:39 GMT, Stefan Karlsson wrote: >> test/jdk/ProblemList-generational-zgc.txt line 27: >> >>> 25: # >>> 26: # List of quarantined tests for testing with Generational ZGC. >>> 27: # >> >> Are the tests in `test/jdk/sun/tools/jhsdb/` not failing? > > It seems like these tests are only run with all GCs at the end of the development cycle. I've run them manually and verified that these tests fail as well. I'm going to problem list them. > > That run also revealed that jstat doesn't like when we report the initial capacity of the old generation as zero. See the calculation in: > src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options > > column { > header "^O^" /* Old Space - Percent Used */ > data (1-((sun.gc.generation.1.space.0.capacity - sun.gc.generation.1.space.0.used)/sun.gc.generation.1.space.0.capacity)) * 100 > align right > scale raw > width 6 > format "0.00" > } > > > I can work around the test problem by faking the capacity to be non-zero, but that's not a pretty solution IMO. The jhsdb tests have been ProblemListed. The jstat test is going to be fixed with #13796. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1184781772 From stefank at openjdk.org Thu May 4 09:44:17 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 May 2023 09:44:17 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column Message-ID: When running jstat tests like the following: test/jdk/sun/tools/jstatd/TestJstatdServer.java with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. Tested locally by running the jstat/jstad tests in the Generational ZGC branch. ------------- Commit messages: - 8307428: jstat tests doesn't tolerate dash in the O column Changes: https://git.openjdk.org/jdk/pull/13796/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13796&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307428 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13796.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13796/head:pull/13796 PR: https://git.openjdk.org/jdk/pull/13796 From kevinw at openjdk.org Thu May 4 09:49:17 2023 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 4 May 2023 09:49:17 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:33:49 GMT, Stefan Karlsson wrote: > When running jstat tests like the following: > test/jdk/sun/tools/jstatd/TestJstatdServer.java > > with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: > https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 > > G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. > > Tested locally by running the jstat/jstad tests in the Generational ZGC branch. Looks good to me. The raw size values are correctly reported as zero, e.g. from -gcoldcapacity, and the problem was only in a column where a percentage was calculated. Divide by zero is shown as a "-", so the O column is like what we can already see for S0 and S1. We do have warnings in the jstat man page about format changing, and permitting a dash for the O column is a very minor change which makes sense. I don't see any surprise for the user here. ------------- Marked as reviewed by kevinw (Committer). PR Review: https://git.openjdk.org/jdk/pull/13796#pullrequestreview-1412737989 From stuefe at openjdk.org Thu May 4 09:50:16 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 4 May 2023 09:50:16 GMT Subject: RFR: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS In-Reply-To: References: Message-ID: On Thu, 4 May 2023 07:30:49 GMT, Arno Zeller wrote: > Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. > It's best that we just don't run the test on OSX if sudo privileges. Seems reasonable. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13791#pullrequestreview-1412739984 From aboldtch at openjdk.org Thu May 4 09:53:32 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 4 May 2023 09:53:32 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: <45EiQagy_IO6JBPslCPdMF0_Ab5tGpaPLPr-AtgmleI=.159d0eb4-f759-4d28-8872-407598dec193@github.com> References: <45EiQagy_IO6JBPslCPdMF0_Ab5tGpaPLPr-AtgmleI=.159d0eb4-f759-4d28-8872-407598dec193@github.com> Message-ID: On Wed, 3 May 2023 21:58:25 GMT, Stefan Karlsson wrote: > I'm getting build warnings on all linux platforms with gcc-11.3.0: > > ``` > src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined > by . For historical compatibility, it is > currently defined by as well, but we plan to > remove this soon. To use "minor", include > directly. If you did not intend to use a system-defined macro > "minor", you should undefine it after including . [-Werror] > 84 | ZDriverMinor* ZDriver::minor() { > ``` @TheRealMDoerr I cannot reproduce this with gcc but can see the issue with clangd. Can you check if this patch solves the issue you are seeing? diff --git a/src/hotspot/share/gc/z/zDriver.hpp b/src/hotspot/share/gc/z/zDriver.hpp index 640ea6575ef..7fa650b1fa1 100644 --- a/src/hotspot/share/gc/z/zDriver.hpp +++ b/src/hotspot/share/gc/z/zDriver.hpp @@ -29,6 +29,14 @@ #include "gc/z/zThread.hpp" #include "gc/z/zTracer.hpp" +#ifdef minor +#undef minor +#endif + +#ifdef major +#undef major +#endif + class VM_ZOperation; class ZDriverMinor; class ZDriverMajor; ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1534438516 From tschatzl at openjdk.org Thu May 4 09:55:12 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 4 May 2023 09:55:12 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:33:49 GMT, Stefan Karlsson wrote: > When running jstat tests like the following: > test/jdk/sun/tools/jstatd/TestJstatdServer.java > > with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: > https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 > > G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. > > Tested locally by running the jstat/jstad tests in the Generational ZGC branch. Is it possible to remove the G1 hack in this change too? Because since now a zero value is supported in the output, there does not seem to be a reason to keep it for G1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13796#issuecomment-1534441302 From duke at openjdk.org Thu May 4 10:18:26 2023 From: duke at openjdk.org (Alexey Pavlyutkin) Date: Thu, 4 May 2023 10:18:26 GMT Subject: Withdrawn: 8306437: jhsdb cannot resolve image/symbol paths being used for analysis of Windows coredumps In-Reply-To: <62oUnZxCgEiiGjk1I9qb_1M_Zl0Sx4UvFeCk2S7mY00=.15eb4fda-eb88-4f9c-9d73-6fba9da60692@github.com> References: <62oUnZxCgEiiGjk1I9qb_1M_Zl0Sx4UvFeCk2S7mY00=.15eb4fda-eb88-4f9c-9d73-6fba9da60692@github.com> Message-ID: On Wed, 19 Apr 2023 10:39:40 GMT, Alexey Pavlyutkin wrote: > Hi! The patch fixes image/symbol lookup by jhsdb on alanysis Windows coredump. It uses executableName as a hint prepending image path with > > `;\server` > > and symbol path with > > `srv*https://msdl.microsoft.com/download/symbols;;\server` > > where the first bit points to Windows symbols located on remote server This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13530 From duke at openjdk.org Thu May 4 10:18:25 2023 From: duke at openjdk.org (Alexey Pavlyutkin) Date: Thu, 4 May 2023 10:18:25 GMT Subject: RFR: 8306437: jhsdb cannot resolve image/symbol paths being used for analysis of Windows coredumps In-Reply-To: <62oUnZxCgEiiGjk1I9qb_1M_Zl0Sx4UvFeCk2S7mY00=.15eb4fda-eb88-4f9c-9d73-6fba9da60692@github.com> References: <62oUnZxCgEiiGjk1I9qb_1M_Zl0Sx4UvFeCk2S7mY00=.15eb4fda-eb88-4f9c-9d73-6fba9da60692@github.com> Message-ID: On Wed, 19 Apr 2023 10:39:40 GMT, Alexey Pavlyutkin wrote: > Hi! The patch fixes image/symbol lookup by jhsdb on alanysis Windows coredump. It uses executableName as a hint prepending image path with > > `;\server` > > and symbol path with > > `srv*https://msdl.microsoft.com/download/symbols;;\server` > > where the first bit points to Windows symbols located on remote server Double checked: the routine from https://github.com/openjdk/jdk/blob/master/src/jdk.hotspot.agent/doc/transported_core.html#L70 works fine if applied completely ------------- PR Comment: https://git.openjdk.org/jdk/pull/13530#issuecomment-1534479253 From sspitsyn at openjdk.org Thu May 4 10:39:32 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 4 May 2023 10:39:32 GMT Subject: RFR: 8306034: add support of virtual threads to JVMTI StopThread [v10] In-Reply-To: References: Message-ID: > This enhancement adds support of virtual threads to the JVMTI `StopThread` function. > In preview releases before this enhancement the StopThread returned the JVMTI_ERROR_UNSUPPORTED_OPERATION error code for virtual threads. > > The `StopThread` supports sending an asynchronous exception to a virtual thread only if it is current or suspended at mounted state. For instance, a virtual thread can be suspended at a JVMTI event. If the virtual thread is not suspended and is not current then the `JVMTI_ERROR_THREAD_NOT_SUSPENDED` error code is returned. If the virtual thread was suspended at unmounted state then the `JVMTI_ERROR_OPAQUE_FRAME` error code is returned. > > The `StopThread` has the following description for `JVMTI_ERROR_OPAQUE_FRAME` error code: >> The thread is a suspended virtual thread and the implementation >> was unable to throw an asynchronous exception from this frame. > > A couple of the `serviceability/jvmti/vthread` tests has been updated to adopt to new `StopThread` behavior. > > The CSR is: https://bugs.openjdk.org/browse/JDK-8306434 > > Testing: > The mach5 tears 1-6 are in progress. > Preliminary test runs were good in general. > The JDB test `vmTestbase/nsk/jdb/kill/kill001/kill001.java` has been problem-listed and will be fixed by the corresponding debugger enhancement which is going to adopt JDWP/JDI specs to new behavior of the JVMTI `StopThread` related to virtual threads. > > Also, two JCK JVMTI tests are failing in the tier-6 : >> vm/jvmti/StopThread/stop001/stop00103/stop00103.html >> vm/jvmti/StopThread/stop001/stop00103/stop00103a.html > > These two tests will be excluded from the test runs by the JCK team and then adjusted to new `StopThread` behavior. Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge - StopThread spec: minor tweek in description of OPAQUE_FRAME error code - minor tweak of JVMTI_ERROR_OPAQUE_FRAME description - Merge - install_async_exception: set interrupt status for platform threads only - minor tweak in new test - 1. Address review comments 2. Clear interrupt bit in the TestTaskThread - corrections for BoundVirtualThread and test typos - addressed review comments on new test - fixed trailing spaces - ... and 1 more: https://git.openjdk.org/jdk/compare/59a7d7f3...925362f2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13546/files - new: https://git.openjdk.org/jdk/pull/13546/files/940cda74..925362f2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13546&range=08-09 Stats: 7820 lines in 287 files changed: 5127 ins; 1309 del; 1384 mod Patch: https://git.openjdk.org/jdk/pull/13546.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13546/head:pull/13546 PR: https://git.openjdk.org/jdk/pull/13546 From sjohanss at openjdk.org Thu May 4 11:03:25 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 4 May 2023 11:03:25 GMT Subject: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2] In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 15:57:49 GMT, Coleen Phillimore wrote: >> Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: >> >> - Test refactor >> - Serguei review > > This looks good. Thanks for all the testing and adding the new test. Thanks again @coleenp and @sspitsyn for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13716#issuecomment-1534557035 From sjohanss at openjdk.org Thu May 4 11:03:28 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 4 May 2023 11:03:28 GMT Subject: Integrated: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared In-Reply-To: References: Message-ID: <2_3THp98E5Bbs1jC5_4HeBUCKM6qMAw-pSSMjOsL3-Q=.ab43a911-ce6d-4f07-92bf-9aa4de08b613@github.com> On Fri, 28 Apr 2023 12:48:44 GMT, Stefan Johansson wrote: > Hi all, > > Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint when there is nothing that can be cleaned up. > > **Summary** > When transforming/redefining classes a previous version list is linked together in the InstanceKlass. The original class is added to this list if it is still used or shared. The difference between shared and used is not currently noted. This leads to a problem when doing concurrent class unloading, because during that we postpone some potential work to a safepoint (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it is triggered by the ServiceThread if there is work to be done, for example if InstanceKlass::_has_previous_versions is true. > > Since we currently does not differentiate between shared and "in use" we always set _has_previous_versions if anything is on this list. This together with the fact that shared previous versions should never be cleaned out leads to this safepoint being triggered after every concurrent class unloading even though there is nothing that can be cleaned out. > > This can be avoided by making sure the _previous_versions list is only cleaned when there are non-shared classes on it. This change renames `_has_previous_versions` to `_clean_previous_versions` and only updates it if we have non-shared classes on the list. > > **Testing** > * A lot of manual testing verifying that we do get the safepoint when we should. > * Added new test to verify expected behavior by parsing the logs. The test uses JFR to trigger redefinition of some shared classes (when -Xshare:on). > * Mach5 run of new test and tier 1-3 This pull request has now been integrated. Changeset: 408cec51 Author: Stefan Johansson URL: https://git.openjdk.org/jdk/commit/408cec516bb5fd82fb6dcddeee934ac0c5ecffaf Stats: 150 lines in 6 files changed: 127 ins; 3 del; 20 mod 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13716 From mdoerr at openjdk.org Thu May 4 11:04:38 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 4 May 2023 11:04:38 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: References: <45EiQagy_IO6JBPslCPdMF0_Ab5tGpaPLPr-AtgmleI=.159d0eb4-f759-4d28-8872-407598dec193@github.com> Message-ID: On Thu, 4 May 2023 09:50:23 GMT, Axel Boldt-Christmas wrote: >>> I'm getting build warnings on all linux platforms with gcc-11.3.0: >>> >>> ``` >>> src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined >>> by . For historical compatibility, it is >>> currently defined by as well, but we plan to >>> remove this soon. To use "minor", include >>> directly. If you did not intend to use a system-defined macro >>> "minor", you should undefine it after including . [-Werror] >>> 84 | ZDriverMinor* ZDriver::minor() { >>> ``` >> >> That's unfortunate as minor and major are quite central to Generational ZGC and having to rename those functions will make the code look worse. I wonder if we should undef minor and major where needed. > >> I'm getting build warnings on all linux platforms with gcc-11.3.0: >> >> ``` >> src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined >> by . For historical compatibility, it is >> currently defined by as well, but we plan to >> remove this soon. To use "minor", include >> directly. If you did not intend to use a system-defined macro >> "minor", you should undefine it after including . [-Werror] >> 84 | ZDriverMinor* ZDriver::minor() { >> ``` > > @TheRealMDoerr I cannot reproduce this with gcc but can see the issue with clangd. > Can you check if this patch solves the issue you are seeing? > > diff --git a/src/hotspot/share/gc/z/zDriver.hpp b/src/hotspot/share/gc/z/zDriver.hpp > index 640ea6575ef..7fa650b1fa1 100644 > --- a/src/hotspot/share/gc/z/zDriver.hpp > +++ b/src/hotspot/share/gc/z/zDriver.hpp > @@ -29,6 +29,14 @@ > #include "gc/z/zThread.hpp" > #include "gc/z/zTracer.hpp" > > +#ifdef minor > +#undef minor > +#endif > + > +#ifdef major > +#undef major > +#endif > + > class VM_ZOperation; > class ZDriverMinor; > class ZDriverMajor; @xmas92: Thanks for your quick solution. Your patch solves the problem. If you want to integrate it, please also add a comment why this is needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1534563624 From aboldtch at openjdk.org Thu May 4 11:18:31 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 4 May 2023 11:18:31 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v3] In-Reply-To: References: <45EiQagy_IO6JBPslCPdMF0_Ab5tGpaPLPr-AtgmleI=.159d0eb4-f759-4d28-8872-407598dec193@github.com> Message-ID: On Thu, 4 May 2023 09:50:23 GMT, Axel Boldt-Christmas wrote: >>> I'm getting build warnings on all linux platforms with gcc-11.3.0: >>> >>> ``` >>> src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined >>> by . For historical compatibility, it is >>> currently defined by as well, but we plan to >>> remove this soon. To use "minor", include >>> directly. If you did not intend to use a system-defined macro >>> "minor", you should undefine it after including . [-Werror] >>> 84 | ZDriverMinor* ZDriver::minor() { >>> ``` >> >> That's unfortunate as minor and major are quite central to Generational ZGC and having to rename those functions will make the code look worse. I wonder if we should undef minor and major where needed. > >> I'm getting build warnings on all linux platforms with gcc-11.3.0: >> >> ``` >> src/hotspot/share/gc/z/zDriver.cpp:84:13: error: In the GNU C Library, "minor" is defined >> by . For historical compatibility, it is >> currently defined by as well, but we plan to >> remove this soon. To use "minor", include >> directly. If you did not intend to use a system-defined macro >> "minor", you should undefine it after including . [-Werror] >> 84 | ZDriverMinor* ZDriver::minor() { >> ``` > > @TheRealMDoerr I cannot reproduce this with gcc but can see the issue with clangd. > Can you check if this patch solves the issue you are seeing? > > diff --git a/src/hotspot/share/gc/z/zDriver.hpp b/src/hotspot/share/gc/z/zDriver.hpp > index 640ea6575ef..7fa650b1fa1 100644 > --- a/src/hotspot/share/gc/z/zDriver.hpp > +++ b/src/hotspot/share/gc/z/zDriver.hpp > @@ -29,6 +29,14 @@ > #include "gc/z/zThread.hpp" > #include "gc/z/zTracer.hpp" > > +#ifdef minor > +#undef minor > +#endif > + > +#ifdef major > +#undef major > +#endif > + > class VM_ZOperation; > class ZDriverMinor; > class ZDriverMajor; > @xmas92: Thanks for your quick solution. Your patch solves the problem. If you want to integrate it, please also add a comment why this is needed. Thanks for testing it. Will do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13771#issuecomment-1534586643 From stefank at openjdk.org Thu May 4 11:43:17 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 May 2023 11:43:17 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:52:25 GMT, Thomas Schatzl wrote: > Is it possible to remove the G1 hack in this change too? Because since now a zero value is supported in the output, there does not seem to be a reason to keep it for G1. I prefer if that is done and tested as a separate PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13796#issuecomment-1534616901 From stefank at openjdk.org Thu May 4 11:44:14 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 May 2023 11:44:14 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: undefine glibc major/minor macros ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/d65523f5..c9f6257b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=04-05 Stats: 11 lines in 1 file changed: 11 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From sjohanss at openjdk.org Thu May 4 12:16:25 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 4 May 2023 12:16:25 GMT Subject: RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption Message-ID: Please review this fix to avoid a tier1 test failure. **Summary** The newly added test wrongfully assumed that one of the transformed classes would be in use. This seem to be the case in most runs, but there is no guarantee. This change fixes the test to only verify that nothing is seen as shared when running with -Xshare:off. **Testing** Verified that the test still passes locally. ------------- Commit messages: - Updated comment - 8307448: Test RedefineSharedClassJFR fail due to wrong assumption Changes: https://git.openjdk.org/jdk/pull/13801/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13801&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307448 Stats: 10 lines in 1 file changed: 1 ins; 3 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/13801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13801/head:pull/13801 PR: https://git.openjdk.org/jdk/pull/13801 From eosterlund at openjdk.org Thu May 4 12:26:14 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 4 May 2023 12:26:14 GMT Subject: RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:08:28 GMT, Stefan Johansson wrote: > Please review this fix to avoid a tier1 test failure. > > **Summary** > The newly added test wrongfully assumed that one of the transformed classes would be in use. This seem to be the case in most runs, but there is no guarantee. This change fixes the test to only verify that nothing is seen as shared when running with -Xshare:off. > > **Testing** > Verified that the test still passes locally. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13801#pullrequestreview-1412973518 From coleenp at openjdk.org Thu May 4 12:39:20 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 May 2023 12:39:20 GMT Subject: RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:08:28 GMT, Stefan Johansson wrote: > Please review this fix to avoid a tier1 test failure. > > **Summary** > The newly added test wrongfully assumed that one of the transformed classes would be in use. This seem to be the case in most runs, but there is no guarantee. This change fixes the test to only verify that nothing is seen as shared when running with -Xshare:off. > > **Testing** > Verified that the test still passes locally. Thanks for fixing this so quickly. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13801#pullrequestreview-1412995733 From sjohanss at openjdk.org Thu May 4 12:52:23 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 4 May 2023 12:52:23 GMT Subject: RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:36:27 GMT, Coleen Phillimore wrote: >> Please review this fix to avoid a tier1 test failure. >> >> **Summary** >> The newly added test wrongfully assumed that one of the transformed classes would be in use. This seem to be the case in most runs, but there is no guarantee. This change fixes the test to only verify that nothing is seen as shared when running with -Xshare:off. >> >> **Testing** >> Verified that the test still passes locally. > > Thanks for fixing this so quickly. Thanks @coleenp and @fisk for the quick reviews. Since the test failure shows up in tier 1 I will push this without waiting 24 hours. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13801#issuecomment-1534718633 From sjohanss at openjdk.org Thu May 4 12:52:25 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 4 May 2023 12:52:25 GMT Subject: Integrated: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:08:28 GMT, Stefan Johansson wrote: > Please review this fix to avoid a tier1 test failure. > > **Summary** > The newly added test wrongfully assumed that one of the transformed classes would be in use. This seem to be the case in most runs, but there is no guarantee. This change fixes the test to only verify that nothing is seen as shared when running with -Xshare:off. > > **Testing** > Verified that the test still passes locally. This pull request has now been integrated. Changeset: 29233e0a Author: Stefan Johansson URL: https://git.openjdk.org/jdk/commit/29233e0a001adde71a3fa5d56292ccfba8409ea5 Stats: 10 lines in 1 file changed: 1 ins; 3 del; 6 mod 8307448: Test RedefineSharedClassJFR fail due to wrong assumption Reviewed-by: eosterlund, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/13801 From sebastian.lovdahl at hibox.tv Thu May 4 06:22:07 2023 From: sebastian.lovdahl at hibox.tv (=?UTF-8?Q?Sebastian_L=c3=b6vdahl?=) Date: Thu, 4 May 2023 09:22:07 +0300 Subject: 8226919: attach in linux hangs due to permission denied accessing /proc/pid/root Message-ID: <6f40fd00-ce52-ad42-049f-36bd701dfcdb@hibox.tv> Hi all, I would be interesting in doing my first JDK contribution by contributing a fix for 8226919. We stumbled upon this issue after having started migrating our Tomcat-based runtime environments from Java 8 to Java 17. A clear and simple reproducer is currently missing from 8226919. One way of reproducing it is by - having a Java service that listens to a privileged port - is run as a non-root user - by a systemd service with AmbientCapabilities=CAP_NET_BIND_SERVICE. This means that the process has elevated capabilities, and the Linux kernel seems to restrict access to /proc/pid/root because of that. If e.g. jcmd is run as the same user as the service is running as, the dynamic attach mechanism fails, because it cannot follow the /proc/pid/root symlink and find the tmp folder of the target process where the .java_pidNNNN socket is created. For the record, this worked fine with Java 8 before /proc/pid/root was used by the dynamic attach mechanism. The reason for using /proc/pid/root in the first place is sane and valid. It was done in 8179498 to make attaching work across container boundaries. It seems like 8255008 may revamp the attach mechanism specifically for containers. It looks like src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java already tries to handle this specific case, but falls just short of doing it all the way. The createAttachFile method checks if the target PID and the inner-most namespaced PID are equal or not. If they are equal, we're in a non-container environment, and we are able to create /tmp/.attach_pidNNNN directly without going through /proc/pid/root. However, the same check is missing in the findSocketFile method; it blindly assumes that /proc/pid/root will work and tries to open the socket via /proc/pid/root/tmp/.java_pidNNNN. First of all, is there consensus that this should be fixed? If yes, are there any flaws in the analysis above? Best regards, Sebastian L?vdahl From lmesnik at openjdk.org Thu May 4 15:20:00 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 4 May 2023 15:20:00 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects Message-ID: 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects caused significant regressions in some benchmarks and should be reverted. This fix backout changes and update problemlist bugs to new issue. Tier1 passed Running also tier5 to check other builds and more svc testing ------------- Commit messages: - Revert "8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects" Changes: https://git.openjdk.org/jdk/pull/13806/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13806&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306326 Stats: 72 lines in 11 files changed: 5 ins; 63 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13806/head:pull/13806 PR: https://git.openjdk.org/jdk/pull/13806 From asotona at openjdk.org Thu May 4 16:15:12 2023 From: asotona at openjdk.org (Adam Sotona) Date: Thu, 4 May 2023 16:15:12 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" Message-ID: Most of the manpages were updated a few years ago but some references remain. This patch renames remaining references to "macOS". Please review. Thanks, Adam ------------- Commit messages: - 8250596: Update remaining manpage references from "OS X" to "macOS" Changes: https://git.openjdk.org/jdk/pull/13807/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13807&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8250596 Stats: 16 lines in 7 files changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/13807.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13807/head:pull/13807 PR: https://git.openjdk.org/jdk/pull/13807 From mullan at openjdk.org Thu May 4 17:02:12 2023 From: mullan at openjdk.org (Sean Mullan) Date: Thu, 4 May 2023 17:02:12 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:50:02 GMT, Adam Sotona wrote: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam keytool and jarsigner docs look good. ------------- Marked as reviewed by mullan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13807#pullrequestreview-1413524329 From adinn at openjdk.org Thu May 4 17:20:15 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 4 May 2023 17:20:15 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: References: Message-ID: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> On Thu, 4 May 2023 09:26:33 GMT, Andrew Dinn wrote: > This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). @coleenp @plummercj Any chance of feedback or a review for this patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1535131579 From heidinga at openjdk.org Thu May 4 17:40:15 2023 From: heidinga at openjdk.org (Dan Heidinga) Date: Thu, 4 May 2023 17:40:15 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> References: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> Message-ID: On Thu, 4 May 2023 17:17:19 GMT, Andrew Dinn wrote: >> This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). > > @coleenp @plummercj Any chance of feedback or a review for this patch? @adinn Looking at the closely related code, is the same problem present for `adjust_exception_table` & `adjust_local_var_table`? Both appear to always reach for the original value from the method, though unlike the line number table, there's no member variable cached in the Relocator for either of them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1535160190 From coleenp at openjdk.org Thu May 4 18:00:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 May 2023 18:00:15 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests Marked as reviewed by coleenp (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13420#pullrequestreview-1413619044 From coleenp at openjdk.org Thu May 4 18:00:17 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 May 2023 18:00:17 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> Message-ID: <3u6wgn1EWik6TaKxwnVXe-Q-QvfDWM2SQo8r3IMn_x4=.0fb4c0c5-09a7-49ee-ad4d-975e0cfc5a2b@github.com> On Thu, 4 May 2023 06:49:10 GMT, David Holmes wrote: >> After I moved the `registerCleanup` to the body of a `default` method in the interface, there is no need for the implementors of the `Finalizable` interface to provide this method. All of them can use the default one. > >> All of them can use the default one > > Exactly my point. Default methods for interface classes were invented to solve a problem of compatibility if I remember correctly. Forcing subclasses to implement the interface method or have a superclass of the subclass to implement the interface method seems like it avoids the problem of silently not registering the cleanup or action that the interface method should force you to do. To solve the duplicated registerCleanup() cases, the two other classes could extend FinalizableObject then inherit its implementation of registerCleanup(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1185337107 From cjplummer at openjdk.org Thu May 4 18:14:16 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 4 May 2023 18:14:16 GMT Subject: RFR: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS In-Reply-To: References: Message-ID: <1IZxvcqfq2ljwq789NIFDORCm3aftTaihakfRkWXrcU=.d4b1102c-8d25-4714-a401-703031f04ded@github.com> On Thu, 4 May 2023 07:30:49 GMT, Arno Zeller wrote: > Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. > It's best that we just don't run the test on OSX if sudo privileges. Changes look good. Is there a reason why this was not noticed when [JDK-8290687](https://bugs.openjdk.org/browse/JDK-8290687) was filed and fixed last year? ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13791#pullrequestreview-1413649773 From cjplummer at openjdk.org Thu May 4 18:27:14 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 4 May 2023 18:27:14 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:33:49 GMT, Stefan Karlsson wrote: > When running jstat tests like the following: > test/jdk/sun/tools/jstatd/TestJstatdServer.java > > with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: > https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 > > G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. > > Tested locally by running the jstat/jstad tests in the Generational ZGC branch. Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13796#pullrequestreview-1413678399 From dcubed at openjdk.org Thu May 4 18:46:15 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 18:46:15 GMT Subject: RFR: 8307067: remove broken EnableThreadSMRExtraValidityChecks option [v2] In-Reply-To: <89mSdHMNu940PF6SHvxp7dyvTA6rENTZUcJiPn1fv0Y=.e8c2b8e2-3978-4a74-884a-8effab7a1b55@github.com> References: <89mSdHMNu940PF6SHvxp7dyvTA6rENTZUcJiPn1fv0Y=.e8c2b8e2-3978-4a74-884a-8effab7a1b55@github.com> Message-ID: <0NPaWxSm7XUVE5rjUNBRkeG3jmNUuo67C9VXDJjk7mQ=.b5a1415c-c9c6-4640-854a-79bceb9c39f9@github.com> On Thu, 4 May 2023 05:11:53 GMT, Robbin Ehn wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes CR - change ':' to '.'. > > Looks good, thanks! @robehn - Thanks for the re-review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13704#issuecomment-1535240037 From dcubed at openjdk.org Thu May 4 19:23:02 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 19:23:02 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v66] In-Reply-To: References: <-Kq6LaQmYZC8PVnmA4IH6QflBHwDB8__ovkqWOGFjeE=.451a7a23-578d-4b7f-b55d-74759c2cc446@github.com> Message-ID: On Fri, 28 Apr 2023 19:01:41 GMT, Roman Kennke wrote: >> This project is currently baselined on jdk-21+21-1701. However, that build-ID >> contains very noisy test failures in Tier[234] and probably higher. If you could >> rebase on: >> >> jiefu: [452cb8 - OpenJDK](https://orahub.oci.oraclecorp.com/jpg-mirrors/jdk-open/commit/452cb8432f4d45c3dacd4415bc9499ae73f7a17c) >> [8307103 ](http://bugs.openjdk.java.net/browse/JDK-8307103) Two TestMetaspaceAllocationMT tests fail after JDK-8306696 >> >> That would make my next Mach5 test cycle much, much happier... > >> http://bugs.openjdk.java.net/browse/JDK-8307103 > > Should be based on JDK-8307103 now. Thanks for all your testing! @rkennke - Please resolve the conversations that you we are done with. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535284406 From cjplummer at openjdk.org Thu May 4 19:24:13 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 4 May 2023 19:24:13 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:50:02 GMT, Adam Sotona wrote: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam The jstatd.1 and jstat.1 files look good. Copyrights need updating on all the files. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13807#pullrequestreview-1413771872 From cjplummer at openjdk.org Thu May 4 19:27:16 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 4 May 2023 19:27:16 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> References: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> Message-ID: On Thu, 4 May 2023 17:17:19 GMT, Andrew Dinn wrote: >> This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). > > @coleenp @plummercj Any chance of feedback or a review for this patch? @adinn This is not code I'm at all familiar with. Perhaps @sspitsyn can help. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1535288905 From dcubed at openjdk.org Thu May 4 19:40:10 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 19:40:10 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 09:33:24 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @dholmes-ora's review comments src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 666: > 664: // Invariant: tmpReg == 0. tmpReg is EAX which is the implicit cmpxchg comparand. > 665: lock(); > 666: cmpxchgptr(scrReg, Address(boxReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); Sigh... I had liked the fact that we took care of these old "TODO" items in this code. It's true that these changes were in violation of our "try not to change stack-lock" mantra. I did run the v66 changes thru Mach5 Tier[1-8] testing in "stack-locking is default" mode so your changes were well tested. src/hotspot/share/runtime/lockStack.hpp line 88: > 86: inline void remove(oop o); > 87: > 88: // Tests whether the object is on this lock-stack. nit: s/object/oop/ For consistency with your other comments. src/hotspot/share/runtime/lockStack.inline.hpp line 53: > 51: bool is_owning = &JavaThread::cast(thread)->lock_stack() == this; > 52: assert(is_owning == (get_thread() == thread), "is_owning sanity"); > 53: return is_owning; This is going to require a re-test just to make sure that we don't have a code path into here from the VMThread when it is doing some JVM/TI stuff (again...). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185436403 PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185437617 PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185439288 From phh at openjdk.org Thu May 4 20:04:15 2023 From: phh at openjdk.org (Paul Hohensee) Date: Thu, 4 May 2023 20:04:15 GMT Subject: RFR: 8304074: [JMX] Add an approximation of JVM process allocated bytes Message-ID: Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. ------------- Commit messages: - 8304074: [JMX] Add an approximation of JVM process allocated bytes Changes: https://git.openjdk.org/jdk/pull/13814/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8304074 Stats: 222 lines in 10 files changed: 173 ins; 16 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/13814.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13814/head:pull/13814 PR: https://git.openjdk.org/jdk/pull/13814 From aturbanov at openjdk.org Thu May 4 20:24:29 2023 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Thu, 4 May 2023 20:24:29 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: <_UHP565f9Io3v9rWWDf0HGRhhtNoniDhbM_XEM-2w1c=.f7cb7bae-5837-42ff-9491-284093ba4c75@github.com> On Thu, 4 May 2023 11:44:14 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > undefine glibc major/minor macros test/hotspot/jtreg/runtime/stringtable/StringTableCleaningTest.java line 117: > 115: return gcEndPrefix + g1Suffix; > 116: } else if (GC.Z.isSelected()) { > 117: return gcEndPrefix + "(" + zEndSuffix + ")|(" + xEndSuffix + ")"; nit Suggestion: return gcEndPrefix + "(" + zEndSuffix + ")|(" + xEndSuffix + ")"; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185476249 From heidinga at redhat.com Thu May 4 20:32:14 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Thu, 4 May 2023 16:32:14 -0400 Subject: JEP draft: Integrity and Strong Encapsulation Message-ID: Hi Ron, I?ve read this draft a number of times and each time I struggled with the framing of the problem given Java?s success over the past almost 30 years. Framing the problem with statements like: ?Strong encapsulation offers a solid foundation to build on. Without it, code is a castle in the sand.? sets the conversation off in the wrong direction. I started to respond in terms of these kinds of statements and found the response wouldn?t have been helpful to move the conversation forward. Instead, I stepped back and looked at the larger context. In particular, I?m reading this in light of JEP 411: Deprecate the Security Manager for Removal [0] and the eventual goal of completely removing the SecurityManager. Let me lay out how I see this and you can correct me where I?ve gone off the rails. As JEP 411 states, the SecurityManager has: * Brittle permission model * Difficult programming model * Poor performance Which translates into a whole lot of cost both for maintainers of the JDK and for all users who must pay the runtime costs related to the SecurityManager (high when enabled, but non-zero always). Although the SecurityManager has high costs, and is infrequently used at runtime in production, it provides the only way to limit certain capabilities like: * JNI (SecurityManager::checkLink) * Encapsulation (SecurityManager::checkPackageAccess) * Launch new processes (SecurityManager::checkExec) * Reflective access (accessDeclaredMembers, etc) * and others Some of those controls need replacements if the SecurityManager will go away. JNI, surprisingly, is a key one here for large corporations. If I understand correctly, this new Integrity JEP draft aims, amongst other things, to replace the hard to maintain, expensive runtime checks of the SecurityManager with configuration via command line options. This allows those who previously relied on the SecurityManager to continue to control the high-order bits of functionality without imposing a cost on the rest of the ecosystem. It also makes it easier to determine which libraries are relying on the restricted features. Overall, this provides a smoother migration path for users, makes the intention of users very clear (just read the command line vs auditing SecurityManager implementation) and improves performance by shifting these decisions to configuration time rather than paying cost of code complexity and stack walks. I also appreciate the ?nudge? being made with this JEP by requiring explicit opt-in to disabling protections versus the previous uphill battle to enable the SecurityManager. It makes for an easier conversation to ask for i.e. JNI to be enabled for one library on the command line rather than having to deal with all the potential restrictions of the SecurityManager. So while overall, when viewed from the lens of removing the SecurityManager, this approach makes sense, I do want to caution on betting against Java?s strengths, particularly against its use of speculative optimizations. > Neither a person reading the code nor the platform itself ? as it compiles and runs it ? can fully be assured that the code does what it says or that its meaning does not change over time as the program runs. ?.. > In the Java runtime, certain optimizations assume that conditions that hold at the time the optimization is made hold forever. This is the basis of all speculative optimization - the platform assumes the meaning doesn?t change and compiles as though it won?t. If the application is modified at runtime, the JVM applies the necessary compensations such as deoptimization and recompilation. Java has bet on dynamic features time and again (even when others have championed static approaches) and those bets - backed by speculative optimizations - have paid off time and again. So this can?t be what you?re arguing against. If the concern is that the runtime behaviour may appear to be different than the intent expressed in the source code due to use of setAccessible or changes by agents, then I think the JEP should be more explicit about that concern. The current wording reads as equally applying to many of Java?s existing dynamic behaviours (and belies the power of speculation coupled with deoptimization!). And a few smaller quibbles: > For example, every developer assumes that changing the signature of a private method, or removing a private field, does not impact the class's clients. Right. The private modifier defines a *contract* which states anyone depending on the implementation details are on their own and shouldn?t be surprised by changes. I understand that it can be problematic when large successful frameworks are broken by such changes, but that doesn?t invalidate the contract that?s in place. The risk is higher for the JDK than for other libraries or applications given the common dependency on the JDK. > However, with deep reflection, doSensitiveOperation could be invoked from anywhere without an isAuthorized check, nullifying the intended restriction; even worse, an agent could modify the code of the isAuthorized method to always return true. And clearly, these would be bugs. Not much different than leaking a privileged MethodHandles.Lookup object outside a Class?s nest (the boundary for private access) for which there is no enhanced integrity check. We can?t fully protect users from code that does the wrong thing even while undertaking efforts to minimize the attack surface. ?Superpowers? are exactly that, while we support making them opt-in, we should be careful not to overstate the risk as the same principle applies to all code running in a process - it must be trusted as it has the same privileges as the process. > A tool like jlink could remove unused strongly-encapsulated methods at link time to reduce image size and class loading time. Most of the benefit here is not time saved by not loading the methods, it?s actually due to avoiding the need to load classes during verification. The verifier needs to validate relationships between classes and every extra method potentially asserts new relationships (such as class X subclasses Throwable) and it is these extra classes that need loading that typically increases the startup time. > The guarantee that code may not change over time even opens the door to ahead-of-time compilation (AOT). AOT doesn?t depend on the code never changing. OpenJ9 has AOT code that is resilient in the face of changes to the underlying Java class files. I?m positive Hotspot will be able to develop similar resilient AOT code. The cost of validating the assumptions made while AOT compiling is much lower than doing the compile while still enabling Java?s dynamic features. ?Dan [0] https://openjdk.org/jeps/411 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at openjdk.org Thu May 4 20:53:08 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 20:53:08 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 19:32:23 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @dholmes-ora's review comments > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 666: > >> 664: // Invariant: tmpReg == 0. tmpReg is EAX which is the implicit cmpxchg comparand. >> 665: lock(); >> 666: cmpxchgptr(scrReg, Address(boxReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); > > Sigh... I had liked the fact that we took care of these old "TODO" items > in this code. It's true that these changes were in violation of our "try not > to change stack-lock" mantra. I did run the v66 changes thru Mach5 > Tier[1-8] testing in "stack-locking is default" mode so your changes > were well tested. Let's re-do those changes in a follow-up, ok? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185498799 From amenkov at openjdk.org Thu May 4 20:55:30 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 20:55:30 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v11] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: jvmtiTagMap refactoring ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/1e6ca207..930f0d0c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=09-10 Stats: 37 lines in 1 file changed: 1 ins; 8 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From amenkov at openjdk.org Thu May 4 20:55:37 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 20:55:37 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v10] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Thu, 4 May 2023 01:53:10 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback > > src/hotspot/share/prims/jvmtiTagMap.cpp line 2231: > >> 2229: >> 2230: // Helper class to collect/report stack roots. >> 2231: class StackRootCollector { > > We discussed privately about the following renamings: > - `StackRootCollector` => `StackRefCollector` > - `collect_stack_roots` => `collect_stack_refs` > - `collect_vthread_stack_roots` => `collect_vthread_stack_refs` done > src/hotspot/share/prims/jvmtiTagMap.cpp line 2284: > >> 2282: for (int index = 0; index < values->size(); index++) { >> 2283: if (values->at(index)->type() == T_OBJECT) { >> 2284: oop o = values->obj_at(index)(); > > I'd suggest to get rid of one-letter identifier like `o` and `c`. > They variables can be renamed to `obj` and `cont` instead. > It'd better to rename `slot_offset` to `offset`. changed variable names. I think "offset" is not good name here, it's unclear what the offset is. slot_offset shows that the offset is for reported slot parameter ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185498169 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185499481 From rkennke at openjdk.org Thu May 4 20:58:09 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 20:58:09 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 20:49:22 GMT, Roman Kennke wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 666: >> >>> 664: // Invariant: tmpReg == 0. tmpReg is EAX which is the implicit cmpxchg comparand. >>> 665: lock(); >>> 666: cmpxchgptr(scrReg, Address(boxReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); >> >> Sigh... I had liked the fact that we took care of these old "TODO" items >> in this code. It's true that these changes were in violation of our "try not >> to change stack-lock" mantra. I did run the v66 changes thru Mach5 >> Tier[1-8] testing in "stack-locking is default" mode so your changes >> were well tested. > > Let's re-do those changes in a follow-up, ok? I've filed: https://bugs.openjdk.org/browse/JDK-8307493 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185503403 From amenkov at openjdk.org Thu May 4 20:58:32 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 20:58:32 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Thu, 4 May 2023 01:44:36 GMT, Serguei Spitsyn wrote: >> refactored. > > It'd be nice to do even more factoring + renaming. > The lines 2326-2345 can be refactored to a function: > > bool StackRootCollector::report_native_frame_refs(jmethodID method) { > _blk->set_context(_thread_tag, _tid, _depth, method); > if (_is_top_frame) { > // JNI locals for the top frame. > assert(_java_thread != nullptr, "sanity"); > _java_thread->active_handles()->oops_do(_blk); > if (_blk->stopped()) { > return false; > } > } else { > if (_last_entry_frame != nullptr) { > // JNI locals for the entry frame > assert(_last_entry_frame->is_entry_frame(), "checking"); > _last_entry_frame->entry_frame_call_wrapper()->handles()->oops_do(_blk); > if (_blk->stopped()) { > return false; > } > } > } > return true; > } > > > The function `report_stack_refs` can be renamed to `report_java_frame_refs` > to make function name more consistent. JNI local reporting uses this tricky _is_top_frame/_last_entry_frame stuff I think it would be better to have it in the main do_frame method for better readability ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185504637 From rkennke at openjdk.org Thu May 4 21:02:05 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 21:02:05 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 19:35:58 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @dholmes-ora's review comments > > src/hotspot/share/runtime/lockStack.inline.hpp line 53: > >> 51: bool is_owning = &JavaThread::cast(thread)->lock_stack() == this; >> 52: assert(is_owning == (get_thread() == thread), "is_owning sanity"); >> 53: return is_owning; > > This is going to require a re-test just to make sure that we don't have > a code path into here from the VMThread when it is doing some > JVM/TI stuff (again...). I don't think so. That code did use JavaThread::cast(thread) before which would have fired. But that means I can leave out the JavaThread::cast() now. Let me do that change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185508726 From amenkov at openjdk.org Thu May 4 21:04:28 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 21:04:28 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v10] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Thu, 4 May 2023 01:55:28 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback > > src/hotspot/share/prims/jvmtiTagMap.cpp line 2893: > >> 2891: HandleMark hm(current_thread); >> 2892: >> 2893: StackChunkFrameStream fs(chunk); > > There are ways to avoid using the `StackChunkFrameStream`. > You can find good examples in the jvmtiEnvBase.cpp. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185510623 From rkennke at openjdk.org Thu May 4 21:10:14 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 21:10:14 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @dcubed-ojdk review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/5d5a43dd..e06c5ef1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=71 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=70-71 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From amenkov at openjdk.org Thu May 4 21:10:28 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 21:10:28 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v12] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: indent ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/930f0d0c..0989d0b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=10-11 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From rkennke at openjdk.org Thu May 4 21:11:18 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 21:11:18 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v66] In-Reply-To: References: <-Kq6LaQmYZC8PVnmA4IH6QflBHwDB8__ovkqWOGFjeE=.451a7a23-578d-4b7f-b55d-74759c2cc446@github.com> Message-ID: On Fri, 28 Apr 2023 19:01:41 GMT, Roman Kennke wrote: >> This project is currently baselined on jdk-21+21-1701. However, that build-ID >> contains very noisy test failures in Tier[234] and probably higher. If you could >> rebase on: >> >> jiefu: [452cb8 - OpenJDK](https://orahub.oci.oraclecorp.com/jpg-mirrors/jdk-open/commit/452cb8432f4d45c3dacd4415bc9499ae73f7a17c) >> [8307103 ](http://bugs.openjdk.java.net/browse/JDK-8307103) Two TestMetaspaceAllocationMT tests fail after JDK-8306696 >> >> That would make my next Mach5 test cycle much, much happier... > >> http://bugs.openjdk.java.net/browse/JDK-8307103 > > Should be based on JDK-8307103 now. Thanks for all your testing! > @rkennke - Please resolve the conversations that you we are done with. Thanks! I just went over the complete history of this PR and closed conversations that have been addressed - which I believe are all of them. Are we finally approaching the finish-line? (Wow what a long-running PR. Including its precedessors this is more than a year in the making.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535414579 From dcubed at openjdk.org Thu May 4 21:22:05 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 21:22:05 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v71] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 20:59:58 GMT, Roman Kennke wrote: >> src/hotspot/share/runtime/lockStack.inline.hpp line 53: >> >>> 51: bool is_owning = &JavaThread::cast(thread)->lock_stack() == this; >>> 52: assert(is_owning == (get_thread() == thread), "is_owning sanity"); >>> 53: return is_owning; >> >> This is going to require a re-test just to make sure that we don't have >> a code path into here from the VMThread when it is doing some >> JVM/TI stuff (again...). > > I don't think so. That code did use JavaThread::cast(thread) before which would have fired. But that means I can leave out the JavaThread::cast() now. Let me do that change. Agreed! I read thru the diffs so fast I missed the "JavaThread::cast(thread)" part of this: > bool is_self = &JavaThread::cast(thread)->lock_stack() == this; I'll still do a round of testing on v70 just because more runs are better for shaking out anything that might be racy... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1185529131 From dcubed at openjdk.org Thu May 4 21:32:12 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 21:32:12 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:10:14 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @dcubed-ojdk review comments I've done a couple of crawl thru reviews of everything except for RISC-V and I think the code is in great shape. I'm doing yet another round of Mach5 testing on v70 (with fast-locking as default and with stack-locking as default). I'll post Mach5 results in another comment. I think we are nearing the finish line. A couple of things: - zero builds are still failing in the Oracle CI; can you check out zero builds on your end? - Eric Caspole has been running perf testing in Oracle perf lab; when did you last re-run your perf testing? - I'm still checking with Oracle reviewers to make sure they have made a final pass. I'm probably forgetting something, but if I think of anything else, I'll let you know. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/10907#pullrequestreview-1413949517 PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535434846 From amenkov at openjdk.org Thu May 4 21:35:17 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 21:35:17 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v13] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: - Merge branch 'openjdk:master' into vthread_follow_ref - indent - jvmtiTagMap refactoring - feedback - Added "no continuations" test case - mounted VTs reported as OTHER, unmounted VTs are not reported as roots - Fixed indent in collect_vthread_stack_roots - removed full heap scan. unmounted VT are not considered roots and reported only from references - Use atomic for synchronization - trailing spaces - ... and 10 more: https://git.openjdk.org/jdk/compare/463afe09...1d01ff11 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/0989d0b8..1d01ff11 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=11-12 Stats: 320341 lines in 3169 files changed: 273731 ins; 26090 del; 20520 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From coleenp at openjdk.org Thu May 4 21:49:13 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 May 2023 21:49:13 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:10:14 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @dcubed-ojdk review comments Do you have GHA configured? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535452342 From rkennke at openjdk.org Thu May 4 21:58:05 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 21:58:05 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:25:50 GMT, Daniel D. Daugherty wrote: > I think we are nearing the finish line. A couple of things: > > - zero builds are still failing in the Oracle CI; can you check out zero builds on your end? I've been wondering about those too. I just built zero 64 and 32 bit locally without issues, tomorrow I will experiment some more and check if anything sticks out in Zero code. > - Eric Caspole has been running perf testing in Oracle perf lab; when did you last re-run your perf testing? It's been a while, last time when I switched to fixed-sized lock-Stack. I haven't re-run perf tests since then because I have not changed anything that seemed substantial. > - I'm still checking with Oracle reviewers to make sure they have made a final pass. Perfect, thank you so much! ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535458878 From rkennke at openjdk.org Thu May 4 21:58:06 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 21:58:06 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:46:02 GMT, Coleen Phillimore wrote: > Do you have GHA configured? Yes I do. Why? (Btw, GHA does Zero builds too and they're looking ok.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535460100 From dcubed at openjdk.org Thu May 4 22:14:00 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 May 2023 22:14:00 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:10:14 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @dcubed-ojdk review comments I have a Tier3 test failure: https://bugs.openjdk.org/browse/JDK-8291555?focusedCommentId=14579239&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579239 ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535476128 From coleenp at openjdk.org Thu May 4 22:39:19 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 May 2023 22:39:19 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 Message-ID: The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. Tested with JVMTI and JDI tests locally, and tier1-4 tests. ------------- Commit messages: - put back the comment for put. - 8306843: JVMTI tag map extremely slow after JDK-8292741 Changes: https://git.openjdk.org/jdk/pull/13818/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13818&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306843 Stats: 326 lines in 8 files changed: 242 ins; 41 del; 43 mod Patch: https://git.openjdk.org/jdk/pull/13818.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13818/head:pull/13818 PR: https://git.openjdk.org/jdk/pull/13818 From dholmes at openjdk.org Thu May 4 22:51:12 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 May 2023 22:51:12 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:50:02 GMT, Adam Sotona wrote: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam Looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13807#pullrequestreview-1414016697 From amenkov at openjdk.org Thu May 4 23:20:21 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 23:20:21 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v14] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: Updated test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/1d01ff11..ac38c44e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=12-13 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From amenkov at openjdk.org Thu May 4 23:20:26 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 4 May 2023 23:20:26 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Tue, 2 May 2023 09:46:30 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> Added "no continuations" test case > > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 38: > >> 36: * @test id=no-vmcontinuations >> 37: * @requires vm.jvmti >> 38: * @enablePreview > > We do not @enablePreview at lines 28 and 38 anymore. fixed > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 41: > >> 39: * @run main/othervm/native >> 40: * -XX:+UnlockExperimentalVMOptions -XX:-VMContinuations >> 41: * -Djdk.virtualThreadScheduler.parallelism=1 > > Why do we need the line 41 in this case? not needed. removed. > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/VThreadStackRefTest.java line 208: > >> 206: >> 207: private static void verifyVthreadMounted(Thread t, boolean expectedMounted) { >> 208: // Hucky, but simple. > > Nit: Hucky => Hacky ? Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185593295 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185593199 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185593067 From sspitsyn at openjdk.org Fri May 5 00:08:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 00:08:14 GMT Subject: RFR: 8307308: Add serviceability_ttf_virtual group to exclude jvmti tests developed for virtual threads In-Reply-To: References: Message-ID: On Wed, 3 May 2023 15:45:18 GMT, Leonid Mesnik wrote: > Please review following trivial fix which add serviceability_ttf_virtual test group. > There are several directories with jvmti tests developed for testing virtual threads. It does't make sense to run them with virtual test thread factory. So the group serviceability_ttf_virtual is introduced to run all other svc test in this mode. Looks good and trivial. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13782#pullrequestreview-1414060905 From sspitsyn at openjdk.org Fri May 5 00:39:17 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 00:39:17 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects In-Reply-To: References: Message-ID: <_xH4KdRJRcDHNkNtyzIFjdO_IiMqyV-DLwFwDqlX4kA=.e964e7a0-14a1-49c7-bc29-128c0f87d419@github.com> On Thu, 4 May 2023 15:12:43 GMT, Leonid Mesnik wrote: > 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects > > caused significant regressions in some benchmarks and should be reverted. > > This fix backout changes and update problemlist bugs to new issue. > Tier1 passed > Running also tier5 to check other builds and more svc testing src/hotspot/share/opto/runtime.hpp line 219: > 217: static address register_finalizer_Java() { return _register_finalizer_Java; } > 218: #if INCLUDE_JVMTI > 219: static address notify_jvmti_object_alloc() { return _notify_jvmti_object_alloc; } This line has to be also removed: `312 static const TypeFunc* notify_jvmti_object_alloc_Type();` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13806#discussion_r1185622347 From sspitsyn at openjdk.org Fri May 5 00:43:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 00:43:15 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects In-Reply-To: References: Message-ID: <3-xnvJQ9SgsTQAMEko8IEp42n7bMnLXQ-xIuv2aGD_c=.11041bc7-d7ae-4d42-be31-09a9c55b6876@github.com> On Thu, 4 May 2023 15:12:43 GMT, Leonid Mesnik wrote: > 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects > > caused significant regressions in some benchmarks and should be reverted. > > This fix backout changes and update problemlist bugs to new issue. > Tier1 passed > Running also tier5 to check other builds and more svc testing The `notify_jvmti_object_alloc_Type` declaration needs to be also removed from the runtime.hpp file. Other than that the BACKOUT looks clean. Thanks,. Serguei ------------- PR Review: https://git.openjdk.org/jdk/pull/13806#pullrequestreview-1414075226 From lmesnik at openjdk.org Fri May 5 00:47:26 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 May 2023 00:47:26 GMT Subject: Integrated: 8307308: Add serviceability_ttf_virtual group to exclude jvmti tests developed for virtual threads In-Reply-To: References: Message-ID: <_f686zZc3V2JkjsIlXwctYgdozcWbzwEbnnX9TbYAYo=.11ee4cfc-0600-48d5-8b72-6dfab8ba2506@github.com> On Wed, 3 May 2023 15:45:18 GMT, Leonid Mesnik wrote: > Please review following trivial fix which add serviceability_ttf_virtual test group. > There are several directories with jvmti tests developed for testing virtual threads. It does't make sense to run them with virtual test thread factory. So the group serviceability_ttf_virtual is introduced to run all other svc test in this mode. This pull request has now been integrated. Changeset: a44e8908 Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/a44e8908a1007365f7c016df65ce7722556c180a Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod 8307308: Add serviceability_ttf_virtual group to exclude jvmti tests developed for virtual threads Reviewed-by: sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13782 From lmesnik at openjdk.org Fri May 5 00:58:12 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 May 2023 00:58:12 GMT Subject: RFR: 8307370: Add tier1 testing with thread factory in CI Message-ID: This fix just excludes a few hotspot/jdk tests which are not compatible with test thread factory. So `make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all JTREG_TEST_THREAD_FACTORY=Virtual TEST=:tier1` could be executed clearly. ------------- Commit messages: - 8307370: Add tier1 testing with thread factory in CI Changes: https://git.openjdk.org/jdk/pull/13820/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13820&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307370 Stats: 36 lines in 2 files changed: 36 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13820.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13820/head:pull/13820 PR: https://git.openjdk.org/jdk/pull/13820 From lmesnik at openjdk.org Fri May 5 01:06:09 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 May 2023 01:06:09 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects [v2] In-Reply-To: References: Message-ID: > 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects > > caused significant regressions in some benchmarks and should be reverted. > > This fix backout changes and update problemlist bugs to new issue. > Tier1 passed > Running also tier5 to check other builds and more svc testing Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: removed notify_jvmti_object_alloc_Type line ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13806/files - new: https://git.openjdk.org/jdk/pull/13806/files/72e42170..fed4d98a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13806&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13806&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13806/head:pull/13806 PR: https://git.openjdk.org/jdk/pull/13806 From sspitsyn at openjdk.org Fri May 5 01:25:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 01:25:21 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 01:06:09 GMT, Leonid Mesnik wrote: >> 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects >> >> caused significant regressions in some benchmarks and should be reverted. >> >> This fix backout changes and update problemlist bugs to new issue. >> Tier1 passed >> Running also tier5 to check other builds and more svc testing > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > removed notify_jvmti_object_alloc_Type line Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13806#pullrequestreview-1414092133 From fyang at openjdk.org Fri May 5 02:00:29 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 5 May 2023 02:00:29 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 11:44:14 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > undefine glibc major/minor macros test/hotspot/gtest/gc/z/test_zForwarding.cpp line 68: > 66: > 67: bool reserved = os::attempt_reserve_memory_at((char*)ZAddressHeapBase, ZGranuleSize, false /* executable */); > 68: ASSERT_TRUE(reserved); Hi, Thanks for the great work! I have performed some tests on linux-riscv64 Hifive Unmatched board. So far, I only witnessed one gtest failure: $ make test TEST=gtest:ZForwardingTest Building target 'test' in configuration 'linux-riscv64-server-release' Test selection 'gtest:ZForwardingTest', will run: * gtest:ZForwardingTest/server Running test 'gtest:ZForwardingTest/server' Note: Google Test filter = ZForwardingTest* [==========] Running 4 tests from 1 test suite. [----------] Global test environment set-up. [----------] 4 tests from ZForwardingTest [ RUN ] ZForwardingTest.setup_vm test/hotspot/gtest/gc/z/test_zForwarding.cpp:68: Failure Value of: reserved Actual: false Expected: true [ FAILED ] ZForwardingTest.setup_vm (0 ms) [ RUN ] ZForwardingTest.find_empty_vm [ OK ] ZForwardingTest.find_empty_vm (1 ms) [ RUN ] ZForwardingTest.find_full_vm [ OK ] ZForwardingTest.find_full_vm (8 ms) [ RUN ] ZForwardingTest.find_every_other_vm [ OK ] ZForwardingTest.find_every_other_vm (0 ms) [----------] 4 tests from ZForwardingTest (761 ms total) [----------] Global test environment tear-down ERROR: RUN_ALL_TESTS() failed. Error 1 [==========] 4 tests from 1 test suite ran. (762 ms total) [ PASSED ] 3 tests. [ FAILED ] 1 test, listed below: [ FAILED ] ZForwardingTest.setup_vm 1 FAILED TEST Finished running test 'gtest:ZForwardingTest/server' Test report is stored in build/linux-riscv64-server-release/test-results/gtest_ZForwardingTest_server ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR >> gtest:ZForwardingTest/server 4 3 1 0 << ============================== TEST FAILURE The gtest failed this assertion where 'reserved' return by function os::attempt_reserve_memory_at is false. I find the reason is that the mmap call at the bottom returns a different address instead of the requested one (ZAddressHeapBase). I think that is possible since we are not sure if the requested address is available before the mmap call, right? So I guess we might need some changes here for this gtest. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185645071 From sspitsyn at openjdk.org Fri May 5 02:16:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 02:16:15 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 22:32:36 GMT, Coleen Phillimore wrote: > The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. > > Tested with JVMTI and JDI tests locally, and tier1-4 tests. src/hotspot/share/utilities/resizeableResourceHash.hpp line 91: > 89: // Calculate next "good" hashtable size based on requested count > 90: int calculate_resize(bool use_large_table_sizes) const { > 91: const int resize_factor = 2.0; // by how much we will resize using current number of entries Nit: extra spaces brefore the '=' sign. Q: Why is a FP constant assigned to the integer variable? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1185650312 From sspitsyn at openjdk.org Fri May 5 02:23:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 02:23:14 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 22:32:36 GMT, Coleen Phillimore wrote: > The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. > > Tested with JVMTI and JDI tests locally, and tier1-4 tests. Thank you fore taking care about these performance issue! I've posted a couple of comments but am still looking at it. It is hard to make sure the changes are fully correct. src/hotspot/share/utilities/resourceHash.hpp line 234: > 232: if (node != nullptr) { > 233: *ptr = node->_next; > 234: bool cont = function(node->_key, node->_value); Q: The local `cont` is not used. Just wanted to check if anything is missed here. Also, what does this name mean? Should it be named `cond` instead? ------------- PR Review: https://git.openjdk.org/jdk/pull/13818#pullrequestreview-1414110945 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1185651139 From sspitsyn at openjdk.org Fri May 5 03:36:16 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 03:36:16 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:26:33 GMT, Andrew Dinn wrote: > This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). Looks good. Thank you for taking care about it! Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13795#pullrequestreview-1414138368 From sspitsyn at openjdk.org Fri May 5 03:42:15 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 03:42:15 GMT Subject: RFR: 8307370: Add tier1 testing with thread factory in CI In-Reply-To: References: Message-ID: On Fri, 5 May 2023 00:47:58 GMT, Leonid Mesnik wrote: > This fix just excludes a few hotspot/jdk tests which are not compatible with test thread factory. So > `make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all JTREG_TEST_THREAD_FACTORY=Virtual TEST=:tier1` > could be executed clearly. test/jdk/ProblemList-Virtual.txt line 54: > 52: > 53: ########## > 54: ## Tests incompatible with with virtual test thread factory A typo: "with with". Dot is missed at the end. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13820#discussion_r1185672766 From sspitsyn at openjdk.org Fri May 5 03:49:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 03:49:14 GMT Subject: RFR: 8307370: Add tier1 testing with thread factory in CI In-Reply-To: References: Message-ID: On Fri, 5 May 2023 00:47:58 GMT, Leonid Mesnik wrote: > This fix just excludes a few hotspot/jdk tests which are not compatible with test thread factory. So > `make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all JTREG_TEST_THREAD_FACTORY=Virtual TEST=:tier1` > could be executed clearly. The update looks okay but I've posted some nits. Thanks, Serguei test/hotspot/jtreg/ProblemList-Virtual.txt line 135: > 133: ## Tests incompatible with with virtual test thread factory > 134: ## There is no goal to run all test with virtual test thread factory > 135: ## So any test migth be added as incompatible, the A typo at line 133: "with with". Dot is missed at the end. It seems, both statements at 134-135 are incomplete. First misses dot at the end, second ended with "the" and has no dot as well. test/jdk/ProblemList-Virtual.txt line 56: > 54: ## Tests incompatible with with virtual test thread factory > 55: ## There is no goal to run all test with virtual test thread factory > 56: ## So any test migth be added as incompatible, the A typo at line 64: "with with". Dot is missed at the end. It seems, both statements at 55-56 are incomplete. First misses dot at the end, second ended with "the" and has no dot as well. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13820#pullrequestreview-1414143094 PR Review Comment: https://git.openjdk.org/jdk/pull/13820#discussion_r1185674577 PR Review Comment: https://git.openjdk.org/jdk/pull/13820#discussion_r1185673789 From stefank at openjdk.org Fri May 5 05:13:04 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 May 2023 05:13:04 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: <_UHP565f9Io3v9rWWDf0HGRhhtNoniDhbM_XEM-2w1c=.f7cb7bae-5837-42ff-9491-284093ba4c75@github.com> References: <_UHP565f9Io3v9rWWDf0HGRhhtNoniDhbM_XEM-2w1c=.f7cb7bae-5837-42ff-9491-284093ba4c75@github.com> Message-ID: On Thu, 4 May 2023 20:21:12 GMT, Andrey Turbanov wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> undefine glibc major/minor macros > > test/hotspot/jtreg/runtime/stringtable/StringTableCleaningTest.java line 117: > >> 115: return gcEndPrefix + g1Suffix; >> 116: } else if (GC.Z.isSelected()) { >> 117: return gcEndPrefix + "(" + zEndSuffix + ")|(" + xEndSuffix + ")"; > > nit > Suggestion: > > return gcEndPrefix + "(" + zEndSuffix + ")|(" + xEndSuffix + ")"; Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185701989 From stefank at openjdk.org Fri May 5 05:12:59 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 May 2023 05:12:59 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v7] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Whitespace nit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13771/files - new: https://git.openjdk.org/jdk/pull/13771/files/c9f6257b..c4217280 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From stefank at openjdk.org Fri May 5 05:20:30 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 May 2023 05:20:30 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 01:54:48 GMT, Fei Yang wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> undefine glibc major/minor macros > > test/hotspot/gtest/gc/z/test_zForwarding.cpp line 68: > >> 66: >> 67: bool reserved = os::attempt_reserve_memory_at((char*)ZAddressHeapBase, ZGranuleSize, false /* executable */); >> 68: ASSERT_TRUE(reserved); > > Hi, > Thanks for the great work! > I have performed some tests on linux-riscv64 Hifive Unmatched board. So far, I only witnessed one gtest failure: > > > $ make test TEST=gtest:ZForwardingTest > Building target 'test' in configuration 'linux-riscv64-server-release' > Test selection 'gtest:ZForwardingTest', will run: > * gtest:ZForwardingTest/server > > Running test 'gtest:ZForwardingTest/server' > Note: Google Test filter = ZForwardingTest* > [==========] Running 4 tests from 1 test suite. > [----------] Global test environment set-up. > [----------] 4 tests from ZForwardingTest > [ RUN ] ZForwardingTest.setup_vm > test/hotspot/gtest/gc/z/test_zForwarding.cpp:68: Failure > Value of: reserved > Actual: false > Expected: true > [ FAILED ] ZForwardingTest.setup_vm (0 ms) > [ RUN ] ZForwardingTest.find_empty_vm > [ OK ] ZForwardingTest.find_empty_vm (1 ms) > [ RUN ] ZForwardingTest.find_full_vm > [ OK ] ZForwardingTest.find_full_vm (8 ms) > [ RUN ] ZForwardingTest.find_every_other_vm > [ OK ] ZForwardingTest.find_every_other_vm (0 ms) > [----------] 4 tests from ZForwardingTest (761 ms total) > > [----------] Global test environment tear-down > ERROR: RUN_ALL_TESTS() failed. Error 1 > [==========] 4 tests from 1 test suite ran. (762 ms total) > [ PASSED ] 3 tests. > [ FAILED ] 1 test, listed below: > [ FAILED ] ZForwardingTest.setup_vm > > 1 FAILED TEST > Finished running test 'gtest:ZForwardingTest/server' > Test report is stored in build/linux-riscv64-server-release/test-results/gtest_ZForwardingTest_server > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR >>> gtest:ZForwardingTest/server 4 3 1 0 << > ============================== > TEST FAILURE > > > The gtest failed this assertion where 'reserved' return by function os::attempt_reserve_memory_at is false. > I find the reason is that the mmap call at the bottom returns a different address instead of the requested one (ZAddressHeapBase). I think that is possible since we are not sure if the requested address is available before the mmap call, right? So I guess we might need some changes here for this gtest. Thanks for reporting. It would be interesting to see what address you get and compare it to the range [ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185707639 From rkennke at openjdk.org Fri May 5 05:54:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 05:54:29 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v73] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Relax zapped-entry test when calling thread is not owning thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/e06c5ef1..43cdbb53 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=72 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=71-72 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Fri May 5 05:56:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 05:56:53 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 22:11:56 GMT, Daniel D. Daugherty wrote: > I have a Tier3 test failure: https://bugs.openjdk.org/browse/JDK-8291555?focusedCommentId=14579239&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579239 *sigh* This looks relatively harmless, though. https://github.com/rkennke/jdk/commit/e5afb43cbcc1 added zapping entries and extra verification. This test is (again) coming from the single path that inspects the lock-stack of a foreign thread concurrently. When doing that, we cannot be sure to not observe zapped entries, because the foreign thread may zap as we go. It's actually surprising that we haven't seen this earlier, the change is more than a month old. Fix is to relax the test for this case. I pushed that fix, let's see if we're good now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535732736 From sspitsyn at openjdk.org Fri May 5 05:58:43 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 05:58:43 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Thu, 4 May 2023 20:55:46 GMT, Alex Menkov wrote: >> It'd be nice to do even more factoring + renaming. >> The lines 2326-2345 can be refactored to a function: >> >> bool StackRootCollector::report_native_frame_refs(jmethodID method) { >> _blk->set_context(_thread_tag, _tid, _depth, method); >> if (_is_top_frame) { >> // JNI locals for the top frame. >> assert(_java_thread != nullptr, "sanity"); >> _java_thread->active_handles()->oops_do(_blk); >> if (_blk->stopped()) { >> return false; >> } >> } else { >> if (_last_entry_frame != nullptr) { >> // JNI locals for the entry frame >> assert(_last_entry_frame->is_entry_frame(), "checking"); >> _last_entry_frame->entry_frame_call_wrapper()->handles()->oops_do(_blk); >> if (_blk->stopped()) { >> return false; >> } >> } >> } >> return true; >> } >> >> >> The function `report_stack_refs` can be renamed to `report_java_frame_refs` >> to make function name more consistent. > > JNI local reporting uses this tricky _is_top_frame/_last_entry_frame stuff > I think it would be better to have it in the main do_frame method for better readability Sorry, I do not see how this improves readability. Big functions with many layered conditions do not improve readability. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185718941 From sspitsyn at openjdk.org Fri May 5 05:58:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 05:58:49 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v14] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: <78w1j8Lxez-jVsUv8nB-StinrbBYPbkvEn5lK5ORvnk=.3e97d792-145d-4b52-a42a-d78c9a1d21a2@github.com> On Thu, 4 May 2023 23:20:21 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Updated test test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 106: > 104: extern "C" JNIEXPORT jint JNICALL > 105: Agent_OnLoad(JavaVM *vm, char *options, void *reserved) { > 106: if (vm->GetEnv(reinterpret_cast(&jvmti), JVMTI_VERSION) != JNI_OK || jvmti == nullptr) { Nit: This line is long and non readable. There are many examples in tests how it is normally done. test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 113: > 111: memset(&capabilities, 0, sizeof(capabilities)); > 112: capabilities.can_tag_objects = 1; > 113: //capabilities.can_support_virtual_threads = 1; The line 113 can be removed. test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 130: > 128: Java_VThreadStackRefTest_test(JNIEnv* env, jclass clazz, jobjectArray classes) { > 129: jsize classesCount = env->GetArrayLength(classes); > 130: for (int i=0; i 152: } > 153: > 154: static void printtCreatedClass(JNIEnv* env, jclass cls) { Why is printt with 'tt' ? test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 167: > 165: > 166: extern "C" JNIEXPORT void JNICALL > 167: Java_VThreadStackRefTest_createObjAndCallback(JNIEnv* env, jclass clazz, jclass cls, jobject callback) { Some comment would be helpful about what this function does. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185720838 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185720066 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185721404 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185722065 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185722636 From sspitsyn at openjdk.org Fri May 5 06:05:25 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 06:05:25 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v14] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Thu, 4 May 2023 23:20:21 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Updated test test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 39: > 37: jint testClassCount; > 38: jint *count; > 39: jlong *threadId; Camel case is the Java naming convention for identifiers. Tests normally use camel case only for native methods which are called from Java. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1185723959 From dholmes at openjdk.org Fri May 5 06:25:12 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 May 2023 06:25:12 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v73] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 05:54:29 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Relax zapped-entry test when calling thread is not owning thread Updates look good to me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/10907#pullrequestreview-1414228134 From rkennke at openjdk.org Fri May 5 06:27:40 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 06:27:40 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v73] In-Reply-To: References: Message-ID: <0KzemgLK9ws6zT_TXHgHfLhiOgEq65LRTdmRhAcn7bI=.8a0efee2-c8fa-45d2-9b94-930857512d77@github.com> On Fri, 5 May 2023 06:21:18 GMT, David Holmes wrote: > Updates look good to me. Thanks. Nice, thank you! The PR has 4 approvals now. Are we good to go, or should I wait for others to approve? (And if so, who?) ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535771621 From yadongwang at openjdk.org Fri May 5 06:31:45 2023 From: yadongwang at openjdk.org (Yadong Wang) Date: Fri, 5 May 2023 06:31:45 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 05:17:44 GMT, Stefan Karlsson wrote: >> test/hotspot/gtest/gc/z/test_zForwarding.cpp line 68: >> >>> 66: >>> 67: bool reserved = os::attempt_reserve_memory_at((char*)ZAddressHeapBase, ZGranuleSize, false /* executable */); >>> 68: ASSERT_TRUE(reserved); >> >> Hi, >> Thanks for the great work! >> I have performed some tests on linux-riscv64 Hifive Unmatched board. So far, I only witnessed one gtest failure: >> >> >> $ make test TEST=gtest:ZForwardingTest >> Building target 'test' in configuration 'linux-riscv64-server-release' >> Test selection 'gtest:ZForwardingTest', will run: >> * gtest:ZForwardingTest/server >> >> Running test 'gtest:ZForwardingTest/server' >> Note: Google Test filter = ZForwardingTest* >> [==========] Running 4 tests from 1 test suite. >> [----------] Global test environment set-up. >> [----------] 4 tests from ZForwardingTest >> [ RUN ] ZForwardingTest.setup_vm >> test/hotspot/gtest/gc/z/test_zForwarding.cpp:68: Failure >> Value of: reserved >> Actual: false >> Expected: true >> [ FAILED ] ZForwardingTest.setup_vm (0 ms) >> [ RUN ] ZForwardingTest.find_empty_vm >> [ OK ] ZForwardingTest.find_empty_vm (1 ms) >> [ RUN ] ZForwardingTest.find_full_vm >> [ OK ] ZForwardingTest.find_full_vm (8 ms) >> [ RUN ] ZForwardingTest.find_every_other_vm >> [ OK ] ZForwardingTest.find_every_other_vm (0 ms) >> [----------] 4 tests from ZForwardingTest (761 ms total) >> >> [----------] Global test environment tear-down >> ERROR: RUN_ALL_TESTS() failed. Error 1 >> [==========] 4 tests from 1 test suite ran. (762 ms total) >> [ PASSED ] 3 tests. >> [ FAILED ] 1 test, listed below: >> [ FAILED ] ZForwardingTest.setup_vm >> >> 1 FAILED TEST >> Finished running test 'gtest:ZForwardingTest/server' >> Test report is stored in build/linux-riscv64-server-release/test-results/gtest_ZForwardingTest_server >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> gtest:ZForwardingTest/server 4 3 1 0 << >> ============================== >> TEST FAILURE >> >> >> The gtest failed this assertion where 'reserved' return by function os::attempt_reserve_memory_at is false. >> I find the reason is that the mmap call at the bottom returns a different address instead of the requested one (ZAddressHeapBase). I think that is possible since we are not sure if the requested address is available before the mmap call, right? So I guess we might need some changes here for this gtest. > > Thanks for reporting. It would be interesting to see what address you get and compare it to the range [ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax). We emailed to erik to discuss this issue two months ago, and maybe he missed it. ZForwardingTest does not guarantee a successful invoke of os::commit_memory for ZAddressHeapBase, and we saw some conflicts between ZAddressHeapBase and the metadata address space on the RISC-V hardware of 39-bits virtual address. There is no failure in the normal initialization phase of JVM, because the commit order of them is guaranteed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185738633 From dholmes at openjdk.org Fri May 5 06:41:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 May 2023 06:41:25 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 22:11:56 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @dcubed-ojdk review comments > > I have a Tier3 test failure: > https://bugs.openjdk.org/browse/JDK-8291555?focusedCommentId=14579239&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579239 It would be good to get @dcubed-ojdk 's final thumbs-up on testing first. And perhaps not a good idea to integrate at the end of the week just in case anything goes wrong. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535781897 From dholmes at openjdk.org Fri May 5 06:48:19 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 May 2023 06:48:19 GMT Subject: RFR: 8304074: [JMX] Add an approximation of JVM process allocated bytes In-Reply-To: References: Message-ID: On Thu, 4 May 2023 19:54:57 GMT, Paul Hohensee wrote: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. src/hotspot/share/services/management.cpp line 2102: > 2100: JVM_ENTRY(jlong, jmm_GetAllThreadAllocatedMemory(JNIEnv *env)) > 2101: // There is a race between threads that exit during the loop and calling > 2102: // exited_allocated_bytes. If the result is initialized with exited_allocated_bytes, If you want a stable and accurate value did you consider holding the Threads_lock while you iterate the threads? Or do it as a safepoint VMop? src/hotspot/share/services/management.cpp line 2106: > 2104: // the loop gets to it and thus not be counted. If, on the other hand and done > 2105: // here, exited_allocated_bytes is added after the loop, the final result might be > 2106: // "too large" because a thread might be counted twice, once in the loop and agsin typo agsin ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1185748115 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1185746920 From thartmann at openjdk.org Fri May 5 06:48:16 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 May 2023 06:48:16 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 01:06:09 GMT, Leonid Mesnik wrote: >> 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects >> >> caused significant regressions in some benchmarks and should be reverted. >> >> This fix backout changes and update problemlist bugs to new issue. >> Tier1 passed >> Running also tier5 to check other builds and more svc testing > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > removed notify_jvmti_object_alloc_Type line Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13806#pullrequestreview-1414249585 From stefank at openjdk.org Fri May 5 06:53:27 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 May 2023 06:53:27 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 06:28:59 GMT, Yadong Wang wrote: >> Thanks for reporting. It would be interesting to see what address you get and compare it to the range [ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax). > > We emailed to erik to discuss this issue two months ago, and maybe he missed it. > ZForwardingTest does not guarantee a successful invoke of os::commit_memory for ZAddressHeapBase, and we saw some conflicts between ZAddressHeapBase and the metadata address space on the RISC-V hardware of 39-bits virtual address. There is no failure in the normal initialization phase of JVM, because the commit order of them is guaranteed. Could you provide the values for `reserved`, `ZAddressHeapBase`, and `ZAddressOffsetMax, when this test is failing. I'd like to know if we can make a workaround for you, or if we have to turn off the test for riscv. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185751935 From sspitsyn at openjdk.org Fri May 5 07:22:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 07:22:22 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 44: > 42: > 43: /* > 44: * This method will register a cleanup method and creates an instance of Finalizer Nit: `creates` => `create` test/hotspot/jtreg/vmTestbase/nsk/share/LocalProcess.java line 167: > 165: > 166: /** > 167: * This method is called at finalization and calls kill(). Nit: Extra space after `and calls`. test/hotspot/jtreg/vmTestbase/nsk/share/jpda/BindServer.java line 99: > 97: private int busyRequests = 0; > 98: > 99: Nit: Unneeded extra line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1185769380 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1185770457 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1185772007 From sspitsyn at openjdk.org Fri May 5 07:25:19 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 07:25:19 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests Looks okay to me but posted some nits. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13420#pullrequestreview-1414290300 From sspitsyn at openjdk.org Fri May 5 07:30:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 5 May 2023 07:30:14 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:50:02 GMT, Adam Sotona wrote: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13807#pullrequestreview-1414295424 From stefank at openjdk.org Fri May 5 07:43:17 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 May 2023 07:43:17 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 917 commits: - ZGC: Generational Co-authored-by: Stefan Karlsson Co-authored-by: Per Liden Co-authored-by: Albert Mingkun Yang Co-authored-by: Erik ?sterlund Co-authored-by: Axel Boldt-Christmas Co-authored-by: Stefan Johansson - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class - UPSTREAM: RISCV tmp reg cleanup resolve_jobject - CLEANUP: barrierSetNMethod_aarch64.cpp - UPSTREAM: Add relaxed add&fetch for aarch64 atomics - UPSTREAM: assembler_ppc CMPLI Co-authored-by: TheRealMDoerr - UPSTREAM: assembler_ppc ANDI Co-authored-by: TheRealMDoerr - UPSTREAM: Add VMErrorCallback infrastructure - Merge branch 'zgc_generational' into zgc_generational_rebase_target - Whitespace nit - ... and 907 more: https://git.openjdk.org/jdk/compare/705ad7d8...349cf9ae ------------- Changes: https://git.openjdk.org/jdk/pull/13771/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=07 Stats: 67399 lines in 685 files changed: 58223 ins; 4254 del; 4922 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From duke at openjdk.org Fri May 5 07:45:18 2023 From: duke at openjdk.org (Afshin Zafari) Date: Fri, 5 May 2023 07:45:18 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: <3u6wgn1EWik6TaKxwnVXe-Q-QvfDWM2SQo8r3IMn_x4=.0fb4c0c5-09a7-49ee-ad4d-975e0cfc5a2b@github.com> References: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> <3u6wgn1EWik6TaKxwnVXe-Q-QvfDWM2SQo8r3IMn_x4=.0fb4c0c5-09a7-49ee-ad4d-975e0cfc5a2b@github.com> Message-ID: On Thu, 4 May 2023 17:56:59 GMT, Coleen Phillimore wrote: > To solve the duplicated registerCleanup() cases, the two other classes could extend FinalizableObject then inherit its implementation of registerCleanup(). The other two cases already extend the `Log.Logger` and cannot extend `FinalizableObject`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1185789488 From adinn at openjdk.org Fri May 5 07:53:24 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 5 May 2023 07:53:24 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> References: <0gaGNt2LtzqFieSBotgwCNewyms2JoSmYSMZ6bvvByk=.a6fdbfdd-2e74-49a5-bdf0-fff5db82eb7a@github.com> Message-ID: <5qDVIKPF4EJshtRuBAdbTYEmVuRoTmEpMiGUex9cPFA=.cb393ea8-a29a-41d4-8d90-f4e7d33c0248@github.com> On Thu, 4 May 2023 17:17:19 GMT, Andrew Dinn wrote: >> This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). > > @coleenp @plummercj Any chance of feedback or a review for this patch? > @adinn Looking at the closely related code, is the same problem present for `adjust_exception_table` & `adjust_local_var_table`? Both appear to always reach for the original value from the method, though unlike the line number table, there's no member variable cached in the Relocator for either of them. @DanHeidinga I also thought that at first -- but it turns out the answer is no. Those other cases are different because the relocator directly updates the relevant offsets in the data located at the end of the ConstMethod. The line number table needs to be uncompressed and recompressed at each update, possibly ending up with a different size compressed array. So, the changes need to accumulate in a succession of compressed arrays held in the relocator the last of which gets recombined with the new version of the method when the method clone eventually happens. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1535867925 From adinn at openjdk.org Fri May 5 07:53:25 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 5 May 2023 07:53:25 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: References: Message-ID: On Fri, 5 May 2023 03:33:51 GMT, Serguei Spitsyn wrote: >> This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). > > Looks good. > Thank you for taking care about it! > Thanks, > Serguei @sspitsyn Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1535868649 From adinn at openjdk.org Fri May 5 07:53:27 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 5 May 2023 07:53:27 GMT Subject: Integrated: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: References: Message-ID: <9L-Cu_NLcLon68Calx_-XlMd0KYPFVhah1Eg9jF8_ik=.a91d7090-67f1-4de5-a3ef-054557337400@github.com> On Thu, 4 May 2023 09:26:33 GMT, Andrew Dinn wrote: > This small change ensures that repeated bytecode rewrites necessitated by class pool index updates are applied cumulatively when updating the method line number table. The current code applies each change to the original table which means only the last one is applied (and even then with the wrong adjustment). This pull request has now been integrated. Changeset: f94f9577 Author: Andrew Dinn URL: https://git.openjdk.org/jdk/commit/f94f957734355fe112e861d1f2f0b49df20f6b66 Stats: 18 lines in 1 file changed: 17 ins; 0 del; 1 mod 8307331: Correctly update line maps when class redefine rewrites bytecodes Reviewed-by: sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13795 From asotona at openjdk.org Fri May 5 08:40:27 2023 From: asotona at openjdk.org (Adam Sotona) Date: Fri, 5 May 2023 08:40:27 GMT Subject: RFR: 8250596: Update remaining manpage references from "OS X" to "macOS" [v2] In-Reply-To: References: Message-ID: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam Adam Sotona has updated the pull request incrementally with one additional commit since the last revision: updated copyright headers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13807/files - new: https://git.openjdk.org/jdk/pull/13807/files/88c2d42c..00e12f4b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13807&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13807&range=00-01 Stats: 7 lines in 7 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/13807.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13807/head:pull/13807 PR: https://git.openjdk.org/jdk/pull/13807 From shade at openjdk.org Fri May 5 08:48:08 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 08:48:08 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 21:54:11 GMT, Roman Kennke wrote: > * zero builds are still failing in the Oracle CI; can you check out zero builds on your end? Can you tell which Zero builds exactly? GHA Zero sanity checks look fine. My local Zero builds are fine with `make hotspot`: macosx-aarch64-zero-fastdebug macosx-aarch64-zero-release linux-x86_64-zero-fastdebug linux-x86_64-zero-release ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1535928669 From asotona at openjdk.org Fri May 5 08:58:54 2023 From: asotona at openjdk.org (Adam Sotona) Date: Fri, 5 May 2023 08:58:54 GMT Subject: Integrated: 8250596: Update remaining manpage references from "OS X" to "macOS" In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:50:02 GMT, Adam Sotona wrote: > Most of the manpages were updated a few years ago but some references remain. > This patch renames remaining references to "macOS". > > Please review. > > Thanks, > Adam This pull request has now been integrated. Changeset: 3b430b9f Author: Adam Sotona URL: https://git.openjdk.org/jdk/commit/3b430b9f732bc94674bf598c28162e2f5e62bae6 Stats: 23 lines in 7 files changed: 0 ins; 0 del; 23 mod 8250596: Update remaining manpage references from "OS X" to "macOS" Reviewed-by: mullan, cjplummer, dholmes, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13807 From shade at openjdk.org Fri May 5 10:00:07 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 10:00:07 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: References: Message-ID: <8EBEUUaROn5MN8UQHT2ZSlgxd5BurpBiqlSUVDhrwxY=.6b29611b-fa5c-4d70-843b-fd0f05e3a78c@github.com> On Fri, 5 May 2023 08:44:12 GMT, Aleksey Shipilev wrote: > > ``` > > * zero builds are still failing in the Oracle CI; can you check out zero builds on your end? > > ``` > > Can you tell which Zero builds exactly? GHA Zero sanity checks look fine. > > My local Zero builds are fine with `make hotspot`: macosx-aarch64-zero-fastdebug macosx-aarch64-zero-release linux-x86_64-zero-fastdebug linux-x86_64-zero-release Full `make images` for `macosx-aarch64-zero-fastdebug` requires #13827. After that, it survives the build with all two `LockingModes`, but not with LockingMode = LM_LIGHTWEIGHT: * For target jdk__optimize_image_exec: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/Users/shipilev/Work/shipilev-jdk/src/hotspot/share/runtime/objectMonitor.cpp:1388), pid=3884, tid=5379 # assert(cur != anon_owner_ptr()) failed: no anon owner here # # JRE version: (21.0) (fastdebug build ) # Java VM: OpenJDK 64-Bit Zero VM (fastdebug 21-internal-adhoc.shipilev.shipilev-jdk, interpreted mode, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64) # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/shipilev/Work/shipilev-jdk/make/hs_err_pid3884.log [thread 22019 also had an error] ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536016350 From azeller at openjdk.org Fri May 5 10:06:16 2023 From: azeller at openjdk.org (Arno Zeller) Date: Fri, 5 May 2023 10:06:16 GMT Subject: RFR: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS In-Reply-To: <1IZxvcqfq2ljwq789NIFDORCm3aftTaihakfRkWXrcU=.d4b1102c-8d25-4714-a401-703031f04ded@github.com> References: <1IZxvcqfq2ljwq789NIFDORCm3aftTaihakfRkWXrcU=.d4b1102c-8d25-4714-a401-703031f04ded@github.com> Message-ID: On Thu, 4 May 2023 18:11:56 GMT, Chris Plummer wrote: >> Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. >> It's best that we just don't run the test on OSX if sudo privileges. > > Changes look good. Is there a reason why this was not noticed when [JDK-8290687](https://bugs.openjdk.org/browse/JDK-8290687) was filed and fixed last year? @plummercj : I tried to find out why we did not see it when JDK-8290687 was fixed but I am unable to find a reason :-(. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13791#issuecomment-1536024939 From coleenp at openjdk.org Fri May 5 12:07:20 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 12:07:20 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: References: Message-ID: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> > The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. > > Tested with JVMTI and JDI tests locally, and tier1-4 tests. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove return variable from remove lambda, fix formatting. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13818/files - new: https://git.openjdk.org/jdk/pull/13818/files/e5e04907..60463042 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13818&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13818&range=00-01 Stats: 6 lines in 3 files changed: 0 ins; 3 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13818.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13818/head:pull/13818 PR: https://git.openjdk.org/jdk/pull/13818 From coleenp at openjdk.org Fri May 5 12:07:21 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 12:07:21 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 22:32:36 GMT, Coleen Phillimore wrote: > The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. > > Tested with JVMTI and JDI tests locally, and tier1-4 tests. Serguei, thank you for doing a first pass. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13818#issuecomment-1536157769 From coleenp at openjdk.org Fri May 5 12:07:24 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 12:07:24 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 02:13:32 GMT, Serguei Spitsyn wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove return variable from remove lambda, fix formatting. > > src/hotspot/share/utilities/resizeableResourceHash.hpp line 91: > >> 89: // Calculate next "good" hashtable size based on requested count >> 90: int calculate_resize(bool use_large_table_sizes) const { >> 91: const int resize_factor = 2.0; // by how much we will resize using current number of entries > > Nit: extra spaces brefore the '=' sign. > Q: Why is a FP constant assigned to the integer variable? The 2.0 constant and spaces were left over from the old implementation. I just fixed them. > src/hotspot/share/utilities/resourceHash.hpp line 234: > >> 232: if (node != nullptr) { >> 233: *ptr = node->_next; >> 234: bool cont = function(node->_key, node->_value); > > Q: The local `cont` is not used. Just wanted to check if anything is missed here. > Also, what does this name mean? Should it be named `cond` instead? The 'cont' variable was because I cut/pasted the lambda from iterate and in that case means to continue. That's also not needed for 'remove' so I removed the return variable for the lambda function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1186010915 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1186011484 From ron.pressler at oracle.com Fri May 5 12:10:18 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Fri, 5 May 2023 12:10:18 +0000 Subject: [External] : Re: JEP draft: Integrity and Strong Encapsulation In-Reply-To: References: Message-ID: <4D285258-D117-4978-9174-34573BC05F70@oracle.com> On 4 May 2023, at 21:32, Dan Heidinga > wrote: I?ve read this draft a number of times and each time I struggled with the framing of the problem given Java?s success over the past almost 30 years. The old regime worked when: 1. Almost all the runtime was written in C++ (so the fact Java code couldn?t really establish invariants didn?t matter as much), 2. The JDK evolved at a slow pace, and 3. Java applications were deployed in a particular way. That lasted for a very long time, but all of these are now changing: 1. More and more of the runtime is being written (or rewritten) in Java, 2. The JDK is evolving faster, and 3. New deployment kinds are desired. In light of this new situations problems are arising due to the old regime, which isn?t working so well anymore. As JEP 411 states, the SecurityManager has: * Brittle permission model * Difficult programming model * Poor performance Which translates into a whole lot of cost both for maintainers of the JDK and for all users who must pay the runtime costs related to the SecurityManager (high when enabled, but non-zero always). Although the SecurityManager has high costs, and is infrequently used at runtime in production, it provides the only way to limit certain capabilities like: * JNI (SecurityManager::checkLink) * Encapsulation (SecurityManager::checkPackageAccess) * Launch new processes (SecurityManager::checkExec) * Reflective access (accessDeclaredMembers, etc) * and others Some of those controls need replacements if the SecurityManager will go away. JNI, surprisingly, is a key one here for large corporations. If I understand correctly, this new Integrity JEP draft aims, amongst other things, to replace the hard to maintain, expensive runtime checks of the SecurityManager with configuration via command line options. This allows those who previously relied on the SecurityManager to continue to control the high-order bits of functionality without imposing a cost on the rest of the ecosystem. It also makes it easier to determine which libraries are relying on the restricted features. Overall, this provides a smoother migration path for users, makes the intention of users very clear (just read the command line vs auditing SecurityManager implementation) and improves performance by shifting these decisions to configuration time rather than paying cost of code complexity and stack walks. I also appreciate the ?nudge? being made with this JEP by requiring explicit opt-in to disabling protections versus the previous uphill battle to enable the SecurityManager. It makes for an easier conversation to ask for i.e. JNI to be enabled for one library on the command line rather than having to deal with all the potential restrictions of the SecurityManager. The relationship between security and integrity is as follows: integrity is a prerequisite to robust security (i.e. security that doesn?t require full-program analysis). That?s because security depends on maintaining security invariant ? e.g. a sensitive method is only ever called after an access check ? and there can be no robust invariants, aka integrity invariants, *of any kind* without integrity. SecurityManager was a security mechanism, and because robust security requires integrity, SecurityManager *also* had to offer integrity. But strong encapsulation isn?t a security mechanism. It is an integrity mechanism. As such, it makes it *possible* to build robust security mechanisms, such as an authorisation mechanism, at any layer: the JDK, frameworks/libraries, the application. Without integrity, it would be impossible to build such security mechanisms at any layer. In a way, SecurityManager served as an excuse of sorts: if you really needed integrity you could have hypothetically achieved it using SM (though in practice it was hard). You are right that strong encapsulation?s ?permissions? are, by design, more coarsely grained than SM?s security permissions, but that?s not the only difference, or even the main one. A bigger difference is that it is quite normal for an application to give some component/user access to some file. On the other hand, it is abnormal and relatively rare for an application to grant *any* strong-encapsulation-breaking permissions (those that override the permissions in modules? module-info, that is) with the possible exception of --enable-native-access to allow JNI/FFM. Few programs should have *any* of --add-exports/add-opens/patch-module in production (although it?s normal in whitebox testing); these are all red flags. Unlike a ?reasonable? security policy, which is quite complex, the only reasonable integrity configuration is the empty one, again, with the exception of ?enable-native-access; a *minority* of programs may also have -javaagent. So it?s not just fine-grained vs. coarse-grained, opt-in vs. opt out, but also: the ?right? configuration is the default one or one that?s very close to it. So while overall, when viewed from the lens of removing the SecurityManager, this approach makes sense, I do want to caution on betting against Java?s strengths, particularly against its use of speculative optimizations. > Neither a person reading the code nor the platform itself ? as it compiles and runs it ? can fully be assured that the code does what it says or that its meaning does not change over time as the program runs. ?.. > In the Java runtime, certain optimizations assume that conditions that hold at the time the optimization is made hold forever. This is the basis of all speculative optimization - the platform assumes the meaning doesn?t change and compiles as though it won?t. If the application is modified at runtime, the JVM applies the necessary compensations such as deoptimization and recompilation. Java has bet on dynamic features time and again (even when others have championed static approaches) and those bets - backed by speculative optimizations - have paid off time and again. So this can?t be what you?re arguing against. If the concern is that the runtime behaviour may appear to be different than the intent expressed in the source code due to use of setAccessible or changes by agents, then I think the JEP should be more explicit about that concern. The current wording reads as equally applying to many of Java?s existing dynamic behaviours (and belies the power of speculation coupled with deoptimization!). I?m certainly not arguing against the power of speculative optimisation. It has certainly worked time and again for Java? except when it doesn?t. For example, Valhalla realised that value objects cannot be *just* a speculative optimisation, and a different user-facing model, with stricter integrity invariants are needed. In this JEP, however, I?m mostly hinting at link-time (or, in any event, pre-production-runtime) optimisations that may come in Project Leyden. It?s not so much the difference between the source code and what ends up running that matters, but what some form of analysis (either static or dynamic during a trial-run) sees vs. what the application may later do. In some cases, speculation that falls back on deopt may do, but for other, ?tighter? link-time/pre-run optimisations, it may prove insufficient. The platform would need to know that the meaning of the program does not change between the time the optimisations are performed and the time the program is run. As for dynamic features, we need to separate regular reflection ? which isn?t affected at all ? from deep reflection. The two primary uses for deep reflection in production are dependency injection and serialization. But dependency injection requires only a very controlled form of deep reflection ? one that is nicely served by Lookups, and the use of deep reflection in serialization is considered a mistake that can and should be fixed (https://openjdk.org/projects/amber/design-notes/towards-better-serialization). Until then, the JDK offers special provisions for serialization libraries that wish to serialize JDK objects (https://github.com/openjdk/jdk/blob/master/src/jdk.unsupported/share/classes/sun/reflect/ReflectionFactory.java). There is no reason --add-opens shouldn?t be rare. > For example, every developer assumes that changing the signature of a private method, or removing a private field, does not impact the class's clients. Right. The private modifier defines a *contract* which states anyone depending on the implementation details are on their own and shouldn?t be surprised by changes. I understand that it can be problematic when large successful frameworks are broken by such changes, but that doesn?t invalidate the contract that?s in place. The risk is higher for the JDK than for other libraries or applications given the common dependency on the JDK. True, which is why we?re not forcing libraries to be modularised (although they may have to be modularised to enjoy some of the features that Project Leyden may end up delivering). But I?ll also say this. What we know *now* that the designers of Java 1.0 didn?t know is that that contract ? at least as far as the JDK goes ? wasn?t respected, which ended up giving users a bad upgrade experience especially since the rate of the platform?s evolution started rising. We can advise library authors not to do something time and again, but they care about their own users, as they should, and so justify doing what they do. Even though everyone is justified in pursuing their interests, the end result has been a tragedy of the commons. As the maintainers of the platform, our user base is the entire Java ecosystem as a whole and, as it turned out, some regulatory intervention is needed to stop this tragedy of the commons. > However, with deep reflection, doSensitiveOperation could be invoked from anywhere without an isAuthorized check, nullifying the intended restriction; even worse, an agent could modify the code of the isAuthorized method to always return true. And clearly, these would be bugs. Not much different than leaking a privileged MethodHandles.Lookup object outside a Class?s nest (the boundary for private access) for which there is no enhanced integrity check. We can?t fully protect users from code that does the wrong thing even while undertaking efforts to minimize the attack surface. ?Superpowers? are exactly that, while we support making them opt-in, we should be careful not to overstate the risk as the same principle applies to all code running in a process - it must be trusted as it has the same privileges as the process. When it comes to security, such bugs are known as vulnerabilities (though not necessarily exploits), and we must differentiate between them depending on which side of the encapsulation boundary these vulnerabilities lie. If a security-sensitive class has a bug that causes it to leak a capabilities object that?s one thing, but if a bug in a serialization library that uses a super-powered deep-reflection library could have its inputs manipulated so that a security-sensitive class is compromised, that?s a whole other story. Strong encapsulation builds bulkheads that allows a sensitive module to be analysed *in isolation*, given its well-defined surface area, and robustly protected from vulnerabilities in *other* modules. That?s precisely why integrity is a required for robust security. Obviously, no security mechanism is perfect, but strong encapsulation gives the authors of security mechanisms a very valuable tool. While talking about this subject it?s worth mentioning that the Java Platform should provide the necessary integrity, but it can?t provide all the sufficient integrity. Some integrity guarantees must also be provided by OS mechanisms (say, filesystem and process isolation) and even hardware mechanism (timing/rowhammer etc.). To be as secure as possible, a security mechanism must rely on the integrity of all layers below it. > A tool like jlink could remove unused strongly-encapsulated methods at link time to reduce image size and class loading time. Most of the benefit here is not time saved by not loading the methods, it?s actually due to avoiding the need to load classes during verification. The verifier needs to validate relationships between classes and every extra method potentially asserts new relationships (such as class X subclasses Throwable) and it is these extra classes that need loading that typically increases the startup time. Right. I count that as class loading, or startup time. > The guarantee that code may not change over time even opens the door to ahead-of-time compilation (AOT). AOT doesn?t depend on the code never changing. OpenJ9 has AOT code that is resilient in the face of changes to the underlying Java class files. I?m positive Hotspot will be able to develop similar resilient AOT code. The cost of validating the assumptions made while AOT compiling is much lower than doing the compile while still enabling Java?s dynamic features. There are different kinds of AOT compilation, and Leyden may allow multiple modes. Some may support deoptimisation, and others may not (or may even not have class files available to them at all). Given an application configuration, we want to know which modes are possible and what link-time transformation is needed or possible. ? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at openjdk.org Fri May 5 12:48:25 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 12:48:25 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v72] In-Reply-To: <8EBEUUaROn5MN8UQHT2ZSlgxd5BurpBiqlSUVDhrwxY=.6b29611b-fa5c-4d70-843b-fd0f05e3a78c@github.com> References: <8EBEUUaROn5MN8UQHT2ZSlgxd5BurpBiqlSUVDhrwxY=.6b29611b-fa5c-4d70-843b-fd0f05e3a78c@github.com> Message-ID: On Fri, 5 May 2023 09:56:44 GMT, Aleksey Shipilev wrote: > Full `make images` for `macosx-aarch64-zero-fastdebug` requires #13827. After that, it survives the build with all two `LockingModes`, but not with LockingMode = LM_LIGHTWEIGHT: This requires significantly more time to implement for Zero. To unblock the rest of the Lilliput work, I suggest we protect Zero with this hunk: diff --git a/src/hotspot/cpu/zero/vm_version_zero.cpp b/src/hotspot/cpu/zero/vm_version_zero.cpp index 4c5e343dbbf..3d17e159a61 100644 --- a/src/hotspot/cpu/zero/vm_version_zero.cpp +++ b/src/hotspot/cpu/zero/vm_version_zero.cpp @@ -116,6 +116,11 @@ void VM_Version::initialize() { FLAG_SET_DEFAULT(UseVectorizedMismatchIntrinsic, false); } + if ((LockingMode != LM_LEGACY) && (LockingMode != LM_MONITOR)) { + warning("Unsupported locking mode for this CPU."); + FLAG_SET_DEFAULT(LockingMode, LM_LEGACY); + } + // Enable error context decoding on known platforms #if defined(IA32) || defined(AMD64) || defined(ARM) || \ defined(AARCH64) || defined(PPC) || defined(RISCV) || \ ...and then deal with the rest in https://bugs.openjdk.org/browse/JDK-8307532. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536208347 From dholmes at openjdk.org Fri May 5 13:01:19 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 May 2023 13:01:19 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > Default methods for interface classes were invented to solve a problem of compatibility if I remember correctly. Yes they were added a way to extend existing interfaces, but the point is that a default method provides an implementation that will work "good enough" for any implementing class in general - and is this case the default implementation is all that is needed (it is like adding a method to a common base class). ------------- PR Comment: https://git.openjdk.org/jdk/pull/13420#issuecomment-1536224855 From coleenp at openjdk.org Fri May 5 13:08:19 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 13:08:19 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 15:54:23 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests You're going to end up with things not overriding this method that should. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13420#issuecomment-1536234894 From dholmes at openjdk.org Fri May 5 13:18:16 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 May 2023 13:18:16 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 13:05:39 GMT, Coleen Phillimore wrote: > You're going to end up with things not overriding this method that should. ??? We are controlling all the classes - we know if anything would need to have a different implementation of this method. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13420#issuecomment-1536247089 From duke at openjdk.org Fri May 5 13:25:27 2023 From: duke at openjdk.org (Afshin Zafari) Date: Fri, 5 May 2023 13:25:27 GMT Subject: Integrated: 8305084: Remove the removal warnings for finalize() from test/hotspot/jtreg/serviceability/dcmd/gc/FinalizerInfoTest.java and RunFinalizationTest.java In-Reply-To: References: Message-ID: <2BC8PpYr2gMMN9wwzN_nX3Kz8OtbQiDIh7PdAgoxs8g=.48c409e6-40d6-44d8-9f91-46fe50dc6fbf@github.com> On Tue, 11 Apr 2023 10:20:25 GMT, Afshin Zafari wrote: > The `removal` warnings are suppressed out. > Test: > `FinalizerInfoTest` and `RunFinalizationTest` are executed locally. This pull request has now been integrated. Changeset: f143bf7c Author: Afshin Zafari Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/f143bf7c4554a689f17c373ea5d99b68dd518b2f Stats: 4 lines in 2 files changed: 1 ins; 0 del; 3 mod 8305084: Remove the removal warnings for finalize() from test/hotspot/jtreg/serviceability/dcmd/gc/FinalizerInfoTest.java and RunFinalizationTest.java Reviewed-by: dholmes, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/13423 From rkennke at openjdk.org Fri May 5 13:35:12 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 13:35:12 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v74] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Disable new lightweight locking in Zero ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/43cdbb53..82b8b702 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=73 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=72-73 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Fri May 5 13:38:58 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 13:38:58 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v75] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 172 commits: - Merge branch 'master' into JDK-8291555-v2 - Disable new lightweight locking in Zero - Relax zapped-entry test when calling thread is not owning thread - Address @dcubed-ojdk review comments - Address @dholmes-ora's review comments - Add missing new file - Fix copyright on new files - Address @coleenp's review - Merge commit '452cb8432f4d45c3dacd4415bc9499ae73f7a17c' into JDK-8291555-v2 - Fix arm and ppcle builds - ... and 162 more: https://git.openjdk.org/jdk/compare/f143bf7c...a65b3aeb ------------- Changes: https://git.openjdk.org/jdk/pull/10907/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=74 Stats: 2580 lines in 70 files changed: 1772 ins; 97 del; 711 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From coleenp at openjdk.org Fri May 5 14:01:17 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 14:01:17 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: <5-6PbFhQpnQN5rnNaISUf-UvXGoP869WUo2pE6QsuxA=.15ea7edc-2b4e-4fa7-8729-e9a5aee5e63c@github.com> <3u6wgn1EWik6TaKxwnVXe-Q-QvfDWM2SQo8r3IMn_x4=.0fb4c0c5-09a7-49ee-ad4d-975e0cfc5a2b@github.com> Message-ID: On Fri, 5 May 2023 07:42:17 GMT, Afshin Zafari wrote: >> Default methods for interface classes were invented to solve a problem of compatibility if I remember correctly. Forcing subclasses to implement the interface method or have a superclass of the subclass to implement the interface method seems like it avoids the problem of silently not registering the cleanup or action that the interface method should force you to do. To solve the duplicated registerCleanup() cases, the two other classes could extend FinalizableObject then inherit its implementation of registerCleanup(). > >> To solve the duplicated registerCleanup() cases, the two other classes could extend FinalizableObject then inherit its implementation of registerCleanup(). > > The other two cases already extend the `Log.Logger` and cannot extend `FinalizableObject`. I though default methods were bad design just like default parameters. maybe not. I didn?t think yours would compile either but I guess it does. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186130453 From dcubed at openjdk.org Fri May 5 14:44:19 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 14:44:19 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v73] In-Reply-To: References: Message-ID: <6xE2oaDa83ABBZX0RTLsG14_XlXKxP8U3RFcKizsa-s=.3d47cfcb-cd9f-4143-8763-5d4e313f885d@github.com> On Fri, 5 May 2023 05:54:29 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Relax zapped-entry test when calling thread is not owning thread src/hotspot/share/runtime/lockStack.cpp line 70: > 68: assert(_base[i] != nullptr || !is_owning_thread(), "no zapped before top"); > 69: for (int j = i + 1; j < top; j++) { > 70: assert(_base[i] != _base[j], "entries must be unique: %s", msg); Okay so you tweaked the assert to allow a `nullptr` value when the caller is not the owning thread. Got it. Is it possible for `_base[i]` and `_base[j]` to both be `nullptr` when the caller is not the owning thread? If so, then that assert will also fire... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1186180260 From duke at openjdk.org Fri May 5 14:44:37 2023 From: duke at openjdk.org (Afshin Zafari) Date: Fri, 5 May 2023 14:44:37 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v3] In-Reply-To: References: Message-ID: > The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13420/files - new: https://git.openjdk.org/jdk/pull/13420/files/fa8e4537..ed854e8e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13420&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13420&range=01-02 Stats: 97 lines in 6 files changed: 18 ins; 71 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/13420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13420/head:pull/13420 PR: https://git.openjdk.org/jdk/pull/13420 From duke at openjdk.org Fri May 5 14:50:30 2023 From: duke at openjdk.org (Afshin Zafari) Date: Fri, 5 May 2023 14:50:30 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v4] In-Reply-To: References: Message-ID: <0Temd9Xn4_R--EJRJWavqC3zOlcJ2eUX1Ff-PdrNuxU=.585c2304-dd30-482b-9c7e-57918abce1e4@github.com> > The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' into _8305083 - 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests ------------- Changes: https://git.openjdk.org/jdk/pull/13420/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13420&range=03 Stats: 140 lines in 11 files changed: 52 ins; 43 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/13420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13420/head:pull/13420 PR: https://git.openjdk.org/jdk/pull/13420 From duke at openjdk.org Fri May 5 14:50:38 2023 From: duke at openjdk.org (Afshin Zafari) Date: Fri, 5 May 2023 14:50:38 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 07:16:12 GMT, Serguei Spitsyn wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > > test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 44: > >> 42: >> 43: /* >> 44: * This method will register a cleanup method and creates an instance of Finalizer > > Nit: `creates` => `create` Done. > test/hotspot/jtreg/vmTestbase/nsk/share/LocalProcess.java line 167: > >> 165: >> 166: /** >> 167: * This method is called at finalization and calls kill(). > > Nit: Extra space after `and calls`. Done. > test/hotspot/jtreg/vmTestbase/nsk/share/jpda/BindServer.java line 99: > >> 97: private int busyRequests = 0; >> 98: >> 99: > > Nit: Unneeded extra line. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186184823 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186184587 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186184337 From rkennke at openjdk.org Fri May 5 14:53:20 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 14:53:20 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v76] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Only do lock-stack consistency checks when called from owning thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/a65b3aeb..171aced8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=75 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=74-75 Stats: 13 lines in 1 file changed: 5 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From dcubed at openjdk.org Fri May 5 14:53:26 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 14:53:26 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v75] In-Reply-To: References: Message-ID: <4i4LvLuxof6igQtBFit9qq4eKTUmAXHzPy5FrqCsYoI=.afadab14-be72-4d51-9ec9-523f7f39d19e@github.com> On Fri, 5 May 2023 13:38:58 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 172 commits: > > - Merge branch 'master' into JDK-8291555-v2 > - Disable new lightweight locking in Zero > - Relax zapped-entry test when calling thread is not owning thread > - Address @dcubed-ojdk review comments > - Address @dholmes-ora's review comments > - Add missing new file > - Fix copyright on new files > - Address @coleenp's review > - Merge commit '452cb8432f4d45c3dacd4415bc9499ae73f7a17c' into JDK-8291555-v2 > - Fix arm and ppcle builds > - ... and 162 more: https://git.openjdk.org/jdk/compare/f143bf7c...a65b3aeb src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp line 2562: > 2560: Register lock = op->lock_opr()->as_register(); > 2561: if (LockingMode == LM_MONITOR) { > 2562: if (op->info() != null) { Hmmm... other places in the same file compare `op->info()` with `nullptr` and not `null`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1186189104 From rkennke at openjdk.org Fri May 5 14:53:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 14:53:29 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v73] In-Reply-To: <6xE2oaDa83ABBZX0RTLsG14_XlXKxP8U3RFcKizsa-s=.3d47cfcb-cd9f-4143-8763-5d4e313f885d@github.com> References: <6xE2oaDa83ABBZX0RTLsG14_XlXKxP8U3RFcKizsa-s=.3d47cfcb-cd9f-4143-8763-5d4e313f885d@github.com> Message-ID: <-P557jGwTtzyMVnWQ6ZkVF06iEcH6FM0PXxrG1UdvLE=.40e66469-7469-458f-9be4-affdbea083a6@github.com> On Fri, 5 May 2023 14:40:52 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Relax zapped-entry test when calling thread is not owning thread > > src/hotspot/share/runtime/lockStack.cpp line 70: > >> 68: assert(_base[i] != nullptr || !is_owning_thread(), "no zapped before top"); >> 69: for (int j = i + 1; j < top; j++) { >> 70: assert(_base[i] != _base[j], "entries must be unique: %s", msg); > > Okay so you tweaked the assert to allow a `nullptr` value when the caller > is not the owning thread. Got it. > > Is it possible for `_base[i]` and `_base[j]` to both be `nullptr` when the > caller is not the owning thread? If so, then that assert will also fire... Aww right. The whole block is not safe to verify when not called from the owning thread, because the owning thread may modify everything under our feet. I've changed it so that the whole loops are only done when called from owning thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1186187814 From rkennke at openjdk.org Fri May 5 14:59:36 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 14:59:36 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix null -> nullptr typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/171aced8..0da2b84b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=76 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=75-76 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Fri May 5 15:01:09 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 15:01:09 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v75] In-Reply-To: References: <4i4LvLuxof6igQtBFit9qq4eKTUmAXHzPy5FrqCsYoI=.afadab14-be72-4d51-9ec9-523f7f39d19e@github.com> Message-ID: On Fri, 5 May 2023 14:53:32 GMT, Daniel D. Daugherty wrote: >> src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp line 2562: >> >>> 2560: Register lock = op->lock_opr()->as_register(); >>> 2561: if (LockingMode == LM_MONITOR) { >>> 2562: if (op->info() != null) { >> >> Hmmm... other places in the same file compare `op->info()` with `nullptr` and not `null`. > > I have absolutely no idea why the above diff showed up when I went to view the > changes for the zero fix. It's not present in the zero fix webrev, but it was in the > "Review new changes" link... sigh... this GitHub thing mystifies me... It's also interesting that it compiled :-) What is 'null' anyway? In any case, I am doing a scan of the whole patch and look for any possible re-introduction of NULL or even null. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1186199522 From dcubed at openjdk.org Fri May 5 14:59:39 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 14:59:39 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v75] In-Reply-To: <4i4LvLuxof6igQtBFit9qq4eKTUmAXHzPy5FrqCsYoI=.afadab14-be72-4d51-9ec9-523f7f39d19e@github.com> References: <4i4LvLuxof6igQtBFit9qq4eKTUmAXHzPy5FrqCsYoI=.afadab14-be72-4d51-9ec9-523f7f39d19e@github.com> Message-ID: On Fri, 5 May 2023 14:48:35 GMT, Daniel D. Daugherty wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 172 commits: >> >> - Merge branch 'master' into JDK-8291555-v2 >> - Disable new lightweight locking in Zero >> - Relax zapped-entry test when calling thread is not owning thread >> - Address @dcubed-ojdk review comments >> - Address @dholmes-ora's review comments >> - Add missing new file >> - Fix copyright on new files >> - Address @coleenp's review >> - Merge commit '452cb8432f4d45c3dacd4415bc9499ae73f7a17c' into JDK-8291555-v2 >> - Fix arm and ppcle builds >> - ... and 162 more: https://git.openjdk.org/jdk/compare/f143bf7c...a65b3aeb > > src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp line 2562: > >> 2560: Register lock = op->lock_opr()->as_register(); >> 2561: if (LockingMode == LM_MONITOR) { >> 2562: if (op->info() != null) { > > Hmmm... other places in the same file compare `op->info()` with `nullptr` and not `null`. I have absolutely no idea why the above diff showed up when I went to view the changes for the zero fix. It's not present in the zero fix webrev, but it was in the "Review new changes" link... sigh... this GitHub thing mystifies me... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1186194465 From dcubed at openjdk.org Fri May 5 15:08:02 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 15:08:02 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 14:59:36 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix null -> nullptr typo This project is now baselined on jdk-21+22-1814 . ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536396432 From dcubed at openjdk.org Fri May 5 15:24:22 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 15:24:22 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: <9KF3QmTZfM7p0FEJjzapT8rcCn4gOVK5vff7h8pi6UU=.fce8e1ee-09de-4ac1-8d10-4af25cc27759@github.com> On Fri, 5 May 2023 14:59:36 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix null -> nullptr typo I've started a new round of Mach5 testing using v76. I'll be doing a round of v76 with default stack locking and v76 with forced-fast-locking. If there are still zero build issues in Tier4, then I'll post more details about what I see. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536416954 From dcubed at openjdk.org Fri May 5 15:41:12 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 15:41:12 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 14:59:36 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix null -> nullptr typo Sigh... v76 builds with forced-fast-locking are failing: [2023-05-05T15:33:30,486Z] Optimizing the exploded image [2023-05-05T15:33:31,371Z] # [2023-05-05T15:33:31,371Z] # A fatal error has been detected by the Java Runtime Environment: [2023-05-05T15:33:31,371Z] # [2023-05-05T15:33:31,371Z] # Internal Error (/opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S91047/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/628c1872-3930-44ad-97b9-7a1205cf1cd7/runs/b8e9c130-d37c-4c62-9d47-75615a406475/workspace/open/src/hotspot/share/runtime/javaThread.hpp:983), pid=2428657, tid=2428786 [2023-05-05T15:33:31,371Z] # assert(t->is_Java_thread()) failed: incorrect cast to JavaThread [2023-05-05T15:33:31,371Z] # [2023-05-05T15:33:31,371Z] # JRE version: Java(TM) SE Runtime Environment (21.0) (fastdebug build 21-internal-LTS-2023-05-05-1518319.daniel.daugherty.8291555forjdk21.git) [2023-05-05T15:33:31,371Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 21-internal-LTS-2023-05-05-1518319.daniel.daugherty.8291555forjdk21.git, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64) [2023-05-05T15:33:31,371Z] # Problematic frame: [2023-05-05T15:33:31,371Z] # V [libjvm.so+0x10c36d0] LockStack::verify(char const*) const+0x4cc ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536436758 From kvn at openjdk.org Fri May 5 15:47:26 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 5 May 2023 15:47:26 GMT Subject: RFR: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects [v2] In-Reply-To: References: Message-ID: <6tYsgjIL9o6s6POMCWYayzjkjAmgCUo5wiF1G8nGUj0=.2f9b129c-5813-4a88-9afe-470927f08f94@github.com> On Fri, 5 May 2023 01:06:09 GMT, Leonid Mesnik wrote: >> 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects >> >> caused significant regressions in some benchmarks and should be reverted. >> >> This fix backout changes and update problemlist bugs to new issue. >> Tier1 passed >> Running also tier5 to check other builds and more svc testing > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > removed notify_jvmti_object_alloc_Type line Agree. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13806#pullrequestreview-1415035282 From lmesnik at openjdk.org Fri May 5 15:48:32 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 May 2023 15:48:32 GMT Subject: RFR: 8307370: Add tier1 testing with thread factory in CI [v2] In-Reply-To: References: Message-ID: > This fix just excludes a few hotspot/jdk tests which are not compatible with test thread factory. So > `make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all JTREG_TEST_THREAD_FACTORY=Virtual TEST=:tier1` > could be executed clearly. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: comments are fixed. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13820/files - new: https://git.openjdk.org/jdk/pull/13820/files/c179ba16..6bfb6b15 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13820&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13820&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/13820.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13820/head:pull/13820 PR: https://git.openjdk.org/jdk/pull/13820 From dcubed at openjdk.org Fri May 5 16:17:20 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 16:17:20 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 14:59:36 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix null -> nullptr typo I reproduced the fastdebug build crash on my MBP13. Here's the stack trace: --------------- T H R E A D --------------- Current thread (0x00007f81ee675d90): WorkerThread "GC Thread#1" [id=24579, stack(0x0000700008b15000,0x0000700008c15000) (1024K)] Stack: [0x0000700008b15000,0x0000700008c15000], sp=0x0000700008c14810, free space=1022k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x1406ce9] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x739 (javaThread.hpp:983) V [libjvm.dylib+0x14073eb] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x3b V [libjvm.dylib+0x7ab2e5] report_vm_error(char const*, int, char const*, char const*, ...)+0xc5 V [libjvm.dylib+0xe5f1ae] LockStack::verify(char const*) const+0x2ce V [libjvm.dylib+0xb03885] JavaThread::oops_do_no_frames(OopClosure*, CodeBlobClosure*)+0x275 V [libjvm.dylib+0x13411c4] Thread::oops_do(OopClosure*, CodeBlobClosure*)+0xb4 V [libjvm.dylib+0x13537aa] Threads::possibly_parallel_threads_do(bool, ThreadClosure*)+0x14a V [libjvm.dylib+0x1356e84] Threads::possibly_parallel_oops_do(bool, OopClosure*, CodeBlobClosure*)+0x24 V [libjvm.dylib+0x9b85d6] G1RootProcessor::process_java_roots(G1RootClosures*, G1GCPhaseTimes*, unsigned int)+0x66 V [libjvm.dylib+0x9b84be] G1RootProcessor::evacuate_roots(G1ParScanThreadState*, unsigned int)+0x5e V [libjvm.dylib+0x9c472f] G1EvacuateRegionsTask::scan_roots(G1ParScanThreadState*, unsigned int)+0x1f V [libjvm.dylib+0x9c452b] G1EvacuateRegionsBaseTask::work(unsigned int)+0x14b V [libjvm.dylib+0x147280c] WorkerThread::run()+0x7c V [libjvm.dylib+0x13407df] Thread::call_run()+0x17f V [libjvm.dylib+0x1080bcf] thread_native_entry(Thread*)+0x14f C [libsystem_pthread.dylib+0x68fc] _pthread_start+0xe0 C [libsystem_pthread.dylib+0x2443] thread_start+0xf JavaThread 0x00007f81f2013010 (nid = 43267) was being processed Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j java.lang.ref.Reference.waitForReferencePendingList()V+0 java.base j java.lang.ref.Reference.processPendingReferences()V+0 java.base j java.lang.ref.Reference$ReferenceHandler.run()V+8 java.base v ~StubRoutines::call_stub 0x0000000121e82d21 ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536482543 From rkennke at openjdk.org Fri May 5 16:49:38 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 16:49:38 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Simple Microbenchmark > > The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. > > | | x86_64 | aarch64 | > | -- | -- | -- | > | -UseFastLocking | 20.651 | 20.764 | > | +UseFastLocking | 18.896 | 18.908 | > > > #### Renaissance > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Only allow lock-stack verification for owning Java threads or at safepoints ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/0da2b84b..66a87a04 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=77 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=76-77 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Fri May 5 16:49:39 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 16:49:39 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v77] In-Reply-To: References: Message-ID: <6NGMgPAoYN8QRzehor9x8k6loctLBwvR6FvXoQrxOno=.0701920c-a6ce-4072-98d3-8c1c8e665805@github.com> On Fri, 5 May 2023 14:59:36 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix null -> nullptr typo Goddamnit. This is caused by VM or GC threads coming in via oops_do(). I've now strengthened the check to only allow the owning *Java* thread in, or when we are at a safepoint. I think that should make it all green again. Sorry for causing the noise. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536514388 From shade at openjdk.org Fri May 5 16:56:24 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 16:56:24 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM In-Reply-To: References: Message-ID: On Thu, 4 May 2023 19:54:57 GMT, Paul Hohensee wrote: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Some comments follow. src/hotspot/share/include/jmm.h line 55: > 53: JMM_VERSION_2 = 0x20020000, // JDK 10 > 54: JMM_VERSION_3 = 0x20030000, // JDK 14 > 55: JMM_VERSION_3_0 = 0x20030000, What's `JMM_VERSION_3_0`? src/hotspot/share/services/management.cpp line 2115: > 2113: result += size; > 2114: } > 2115: return result + ThreadService::exited_allocated_bytes();; Double `;;`. src/hotspot/share/services/threadService.hpp line 111: > 109: static jlong exited_allocated_bytes() { return _exited_allocated_bytes; } > 110: static void incr_exited_allocated_bytes(jlong size) { > 111: Atomic::add(&_exited_allocated_bytes, size); `Atomic::add(&_exited_allocated_bytes, size, memory_order_relaxed);`, please. No need for overly-strict memory effects for this counter. src/java.management/share/classes/sun/management/ThreadImpl.java line 535: > 533: private static native long getThreadAllocatedMemory0(long id); > 534: private static native void getThreadAllocatedMemory1(long[] ids, long[] result); > 535: private static native long getThreadAllocatedMemory2(); We can call this one `getAllThreadAllocatedMemory`, which obviates the need for `2` as the suffix. src/jdk.management/share/classes/com/sun/management/ThreadMXBean.java line 159: > 157: * > 158: * @return an approximation of the total memory allocated, in bytes, in > 159: * heap memory for the current thread, I am not sure if typos changes in the public API requires a CSR (albeit trivial one). Maybe skip these updates? test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 221: > 219: // baseline should be positive > 220: Thread curThread = Thread.currentThread(); > 221: long cumulative_size = mbean.getAllThreadAllocatedBytes(); Java style for variables is camel-case, `cumulativeSize`. test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 286: > 284: } > 285: > 286: private static long checkResult(Thread curThread, There is another `checkResult` below? Should they be replaced by a single method? test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 377: > 375: throw new RuntimeException(getName() + > 376: " ThreadAllocatedBytes before = " + size1 + > 377: " > ThreadAllocatedBytes after = " + size2); Is this replaceable with `checkResult(...)`? test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemoryArray.java line 120: > 118: long[] sizes1 = mbean.getThreadAllocatedBytes(ids); > 119: for (int i = 0; i < NUM_THREADS; i++) { > 120: checkResult(threads[i], sizes[i], sizes1[i]); Since we are cleaning up the test anyway, can we / should we rename `sizes` -> `before`, `size1` -> `after`? test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemoryArray.java line 164: > 162: > 163: private static void checkResult(Thread curThread, > 164: long prev_size, long curr_size) { camelCase arguments. ------------- PR Review: https://git.openjdk.org/jdk/pull/13814#pullrequestreview-1415057714 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186299709 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186309343 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186260976 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186261881 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186263383 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186265606 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186277116 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186275693 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186275048 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186275416 From shade at openjdk.org Fri May 5 16:56:26 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 16:56:26 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM In-Reply-To: References: Message-ID: On Fri, 5 May 2023 06:45:10 GMT, David Holmes wrote: >> Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. >> >> Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. > > src/hotspot/share/services/management.cpp line 2102: > >> 2100: JVM_ENTRY(jlong, jmm_GetAllThreadAllocatedMemory(JNIEnv *env)) >> 2101: // There is a race between threads that exit during the loop and calling >> 2102: // exited_allocated_bytes. If the result is initialized with exited_allocated_bytes, > > If you want a stable and accurate value did you consider holding the Threads_lock while you iterate the threads? Or do it as a safepoint VMop? I agree we should strive to get the value as accurate as possible. I think for operational use at scale, we need to avoid doing safepoints. Holding a `ThreadLock` might also penalize other code that (ab)uses threading (we frequently see thousands of threads coming and going, don't ask). But I have a fundamental question here: since SMR/TLH gives us a snapshot of currently live threads, and it also protects us from seeing an exiting thread in bad state (ultimately, a `delete`-d one), why can't we just trust its `cooked_allocated_bytes`, and avoid adding allocated bytes on exit path? If we cannot trust that, can we make it trustable while thread is protected by SMR/TLH? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186299142 From simonis at openjdk.org Fri May 5 17:16:16 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 5 May 2023 17:16:16 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM In-Reply-To: References: Message-ID: <-G4uTOCHpkgf1qjgTS7NYtbUIbWVTOdOgrhN5XU9kT0=.bce10c8b-f69c-4f10-80ef-8573017fa15f@github.com> On Thu, 4 May 2023 19:54:57 GMT, Paul Hohensee wrote: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Looks good in general. Please find my comments inline. src/hotspot/share/include/jmm.h line 55: > 53: JMM_VERSION_2 = 0x20020000, // JDK 10 > 54: JMM_VERSION_3 = 0x20030000, // JDK 14 > 55: JMM_VERSION_3_0 = 0x20030000, Why do we need `JMM_VERSION_3_0`? We haven't defined `JMM_VERSION_2_0` either. src/hotspot/share/include/jmm.h line 321: > 319: jstring flag_name, > 320: jvalue new_value); > 321: jlong (JNICALL *GetAllThreadAllocatedMemory) I'm not sure here, but I think there's no need to "overwrite" a *reserved* slot if you add this functionality to a new major release as you do. You also haven't done it when you've added `GetOneThreadAllocatedMemory()` with [JDK-8231209](https://bugs.openjdk.org/browse/JDK-8231209). I think we should keep these *reserved* slots for the case when we eventually have to downport new functionality from a later release. src/hotspot/share/services/management.cpp line 2282: > 2280: jmm_FindDeadlockedThreads, > 2281: jmm_SetVMGlobal, > 2282: jmm_GetAllThreadAllocatedMemory, See comment on overwriting the `reserved6` slot above. ------------- PR Review: https://git.openjdk.org/jdk/pull/13814#pullrequestreview-1414984240 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186213343 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186224124 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186301611 From simonis at openjdk.org Fri May 5 17:16:19 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 5 May 2023 17:16:19 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM In-Reply-To: References: Message-ID: On Fri, 5 May 2023 06:45:10 GMT, David Holmes wrote: >> Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. >> >> Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. > > src/hotspot/share/services/management.cpp line 2102: > >> 2100: JVM_ENTRY(jlong, jmm_GetAllThreadAllocatedMemory(JNIEnv *env)) >> 2101: // There is a race between threads that exit during the loop and calling >> 2102: // exited_allocated_bytes. If the result is initialized with exited_allocated_bytes, > > If you want a stable and accurate value did you consider holding the Threads_lock while you iterate the threads? Or do it as a safepoint VMop? The API specification clearly states that this method returns "*an approximation of the total amount of memory allocated in heap*" so in my opinion it is OK to keep it simple here and don't start messing with looks and safepoints. But can't we make this a little more accurate by only adding a threads allocated bytes if it is not `thread->is_terminated()`? Wouldn't that prevent double counting most of the time? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186246289 From dcubed at openjdk.org Fri May 5 17:23:09 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 5 May 2023 17:23:09 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:49:38 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Only allow lock-stack verification for owning Java threads or at safepoints Slowdebug had a better stack trace: --------------- T H R E A D --------------- Current thread (0x00007fe4ad0062d0): WorkerThread "GC Thread#0" [id=19715, stack(0x0000700004416000,0x0000700004516000) (1024K)] Stack: [0x0000700004416000,0x0000700004516000], sp=0x00007000045152b0, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x133a8a6] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x906 (javaThread.hpp:983) V [libjvm.dylib+0x133af59] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x89 V [libjvm.dylib+0x6d4e5c] report_vm_error(char const*, int, char const*, char const*, ...)+0x1ac V [libjvm.dylib+0xd33f] JavaThread::cast(Thread*)+0x4f V [libjvm.dylib+0x111911] JavaThread::current()+0x11 V [libjvm.dylib+0xdeffe9] LockStack::is_owning_thread() const+0x19 V [libjvm.dylib+0xdefe14] LockStack::verify(char const*) const+0x134 V [libjvm.dylib+0xac9367] LockStack::oops_do(OopClosure*)+0x27 V [libjvm.dylib+0xac92da] JavaThread::oops_do_no_frames(OopClosure*, CodeBlobClosure*)+0x2da V [libjvm.dylib+0x1287210] Thread::oops_do(OopClosure*, CodeBlobClosure*)+0x40 V [libjvm.dylib+0x129f395] ParallelOopsDoThreadClosure::do_thread(Thread*)+0x25 V [libjvm.dylib+0x129b6cc] Threads::possibly_parallel_threads_do(bool, ThreadClosure*)+0xfc V [libjvm.dylib+0x129dfad] Threads::possibly_parallel_oops_do(bool, OopClosure*, CodeBlobClosure*)+0x3d V [libjvm.dylib+0x99b3b6] G1RootProcessor::process_java_roots(G1RootClosures*, G1GCPhaseTimes*, unsigned int)+0xc6 V [libjvm.dylib+0x99b217] G1RootProcessor::evacuate_roots(G1ParScanThreadState*, unsigned int)+0x77 V [libjvm.dylib+0x9ac68e] G1EvacuateRegionsTask::scan_roots(G1ParScanThreadState*, unsigned int)+0x2e V [libjvm.dylib+0x9ac568] G1EvacuateRegionsBaseTask::work(unsigned int)+0x78 V [libjvm.dylib+0x13f75b4] WorkerTaskDispatcher::worker_run_task()+0x74 V [libjvm.dylib+0x13f7c14] WorkerThread::run()+0x34 V [libjvm.dylib+0x12868ee] Thread::call_run()+0x15e V [libjvm.dylib+0xfeafa7] thread_native_entry(Thread*)+0x117 C [libsystem_pthread.dylib+0x68fc] _pthread_start+0xe0 C [libsystem_pthread.dylib+0x2443] thread_start+0xf JavaThread 0x00007fe4af015610 (nid = 22019) was being processed Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j java.lang.ref.Reference.waitForReferencePendingList()V+0 java.base j java.lang.ref.Reference.processPendingReferences()V+0 java.base j java.lang.ref.Reference$ReferenceHandler.run()V+8 java.base v ~StubRoutines::call_stub 0x000000011fd08d21 Does that still match up with your theory? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536554672 From phh at openjdk.org Fri May 5 17:32:26 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 17:32:26 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:42:29 GMT, Aleksey Shipilev wrote: >> The API specification clearly states that this method returns "*an approximation of the total amount of memory allocated in heap*" so in my opinion it is OK to keep it simple here and don't start messing with looks and safepoints. >> >> But can't we make this a little more accurate by only adding a threads allocated bytes if it is not `thread->is_terminated()`? Wouldn't that prevent double counting most of the time? > > I agree we should strive to get the value as accurate as possible. I think for operational use at scale, we need to avoid doing safepoints. Holding a `ThreadLock` might also penalize other code that (ab)uses threading (we frequently see thousands of threads coming and going, don't ask). > > But I have a fundamental question here: since SMR/TLH gives us a snapshot of currently live threads, and it also protects us from seeing an exiting thread in bad state (ultimately, a `delete`-d one), why can't we just trust its `cooked_allocated_bytes`, and avoid adding allocated bytes on exit path? If we cannot trust that, can we make it trustable while thread is protected by SMR/TLH? I thought about doing those, but both would slow down the app/JVM and very likely introduce p99.9/p100 latency outliers that we'd rather not see just because we're sampling. Also, 1. The existing thread allocated bytes implementation isn't particularly accurate either, in the sense that reality quickly gets away from the method return values. That was a conscious decision to go with speed and efficiency over stability/accuracy, so I went the same way for this implementation. 2. Cloud services sample at fairly long intervals in order to avoid overhead: 1 minute is common and 5 minutes not unheard of. Stability/accuracy within short time frames is unneeded with such long sampling intervals. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186341500 From rkennke at openjdk.org Fri May 5 17:35:32 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 17:35:32 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 17:19:11 GMT, Daniel D. Daugherty wrote: > Slowdebug had a better stack trace: > > > > --------------- T H R E A D --------------- > > > > Current thread (0x00007fe4ad0062d0): WorkerThread "GC Thread#0" [id=19715, stack(0x0000700004416000,0x0000700004516000) (1024K)] > > > > Stack: [0x0000700004416000,0x0000700004516000], sp=0x00007000045152b0, free space=1020k > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > > V [libjvm.dylib+0x133a8a6] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x906 (javaThread.hpp:983) > > V [libjvm.dylib+0x133af59] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x89 > > V [libjvm.dylib+0x6d4e5c] report_vm_error(char const*, int, char const*, char const*, ...)+0x1ac > > V [libjvm.dylib+0xd33f] JavaThread::cast(Thread*)+0x4f > > V [libjvm.dylib+0x111911] JavaThread::current()+0x11 > > V [libjvm.dylib+0xdeffe9] LockStack::is_owning_thread() const+0x19 > > V [libjvm.dylib+0xdefe14] LockStack::verify(char const*) const+0x134 > > V [libjvm.dylib+0xac9367] LockStack::oops_do(OopClosure*)+0x27 > > V [libjvm.dylib+0xac92da] JavaThread::oops_do_no_frames(OopClosure*, CodeBlobClosure*)+0x2da > > V [libjvm.dylib+0x1287210] Thread::oops_do(OopClosure*, CodeBlobClosure*)+0x40 > > V [libjvm.dylib+0x129f395] ParallelOopsDoThreadClosure::do_thread(Thread*)+0x25 > > V [libjvm.dylib+0x129b6cc] Threads::possibly_parallel_threads_do(bool, ThreadClosure*)+0xfc > > V [libjvm.dylib+0x129dfad] Threads::possibly_parallel_oops_do(bool, OopClosure*, CodeBlobClosure*)+0x3d > > V [libjvm.dylib+0x99b3b6] G1RootProcessor::process_java_roots(G1RootClosures*, G1GCPhaseTimes*, unsigned int)+0xc6 > > V [libjvm.dylib+0x99b217] G1RootProcessor::evacuate_roots(G1ParScanThreadState*, unsigned int)+0x77 > > V [libjvm.dylib+0x9ac68e] G1EvacuateRegionsTask::scan_roots(G1ParScanThreadState*, unsigned int)+0x2e > > V [libjvm.dylib+0x9ac568] G1EvacuateRegionsBaseTask::work(unsigned int)+0x78 > > V [libjvm.dylib+0x13f75b4] WorkerTaskDispatcher::worker_run_task()+0x74 > > V [libjvm.dylib+0x13f7c14] WorkerThread::run()+0x34 > > V [libjvm.dylib+0x12868ee] Thread::call_run()+0x15e > > V [libjvm.dylib+0xfeafa7] thread_native_entry(Thread*)+0x117 > > C [libsystem_pthread.dylib+0x68fc] _pthread_start+0xe0 > > C [libsystem_pthread.dylib+0x2443] thread_start+0xf > > JavaThread 0x00007fe4af015610 (nid = 22019) was being processed > > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > > j java.lang.ref.Reference.waitForReferencePendingList()V+0 java.base > > j java.lang.ref.Reference.processPendingReferences()V+0 java.base > > j java.lang.ref.Reference$ReferenceHandler.run()V+8 java.base > > v ~StubRoutines::call_stub 0x000000011fd08d21 > > > > Does that still match up with your theory? Yes, definitely. Thanks for trying with slowdebug for confirmation! ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1536567480 From amenkov at openjdk.org Fri May 5 18:43:58 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 5 May 2023 18:43:58 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v15] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: disabled VTMS transitions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/ac38c44e..bb87bdb0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=13-14 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From lmesnik at openjdk.org Fri May 5 19:02:26 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 May 2023 19:02:26 GMT Subject: Integrated: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:12:43 GMT, Leonid Mesnik wrote: > 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects > > caused significant regressions in some benchmarks and should be reverted. > > This fix backout changes and update problemlist bugs to new issue. > Tier1 passed > Running also tier5 to check other builds and more svc testing This pull request has now been integrated. Changeset: e2b1013f Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/e2b1013f11fc605501c3bf77976facb9b870d28e Stats: 73 lines in 11 files changed: 5 ins; 64 del; 4 mod 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects Reviewed-by: sspitsyn, thartmann, kvn ------------- PR: https://git.openjdk.org/jdk/pull/13806 From coleenp at openjdk.org Fri May 5 19:11:19 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 19:11:19 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v3] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 14:44:37 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 56: > 54: * This method will register a cleanup method and create an instance of Finalizer > 55: * to register the object for finalization at VM exit. > 56: * It is implemented in FinalizableObject. This sentence can go since it's implemented here now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186420381 From coleenp at openjdk.org Fri May 5 19:15:37 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 May 2023 19:15:37 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v4] In-Reply-To: <0Temd9Xn4_R--EJRJWavqC3zOlcJ2eUX1Ff-PdrNuxU=.585c2304-dd30-482b-9c7e-57918abce1e4@github.com> References: <0Temd9Xn4_R--EJRJWavqC3zOlcJ2eUX1Ff-PdrNuxU=.585c2304-dd30-482b-9c7e-57918abce1e4@github.com> Message-ID: <8jDMGWBkqLnQAKWu5g7qLon1dpNLhrWSto-Dq1P9oHk=.15deb34f-2c32-45c2-aecc-02c2d1076996@github.com> On Fri, 5 May 2023 14:50:30 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into _8305083 > - 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests test/hotspot/jtreg/vmTestbase/nsk/share/MainWrapper.java line 50: > 48: > 49: // Some tests use this property to understand if virtual threads are used > 50: System.setProperty("main.wrapper", wrapperName); Should this line have been deleted? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186423314 From phh at openjdk.org Fri May 5 20:46:12 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 20:46:12 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v2] In-Reply-To: References: Message-ID: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13814/files - new: https://git.openjdk.org/jdk/pull/13814/files/d78ec8fa..460c00e4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=00-01 Stats: 55 lines in 8 files changed: 6 ins; 14 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/13814.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13814/head:pull/13814 PR: https://git.openjdk.org/jdk/pull/13814 From phh at openjdk.org Fri May 5 20:56:41 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 20:56:41 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v3] In-Reply-To: References: Message-ID: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13814/files - new: https://git.openjdk.org/jdk/pull/13814/files/460c00e4..1001b667 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13814.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13814/head:pull/13814 PR: https://git.openjdk.org/jdk/pull/13814 From phh at openjdk.org Fri May 5 20:56:41 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 20:56:41 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v3] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:16:37 GMT, Aleksey Shipilev wrote: >> Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: >> >> 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 286: > >> 284: } >> 285: >> 286: private static long checkResult(Thread curThread, > > There is another `checkResult` below? Should they be replaced by a single method? No, there's only one checkResult method. I changed the result type from void to long. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186491906 From phh at openjdk.org Fri May 5 21:04:32 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:04:32 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v4] In-Reply-To: References: Message-ID: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13814/files - new: https://git.openjdk.org/jdk/pull/13814/files/1001b667..2e2adc0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=02-03 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13814.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13814/head:pull/13814 PR: https://git.openjdk.org/jdk/pull/13814 From phh at openjdk.org Fri May 5 21:04:49 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:04:49 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v4] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 06:43:20 GMT, David Holmes wrote: >> Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: >> >> 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM > > src/hotspot/share/services/management.cpp line 2106: > >> 2104: // the loop gets to it and thus not be counted. If, on the other hand and done >> 2105: // here, exited_allocated_bytes is added after the loop, the final result might be >> 2106: // "too large" because a thread might be counted twice, once in the loop and agsin > > typo agsin Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186498560 From phh at openjdk.org Fri May 5 21:24:15 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:24:15 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v4] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 17:29:30 GMT, Paul Hohensee wrote: >> I agree we should strive to get the value as accurate as possible. I think for operational use at scale, we need to avoid doing safepoints. Holding a `ThreadLock` might also penalize other code that (ab)uses threading (we frequently see thousands of threads coming and going, don't ask). >> >> But I have a fundamental question here: since SMR/TLH gives us a snapshot of currently live threads, and it also protects us from seeing an exiting thread in bad state (ultimately, a `delete`-d one), why can't we just trust its `cooked_allocated_bytes`, and avoid adding allocated bytes on exit path? If we cannot trust that, can we make it trustable while thread is protected by SMR/TLH? > > I thought about doing those, but both would slow down the app/JVM and very likely introduce p99.9/p100 latency outliers that we'd rather not see just because we're sampling. Also, > > 1. The existing thread allocated bytes implementation isn't particularly accurate either, in the sense that reality quickly gets away from the method return values. That was a conscious decision to go with speed and efficiency over stability/accuracy, so I went the same way for this implementation. > 2. Cloud services sample at fairly long intervals in order to avoid overhead: 1 minute is common and 5 minutes not unheard of. Stability/accuracy within short time frames is unneeded with such long sampling intervals. Afaiu, SMR/TLH keeps a terminated thread's TLS accessible, but doesn't stop the termination process. If we initialize result with exited_allocated_bytes, the "too small" possibility is still there. We still have a window between that initialization and the creation of the threads iterator where threads may terminate and thus not be included in the iteration. And new threads may become active during the iteration and not be included. As Aleksey observes, an advantage is that threads that terminate during the iteration will be counted only once and we don't have to check for terminated threads. If we stick with adding exited_allocated_bytes to result after the iteration, Volker is correct that we shouldn't include terminated threads in the sum because their allocated bytes values will have been added to exited_allocated_bytes. We have the same undercount possibility that initializing result with exited_allocated_bytes has, so this (fixed) approach can result in "too small" also, but "too large" goes away. Looks like both approaches are equivalent, so let's go with initializing result with exited_allocated_bytes because it avoids a comparison in the loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186508768 From phh at openjdk.org Fri May 5 21:38:47 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:38:47 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: References: Message-ID: > Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. > > Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13814/files - new: https://git.openjdk.org/jdk/pull/13814/files/2e2adc0b..7b922263 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13814&range=03-04 Stats: 11 lines in 1 file changed: 0 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/13814.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13814/head:pull/13814 PR: https://git.openjdk.org/jdk/pull/13814 From phh at openjdk.org Fri May 5 21:38:48 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:38:48 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:43:10 GMT, Aleksey Shipilev wrote: >> Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: >> >> 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM > > src/hotspot/share/include/jmm.h line 55: > >> 53: JMM_VERSION_2 = 0x20020000, // JDK 10 >> 54: JMM_VERSION_3 = 0x20030000, // JDK 14 >> 55: JMM_VERSION_3_0 = 0x20030000, > > What's `JMM_VERSION_3_0`? Removed. > src/hotspot/share/services/management.cpp line 2115: > >> 2113: result += size; >> 2114: } >> 2115: return result + ThreadService::exited_allocated_bytes();; > > Double `;;`. Fixed. > src/hotspot/share/services/threadService.hpp line 111: > >> 109: static jlong exited_allocated_bytes() { return _exited_allocated_bytes; } >> 110: static void incr_exited_allocated_bytes(jlong size) { >> 111: Atomic::add(&_exited_allocated_bytes, size); > > `Atomic::add(&_exited_allocated_bytes, size, memory_order_relaxed);`, please. No need for overly-strict memory effects for this counter. Fixed. > src/java.management/share/classes/sun/management/ThreadImpl.java line 535: > >> 533: private static native long getThreadAllocatedMemory0(long id); >> 534: private static native void getThreadAllocatedMemory1(long[] ids, long[] result); >> 535: private static native long getThreadAllocatedMemory2(); > > We can call this one `getAllThreadAllocatedMemory`, which obviates the need for `2` as the suffix. Fixed. > src/jdk.management/share/classes/com/sun/management/ThreadMXBean.java line 159: > >> 157: * >> 158: * @return an approximation of the total memory allocated, in bytes, in >> 159: * heap memory for the current thread, > > I am not sure if typos changes in the public API requires a CSR (albeit trivial one). Maybe skip these updates? Fixed. > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 221: > >> 219: // baseline should be positive >> 220: Thread curThread = Thread.currentThread(); >> 221: long cumulative_size = mbean.getAllThreadAllocatedBytes(); > > Java style for variables is camel-case, `cumulativeSize`. Fixed. > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java line 377: > >> 375: throw new RuntimeException(getName() + >> 376: " ThreadAllocatedBytes before = " + size1 + >> 377: " > ThreadAllocatedBytes after = " + size2); > > Is this replaceable with `checkResult(...)`? Yes. Done. > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemoryArray.java line 120: > >> 118: long[] sizes1 = mbean.getThreadAllocatedBytes(ids); >> 119: for (int i = 0; i < NUM_THREADS; i++) { >> 120: checkResult(threads[i], sizes[i], sizes1[i]); > > Since we are cleaning up the test anyway, can we / should we rename `sizes` -> `before`, `size1` -> `after`? I kept "sizes" since it's use before isn't really "before". I renamed sizes1 to afterSizes. > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemoryArray.java line 164: > >> 162: >> 163: private static void checkResult(Thread curThread, >> 164: long prev_size, long curr_size) { > > camelCase arguments. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186514932 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186515039 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186513534 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186513639 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186513735 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186513794 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186514728 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186514463 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186514571 From phh at openjdk.org Fri May 5 21:38:48 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 5 May 2023 21:38:48 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: <-G4uTOCHpkgf1qjgTS7NYtbUIbWVTOdOgrhN5XU9kT0=.bce10c8b-f69c-4f10-80ef-8573017fa15f@github.com> References: <-G4uTOCHpkgf1qjgTS7NYtbUIbWVTOdOgrhN5XU9kT0=.bce10c8b-f69c-4f10-80ef-8573017fa15f@github.com> Message-ID: On Fri, 5 May 2023 15:11:26 GMT, Volker Simonis wrote: >> Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: >> >> 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM > > src/hotspot/share/include/jmm.h line 55: > >> 53: JMM_VERSION_2 = 0x20020000, // JDK 10 >> 54: JMM_VERSION_3 = 0x20030000, // JDK 14 >> 55: JMM_VERSION_3_0 = 0x20030000, > > Why do we need `JMM_VERSION_3_0`? We haven't defined `JMM_VERSION_2_0` either. Removed. > src/hotspot/share/include/jmm.h line 321: > >> 319: jstring flag_name, >> 320: jvalue new_value); >> 321: jlong (JNICALL *GetAllThreadAllocatedMemory) > > I'm not sure here, but I think there's no need to "overwrite" a *reserved* slot if you add this functionality to a new major release as you do. You also haven't done it when you've added `GetOneThreadAllocatedMemory()` with [JDK-8231209](https://bugs.openjdk.org/browse/JDK-8231209). > > I think we should keep these *reserved* slots for the case when we eventually have to downport new functionality from a later release. Done. > src/hotspot/share/services/management.cpp line 2282: > >> 2280: jmm_FindDeadlockedThreads, >> 2281: jmm_SetVMGlobal, >> 2282: jmm_GetAllThreadAllocatedMemory, > > See comment on overwriting the `reserved6` slot above. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186515133 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186515257 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186515422 From amenkov at openjdk.org Fri May 5 22:36:21 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 5 May 2023 22:36:21 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Fri, 5 May 2023 05:48:04 GMT, Serguei Spitsyn wrote: >> JNI local reporting uses this tricky _is_top_frame/_last_entry_frame stuff >> I think it would be better to have it in the main do_frame method for better readability > > Sorry, I do not see how this improves readability. > Big functions with many layered conditions do not improve readability. I mean the pieces of the code that set and use _is_top_frame/_last_entry_frame are close so it's easier to see the logic ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186539125 From amenkov at openjdk.org Fri May 5 23:03:38 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 5 May 2023 23:03:38 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v16] In-Reply-To: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: > The fix updates JVMTI FollowReferences implementation to report references from virtual threads: > - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; > - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; > - common code to handle stack frames are moved into separate class; > > Threads are reported as: > - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); > - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; > - unmounted vthreads: not reported as heap roots. Alex Menkov has updated the pull request incrementally with three additional commits since the last revision: - cosmetic changes in libVThreadStackRefTest.cpp - collect VT stack references if initial_object is null - moved transition disabler to correct functions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13254/files - new: https://git.openjdk.org/jdk/pull/13254/files/bb87bdb0..ae2085ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13254&range=14-15 Stats: 42 lines in 2 files changed: 17 ins; 7 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/13254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13254/head:pull/13254 PR: https://git.openjdk.org/jdk/pull/13254 From amenkov at openjdk.org Fri May 5 23:03:39 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 5 May 2023 23:03:39 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v14] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: <-Kq0OXDGxG72-4XA9v8HgeQsZ-kkAeT-yguDlNyRW1w=.4be6b2c8-313c-4502-b83b-1f14fb0632ae@github.com> On Fri, 5 May 2023 05:59:49 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> Updated test > > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 39: > >> 37: jint testClassCount; >> 38: jint *count; >> 39: jlong *threadId; > > Camel case is the Java naming convention for identifiers. > Tests normally use camel case only for native methods which are called from Java. fixed > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 106: > >> 104: extern "C" JNIEXPORT jint JNICALL >> 105: Agent_OnLoad(JavaVM *vm, char *options, void *reserved) { >> 106: if (vm->GetEnv(reinterpret_cast(&jvmti), JVMTI_VERSION) != JNI_OK || jvmti == nullptr) { > > Nit: This line is long and non readable. There are many examples in tests how it is normally done. done > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 113: > >> 111: memset(&capabilities, 0, sizeof(capabilities)); >> 112: capabilities.can_tag_objects = 1; >> 113: //capabilities.can_support_virtual_threads = 1; > > The line 113 can be removed now. done > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 130: > >> 128: Java_VThreadStackRefTest_test(JNIEnv* env, jclass clazz, jobjectArray classes) { >> 129: jsize classesCount = env->GetArrayLength(classes); >> 130: for (int i=0; i > Spaces are missed arounf '=' and '<' signs. fixed > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 154: > >> 152: } >> 153: >> 154: static void printtCreatedClass(JNIEnv* env, jclass cls) { > > Why is printt with 'tt' ? ttypo :) fixed > test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 167: > >> 165: >> 166: extern "C" JNIEXPORT void JNICALL >> 167: Java_VThreadStackRefTest_createObjAndCallback(JNIEnv* env, jclass clazz, jclass cls, jobject callback) { > > Some comment would be helpful about what this function does. added ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547290 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547055 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547020 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547091 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547193 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186547244 From amenkov at openjdk.org Fri May 5 23:32:33 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 5 May 2023 23:32:33 GMT Subject: RFR: 8306027: Clarify JVMTI heap functions spec about virtual thread stack. [v2] In-Reply-To: References: Message-ID: > The fix updates JVMTI spec updates description of heap functions to support virtual threads. > Virtual threads are not heap roots by design, so FollowReference/IterateOverReachableObjects specs are updated to note only platform threads. > References from thread stacks (including virtual threads) are reported as JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL, so description of the values is relaxed. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: updated spec to follow CSR ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13661/files - new: https://git.openjdk.org/jdk/pull/13661/files/8d9e284e..6fd16ef9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13661&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13661&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13661.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13661/head:pull/13661 PR: https://git.openjdk.org/jdk/pull/13661 From cjplummer at openjdk.org Sat May 6 01:22:30 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 6 May 2023 01:22:30 GMT Subject: RFR: 8307480: Improve SA "transported core" documentation for windows Message-ID: The SA document `transported_core.html` contains some tips on getting core files to work when debugging it on a machine other than the one that produced it. There are a few improvements that can be made based on information provided in [JDK-8306437](https://bugs.openjdk.org/browse/JDK-8306437) and in the #13836 review (which was eventually pulled as not necessary). Updates to the document include the use of `sun.jvm.hotspot.debugger.windbg.imagePath` and `sun.jvm.hotspot.debugger.windbg.symbolPath` properties, and adding "`srv*https://msdl.microsoft.com/download/symbols`" to symbolPath. The rendered html file with these changes can be found here: https://htmlpreview.github.io/?https://raw.githubusercontent.com/openjdk/jdk/baa6b36cd5b9c5b953fd9f5b8f4461bd4107ee05/src/jdk.hotspot.agent/doc/transported_core.html ------------- Commit messages: - improve windows info Changes: https://git.openjdk.org/jdk/pull/13849/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13849&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307480 Stats: 21 lines in 1 file changed: 19 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13849.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13849/head:pull/13849 PR: https://git.openjdk.org/jdk/pull/13849 From cjplummer at openjdk.org Sat May 6 01:24:25 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 6 May 2023 01:24:25 GMT Subject: RFR: 8306758: com/sun/jdi/ConnectedVMs.java fails with "Non-zero debuggee exitValue: 143" Message-ID: This test was very rarely failing with a exitValue 143 from the debuggee. It only happened when the machine was under a lot of stress. After some investigation it was realized that on unix OSes it should *always* expect exitValue 143, but for some reason was normally getting exitValue 0. The reason 143 should be expected is because `Process.destroy()` is used on the debuggee, which results in a SIGTERM, which should produce exitValue 143. The reason we were not normally seeing this is because the `Process.destroy()` was done while the debuggee was suspended at a breakpoint. Nothing can be done with the SIGTERM while all threads are suspended, but once the debugger does the `vm.resume()` the SIGTERM can be handled. But by that time it is a race between some thread handling SIGTERM and doing the exit(143), and the main debuggee thread resuming and exiting cleanly (producing exitValue 0). In almost all cases the clean exit was winning. By adding a 5 second sleep before exiting, I made it so the SIGTERM exit always wins. Once this was in place, I had to make changes so the test would pass with exitCode 143. This was done by adding a `TestScaffold.allowExitValue()` method, which the test can override. Note I'll have more uses for this in the future, as I plan to no longer by default allow exitValue 1 (exit with an uncaught exception) and requiring tests to override this method if needed. That will be done by [JDK-8307559](https://bugs.openjdk.org/browse/JDK-8307559). Tested by running all of test/jdk/com/sun/jdi with and without virtual threads, 10x times on each platform. ------------- Commit messages: - Allow 143 exitcode. Changes: https://git.openjdk.org/jdk/pull/13848/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13848&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306758 Stats: 46 lines in 3 files changed: 36 ins; 2 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/13848.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13848/head:pull/13848 PR: https://git.openjdk.org/jdk/pull/13848 From cjplummer at openjdk.org Sat May 6 01:24:27 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 6 May 2023 01:24:27 GMT Subject: RFR: 8306758: com/sun/jdi/ConnectedVMs.java fails with "Non-zero debuggee exitValue: 143" In-Reply-To: References: Message-ID: On Fri, 5 May 2023 22:21:36 GMT, Chris Plummer wrote: > This test was very rarely failing with a exitValue 143 from the debuggee. It only happened when the machine was under a lot of stress. After some investigation it was realized that on unix OSes it should *always* expect exitValue 143, but for some reason was normally getting exitValue 0. The reason 143 should be expected is because `Process.destroy()` is used on the debuggee, which results in a SIGTERM, which should produce exitValue 143. The reason we were not normally seeing this is because the `Process.destroy()` was done while the debuggee was suspended at a breakpoint. Nothing can be done with the SIGTERM while all threads are suspended, but once the debugger does the `vm.resume()` the SIGTERM can be handled. But by that time it is a race between some thread handling SIGTERM and doing the exit(143), and the main debuggee thread resuming and exiting cleanly (producing exitValue 0). In almost all cases the clean exit was winning. By adding a 5 second sleep before exiting, I mad e it so the SIGTERM exit always wins. Once this was in place, I had to make changes so the test would pass with exitCode 143. This was done by adding a `TestScaffold.allowExitValue()` method, which the test can override. Note I'll have more uses for this in the future, as I plan to no longer by default allow exitValue 1 (exit with an uncaught exception) and requiring tests to override this method if needed. That will be done by [JDK-8307559](https://bugs.openjdk.org/browse/JDK-8307559). > > Tested by running all of test/jdk/com/sun/jdi with and without virtual threads, 10x times on each platform. test/jdk/com/sun/jdi/ConnectedVMs.java line 101: > 99: BreakpointEvent bp = startToMain("InstTarg"); > 100: waitForVMStart(); > 101: StepEvent stepEvent = stepIntoLine(bp.thread()); These changes were needed for virtual thread support. `startToMain("InstTarg")` causes the debuggee to run until it it is suspended at a breakpoint in `InstTarg.main()`. `waitForVMStart()` will return right away since the VM has already started, and will return the main thread of the debuggee, but this is the thread running `TestScaffold.main()`, which started up `InstTarg.main()` in a virtual thread. If we single step in the main thread in this case, the single step is not in `InstTarg.main()` as it should be, but is instead in main thread, which is blocked in the `join()` call waiting for the virtual thread to complete. The single step resumes all threads, but can't complete until the virtual thread exits. So before the test ever gets to do the `Process.destroy()`, `InstTarg.main()` has already exited Fortunately it was easy to find the proper thread to single step in, since the virtual thread is the `BreakpointEvent` thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13848#discussion_r1186539805 From alanb at openjdk.org Sat May 6 05:21:16 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 6 May 2023 05:21:16 GMT Subject: RFR: 8306027: Clarify JVMTI heap functions spec about virtual thread stack. [v2] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 23:32:33 GMT, Alex Menkov wrote: >> The fix updates JVMTI spec updates description of heap functions to support virtual threads. >> Virtual threads are not heap roots by design, so FollowReference/IterateOverReachableObjects specs are updated to note only platform threads. >> References from thread stacks (including virtual threads) are reported as JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL, so description of the values is relaxed. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > updated spec to follow CSR Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13661#pullrequestreview-1415698008 From qamai at openjdk.org Sat May 6 05:35:32 2023 From: qamai at openjdk.org (Quan Anh Mai) Date: Sat, 6 May 2023 05:35:32 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 07:43:17 GMT, Stefan Karlsson wrote: >> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... >> >> >> There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 917 commits: > > - ZGC: Generational > > Co-authored-by: Stefan Karlsson > Co-authored-by: Per Liden > Co-authored-by: Albert Mingkun Yang > Co-authored-by: Erik ?sterlund > Co-authored-by: Axel Boldt-Christmas > Co-authored-by: Stefan Johansson > - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > - UPSTREAM: RISCV tmp reg cleanup resolve_jobject > - CLEANUP: barrierSetNMethod_aarch64.cpp > - UPSTREAM: Add relaxed add&fetch for aarch64 atomics > - UPSTREAM: assembler_ppc CMPLI > > Co-authored-by: TheRealMDoerr > - UPSTREAM: assembler_ppc ANDI > > Co-authored-by: TheRealMDoerr > - UPSTREAM: Add VMErrorCallback infrastructure > - Merge branch 'zgc_generational' into zgc_generational_rebase_target > - Whitespace nit > - ... and 907 more: https://git.openjdk.org/jdk/compare/705ad7d8...349cf9ae src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp line 310: > 308: // A not relocatable object could have spurious raw null pointers in its fields after > 309: // getting promoted to the old generation. > 310: __ cmpw(ref_addr, barrier_Relocation::unpatched); `cmpw` with immediates stalls the predecoder, it may be better to `movzwl` to a spare register and `cmpl` there. src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp line 483: > 481: > 482: __ lock(); > 483: __ cmpxchgq(rbx, Address(rcx, 0)); `ref_addr` is not necessarily materialised here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1186614250 PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1186640115 From qamai at openjdk.org Sat May 6 08:17:31 2023 From: qamai at openjdk.org (Quan Anh Mai) Date: Sat, 6 May 2023 08:17:31 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: References: Message-ID: <6SAAbnqbNXzGj7LtOU1fhkg9y87ZR2dKYeRM2RyxO1E=.12002ace-4616-4b73-9306-25da93948b2d@github.com> On Sat, 6 May 2023 04:08:42 GMT, Quan Anh Mai wrote: >> Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 917 commits: >> >> - ZGC: Generational >> >> Co-authored-by: Stefan Karlsson >> Co-authored-by: Per Liden >> Co-authored-by: Albert Mingkun Yang >> Co-authored-by: Erik ?sterlund >> Co-authored-by: Axel Boldt-Christmas >> Co-authored-by: Stefan Johansson >> - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> - UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> - CLEANUP: barrierSetNMethod_aarch64.cpp >> - UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> - UPSTREAM: assembler_ppc CMPLI >> >> Co-authored-by: TheRealMDoerr >> - UPSTREAM: assembler_ppc ANDI >> >> Co-authored-by: TheRealMDoerr >> - UPSTREAM: Add VMErrorCallback infrastructure >> - Merge branch 'zgc_generational' into zgc_generational_rebase_target >> - Whitespace nit >> - ... and 907 more: https://git.openjdk.org/jdk/compare/705ad7d8...349cf9ae > > src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp line 310: > >> 308: // A not relocatable object could have spurious raw null pointers in its fields after >> 309: // getting promoted to the old generation. >> 310: __ cmpw(ref_addr, barrier_Relocation::unpatched); > > `cmpw` with immediates stalls the predecoder, it may be better to `movzwl` to a spare register and `cmpl` there. I think we use the flag `UseStoreImmI16` for these kinds of situations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1186662246 From sspitsyn at openjdk.org Sat May 6 09:16:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 6 May 2023 09:16:21 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v16] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Fri, 5 May 2023 23:03:38 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with three additional commits since the last revision: > > - cosmetic changes in libVThreadStackRefTest.cpp > - collect VT stack references if initial_object is null > - moved transition disabler to correct functions test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 140: > 138: LOG("JVMTI FollowReferences error: %d\n", err); > 139: env->FatalError("FollowReferences failed"); > 140: } Nit: `classesCount` and `heapCallBacks` need c-style names. test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 153: > 151: } > 152: > 153: static void printCreatedClass(JNIEnv* env, jclass cls) { Nit: This function should have a c-style name. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186668539 PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186668689 From sspitsyn at openjdk.org Sat May 6 09:19:21 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 6 May 2023 09:19:21 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v16] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Fri, 5 May 2023 23:03:38 GMT, Alex Menkov wrote: >> The fix updates JVMTI FollowReferences implementation to report references from virtual threads: >> - unmounted vthreads are detected, their stack references for JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL; >> - stacks of mounted vthreads are splitted into 2 parts (virtual thread stack and carrier thread stack), references are reported with correct thread id/class tag/object tags/frame depth; >> - common code to handle stack frames are moved into separate class; >> >> Threads are reported as: >> - platform threads: JVMTI_HEAP_REFERENCE_THREAD (as before); >> - mounted vthreads (synthetic references, consider them as heap roots because carrier threads are roots): JVMTI_HEAP_REFERENCE_OTHER; >> - unmounted vthreads: not reported as heap roots. > > Alex Menkov has updated the pull request incrementally with three additional commits since the last revision: > > - cosmetic changes in libVThreadStackRefTest.cpp > - collect VT stack references if initial_object is null > - moved transition disabler to correct functions test/hotspot/jtreg/serviceability/jvmti/vthread/FollowReferences/libVThreadStackRefTest.cpp line 181: > 179: } > 180: > 181: static std::atomic timeToExit(false); Nit: This variable should have c-style name. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186669013 From sspitsyn at openjdk.org Sat May 6 09:39:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 6 May 2023 09:39:18 GMT Subject: RFR: 8299414: JVMTI FollowReferences should support references from VirtualThread stack [v9] In-Reply-To: References: <6oQOD_egcB3HyuagMWGSPLjKSE3JkaI2K2WOsDK1Cww=.c568223b-5100-4425-a4b7-defbd812a9ff@github.com> Message-ID: On Fri, 5 May 2023 22:32:59 GMT, Alex Menkov wrote: >> Sorry, I do not see how this improves readability. >> Big functions with many layered conditions do not improve readability. > > I mean the pieces of the code that set and use _is_top_frame/_last_entry_frame are close so it's easier to see the logic I'd say that it will be even better to find out what are manipulations with these instance fields. They are defined in class scope anyway. Also, you can place the definition of function `report_native_frame_refs()` right after `do_frame()` definition, so they occurrences will be still close. I think, it is more important to see the whole logics of the `do_frame()` with less cascading levels. You can give it a try and see the advantage. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13254#discussion_r1186671240 From sspitsyn at openjdk.org Sat May 6 09:42:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 6 May 2023 09:42:14 GMT Subject: RFR: 8306027: Clarify JVMTI heap functions spec about virtual thread stack. [v2] In-Reply-To: References: Message-ID: <5ktYT7-Ui1dNBPcBIRLWiLru_nmxftZKZSM3Lu5DckA=.993ce4f5-d4d8-4a48-bc3f-f0915edb9bf9@github.com> On Fri, 5 May 2023 23:32:33 GMT, Alex Menkov wrote: >> The fix updates JVMTI spec updates description of heap functions to support virtual threads. >> Virtual threads are not heap roots by design, so FollowReference/IterateOverReachableObjects specs are updated to note only platform threads. >> References from thread stacks (including virtual threads) are reported as JVMTI_HEAP_REFERENCE_STACK_LOCAL/JVMTI_HEAP_REFERENCE_JNI_LOCAL, so description of the values is relaxed. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > updated spec to follow CSR Looks good. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13661#pullrequestreview-1415749689 From sspitsyn at openjdk.org Sat May 6 09:50:16 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 6 May 2023 09:50:16 GMT Subject: RFR: 8307480: Improve SA "transported core" documentation for windows In-Reply-To: References: Message-ID: On Fri, 5 May 2023 23:53:47 GMT, Chris Plummer wrote: > The SA document `transported_core.html` contains some tips on getting core files to work when debugging it on a machine other than the one that produced it. There are a few improvements that can be made based on information provided in [JDK-8306437](https://bugs.openjdk.org/browse/JDK-8306437) and in the #13836 review (which was eventually pulled as not necessary). > > Updates to the document include the use of `sun.jvm.hotspot.debugger.windbg.imagePath` and `sun.jvm.hotspot.debugger.windbg.symbolPath` properties, and adding "`srv*https://msdl.microsoft.com/download/symbols`" to symbolPath. > > The rendered html file with these changes can be found here: > https://htmlpreview.github.io/?https://raw.githubusercontent.com/openjdk/jdk/baa6b36cd5b9c5b953fd9f5b8f4461bd4107ee05/src/jdk.hotspot.agent/doc/transported_core.html Looks good. Posted one comment about a potential typo. Thanks, Serguei src/jdk.hotspot.agent/doc/transported_core.html line 94: > 92: > 93:

> 94: How you set these properties will depend on the SA tool being used. The following in an example of when launching the clhsdb tool: Typo?: `The following in an example of when launching...` => `The following is an example of launching...`. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13849#pullrequestreview-1415750632 PR Review Comment: https://git.openjdk.org/jdk/pull/13849#discussion_r1186672347 From dholmes at openjdk.org Sun May 7 22:20:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 7 May 2023 22:20:43 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v4] In-Reply-To: <0Temd9Xn4_R--EJRJWavqC3zOlcJ2eUX1Ff-PdrNuxU=.585c2304-dd30-482b-9c7e-57918abce1e4@github.com> References: <0Temd9Xn4_R--EJRJWavqC3zOlcJ2eUX1Ff-PdrNuxU=.585c2304-dd30-482b-9c7e-57918abce1e4@github.com> Message-ID: On Fri, 5 May 2023 14:50:30 GMT, Afshin Zafari wrote: >> The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into _8305083 > - 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests > - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests This is a tricky issue to solve cleanly and minimally. If every class extended FinalizableObject then all the implementation could go there, but we have to try and split things across FinalizableObject and Finalizable because some classes only implement the interface - this leads to some unfortunate design choices. I think a better design for the classes that can't extend FinalizableObject would be for them to contain a FinalizableObject, the cleanup action for which would cleanup the host object. That could allow the removal of the Finalizable interface and simplify the general usage patterns. But that may be going too far for this particular PR ... ? test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 30: > 28: * Finalizable interface allows Finalizer to perform finalization of an object. > 29: * Each object that requires finalization at VM shutdown time should implement this > 30: * interface and call the registerClenup to activate a Finalizer hook. Typo: Clenup test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 41: > 39: * @see Finalizer > 40: */ > 41: public void cleanup(); I see now that implementing `registerCleanup` as a default method forces `cleanup` to become a public interface member. That is unfortunate and not what we would generally do - but for this testing framework it may be okay. test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 61: > 59: */ > 60: default public void registerCleanup() { > 61: // install finalizer to print errors summary at exit This was moved from Log but isn't appropriate for the interface. No need to say anything here. test/hotspot/jtreg/vmTestbase/nsk/share/Finalizable.java line 65: > 63: finalizer.activate(); > 64: > 65: // register the cleanup method to be called when this Log instance becomes unreachable. Remove "Log" - actually again this is not needed. You can see what the method does and it is already documented in the doc comment. test/hotspot/jtreg/vmTestbase/nsk/share/FinalizableObject.java line 37: > 35: /** > 36: * All instances of this class, should implement their own cleanup method > 37: * to clean appropriately the objects they used. "instances" don't implement methods. Suggestion: "Subclasses should override this method to provide the specific cleanup actions that they need." test/hotspot/jtreg/vmTestbase/nsk/share/MainWrapper.java line 50: > 48: finalizableObject.registerCleanup(); > 49: > 50: Extra blank line not needed test/hotspot/jtreg/vmTestbase/nsk/share/jpda/BindServer.java line 27: > 25: > 26: import java.io.*; > 27: import java.lang.ref.Cleaner; Seems unnecessary test/hotspot/jtreg/vmTestbase/nsk/share/jpda/BindServer.java line 410: > 408: * > 409: * This is replacement of the deprecated finalize() and is called > 410: * when this instance becomes unreachable. I don't think this is needed. test/hotspot/jtreg/vmTestbase/nsk/share/jpda/DebugeeBinder.java line 555: > 553: * > 554: * This is replacement of the finalize() method and is called when this > 555: * instance becomes unreachable. Again not needed. test/hotspot/jtreg/vmTestbase/nsk/share/jpda/DebugeeProcess.java line 89: > 87: this.log = binder.getLog(); > 88: > 89: // As the alternative to finalize(), register the cleanup() method No need to say "As an alternative to finalize()". test/hotspot/jtreg/vmTestbase/nsk/share/jpda/DebugeeProcess.java line 91: > 89: // As the alternative to finalize(), register the cleanup() method > 90: // to be called when this instance becomes unreachable. > 91: Cleaner.create().register(this, () -> cleanup()); Why do we need to do this explicitly here? Why not call `registerCleaner`? test/hotspot/jtreg/vmTestbase/nsk/share/jpda/SocketIOPipe.java line 26: > 24: > 25: import java.io.IOException; > 26: import java.lang.ref.Cleaner; Not needed. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13420#pullrequestreview-1415965638 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918144 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918976 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918246 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918278 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918472 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186918574 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186919232 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186919622 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186919786 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186919870 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186920211 PR Review Comment: https://git.openjdk.org/jdk/pull/13420#discussion_r1186920560 From dholmes at openjdk.org Mon May 8 01:17:18 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 01:17:18 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:49:38 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Only allow lock-stack verification for owning Java threads or at safepoints updates seem fine. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/10907#pullrequestreview-1416009244 From yyang at openjdk.org Mon May 8 01:55:23 2023 From: yyang at openjdk.org (Yi Yang) Date: Mon, 8 May 2023 01:55:23 GMT Subject: RFR: JDK-8306441: Segmented heap dump [v4] In-Reply-To: <6fGz-XulrkMTHQSMPlSvCa-nZwpxf5eglnQXfN1HN3c=.481b0390-8d98-4736-8948-938a7dd94b58@github.com> References: <8YqPPHSW4K1s0t317Kp6UqvoGuv5v9oCbjtQ9FX8p2o=.0f6c687b-d031-401d-901d-1ec532715cdc@github.com> <6fGz-XulrkMTHQSMPlSvCa-nZwpxf5eglnQXfN1HN3c=.481b0390-8d98-4736-8948-938a7dd94b58@github.com> Message-ID: <272QQBFeS-PNisjBRriR79zV7KfHdrJOYUOk2FP421E=.baa179ec-71e1-449e-9155-e709823b7fa7@github.com> On Thu, 4 May 2023 08:40:10 GMT, Yi Yang wrote: >> Hi, heap dump brings about pauses for application's execution(STW), this is a well-known pain. JDK-8252842 have added parallel support to heapdump in an attempt to alleviate this issue. However, all concurrent threads competitively write heap data to the same file, and more memory is required to maintain the concurrent buffer queue. In experiments, we did not feel a significant performance improvement from that. >> >> The minor-pause solution, which is presented in this PR, is a two-stage segmented heap dump: >> >> 1. Stage One(STW): Concurrent threads directly write data to multiple heap files. >> 2. Stage Two(Non-STW): Merge multiple heap files into one complete heap dump file. >> >> Now concurrent worker threads are not required to maintain a buffer queue, which would result in more memory overhead, nor do they need to compete for locks. It significantly reduces 73~80% application pause time. >> >> | memory | numOfThread | STW | Total | >> | --- | --------- | -------------- | ------------ | >> | 8g | 1 thread | 15.612 secs | 15.612 secs | >> | 8g | 32 thread | 2.5617250 secs | 14.498 secs | >> | 8g | 96 thread | 2.6790452 secs | 14.012 secs | >> | 16g | 1 thread | 26.278 secs | 26.278 secs | >> | 16g | 32 thread | 5.2313740 secs | 26.417 secs | >> | 16g | 96 thread | 6.2445556 secs | 27.141 secs | >> | 32g | 1 thread | 48.149 secs | 48.149 secs | >> | 32g | 32 thread | 10.7734677 secs | 61.643 secs | >> | 32g | 96 thread | 13.1522042 secs | 61.432 secs | >> | 64g | 1 thread | 100.583 secs | 100.583 secs | >> | 64g | 32 thread | 20.9233744 secs | 134.701 secs | >> | 64g | 96 thread | 26.7374116 secs | 126.080 secs | >> | 128g | 1 thread | 233.843 secs | 233.843 secs | >> | 128g | 32 thread | 72.9945768 secs | 207.060 secs | >> | 128g | 96 thread | 67.6815929 secs | 336.345 secs | >> >>> **Total** means the total heap dump including both two phases >>> **STW** means the first phase only. >>> For parallel dump, **Total** = **STW** + **Merge**. For serial dump, **Total** = **STW** >> >> ![image](https://user-images.githubusercontent.com/5010047/234534654-6f29a3af-dad5-46bc-830b-7449c80b4dec.png) >> >> In actual testing, two-stage solution can lead to an increase in the overall time for heapdump(See table above). However, considering the reduction of STW time, I think it is an acceptable trade-off. Furthermore, there is still room for optimization in the second merge stage(e.g. sendfile/splice/copy_file_range instead of read+write combination). Since number of parallel dump thread has a considerable impact on total dump time, I added a parameter that allows users to specify the number of parallel dump thread they wish to run. >> >> ##### Open discussion >> >> - Pauseless heap dump solution? >> An alternative pauseless solution is to fork a child process, set the parent process heap to read-only, and dump the heap in child process. Once writing happens in parent process, child process observes them by userfaultfd and corresponding pages are prioritized for dumping. I'm also looking forward to hearing comments and discussions about this solution. >> >> - Client parser support for segmented heap dump >> This patch provides a possibility that whether heap dump needs to be complete or not, can the VM directly generate segmented heapdump, and let the client parser complete the merge process? Looking forward to hearing comments from the Eclipse MAT community > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > remove useless scope Hi, can I have a review fot this? It significantly reduces heapdump STW time. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1537626780 From dholmes at openjdk.org Mon May 8 02:10:23 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 02:10:23 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 21:21:28 GMT, Paul Hohensee wrote: > Afaiu, SMR/TLH keeps a terminated thread's TLS accessible, but doesn't stop the termination process. Incorrect. A thread cannot complete the termination process if it is contained by a TLH - see ` ThreadsSMRSupport::smr_delete` and the call to `wait_until_not_protected`. But not sure that helps with the zero-else-double accounting problem. Any read of the "total accumulated bytes written to date" value is racing with terminating threads. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186966520 From dholmes at openjdk.org Mon May 8 02:24:26 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 02:24:26 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: References: Message-ID: <1-dU7miE3DgmLHf1LWgwDrqppK2l2Y5OwYkms3QMCIs=.398dcfe2-12b6-42e7-ac44-1cec2f729432@github.com> On Fri, 5 May 2023 21:38:47 GMT, Paul Hohensee wrote: >> Please review this addition to com.sun.management.ThreadMXBean that returns the total number of bytes allocated on the Java heap since JVM launch by both terminated and live threads. >> >> Because this PR adds a new interface method, I've updated the JMM_VERSION to 4, but would be happy to update it to 3_1 instead. > > Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: > > 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM src/hotspot/share/services/management.cpp line 2107: > 2105: // when result is initialized. > 2106: jlong result = ThreadService::exited_allocated_bytes(); > 2107: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *thread = jtiwh.next();) { If you call `exited_allocated_bytes` whilst you have an active `ThreadsListHandle` then you at least ensure you don't miss accounting for threads that are just about to terminate. src/hotspot/share/services/threadService.cpp line 173: > 171: // was not called, e.g., JavaThread::cleanup_failed_attach_current_thread(). > 172: decrement_thread_counts(thread, daemon); > 173: ThreadService::incr_exited_allocated_bytes(thread->cooked_allocated_bytes()); By doing this here you increase the likelihood of double-accounting for this thread. If you do this after the thread is no longer on any threads-list you may miss its contribution entirely, but you won't double-count it. src/jdk.management/share/classes/com/sun/management/ThreadMXBean.java line 111: > 109: * Returns an approximation of the total amount of memory, in bytes, > 110: * allocated in heap memory since the Java virtual machine was launched, > 111: * including the amount allocated by terminated threads. This "including ..." part seems redundant - it is the value allocated since JVM launch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186967565 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186968365 PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1186968949 From iklam at openjdk.org Mon May 8 04:25:26 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 8 May 2023 04:25:26 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> References: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> Message-ID: On Fri, 5 May 2023 12:07:20 GMT, Coleen Phillimore wrote: >> The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. >> >> Tested with JVMTI and JDI tests locally, and tier1-4 tests. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove return variable from remove lambda, fix formatting. I can't comment on the JVMTI changes, but the changes in the hashtable code seems OK to me. src/hotspot/share/classfile/stringTable.cpp line 638: > 636: public: > 637: size_t _errors; > 638: VerifyCompStrings() : _table(unsigned(_items_count / 8) + 1, 0 /* do not resize */), _errors(0) {} Shouldn't this use a regular ResourceHashtable instead? src/hotspot/share/utilities/resizeableResourceHash.hpp line 91: > 89: // Calculate next "good" hashtable size based on requested count > 90: int calculate_resize(bool use_large_table_sizes) const { > 91: const int resize_factor = 2; // by how much we will resize using current number of entries Does this function depend on the template parameters? If not, I think it can be made a static function -- you may need to pass `BASE::number_of_entries()` in as a parameter. src/hotspot/share/utilities/resourceHash.hpp line 147: > 145: */ > 146: bool put_fast(K const& key, V const& value) { > 147: unsigned hv = HASH(key); I think `put_fast` is not clear enough. Maybe `put_must_be_absent()` or something more concise. ------------- PR Review: https://git.openjdk.org/jdk/pull/13818#pullrequestreview-1416091781 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187009635 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187005281 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187009805 From dholmes at openjdk.org Mon May 8 05:28:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 05:28:14 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: References: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> Message-ID: <5kwuq2NrEkzznbU4n9tJ4nMDZ2WFZQCobSb04v5srNk=.de876e59-9ea0-4dd5-93f6-fa6cb260bbb5@github.com> On Mon, 8 May 2023 04:21:01 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove return variable from remove lambda, fix formatting. > > src/hotspot/share/utilities/resourceHash.hpp line 147: > >> 145: */ >> 146: bool put_fast(K const& key, V const& value) { >> 147: unsigned hv = HASH(key); > > I think `put_fast` is not clear enough. Maybe `put_must_be_absent()` or something more concise. I would suggest `put_when_absent` to complement `put_if_absent` - with suitable descriptive comments of course. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187035245 From azeller at openjdk.org Mon May 8 06:24:22 2023 From: azeller at openjdk.org (Arno Zeller) Date: Mon, 8 May 2023 06:24:22 GMT Subject: RFR: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:47:55 GMT, Thomas Stuefe wrote: >> Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. >> It's best that we just don't run the test on OSX if sudo privileges. > > Seems reasonable. @tstuefe and @plummercj Thanks for the reviews! As I am no Committer I will need a sponsor - would one of you be so kind to sponsor me? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13791#issuecomment-1537819200 From dholmes at openjdk.org Mon May 8 07:00:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 07:00:33 GMT Subject: RFR: JDK-8307331: Correctly update line maps when class redefine rewrites bytecodes In-Reply-To: References: Message-ID: On Fri, 5 May 2023 07:49:10 GMT, Andrew Dinn wrote: >> Looks good. >> Thank you for taking care about it! >> Thanks, >> Serguei > > @sspitsyn Thanks for the review. @adinn Please wait for two reviews for hotspot changes unless designated as trivial. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13795#issuecomment-1537849162 From rkennke at openjdk.org Mon May 8 07:45:24 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 07:45:24 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 01:13:36 GMT, David Holmes wrote: > updates seem fine. Thanks! @dcubed-ojdk are you good with testing? If you could approve this PR again, I would integrate it later today? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1537901860 From stefank at openjdk.org Mon May 8 07:52:20 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 May 2023 07:52:20 GMT Subject: RFR: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:33:49 GMT, Stefan Karlsson wrote: > When running jstat tests like the following: > test/jdk/sun/tools/jstatd/TestJstatdServer.java > > with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: > https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 > > G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. > > Tested locally by running the jstat/jstad tests in the Generational ZGC branch. Thanks for reviewing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13796#issuecomment-1537910374 From stefank at openjdk.org Mon May 8 07:55:25 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 May 2023 07:55:25 GMT Subject: Integrated: 8307428: jstat tests doesn't tolerate dash in the O column In-Reply-To: References: Message-ID: On Thu, 4 May 2023 09:33:49 GMT, Stefan Karlsson wrote: > When running jstat tests like the following: > test/jdk/sun/tools/jstatd/TestJstatdServer.java > > with Generational ZGC we get a failure because the O (old generation percentage) is reported as `-` and not a number. The reason why it is reported as `-` is that the current capacity of the old generation is zero and that leads to a divide-by-zero in this line: > https://github.com/openjdk/jdk/blob/82a8e91ef7c3b397f9cce3854722cfe4bace6f2e/src/jdk.jcmd/share/classes/sun/tools/jstat/resources/jstat_options#L1029 > > G1 has some workarounds for this situation where the reported capacity is slightly above 0. I'm a bit reluctant to add such a hack into Generational ZGC. I've talked to the jstat maintainers and they propose that we simply relax the test. > > Tested locally by running the jstat/jstad tests in the Generational ZGC branch. This pull request has now been integrated. Changeset: 68f385c1 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/68f385c1ca5f5bef7edfb66d9ec8ebee44cf4860 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8307428: jstat tests doesn't tolerate dash in the O column Reviewed-by: kevinw, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/13796 From jsjolen at openjdk.org Mon May 8 08:36:55 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 08:36:55 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v5] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. > > Thanks! Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge remote-tracking branch 'origin/master' into JDK-8300245 - Fix outdated copyright - Manual fix - Fix two faulty NULL_STRING misses - More manual fixes - Merge remote-tracking branch 'origin/master' into JDK-8300245 - Manual fixes - Replace NULL with nullptr in share/jfr/ ------------- Changes: https://git.openjdk.org/jdk/pull/12034/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=04 Stats: 2065 lines in 125 files changed: 0 ins; 0 del; 2065 mod Patch: https://git.openjdk.org/jdk/pull/12034.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12034/head:pull/12034 PR: https://git.openjdk.org/jdk/pull/12034 From azeller at openjdk.org Mon May 8 08:40:16 2023 From: azeller at openjdk.org (Arno Zeller) Date: Mon, 8 May 2023 08:40:16 GMT Subject: Integrated: 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS In-Reply-To: References: Message-ID: On Thu, 4 May 2023 07:30:49 GMT, Arno Zeller wrote: > Unless this test is run as root, it needs sudo privileges. If it gets them, the test runs fine, but leaves a file with root ownership. So jtreg cannot delete it, and you see errors when "make clean" tries to delete it. > It's best that we just don't run the test on OSX if sudo privileges. This pull request has now been integrated. Changeset: 5c7ede94 Author: Arno Zeller Committer: Christoph Langer URL: https://git.openjdk.org/jdk/commit/5c7ede94ae59b46c12d40a38bf5b7e15319cc7e2 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod 8307347: serviceability/sa/ClhsdbDumpclass.java could leave files owned by root on macOS Reviewed-by: stuefe, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/13791 From eosterlund at openjdk.org Mon May 8 09:04:39 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 8 May 2023 09:04:39 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: <6SAAbnqbNXzGj7LtOU1fhkg9y87ZR2dKYeRM2RyxO1E=.12002ace-4616-4b73-9306-25da93948b2d@github.com> References: <6SAAbnqbNXzGj7LtOU1fhkg9y87ZR2dKYeRM2RyxO1E=.12002ace-4616-4b73-9306-25da93948b2d@github.com> Message-ID: On Sat, 6 May 2023 08:14:24 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp line 310: >> >>> 308: // A not relocatable object could have spurious raw null pointers in its fields after >>> 309: // getting promoted to the old generation. >>> 310: __ cmpw(ref_addr, barrier_Relocation::unpatched); >> >> `cmpw` with immediates stalls the predecoder, it may be better to `movzwl` to a spare register and `cmpl` there. > > I think we use the flag `UseStoreImmI16` for these kinds of situations. We did indeed run into the predecoder issue when we used testw for normal store barriers, so I changed to testl. However, this cmpw is only taken when we use atomics. I felt less motivated to optimize every bit in this path as the ratio of atomic accesses compared to normal stores/loads is typically really small, when I have profiled it. That's why I haven't optimized this path further. However, we can fix it too. It will however require some changes to the assembler, as it currently tries to be too smart about encoding cmpl with register + immediate operands with varying sizes. I'd like to postpone that until after we integrate, as it seems mostly like a micro optimization. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1187207769 From eosterlund at openjdk.org Mon May 8 09:13:39 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 8 May 2023 09:13:39 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: References: Message-ID: <3biazHwRxoAOqw2VA_W48jB5IUe_asslAOFbTyIpCIg=.fa235ecf-6139-44e4-bb6c-d98ae7188841@github.com> On Sat, 6 May 2023 05:22:48 GMT, Quan Anh Mai wrote: >> Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 917 commits: >> >> - ZGC: Generational >> >> Co-authored-by: Stefan Karlsson >> Co-authored-by: Per Liden >> Co-authored-by: Albert Mingkun Yang >> Co-authored-by: Erik ?sterlund >> Co-authored-by: Axel Boldt-Christmas >> Co-authored-by: Stefan Johansson >> - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class >> - UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> - CLEANUP: barrierSetNMethod_aarch64.cpp >> - UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> - UPSTREAM: assembler_ppc CMPLI >> >> Co-authored-by: TheRealMDoerr >> - UPSTREAM: assembler_ppc ANDI >> >> Co-authored-by: TheRealMDoerr >> - UPSTREAM: Add VMErrorCallback infrastructure >> - Merge branch 'zgc_generational' into zgc_generational_rebase_target >> - Whitespace nit >> - ... and 907 more: https://git.openjdk.org/jdk/compare/705ad7d8...349cf9ae > > src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp line 483: > >> 481: >> 482: __ lock(); >> 483: __ cmpxchgq(rbx, Address(rcx, 0)); > > `ref_addr` is not necessarily materialised here? I think it is, yes. But we want to ensure it's in a register that isn't rbx or rax. So I figured I'd just force materialize it in rcx and call it a day. It might be possible to micro optimize this further and even use the live information we have gathered to eliminate some of the spilling, but I'd like to hold off on that until we integrate. It's again only for atomics, and also happens at most once per field. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1187216401 From jsjolen at openjdk.org Mon May 8 09:27:42 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 09:27:42 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v6] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: It's impossible for an array to be nullptr, remove asserts. Fails build on Clang systems ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12034/files - new: https://git.openjdk.org/jdk/pull/12034/files/9fc99f4a..2da97d57 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12034.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12034/head:pull/12034 PR: https://git.openjdk.org/jdk/pull/12034 From jsjolen at openjdk.org Mon May 8 10:00:45 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 10:00:45 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v7] In-Reply-To: References: Message-ID: <3WDz3tQU0xVjKHdIgUWUADDFPCvdgiZVAQLMiPtWyvQ=.0154e5b8-df85-43e5-9914-479692be0a09@github.com> > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. > > Thanks! Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Missing fixes - Merge branch 'master' into JDK-8300245 - It's impossible for an array to be nullptr, remove asserts. Fails build on Clang systems - Merge remote-tracking branch 'origin/master' into JDK-8300245 - Fix outdated copyright - Manual fix - Fix two faulty NULL_STRING misses - More manual fixes - Merge remote-tracking branch 'origin/master' into JDK-8300245 - Manual fixes - ... and 1 more: https://git.openjdk.org/jdk/compare/5c7ede94...cb705720 ------------- Changes: https://git.openjdk.org/jdk/pull/12034/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=06 Stats: 2070 lines in 125 files changed: 1 ins; 4 del; 2065 mod Patch: https://git.openjdk.org/jdk/pull/12034.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12034/head:pull/12034 PR: https://git.openjdk.org/jdk/pull/12034 From jsjolen at openjdk.org Mon May 8 10:09:29 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 10:09:29 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v8] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Dead assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12034/files - new: https://git.openjdk.org/jdk/pull/12034/files/cb705720..6018ab38 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12034&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12034.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12034/head:pull/12034 PR: https://git.openjdk.org/jdk/pull/12034 From jsjolen at openjdk.org Mon May 8 10:09:31 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 10:09:31 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v7] In-Reply-To: <3WDz3tQU0xVjKHdIgUWUADDFPCvdgiZVAQLMiPtWyvQ=.0154e5b8-df85-43e5-9914-479692be0a09@github.com> References: <3WDz3tQU0xVjKHdIgUWUADDFPCvdgiZVAQLMiPtWyvQ=.0154e5b8-df85-43e5-9914-479692be0a09@github.com> Message-ID: On Mon, 8 May 2023 10:00:45 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. >> >> Thanks! > > Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Missing fixes > - Merge branch 'master' into JDK-8300245 > - It's impossible for an array to be nullptr, remove asserts. > > Fails build on Clang systems > - Merge remote-tracking branch 'origin/master' into JDK-8300245 > - Fix outdated copyright > - Manual fix > - Fix two faulty NULL_STRING misses > - More manual fixes > - Merge remote-tracking branch 'origin/master' into JDK-8300245 > - Manual fixes > - ... and 1 more: https://git.openjdk.org/jdk/compare/5c7ede94...cb705720 Alright, @mgronlun, @egahlin, looks like the JFR null conversion is done. I'm currently running the tier1 tests. I removed a few asserts that didn't do anything (Clang complains about unnecessary comparisons), see here: https://github.com/openjdk/jdk/pull/12034/commits/2da97d578cf008170a0881d44844516370a6f456 ------------- PR Comment: https://git.openjdk.org/jdk/pull/12034#issuecomment-1538107444 From fyang at openjdk.org Mon May 8 10:22:53 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 8 May 2023 10:22:53 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 06:50:55 GMT, Stefan Karlsson wrote: >> We emailed to erik to discuss this issue two months ago, and maybe he missed it. >> ZForwardingTest does not guarantee a successful invoke of os::commit_memory for ZAddressHeapBase, and we saw some conflicts between ZAddressHeapBase and the metadata address space on the RISC-V hardware of 39-bits virtual address. There is no failure in the normal initialization phase of JVM, because the commit order of them is guaranteed. > > Could you provide the values for `reserved`, `ZAddressHeapBase`, and `ZAddressOffsetMax`, when this test is failing. I'd like to know if we can make a workaround for you, or if we have to turn off the test for riscv. @stefank : I ran this gtest for 5 times and here is what I got. ZAddressHeapBase : 0x800000000 ZAddressOffsetMax: 0x800000000 ZGranuleSize : 0x200000 In os::pd_attempt_reserve_memory_at() which is called by os::attempt_reserve_memory_at(), return value by anon_mmap() [1] is one of: ```0x3f8d5ff000, 0x3f649fe000, 0x3f5d3ff000, 0x3f68077000 and 0x3f555ff000``` So seems that those values are not in the range [ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax). [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/linux/os_linux.cpp#L3334 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1187278971 From qamai at openjdk.org Mon May 8 10:52:38 2023 From: qamai at openjdk.org (Quan Anh Mai) Date: Mon, 8 May 2023 10:52:38 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v8] In-Reply-To: References: <6SAAbnqbNXzGj7LtOU1fhkg9y87ZR2dKYeRM2RyxO1E=.12002ace-4616-4b73-9306-25da93948b2d@github.com> Message-ID: On Mon, 8 May 2023 09:01:07 GMT, Erik ?sterlund wrote: >> I think we use the flag `UseStoreImmI16` for these kinds of situations. > > We did indeed run into the predecoder issue when we used testw for normal store barriers, so I changed to testl. However, this cmpw is only taken when we use atomics. I felt less motivated to optimize every bit in this path as the ratio of atomic accesses compared to normal stores/loads is typically really small, when I have profiled it. That's why I haven't optimized this path further. However, we can fix it too. It will however require some changes to the assembler, as it currently tries to be too smart about encoding cmpl with register + immediate operands with varying sizes. I'd like to postpone that until after we integrate, as it seems mostly like a micro optimization. @fisk Thanks a lot for your explanations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1187303107 From jsjolen at openjdk.org Mon May 8 11:04:29 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 May 2023 11:04:29 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v8] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:09:29 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Dead assert Passes tier1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/12034#issuecomment-1538178989 From duke at openjdk.org Mon May 8 11:33:05 2023 From: duke at openjdk.org (Afshin Zafari) Date: Mon, 8 May 2023 11:33:05 GMT Subject: RFR: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests [v5] In-Reply-To: References: Message-ID: > The `finalize()` method is removed from base classes/interfaces and are replaced by a Cleaner callback.. Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge master - Merge branch 'master' into _8305083 - Merge master - 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests - 8305083: 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests - 8305083: Remove finalize() from test/hotspot/jtreg/vmTestbase/nsk/share/ and /jpda that are used in serviceability/dcmd/framework tests ------------- Changes: https://git.openjdk.org/jdk/pull/13420/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13420&range=04 Stats: 338690 lines in 3359 files changed: 25593 ins; 279624 del; 33473 mod Patch: https://git.openjdk.org/jdk/pull/13420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13420/head:pull/13420 PR: https://git.openjdk.org/jdk/pull/13420 From mgronlun at openjdk.org Mon May 8 12:17:30 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 8 May 2023 12:17:30 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v8] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:09:29 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/jfr/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> Note how nullptr participates in a code expression here, we really are talking about the specific value nullptr. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Dead assert Marked as reviewed by mgronlun (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/12034#pullrequestreview-1416686566 From mgronlun at openjdk.org Mon May 8 12:17:31 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 8 May 2023 12:17:31 GMT Subject: RFR: JDK-8300245: Replace NULL with nullptr in share/jfr/ [v8] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 11:01:56 GMT, Johan Sj?len wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Dead assert > > Passes tier1. Thanks @jdksjolen , I am rubberstamping this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/12034#issuecomment-1538260329 From stefank at openjdk.org Mon May 8 12:51:13 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 May 2023 12:51:13 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v6] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:19:44 GMT, Fei Yang wrote: >> Could you provide the values for `reserved`, `ZAddressHeapBase`, and `ZAddressOffsetMax`, when this test is failing. I'd like to know if we can make a workaround for you, or if we have to turn off the test for riscv. > > @stefank : I ran this gtest for 5 times on linux-riscv64 board and here is what I got. > > ZAddressHeapBase : 0x800000000 > ZAddressOffsetMax: 0x800000000 > ZGranuleSize : 0x200000 > > In os::pd_attempt_reserve_memory_at() which is called by os::attempt_reserve_memory_at(), return value by anon_mmap() [1] is one of: ```0x3f8d5ff000, 0x3f649fe000, 0x3f5d3ff000, 0x3f68077000 and 0x3f555ff000``` > > So seems that those values are not in the range [ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax). > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/linux/os_linux.cpp#L3334 That's unfortunate. Could you try this patch, which probes the address range to see if it can reserve the memory somewhere else within `[ZAddressHeapBase, ZAddressHeapBase+ZAddressOffsetMax)`: https://github.com/stefank/jdk/tree/zgc_generational_review_test_zforwarding ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1187406599 From heidinga at redhat.com Mon May 8 13:41:14 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 8 May 2023 09:41:14 -0400 Subject: [External] : Re: JEP draft: Integrity and Strong Encapsulation In-Reply-To: <4D285258-D117-4978-9174-34573BC05F70@oracle.com> References: <4D285258-D117-4978-9174-34573BC05F70@oracle.com> Message-ID: Thanks for the response, Ron. My comments are in line. On Fri, May 5, 2023 at 8:10?AM Ron Pressler wrote: > > > On 4 May 2023, at 21:32, Dan Heidinga wrote: > > > I?ve read this draft a number of times and each time I struggled with the > framing of the problem given Java?s success over the past almost 30 years. > > > The old regime worked when: 1. Almost all the runtime was written in C++ > (so the fact Java code couldn?t really establish invariants didn?t matter > as much), 2. The JDK evolved at a slow pace, and 3. Java applications were > deployed in a particular way. That lasted for a very long time, but all of > these are now changing: 1. More and more of the runtime is being written > (or rewritten) in Java, 2. The JDK is evolving faster, and 3. New > deployment kinds are desired. > I agree the old regime worked. It worked well and enabled Java to flourish as a stable base for applications built on top of the runtime. And many of those applications have chosen to "violate integrity" to achieve business goals. Enforcing more constraints on the ecosystem to make JDK development / maintenance easier isn't necessarily a winning strategy for the applications built on top of the runtime. Especially given we have existing tools - such as marking specific classes as "unmodifiable" [0] - that would allow the VM to enforce invariants on critical implementation classes that are ported from C++ to Java and could be extended to protect the runtime further. Can you speak further to the "new deployments" and why integrity constraints are critical to them? [0] https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#IsModifiableClass > > In light of this new situations problems are arising due to the old > regime, which isn?t working so well anymore. > > As JEP 411 states, the SecurityManager has: > * Brittle permission model > * Difficult programming model > * Poor performance > Which translates into a whole lot of cost both for maintainers of the JDK > and for all users who must pay the runtime costs related to the > SecurityManager (high when enabled, but non-zero always). > > Although the SecurityManager has high costs, and is infrequently used at > runtime in production, it provides the only way to limit certain > capabilities like: > * JNI (SecurityManager::checkLink) > * Encapsulation (SecurityManager::checkPackageAccess) > * Launch new processes (SecurityManager::checkExec) > * Reflective access (accessDeclaredMembers, etc) > * and others > > Some of those controls need replacements if the SecurityManager will go > away. JNI, surprisingly, is a key one here for large corporations. > > If I understand correctly, this new Integrity JEP draft aims, amongst > other things, to replace the hard to maintain, expensive runtime checks of > the SecurityManager with configuration via command line options. This > allows those who previously relied on the SecurityManager to continue to > control the high-order bits of functionality without imposing a cost on the > rest of the ecosystem. It also makes it easier to determine which > libraries are relying on the restricted features. > > Overall, this provides a smoother migration path for users, makes the > intention of users very clear (just read the command line vs auditing > SecurityManager implementation) and improves performance by shifting these > decisions to configuration time rather than paying cost of code complexity > and stack walks. > > I also appreciate the ?nudge? being made with this JEP by requiring > explicit opt-in to disabling protections versus the previous uphill battle > to enable the SecurityManager. It makes for an easier conversation to ask > for i.e. JNI to be enabled for one library on the command line rather than > having to deal with all the potential restrictions of the SecurityManager. > > > The relationship between security and integrity is as follows: integrity > is a prerequisite to robust security (i.e. security that doesn?t require > full-program analysis). That?s because security depends on maintaining > security invariant ? e.g. a sensitive method is only ever called after an > access check ? and there can be no robust invariants, aka integrity > invariants, *of any kind* without integrity. > > SecurityManager was a security mechanism, and because robust security > requires integrity, SecurityManager *also* had to offer integrity. But > strong encapsulation isn?t a security mechanism. It is an integrity > mechanism. As such, it makes it *possible* to build robust security > mechanisms, such as an authorisation mechanism, at any layer: the JDK, > frameworks/libraries, the application. Without integrity, it would be > impossible to build such security mechanisms at any layer. In a way, > SecurityManager served as an excuse of sorts: if you really needed > integrity you could have hypothetically achieved it using SM (though in > practice it was hard). > That's a fair characterization. I see this JEP draft as a necessary foundational step towards the removal of the SecurityManager. Without the limitations being proposed by this JEP, there is nothing the runtime offers to fill the gap produced by removing the SecurityManager. I think it's worth calling out that this JEP draft is an enabling step towards the complete removal of the deprecated SecurityManager. > > You are right that strong encapsulation?s ?permissions? are, by design, > more coarsely grained than SM?s security permissions, but that?s not the > only difference, or even the main one. A bigger difference is that it is > quite normal for an application to give some component/user access to some > file. On the other hand, it is abnormal and relatively rare for an > application to grant *any* strong-encapsulation-breaking permissions (those > that override the permissions in modules? module-info, that is) with the > possible exception of --enable-native-access to allow JNI/FFM. Few programs > should have *any* of --add-exports/add-opens/patch-module in production > (although it?s normal in whitebox testing); these are all red flags. Unlike > a ?reasonable? security policy, which is quite complex, the only reasonable > integrity configuration is the empty one, again, with the exception of > ?enable-native-access; a *minority* of programs may also have -javaagent. > It's a great vision statement but the unfortunate reality is much messier. Most programs - especially given the current adoption of modules - will need --add-exports/add-opens until their dependencies are all fully modularized and even then, if today's setAccessible use is any indication, will continue to use those options. Additionally, -javaagent is a key enabler of Observability tooling. I'd be surprised if only a minority of programs were deployed with monitoring agents... in fact, I expect that given the increasing emphasis on Observability, usage will increase, especially with these tools needing to switch away from dynamic attach. > > So it?s not just fine-grained vs. coarse-grained, opt-in vs. opt out, but > also: the ?right? configuration is the default one or one that?s very close > to it. > > > So while overall, when viewed from the lens of removing the > SecurityManager, this approach makes sense, I do want to caution on betting > against Java?s strengths, particularly against its use of speculative > optimizations. > > > Neither a person reading the code nor the platform itself ? as it > compiles and runs it ? can fully be assured that the code does what it says > or that its meaning does not change over time as the program runs. > ?.. > > In the Java runtime, certain optimizations assume that conditions that > hold at the time the optimization is made hold forever. > > This is the basis of all speculative optimization - the platform assumes > the meaning doesn?t change and compiles as though it won?t. If the > application is modified at runtime, the JVM applies the necessary > compensations such as deoptimization and recompilation. > > Java has bet on dynamic features time and again (even when others have > championed static approaches) and those bets - backed by speculative > optimizations - have paid off time and again. So this can?t be what you?re > arguing against. > > If the concern is that the runtime behaviour may appear to be different > than the intent expressed in the source code due to use of setAccessible or > changes by agents, then I think the JEP should be more explicit about that > concern. The current wording reads as equally applying to many of Java?s > existing dynamic behaviours (and belies the power of speculation coupled > with deoptimization!). > > > I?m certainly not arguing against the power of speculative optimisation. > It has certainly worked time and again for Java? except when it doesn?t. > For example, Valhalla realised that value objects cannot be *just* a > speculative optimisation, and a different user-facing model, with stricter > integrity invariants are needed. > As a member of the Valhalla EG, I can confidently state that many of the Valhalla requirements come out of the underlying "vm physics" and need to reflect those tradeoffs in a way that makes sense to developers who aren't familiar with the ins-and-outs of the core runtime. Valhalla still bets hard on speculation - preferring to assume "this won't be null" for most values rather than hard code that into the underlying runtime (see recent discussions on removing the "Q" descriptor). > In this JEP, however, I?m mostly hinting at link-time (or, in any event, > pre-production-runtime) optimisations that may come in Project Leyden. It?s > not so much the difference between the source code and what ends up running > that matters, but what some form of analysis (either static or dynamic > during a trial-run) sees vs. what the application may later do. In some > cases, speculation that falls back on deopt may do, but for other, > ?tighter? link-time/pre-run optimisations, it may prove insufficient. The > platform would need to know that the meaning of the program does not change > between the time the optimisations are performed and the time the program > is run. > Or it needs to be able to cheaply and quickly validate that the assumptions made based on the training runs / static analysis continue to hold in this new run. Some of those assumptions will hold by fiat while others will need to be checked. Enforcing additional integrity checks may make that analysis easier but is unlikely to remove the need to validate the assumptions. > > As for dynamic features, we need to separate regular reflection ? which > isn?t affected at all ? from deep reflection. The two primary uses for deep > reflection in production are dependency injection and serialization. But > dependency injection requires only a very controlled form of deep > reflection ? one that is nicely served by Lookups, and the use of deep > reflection in serialization is considered a mistake that can and should be > fixed ( > https://openjdk.org/projects/amber/design-notes/towards-better-serialization). > Until then, the JDK offers special provisions for serialization libraries > that wish to serialize JDK objects ( > https://github.com/openjdk/jdk/blob/master/src/jdk.unsupported/share/classes/sun/reflect/ReflectionFactory.java). > There is no reason --add-opens shouldn?t be rare. > Has there been any analysis on how common --add-opens actually is? Or has the use of setAccessible (as a proxy for --add-opens) been analyzed to validate the assumptions here? If that analysis could be shared it would help to validate the assumptions being stated here. I know we've examined common corpuses as part of other JSRs to validate ie how widespread "_" was used as variable name before restricting it. Can the same be done here (if it hasn't already)? > > > > For example, every developer assumes that changing the signature of a > private method, or removing a private field, does not impact the class's > clients. > > Right. The private modifier defines a *contract* which states anyone > depending on the implementation details are on their own and shouldn?t be > surprised by changes. I understand that it can be problematic when large > successful frameworks are broken by such changes, but that doesn?t > invalidate the contract that?s in place. The risk is higher for the JDK > than for other libraries or applications given the common dependency on the > JDK. > > > True, which is why we?re not forcing libraries to be modularised (although > they may have to be modularised to enjoy some of the features that Project > Leyden may end up delivering). > > But I?ll also say this. What we know *now* that the designers of Java 1.0 > didn?t know is that that contract ? at least as far as the JDK goes ? > wasn?t respected, which ended up giving users a bad upgrade experience > especially since the rate of the platform?s evolution started rising. We > can advise library authors not to do something time and again, but they > care about their own users, as they should, and so justify doing what they > do. Even though everyone is justified in pursuing their interests, the end > result has been a tragedy of the commons. As the maintainers of the > platform, our user base is the entire Java ecosystem as a whole and, as it > turned out, some regulatory intervention is needed to stop this tragedy of > the commons. > For applications that made the jump to a version > 9, the upgrade from release to release has been (to my knowledge) fairly smooth apart from dealing with --illegal-access=deny becoming mandatory. > > > > However, with deep reflection, doSensitiveOperation could be invoked > from anywhere without an isAuthorized check, nullifying the intended > restriction; even worse, an agent could modify the code of the isAuthorized > method to always return true. > > And clearly, these would be bugs. Not much different than leaking a > privileged MethodHandles.Lookup object outside a Class?s nest (the boundary > for private access) for which there is no enhanced integrity check. > > We can?t fully protect users from code that does the wrong thing even > while undertaking efforts to minimize the attack surface. ?Superpowers? > are exactly that, while we support making them opt-in, we should be careful > not to overstate the risk as the same principle applies to all code running > in a process - it must be trusted as it has the same privileges as the > process. > > > When it comes to security, such bugs are known as vulnerabilities (though > not necessarily exploits), and we must differentiate between them depending > on which side of the encapsulation boundary these vulnerabilities lie. If a > security-sensitive class has a bug that causes it to leak a capabilities > object that?s one thing, but if a bug in a serialization library that uses > a super-powered deep-reflection library could have its inputs manipulated > so that a security-sensitive class is compromised, that?s a whole other > story. > > Strong encapsulation builds bulkheads that allows a sensitive module to be > analysed *in isolation*, given its well-defined surface area, and robustly > protected from vulnerabilities in *other* modules. That?s precisely why > integrity is a required for robust security. Obviously, no security > mechanism is perfect, but strong encapsulation gives the authors of > security mechanisms a very valuable tool. > > While talking about this subject it?s worth mentioning that the Java > Platform should provide the necessary integrity, but it can?t provide all > the sufficient integrity. Some integrity guarantees must also be provided > by OS mechanisms (say, filesystem and process isolation) and even hardware > mechanism (timing/rowhammer etc.). To be as secure as possible, a security > mechanism must rely on the integrity of all layers below it. > I think we're in the same ballpark here - there's a balancing act between what the runtime can provide and the risk of running any code on a system. > > > > A tool like jlink could remove unused strongly-encapsulated methods at > link time to reduce image size and class loading time. > > Most of the benefit here is not time saved by not loading the methods, > it?s actually due to avoiding the need to load classes during > verification. The verifier needs to validate relationships between classes > and every extra method potentially asserts new relationships (such as class > X subclasses Throwable) and it is these extra classes that need loading > that typically increases the startup time. > > > Right. I count that as class loading, or startup time. > > > > The guarantee that code may not change over time even opens the door to > ahead-of-time compilation (AOT). > > AOT doesn?t depend on the code never changing. OpenJ9 has AOT code that > is resilient in the face of changes to the underlying Java class files. > I?m positive Hotspot will be able to develop similar resilient AOT code. > The cost of validating the assumptions made while AOT compiling is much > lower than doing the compile while still enabling Java?s dynamic features. > > > There are different kinds of AOT compilation, and Leyden may allow > multiple modes. Some may support deoptimisation, and others may not (or may > even not have class files available to them at all). Given an application > configuration, we want to know which modes are possible and what link-time > transformation is needed or possible. > > ? Ron > Thanks, --Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleenp at openjdk.org Mon May 8 14:02:30 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 May 2023 14:02:30 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: References: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> Message-ID: On Mon, 8 May 2023 04:20:21 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove return variable from remove lambda, fix formatting. > > src/hotspot/share/classfile/stringTable.cpp line 638: > >> 636: public: >> 637: size_t _errors; >> 638: VerifyCompStrings() : _table(unsigned(_items_count / 8) + 1, 0 /* do not resize */), _errors(0) {} > > Shouldn't this use a regular ResourceHashtable instead? It didn't trivially compile and I didn't want to change the code for this unrelated table to fix this bug. I will file a new RFE to fix this. > src/hotspot/share/utilities/resizeableResourceHash.hpp line 91: > >> 89: // Calculate next "good" hashtable size based on requested count >> 90: int calculate_resize(bool use_large_table_sizes) const { >> 91: const int resize_factor = 2; // by how much we will resize using current number of entries > > Does this function depend on the template parameters? If not, I think it can be made a static function -- you may need to pass `BASE::number_of_entries()` in as a parameter. I don't see the reason to do that. It makes the caller noisier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187480076 PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187483036 From coleenp at openjdk.org Mon May 8 14:02:33 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 May 2023 14:02:33 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v2] In-Reply-To: <5kwuq2NrEkzznbU4n9tJ4nMDZ2WFZQCobSb04v5srNk=.de876e59-9ea0-4dd5-93f6-fa6cb260bbb5@github.com> References: <6Jv6JVqGXRI3L_PKDEccnT6fqD5s4VXzD9LOkwt7RWs=.95505a79-eaaf-4ae9-95fa-d0f433f6fdba@github.com> <5kwuq2NrEkzznbU4n9tJ4nMDZ2WFZQCobSb04v5srNk=.de876e59-9ea0-4dd5-93f6-fa6cb260bbb5@github.com> Message-ID: <8aXM8ad_I0zShBomKKFWOZJKzC6y7OWRXsysCtBDryI=.d576926e-dc1b-4659-9b7c-a78dd3f074b0@github.com> On Mon, 8 May 2023 05:25:04 GMT, David Holmes wrote: >> src/hotspot/share/utilities/resourceHash.hpp line 147: >> >>> 145: */ >>> 146: bool put_fast(K const& key, V const& value) { >>> 147: unsigned hv = HASH(key); >> >> I think `put_fast` is not clear enough. Maybe `put_must_be_absent()` or something more concise. > > I would suggest `put_when_absent` to complement `put_if_absent` - with suitable descriptive comments of course. This is a good name. Updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13818#discussion_r1187483386 From simonis at openjdk.org Mon May 8 14:06:28 2023 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 8 May 2023 14:06:28 GMT Subject: RFR: 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM [v5] In-Reply-To: <1-dU7miE3DgmLHf1LWgwDrqppK2l2Y5OwYkms3QMCIs=.398dcfe2-12b6-42e7-ac44-1cec2f729432@github.com> References: <1-dU7miE3DgmLHf1LWgwDrqppK2l2Y5OwYkms3QMCIs=.398dcfe2-12b6-42e7-ac44-1cec2f729432@github.com> Message-ID: On Mon, 8 May 2023 02:10:48 GMT, David Holmes wrote: >> Paul Hohensee has updated the pull request incrementally with one additional commit since the last revision: >> >> 8304074: [JMX] Add an approximation of total bytes allocated on the Java heap by the JVM > > src/hotspot/share/services/management.cpp line 2107: > >> 2105: // when result is initialized. >> 2106: jlong result = ThreadService::exited_allocated_bytes(); >> 2107: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *thread = jtiwh.next();) { > > If you call `exited_allocated_bytes` whilst you have an active `ThreadsListHandle` then you at least ensure you don't miss accounting for threads that are just about to terminate. Do you mean something like: JavaThreadIteratorWithHandle jtiwh; jlong result = ThreadService::exited_allocated_bytes(); while (JavaThread *thread = jtiwh.next()) { ... That would be fine for me. Otherwise I agree with the current compromise between accuracy and speed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13814#discussion_r1187488276 From coleenp at openjdk.org Mon May 8 14:15:18 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 May 2023 14:15:18 GMT Subject: RFR: 8306843: JVMTI tag map extremely slow after JDK-8292741 [v3] In-Reply-To: References: Message-ID: > The ResourceHashtable conversion for JDK-8292741 didn't add the resizing code. The old hashtable code was tuned for resizing in anticipation of large hashtables for JVMTI tags. This patch ports over the old hashtable resizing code. It also adds a ResourceHashtable::put_fast() function that prepends to the bucket list, which is also reclaims the performance of the old hashtable for this test with 10M tags. The ResourceHashtable put function is really a put_if_absent. This can be cleaned up in a future change. Also, the remove function needed a lambda to destroy the WeakHandle, since resizing requires copying entries. > > Tested with JVMTI and JDI tests locally, and tier1-4 tests. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Rename and comment put_when_absent. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13818/files - new: https://git.openjdk.org/jdk/pull/13818/files/60463042..e9b5af0e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13818&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13818&range=01-02 Stats: 8 lines in 2 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13818.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13818/head:pull/13818 PR: https://git.openjdk.org/jdk/pull/13818 From stefank at openjdk.org Mon May 8 14:29:47 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 May 2023 14:29:47 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v9] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 923 commits: - ZGC: Generational Co-authored-by: Stefan Karlsson Co-authored-by: Per Liden Co-authored-by: Albert Mingkun Yang Co-authored-by: Erik ?sterlund Co-authored-by: Axel Boldt-Christmas Co-authored-by: Stefan Johansson - UPSTREAM: RISCV tmp reg cleanup resolve_jobject - CLEANUP: barrierSetNMethod_aarch64.cpp - UPSTREAM: assembler_ppc CMPLI Co-authored-by: TheRealMDoerr - UPSTREAM: assembler_ppc ANDI Co-authored-by: TheRealMDoerr - Merge branch 'zgc_generational' into zgc_generational_rebase_target - ZGC: Generational Co-authored-by: Stefan Karlsson Co-authored-by: Per Liden Co-authored-by: Albert Mingkun Yang Co-authored-by: Erik ?sterlund Co-authored-by: Axel Boldt-Christmas Co-authored-by: Stefan Johansson - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class - UPSTREAM: RISCV tmp reg cleanup resolve_jobject - CLEANUP: barrierSetNMethod_aarch64.cpp - ... and 913 more: https://git.openjdk.org/jdk/compare/5c7ede94...34312e0c ------------- Changes: https://git.openjdk.org/jdk/pull/13771/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=08 Stats: 67315 lines in 682 files changed: 58157 ins; 4252 del; 4906 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From stefank at openjdk.org Mon May 8 14:32:34 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 May 2023 14:32:34 GMT Subject: RFR: 8307058: Implementation of Generational ZGC [v9] In-Reply-To: References: Message-ID: > Hi all, > > Please review the implementation of Generational ZGC, which can be turned on by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational ZGC is a major rewrite of the non-generational ZGC version that exists in the openjdk/jdk repository. It splits the heap into two generations; the young generation where newly allocated objects are born, and the old generation where long-lived objects get promoted to. The motivation for introducing generations is to allow ZGC to reclaim memory faster by not having to walk the entire object graph every time a garbage collection is run. This should make Generational ZGC suitable for more workloads. In particular workloads that previously hit allocation stalls because of high allocation rates, large live sets, or limited spare machine resources, have the potential to work better with Generational ZGC. For an in-depth description of Generational ZGC, see https://openjdk.org/jeps/439. > > The development of Generational ZGC started around the same time as the development of JDK 17. At that point we forked off the Generational ZGC development into its own branch and let non-generational live unaffected in openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational ZGC to move unhindered, without the shackles of having to fit into another GC implementation's design and quirks. Since then, almost all of the ZGC files have been changed. Moving forward to today, when it's ready for us to upstream Generational ZGC, we now need to deliver Generational ZGC without disrupting our current user-base. We have therefore opted to initially include both versions of ZGC in the code base, but with the intention to deprecate non-generational ZGC in a future release. Existing users running with only -XX:+UseZGC will get the non-generational ZGC, and users that want the new Generational ZGC need to run with -XX:+ZGenerational in addition to -XX:+UseZGC. The intention i s to give the users time to validate and deploy their workloads with the new GC implementation. > > Including both the new evolution of a GC and its legacy predecessor poses a few challenges for us GC developers. The first reaction could be to try to mash the two implementations together and sprinkle the GC code with conditional statements or dynamic dispatches. We have done similar experiments before. When ZGC was first born, we started an experiment where we converted G1 into getting the same features as the evolving ZGC. It was quite clear to us how time consuming and complex things end up being when we tried to keep both the original G1 working, and at the same time implemented the ZGC-alike G1. Given this experience, we don't see that as a viable solution to deliver a maintainable and evolving Generational ZGC. Our pragmatic suggestion to these challenges is to let Generational ZGC live under the current gc/z directories and let the legacy, non-generational ZGC be completely separated in its own directories. This way we can continue to move quickly with the continued develop ment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. > > Most of the stand-alone cleanups and enhancements outside of the ZGC code have already been upstreamed to openjdk/jdk. There are still a few patches that could/should be pushed separately, but they will be easier to understand by also looking at the Generational ZGC code, so they will be sent out after this PR has been published. The patches that could be published separately are: > > * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class > * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject > * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp > * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics > * a2824734d23 UPSTREAM: lir_xchg > * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI > * 447259cea42 UPSTREAM: assembler_ppc ANDI > * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure > > Regarding all the changesets you see in this PR, they form the history of the development of Generational ZGC. It might look a bit unconventional to what you are used to see in openjdk development. What we have done is to use merges with the 'ours' strategy to ignore the previous Generational ZGC patches, and then rebased and flattened the changes on top of the merge. This effectively gives us the upsides of having a rebased repository and the upsides of retaining the history in the repository. The downside could be that GitHub now lists all those changesets in the PR. Given that this patch is so big, and that you likely only want to see a part of it, I suggest that you pull down the PR branch and then compare it to the openjdk/jdk changeset this PR is based against: > > > git fetch https://github.com/openjdk/zgc zgc_master > git diff zgc_master... > > > There have been many contributors of this patch over the years. I'll do my best to poke Skara into listing you all, but if you see that I've missed your name please reach out to me and I'll fix it. > > Testing: we have been continuously running Generational ZGC through Oracle's tier1-8 testing. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 923 commits: - ZGC: Generational Co-authored-by: Stefan Karlsson Co-authored-by: Per Liden Co-authored-by: Albert Mingkun Yang Co-authored-by: Erik ?sterlund Co-authored-by: Axel Boldt-Christmas Co-authored-by: Stefan Johansson - UPSTREAM: RISCV tmp reg cleanup resolve_jobject - CLEANUP: barrierSetNMethod_aarch64.cpp - UPSTREAM: assembler_ppc CMPLI Co-authored-by: TheRealMDoerr - UPSTREAM: assembler_ppc ANDI Co-authored-by: TheRealMDoerr - Merge branch 'zgc_generational' into zgc_generational_rebase_target - ZGC: Generational Co-authored-by: Stefan Karlsson Co-authored-by: Per Liden Co-authored-by: Albert Mingkun Yang Co-authored-by: Erik ?sterlund Co-authored-by: Axel Boldt-Christmas Co-authored-by: Stefan Johansson - UPSTREAM: Introduce check_oop infrastructure to check oops in the oop class - UPSTREAM: RISCV tmp reg cleanup resolve_jobject - CLEANUP: barrierSetNMethod_aarch64.cpp - ... and 913 more: https://git.openjdk.org/jdk/compare/5c7ede94...34312e0c ------------- Changes: https://git.openjdk.org/jdk/pull/13771/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13771&range=08 Stats: 67315 lines in 682 files changed: 58157 ins; 4252 del; 4906 mod Patch: https://git.openjdk.org/jdk/pull/13771.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13771/head:pull/13771 PR: https://git.openjdk.org/jdk/pull/13771 From dcubed at openjdk.org Mon May 8 15:35:24 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 8 May 2023 15:35:24 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: <1YUrEokX3KxXMqDF5nM4Na5tcpxnAt_69ZHR2tQ7k38=.6f183571-1afe-451a-a6f4-38b2577daa90@github.com> On Fri, 5 May 2023 16:49:38 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is fixed size, currently with 8 elements. According to my experiments with various workloads, this covers the vast majority of workloads (in-fact, most workloads seem to never exceed 5 active locks per thread at a time). We check for overflow in the fast-paths and when the lock-stack is full, we take the slow-path, which would inflate the lock to a monitor. That case should be very rare. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> Also, and I might be mistaken here, this new lightweight locking would make synchronized work better with Loom: Because the lock-records are no longer scattered across the stack, but instead are densely packed into the lock-stack, it should be easy for a vthread to save its lock-stack upon unmounting and restore it when re-mounting. However, I am not sure about this, and this PR does not attempt to implement that support. >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Only allow lock-stack verification for owning Java threads or at safepoints Mach5 Tier[1-8] of v77 with forced-fast-locking results look good. Mach5 Tier[1-8] of v77 with default-stack-locking results also look good. I do still have to check in with Eric Caspole about the performance testing of the baseline versus the default-stack-locking configuration. We did that testing with a baseline of jdk-21+21-1704 and the v66 version of the patch in default-stack-locking configuration. Eric also did testing of the v66 version of the patch with forced-fast-locking, but those results are not a gate for determining whether this patch gets integrated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10907#issuecomment-1538581877 From dcubed at openjdk.org Mon May 8 16:03:17 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 8 May 2023 16:03:17 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v78] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:49:38 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be ve