<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>
<blockquote type="cite">One advantage of the current design is
that it makes the intent of the developer clear</blockquote>
</p>
<p>I am also not in favor of the initial proposal, but I share the
general concern. I see the pain point. Regarding the initial
proposal:<br>
</p>
<pre><blockquote type="cite">for(int index, String s : strings) {</blockquote></pre>
<p>I think this solution would be too inflexible. Extending the
syntax of the language only for this one very specific scenario
seems not justified to me. I think there should rather be a focus
on re-introducing and simplifying pattern matching in enhanced <font face="monospace">for</font>-loops. Let's consider my previous
example of what was possible in Java 20:</p>
<blockquote>
<pre>for (ListIndex(int index, String element) : enumerate(strings)) {
</pre>
</blockquote>
<p>The compiler could be updated to infer the Pattern and support
the following expression:</p>
<blockquote>
<pre>for ((int index, String element) : enumerate(strings)) {
</pre>
</blockquote>
<p>Or when using <font face="monospace">var</font>:<br>
</p>
<blockquote>
<pre>for ((var index, var element) : enumerate(strings)) {
</pre>
</blockquote>
<p>By keeping the method call on the right, we greatly improve the
flexibility. Let's for example look at all the following functions
shipped with Python. They could all benefit from this syntax if
they were ported to Java.</p>
<ul>
<li><a href="https://docs.python.org/3/library/functions.html#enumerate"><font face="monospace">enumerate(iterable, start=0)</font></a></li>
<li><a href="https://docs.python.org/3/library/functions.html#zip"><font face="monospace">zip(*iterables, strict=False)</font></a></li>
<li><a href="https://docs.python.org/3/library/itertools.html#itertools.pairwise"><font face="monospace">pairwise(iterable)</font></a></li>
<li><a href="https://docs.python.org/3/library/itertools.html#itertools.groupby"><font face="monospace">groupby(iterable, key=None)</font></a></li>
<li><a href="https://docs.python.org/3/library/itertools.html#itertools.product"><font face="monospace">product(*iterables, repeat=1)</font></a><br>
</li>
<li><a href="https://docs.python.org/3/library/itertools.html#itertools.combinations"><font face="monospace">combinations(iterable, r)</font></a></li>
<li><a href="https://docs.python.org/3/library/itertools.html#itertools.permutations"><font face="monospace">permutations(iterable, r=None)</font></a><br>
</li>
</ul>
<p>Besides, I think inferring the pattern is not only useful in
loops, but also in <font face="monospace">switch</font>
expressions:</p>
<blockquote>
<pre>return switch (pair(state, isWaiting)) {
case (INITIALIZATION, false) -> "Initializing task";
case (INITIALIZATION, true ) -> "Waiting for an external process before continuing with the initialization";
case (IN_PROGRESS , false) -> "Task in progress";
case (IN_PROGRESS , true ) -> "Waiting for an external process";
case (FINISHED , _ ) -> "Task finished";
case (CANCELED , _ ) -> "Task canceled";
};
</pre>
</blockquote>
I have to admit that adding <font face="monospace">`Pair`</font>
after <font face="monospace">`case`</font> might not be that big of
a deal in this case, but note that in some cases, the name of the
type might be much longer, significantly increasing the noise.<br>
<p>
<blockquote type="cite">I think i would prefer to have to have an
indexed stream more than indexed loop</blockquote>
</p>
<p>Note that checked exceptions and streams do not work well
together. At least not in the current state of Java. For the time
being, I would therefore favor the enhanced <font face="monospace">for</font> loop. (It might be possible to fix
the interoperability of checked exceptions and streams with union
types or varargs in type parameters, but neither is planned as far
as I know.)</p>
<p>
<blockquote type="cite">the good news is that it seems something
we can do using the gatherer API [2] and Valhalla (to avoid the
cost of creating a a pair (index, element) for each element).</blockquote>
</p>
<p>I was wondering if the JIT would already optimize the overhead
away. I ran some benchmarks using JMH on the <font face="monospace">enumerate(...)</font> method I introduced
earlier. As you are the second person mentioning Valhalla out of
performance concerns, I thought I share my results.<br>
</p>
<blockquote>
<pre>fori (OpenJDK 17) -> enhanced_for (OpenJDK 17) ≈ + 7 %
fori (OpenJDK 21) -> enhanced_for (OpenJDK 21) ≈ -34 %
enhanced_for (OpenJDK 17) -> enhanced_for (OpenJDK 21) ≈ -29 %
fori (OpenJDK 21) -> enhanced_for (OpenJDK 17) ≈ - 8 %
</pre>
</blockquote>
<p>With OpenJDK 17, my high-level <font face="monospace">enumerate(...)</font>
method was actually 7 % faster then a low-level old-style <font face="monospace">for</font>-loop. However, in later versions of
OpenJDK, the high-level code got much slower.<br>
</p>
<p>You can find the <a href="https://github.com/JojOatXGME/benchmarks-java/blob/d12c441e16e04a6e7971365ee9672056687ad89b/src/jmh/java/benchmark/EnhancedForHelper.java" moz-do-not-send="true">benchmark implementation at GitHub</a>.
The benchmark was running within WSL2 and Ubuntu 20.04 on an
i7-3770 from 2012.</p>
<blockquote>
<pre># VM version: JDK 17.0.7, OpenJDK 64-Bit Server VM, 17.0.7+7-nixos
Benchmark Mode Cnt Score Error Units
EnhancedForHelper.enhanced_for thrpt 10 588852.311 ± 3783.862 ops/s
EnhancedForHelper.fori thrpt 10 551406.193 ± 1172.687 ops/s
# VM version: JDK 21, OpenJDK 64-Bit Server VM, 21+35-nixos
Benchmark Mode Cnt Score Error Units
EnhancedForHelper.enhanced_for thrpt 10 419723.971 ± 8903.577 ops/s
EnhancedForHelper.fori thrpt 10 640767.173 ± 2829.187 ops/s
# VM version: JDK 20, OpenJDK 64-Bit Server VM, 20+36-nixos
Benchmark Mode Cnt Score Error Units
EnhancedForHelper.enhanced_for thrpt 10 430022.265 ± 3050.285 ops/s
EnhancedForHelper.enhanced_for_with_pattern_matching thrpt 10 325179.547 ± 5206.194 ops/s
EnhancedForHelper.fori thrpt 10 631755.837 ± 20495.694 ops/s
</pre>
</blockquote>
I also run the Benchmark with Azul Zing for Java 21, which uses LLVM
for the JIT optimizations. It was about 51 % faster then the fastest
run I have seen with OpenJDK. However, the warmup-time was
noticeably longer. There was no big difference between both loops.
<blockquote>
<pre># VM version: JDK 21.0.1, Zing 64-Bit Tiered VM, 21.0.1-zing_23.10.0.0-b3-product-linux-X86_64
# *** WARNING: JMH support for this VM is experimental. Be extra careful with the produced data.
Benchmark Mode Cnt Score Error Units
EnhancedForHelper.enhanced_for thrpt 10 978782.093 ± 4838.520 ops/s
EnhancedForHelper.fori thrpt 10 965482.460 ± 17837.251 ops/s
</pre>
</blockquote>
<p>I have also seen some results with GraalVM for Java 21, but I
don't have the exact numbers on hand. In general, Native Image was
very slow on Windows, but competitive with OpenJDK on Linux. The
GraalVM JDK (no Native Image) was about 40% faster then OpenJDK 21
and there was no measurable difference between <font face="monospace">fori</font> and <font face="monospace">enhanced_for</font>
on Linux.<br>
</p>
<p>Disclaimer: This is just a micro-benchmark. We don't know how all
of this translates to real-world applications. I still find it
interesting how different the optimizations are. I am also a bit
concerned that OpenJDK 21 got noticeable slower with the
high-level code compared to OpenJDK 17. I am eager to find out if
we see a noticeable difference in our end-to-end benchmarks when
we move forward to OpenJDK 21 at my workplace.<br>
</p>
</body>
</html>