Raytracing Experience Report
Cay Horstmann
cay.horstmann at gmail.com
Sat Nov 1 16:58:12 UTC 2025
You are on the right track replacing the array list with an array.
But if you want the VM to flatten the array, you need to use the non-public API for now. And you need to tell the VM that the values are never null, and if they are > 64 bits, that you don't care about tearing.
https://horstmann.com/presentations/2025/jfn-valhalla/#(15)
https://horstmann.com/presentations/2025/jfn-valhalla/#(17)
Cheers,
Cay
Il 30/10/2025 19:08, Ethan McCue ha scritto:
> I did try that - by replacing the ArrayList<Material> with a Sphere[] - there was a modest speedup. But the C++ code itself uses an abstract hittableclass and has a std::vector<shared_ptr<hittable>>. So if I were to make that change in the Java version to get better performance I would feel the need to do the same in the C++ or else it would not be a fair comparison.
>
> The only other thing I could think of - replacing Optional<HitRecord>with a nullable HitRecord - didn't move the needle. VisualVM doesn't support the EA so I'm not experienced in how I would need to dig down. It is possible System.out.println might be the bottleneck now, but I somewhat doubt it.
>
>
>
> On Thu, Oct 30, 2025 at 1:56 PM Piotr Tarsa <piotr.tarsa at hotmail.com <mailto:piotr.tarsa at hotmail.com>> wrote:
>
> Hi Ethan,
>
> IIRC Valhalla still doesn't have reified nor specialized generics in any way, so anything generic, like List<Whatever> or Optional<Whatever>, is erased to non-generic form. The layout of generic classes is not specialized to the generic parameter, but instead the instances of 'Whatever' are unconditionally boxed. I think that the first step to get performance closer to C++ with current Valhalla state would be to avoid all generics in hot execution paths and then redo the experiments.
>
> Regards,
> Piotr
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *Od:* valhalla-dev <valhalla-dev-retn at openjdk.org <mailto:valhalla-dev-retn at openjdk.org>> w imieniu użytkownika Ethan McCue <ethan at mccue.dev <mailto:ethan at mccue.dev>>
> *Wysłane:* czwartek, 30 października 2025 18:21
> *Do:* Sergey Kuksenko <sergey.kuksenko at oracle.com <mailto:sergey.kuksenko at oracle.com>>
> *DW:* valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org> <valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org>>
> *Temat:* Re: Raytracing Experience Report
> Continuing from this, I ran it against the reference C++ implementation and got these numbers.
>
> # Reference C++ implementation (-O3)
>
> ```
> real 6m35.702s
> user 6m33.780s
> sys 0m1.454s
> ```
>
> # Java With Value Classes
>
> ```
> real 11m50.122s
> user 11m36.536s
> sys 0m13.281s
> ```
>
> # Java Without Value Classes
>
> ```
> real 17m1.038s
> user 16m40.993s
> sys 0m29.400s
> ```
>
> I am wondering if using an AOT cache could help catch up to the C++, but I get a class file version error running with -XX:AOTCache=value.aot
>
> Error: LinkageError occurred while loading main class Main
> java.lang.UnsupportedClassVersionError: Main has been compiled by a more recent version of the Java Runtime (class file version 70.0), this version of the Java Runtime only recognizes class file versions up to 69.0
>
> On Wed, Oct 29, 2025 at 3:44 PM Sergey Kuksenko <sergey.kuksenko at oracle.com <mailto:sergey.kuksenko at oracle.com>> wrote:
>
> Hi Ethan,
>
> Thank you for the information. Your example and the code are pretty straightforward, and I was able to repeat and diagnose the issue.
>
> The fact is, the performance issue is not directly related to value classes. The problem is that HittableList::hit method (invoked at Camera::rayColor) was inlined by JIT in the non-value version and wasn't inlined in the value classes version.
> When you inline that invocation manually, you should get the same performance for both versions.
> HittableList::hit was not inlined in the value classes version because value classes resulted in a different code size and changed the inline heuristics. It's a mainline issue; you'll encounter it quite rarely. Current inline heuristics work well in 99% of cases, and you should be very lucky (or unlucky) to get it in real life.
>
> Best regards,
> Sergey Kuksenko
>
>
>
> ________________________________________
> From: valhalla-dev <valhalla-dev-retn at openjdk.org <mailto:valhalla-dev-retn at openjdk.org>> on behalf of Ethan McCue <ethan at mccue.dev <mailto:ethan at mccue.dev>>
> Sent: Monday, October 27, 2025 5:08 PM
> To: valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org>
> Subject: Raytracing Experience Report
>
> Hi all,
>
> I have been following along in the "Ray Tracing in a Weekend" book and trying to make as many classes as possible value classes. (Vec3, Ray, etc.)
>
> https://github.com/bowbahdoe/raytracer <https://github.com/bowbahdoe/raytracer>
>
> https://raytracing.github.io/books/RayTracingInOneWeekend.html <https://raytracing.github.io/books/RayTracingInOneWeekend.html>
>
> (without value classes)
>
> time java --enable-preview --class-path build/classes Main > image.ppm
>
> real 4m33.190s
> user 4m28.984s
> sys 0m5.511s
>
> (with value classes)
>
> time java --enable-preview --class-path build/classes Main > image.ppm
>
> real 3m54.623s
> user 3m52.205s
> sys 0m2.064s
>
> So by the end the version using value classes beats the version without them by ~14% using unscientific measurements.
>
> But that is at the end, running the ray tracer on a relatively large scene with all the features turned on. Before that point there were some checkpoints where using value classes performed noticeably worse than the equivalent code sans the value modifier
>
> https://github.com/bowbahdoe/raytracer/tree/no-value-faster <https://github.com/bowbahdoe/raytracer/tree/no-value-faster>
>
> real 1m22.172s
> user 1m9.871s
> sys 0m12.951s
>
> https://github.com/bowbahdoe/raytracer/tree/with-value-slower <https://github.com/bowbahdoe/raytracer/tree/with-value-slower>
>
> real 3m34.440s
> user 3m19.656s
> sys 0m14.870s
>
> So for some reason just adding value to the records/classes makes the program run a over 2x as slow.
>
> https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower <https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower>
>
> Is there some intuition that explains this? I am on a stock M1 Arm Mac.
>
--
Cay S. Horstmann | https://horstmann.com
More information about the valhalla-dev
mailing list