Raytracing Experience Report

Remi Forax forax at univ-mlv.fr
Sat Nov 1 18:32:28 UTC 2025


----- Original Message -----
> From: "cay horstmann" <cay.horstmann at gmail.com>
> To: "valhalla-dev" <valhalla-dev at openjdk.org>
> Sent: Saturday, November 1, 2025 5:58:12 PM
> Subject: Re: Raytracing Experience Report

> You are on the right track replacing the array list with an array.
> 
> But if you want the VM to flatten the array, you need to use the non-public API
> for now. And you need to tell the VM that the values are never null, and if
> they are > 64 bits, that you don't care about tearing.
> 
> https://horstmann.com/presentations/2025/jfn-valhalla/#(15)
> https://horstmann.com/presentations/2025/jfn-valhalla/#(17)
> 
> Cheers,
> 
> Cay

Or you can use an implementation of List that uses specialization

https://github.com/forax/weather-alert/blob/master/src/main/java/util/FlatListFactory.java#L433

Rémi

> 
> Il 30/10/2025 19:08, Ethan McCue ha scritto:
>> I did try that - by replacing the ArrayList<Material> with a Sphere[] - there
>> was a modest speedup. But the C++ code itself uses an abstract hittableclass
>> and has a std::vector<shared_ptr<hittable>>. So if I were to make that change
>> in the Java version to get better performance I would feel the need to do the
>> same in the C++ or else it would not be a fair comparison.
>> 
>> The only other thing I could think of - replacing Optional<HitRecord>with a
>> nullable HitRecord - didn't move the needle. VisualVM doesn't support the EA so
>> I'm not experienced in how I would need to dig down. It is possible
>> System.out.println might be the bottleneck now, but I somewhat doubt it.
>> 
>> 
>> 
>> On Thu, Oct 30, 2025 at 1:56 PM Piotr Tarsa <piotr.tarsa at hotmail.com
>> <mailto:piotr.tarsa at hotmail.com>> wrote:
>> 
>>     Hi Ethan,
>> 
>>     IIRC Valhalla still doesn't have reified nor specialized generics in any way, so
>>     anything generic, like List<Whatever> or Optional<Whatever>, is erased to
>>     non-generic form. The layout of generic classes is not specialized to the
>>     generic parameter, but instead the instances of 'Whatever' are unconditionally
>>     boxed. I think that the first step to get performance closer to C++ with
>>     current Valhalla state would be to avoid all generics in hot execution paths
>>     and then redo the experiments.
>> 
>>     Regards,
>>     Piotr
>>     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>     *Od:* valhalla-dev <valhalla-dev-retn at openjdk.org
>>     <mailto:valhalla-dev-retn at openjdk.org>> w imieniu użytkownika Ethan McCue
>>     <ethan at mccue.dev <mailto:ethan at mccue.dev>>
>>     *Wysłane:* czwartek, 30 października 2025 18:21
>>     *Do:* Sergey Kuksenko <sergey.kuksenko at oracle.com
>>     <mailto:sergey.kuksenko at oracle.com>>
>>     *DW:* valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org>
>>     <valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org>>
>>     *Temat:* Re: Raytracing Experience Report
>>     Continuing from this, I ran it against the reference C++ implementation and got
>>     these numbers.
>> 
>>     # Reference C++ implementation (-O3)
>> 
>>     ```
>>     real 6m35.702s
>>     user 6m33.780s
>>     sys 0m1.454s
>>     ```
>> 
>>     # Java With Value Classes
>> 
>>     ```
>>     real 11m50.122s
>>     user 11m36.536s
>>     sys 0m13.281s
>>     ```
>> 
>>     # Java Without Value Classes
>> 
>>     ```
>>     real 17m1.038s
>>     user 16m40.993s
>>     sys 0m29.400s
>>     ```
>> 
>>     I am wondering if using an AOT cache could help catch up to the C++, but I get a
>>     class file version error running with -XX:AOTCache=value.aot
>> 
>>     Error: LinkageError occurred while loading main class Main
>>              java.lang.UnsupportedClassVersionError: Main has been compiled by a more recent
>>              version of the Java Runtime (class file version 70.0), this version of the Java
>>              Runtime only recognizes class file versions up to 69.0
>> 
>>     On Wed, Oct 29, 2025 at 3:44 PM Sergey Kuksenko <sergey.kuksenko at oracle.com
>>     <mailto:sergey.kuksenko at oracle.com>> wrote:
>> 
>>         Hi Ethan,
>> 
>>         Thank you for the information. Your example and the code are pretty
>>         straightforward, and I was able to repeat and diagnose the issue.
>> 
>>         The fact is, the performance issue is not directly related to value classes. The
>>         problem is that HittableList::hit method (invoked at Camera::rayColor) was
>>         inlined by JIT in the non-value version and wasn't inlined in the value classes
>>         version.
>>         When you inline that invocation manually, you should get the same performance
>>         for both versions.
>>         HittableList::hit was not inlined in the value classes version because value
>>         classes resulted in a different code size and changed the inline heuristics.
>>         It's a mainline issue; you'll encounter it quite rarely. Current inline
>>         heuristics work well in 99% of cases, and you should be very lucky (or unlucky)
>>         to get it in real life.
>> 
>>         Best regards,
>>         Sergey Kuksenko
>> 
>> 
>> 
>>         ________________________________________
>>         From: valhalla-dev <valhalla-dev-retn at openjdk.org
>>         <mailto:valhalla-dev-retn at openjdk.org>> on behalf of Ethan McCue
>>         <ethan at mccue.dev <mailto:ethan at mccue.dev>>
>>         Sent: Monday, October 27, 2025 5:08 PM
>>         To: valhalla-dev at openjdk.org <mailto:valhalla-dev at openjdk.org>
>>         Subject: Raytracing Experience Report
>> 
>>         Hi all,
>> 
>>         I have been following along in the "Ray Tracing in a Weekend" book and trying to
>>         make as many classes as possible value classes. (Vec3, Ray, etc.)
>> 
>>         https://github.com/bowbahdoe/raytracer <https://github.com/bowbahdoe/raytracer>
>> 
>>         https://raytracing.github.io/books/RayTracingInOneWeekend.html
>>         <https://raytracing.github.io/books/RayTracingInOneWeekend.html>
>> 
>>         (without value classes)
>> 
>>         time java --enable-preview --class-path build/classes Main > image.ppm
>> 
>>         real 4m33.190s
>>         user 4m28.984s
>>         sys 0m5.511s
>> 
>>         (with value classes)
>> 
>>         time java --enable-preview --class-path build/classes Main > image.ppm
>> 
>>         real 3m54.623s
>>         user 3m52.205s
>>         sys 0m2.064s
>> 
>>         So by the end the version using value classes beats the version without them by
>>         ~14% using unscientific measurements.
>> 
>>         But that is at the end, running the ray tracer on a relatively large scene with
>>         all the features turned on. Before that point there were some checkpoints where
>>         using value classes performed noticeably worse than the equivalent code sans
>>         the value modifier
>> 
>>         https://github.com/bowbahdoe/raytracer/tree/no-value-faster
>>         <https://github.com/bowbahdoe/raytracer/tree/no-value-faster>
>> 
>>         real 1m22.172s
>>         user 1m9.871s
>>         sys 0m12.951s
>> 
>>         https://github.com/bowbahdoe/raytracer/tree/with-value-slower
>>         <https://github.com/bowbahdoe/raytracer/tree/with-value-slower>
>> 
>>         real 3m34.440s
>>         user 3m19.656s
>>         sys 0m14.870s
>> 
>>         So for some reason just adding value to the records/classes makes the program
>>         run a over 2x as slow.
>> 
>>         https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower
>>         <https://github.com/bowbahdoe/raytracer/compare/no-value-faster...with-value-slower>
>> 
>>         Is there some intuition that explains this? I am on a stock M1 Arm Mac.
>> 
> 
> --
> 
> Cay S. Horstmann | https://horstmann.com


More information about the valhalla-dev mailing list