RFR: 8351491: Add info from release file to hserr file [v3]
Thomas Stuefe
stuefe at openjdk.org
Fri Apr 4 06:03:53 UTC 2025
On Thu, 3 Apr 2025 16:12:38 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
>>
>> use one time PeriodicTask
>
> Coming to this discussion late.
>
> IMHO this is overengineered for just a printout to the hs-err file during error dumping. We already read from proc fs. Proc can be worse (depending on what you read, a lot) than reading sequentially from a flat file.
>
> Remember that we already run the JVM binaries from the same file system. We read debug information from those binaries during error dumping, and that causes a ton of IO; a sequential read of a tiny file is a drop in the bucket.
>
> Also remember that we have safety fuses: Step timeouts and Step signal handling - so if this read ever turns out to be a problem, e.g by hanging, the Step would be cancelled and error reporting would continue with the next step.
>
> I would, however, attempt to avoid malloc. Not super important, but if its easy to do I would do it. Best by using a small fixed-sized stack-allocated buffer, and just printing the file line by line.
>
> Just my 5 cent.
> @tstuefe perhap this is over engineered but there are a number of issues to consider:
>
> * reading from the physical filesystem is not the same as reading from /proc and IMO is far more likely to be problematic if done in a signal handling context during error reporting - hence we need to read the file ahead-of-time
But we already do exactly that. We read the Elf- and Dwarf-files to print out the symbol and stack information. From the same filesystem. And these are way larger than the release file, and we don't read sequentially, but seek around. This generates a ton of IO. Reading a small file sequentially is fine in comparison.
I also don't follow the arguments: the argument for this is that the file system could be slow or IO could be broken in some form.
If that is true, it is likely true for the the whole file system. In that case, VM startup would take either a very long time or the VM would not even come up, and I argue that is a configuration error.
If only reading the release file is slow while reading binaries is fine (and why would that ever happen?), then doing this for every startup of the JVM is a bad idea since it blocks the task queue, concurrent execution or not. I think if we expect problems reading that file, we should rather read it when we need it, not unconditionally at every VM startup. Plus, in error handling we have safeguards against hanging reads - we cancel hanging error reporting steps. We don't have those safeguards in the normal JVM.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24244#issuecomment-2777632094
More information about the hotspot-dev
mailing list