RFR: 8330611: AES-CTR vector intrinsic may read out of bounds (x86_64, AVX-512)
Martin Balao
mbalao at openjdk.org
Fri Apr 19 17:44:56 UTC 2024
On Fri, 19 Apr 2024 11:45:31 GMT, Charles Connell <duke at openjdk.org> wrote:
>> We would like to propose a fix for 8330611.
>>
>> To avoid an out of bounds memory read when the input's size is not multiple of the block size, we read the plaintext/ciphertext tail in 8, 4, 2 and 1 byte batches depending on what it is guaranteed to be available by 'len_reg'. This behavior replaces the read of 16 bytes of input upfront and later discard of spurious data.
>>
>> While we add 3 extra instructions + 3 extra memory reads in the worst case —to the same cache line probably—, the performance impact of this fix should be low because it only occurs at the end of the input and when its length is not multiple of the block size.
>>
>> A reliable test case for this bug is hard to develop because we would need accurate heap allocation. The fact that spuriously read data is silently discarded most of the time makes this bug harder to observe. No regressions have been observed in the compiler/codegen/aes jtreg category. Additionally, we verified the fix manually with the debugger.
>>
>> This work is in collaboration with @franferrax .
>
> Thank you for this fix! (I discovered the bug) I agree it would be very difficult to verify the lack of out-of-bounds memory access. However, after some tinkering myself, I also noticed that even obvious correctness problems in this "tail" area (like simply not copying bytes into the dest array) do not fail any existing unit tests. If it's not that hard, maybe that would a good test to add.
Thanks @charlesconnell for reporting this bug and @theRealAph for your review.
As part of my initial verification, I compared values obtained from the intrinsic with values obtained from the interpreter and they were equal in all cases. These cases were for input whose sizes were multiple of block size - 1 (so all bits of the tail had to be processed). To make sure that the testing was correct, I did error seeding (replaced a `movw` with a `movb` in the tail processing) and it failed as expected.
I have now extended the error seeding strategy and verified how 3/4 tests under compiler/codegen/aes failed. I think that we have reasonable coverage. If there are no objections, I'm planning to integrate.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18849#issuecomment-2067026317
More information about the hotspot-compiler-dev
mailing list