Request for review JDK-8185591 guarantee(_byte_map[_guard_index] == last_card) failed: card table guard has been modified

Thomas Schatzl thomas.schatzl at oracle.com
Wed Nov 22 08:28:33 UTC 2017


On Tue, 2017-11-21 at 18:24 -0500, Kim Barrett wrote:
> > On Nov 21, 2017, at 5:18 PM, Alexander Harlap <alexander.harlap at ora
> > cle.com> wrote:
> > 
> > Please review change for JDK-8185591 <https://bugs.openjdk.java.net
> > /browse/JDK-8185591> - guarantee(_byte_map[_guard_index] ==
> > last_card) failed: card table guard has been modified
> > 
> > Change is located at http://cr.openjdk.java.net/~aharlap/8185591/we
> > brev.01/
> > 
> > Problem was in mishandling zero count in code generated by
> > gen_write_ref_array_post_barrier().
> > 
> > Code is machine specific. Suggested fix for arm, sparc and x86_64.
> > 
> > No changes are  required for x86_32 - case zero count already is
> > handled properly ( same as for s390 and ppc).
> > 
> > Aarch64 code have same problem as arm, sparc and x86_64, but I did
> > not include this platform in suggested changeset.
> > 
> > I attached possiible change for aarch64
> > stubGenerator_aarch64.cpp.diff
> > 
> > Testing was done with JPRT.
> > 
> > Thank you,
> > 
> > Alex
> 
> Thanks for tracking this down.

Thanks! Looks good to me.

> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
> 1270           __ testl(count, count);
> 1271           __ jcc(Assembler::zero, L_done); // zero count -
> nothing to do
> 
> Instead of that, might it be better to instead add something like
> this?
> 
>   1277           __ subptr(end, start); // end --> cards count
> + 1278           __ jcc(Assembler::less, L_done); // negative
> inclusive count - nothing to do
> 
> Similar question for all of the affected platforms.
> 
> I don't currently have a strong preference either way, but wonder if
> there's a good reason to choose one over the other.

Basically the test instruction, particularly in combination with jcc,
has some fast paths in processors. While in current ones that
difference is pretty small after looking at the optimization manual
(current Intel manuals only indicate that the test instruction (macro-) 
fuses with all flags afterwards; the sub does not [0]). On earlier
processors (probably includes AMD ones) only test fuses with the jcc.

I.e. some nano-optimization.

Thanks,
  Thomas

[0] https://www.intel.com/content/dam/www/public/us/en/documents/manual
s/64-ia-32-architectures-optimization-manual.pdf , 3-13



More information about the hotspot-dev mailing list