[aarch64-port-dev ] RFD: scratch registers cleanup [Long post]
Nick Gasson
nick.gasson at arm.com
Thu Oct 31 08:04:23 UTC 2019
Hi Andrew,
In general I think this is an improvement. I've put some comments below.
>
> However, things haven't quite worked out that way. In practice people
> "know" that the macro WhizzyMacro they're using only clobbers
> rscratch1 and so rscratch2 will be "safe" and its contents will be
> preserved across WhizzyMacro. The danger of this is obvious: a
> maintenance programmer less familiar with the code base will come
> along and use some other registers. This is particularly problematic
> when a macro is used in many places, and if there are several versions
> of a macro, as happens in the case of GC barriers.
Do you think this partly stems from having *two* scratch registers,
where one or the other is often not clobbered? The 32-bit Arm ports gets
away with just a single Rtmp, and perhaps in that case programmers are
more likely to think it's always clobbered.
>
> There is another risk: aliasing, the extreme sport of referring to the
> same register in a macro by more than one name. Again, the risk is
> obvious: the poor programmer won't know that the registers "tmp" and
> "rscratch2" are in fact the same register.
This can be mitigated somewhat if the macros name all the registers they
use up-front and then do an assert_different_registers(...).
>
> Bear in mind that it is sometimes convenient to avoid declaring
> scratch registers in macros altogether. You can do this by passing all
> of the temporary registers a macro needs explicitly as arguments, and
> this can make the code smaller and simpler.
Yes, it makes the macro much easier to understand in isolation.
As someone new to the codebase I would be a bit surprised that seemingly
primitive operation like cmpxchg, increment, compare_eq, etc. clobber
scratch registers. I think readability would improve if we added
explicit temporary arguments to these macros. So that the majority of
MacroAssembler::* then doesn't use the implicit scratch registers at
all. Maybe we can try this as an exercise and see what's left over? I
guess it wouldn't increase verbosity all that much.
>
> {
> ScratchRegister rscratch1(__ as(), r8);
> __ ldrb(rscratch1, gc_state);
> __ tbz(rscratch1, ShenandoahHeap::MARKING_BITPOS, done);
> }
>
>
> [NB: it is possible to give the scratch register a name other than
> rscratch1, but please don't!]
It's also possible to do `ScratchRegister rscratch1(__ as(), r9)' which
is perhaps more confusing.
>
> Let's say you need to call InnerMacro from WhizzyMacro, and both
> InnerMacro and WhizzyMacro need scratch registers. You can do this by
> explicitly freeing the scratch registers, like so:
>
> WhizzyMacro() {
> ...
> ScratchRegister rscratch1(__ as(), r8);
> // Code using rscratch1 ...
> {
> FreeScratchRegs dummy(__ as());
> __ InnerMacro();
> // Try to use rscratch1 here and you'll get an assertion
> // because you don't own it.
> }
> // You can use rscratch1 again here.
> ...
> }
>
> In this case it is clear to the reader that InnerMacro() may clobber
> both scratch registers. At the end of the scope of FreeScratchRegs,
> rscratch1 is again owned by WhizzyMacro.
But you could equally write it with two separate ScratchRegister scopes
and avoid FreeScratchRegs altogether:
WhizzyMacro() {
...
{
ScratchRegister rscratch1(__ as(), r8);
// Code using rscratch1 ...
}
__ InnerMacro();
// Try to use rscratch1 here and you'll get an assertion
// because you don't own it.
{
ScratchRegister rscratch1(__ as(), r8);
// You can use rscratch1 again here.
...
}
}
Visually this makes it clear to the reader that the two "rscratch1"s are
unrelated and not preserved across InnerMacro. Looking at the patch I
can see why FreeScratchRegs helps minimise changes to the existing code,
but it still feels like a bit of a kludge.
>
> Webrev at http://cr.openjdk.java.net/~aph/scratch-regs/webrev/
>
I couldn't apply that cleanly on top of jdk/jdk. Maybe needs rebasing?
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp
+class ScratchRegister {
+ Register _r;
+ MacroAssembler *_masm;
I think this class can inherit from StackObj to disallow new/delete,
like the other RAII types in Hotspot do.
+ ScratchRegister(const ScratchRegister &r) {
+ _r = r._r;
+ _masm = NULL;
+ }
Do we really need to support copying? It seems to complicate the model
quite a lot and has some issues e.g. if the copy outlives the original.
I can only see it being used in the new cmpw overload - do we actually
call that?
src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp
- set_last_Java_frame(esp, rfp, (address) pc(), rscratch1);
+ set_last_Java_frame(esp, rfp, (address) pc(), r8);
This is a bit icky. Can we move the body of _call_Unimplemented into the
.cpp file where there is already a global constant for rscratch1?
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
+void tassert(bool shouldBeTrue, const char *msg) {
+ if (! shouldBeTrue) {
+ fprintf(stderr, "%s\n", msg);
+ }
+}
This isn't used?
> Finally, I'm not wedded to the current syntax for declarations: I can
> think of some nicer ways to do it so that you could say
>
> use_scratch(rscratch1);
>
> rather than
>
> ScratchRegister rscratch1(__ as(), r8);
I think it's OK as-is, and preferable to anything that uses
pre-processor magic. Maybe something like the following using a move
constructor for ScratchRegister would be prettier:
ScratchRegister rscratch1(__ scratch_register(r8));
Thanks,
Nick
More information about the aarch64-port-dev
mailing list