code movement from slow path into fast path

Tom Rodriguez tom.rodriguez at oracle.com
Thu Mar 27 17:58:02 UTC 2014


Constant placement is somewhat suboptimal currently.  We’re trying to share them when we can so initially they are place at a point that dominates it’s usages.  There are probably later uses though we probably don’t end up sharing them because of spilling.  I’ve seen some much worse cases and I’ve looked at it a bit, but I think we need to revisit how we handle them both during LIR generation and in the register allocator.  It’s definitely wrong for your example.

tom

On Mar 27, 2014, at 6:17 AM, Deneau, Tom <tom.deneau at amd.com> wrote:

> question about code movement and fast_path_probability:
> 
> My snippet looks like this...
> 
>        Word thread = thread();
>        Word top = atomicGetAndAddTlabTop(thread, size);
>        Word end = readTlabEnd(thread);
>        Word newTop = top.add(size);
>        if (useTLAB() && probability(FAST_PATH_PROBABILITY, newTop.belowOrEqual(end))) {
>            // writeTlabTop(thread, newTop) was done by the atomicGetAndAdd
>            result = formatObject(hub, size, top, prototypeMarkWord, fillContents);
>        } else {
>            // slow path requiring eden access, etc.
>        }
> 
> The generated hsail is shown below.  Why would the moves to $d8 and $d9 registers which are used only on the slow path be moved before the compare instruction?
> 
> 	atomic_add_global_u64   $d4, [$d20 + 136], 24;     // $d20 = thread register
> 	ld_global_s64 $d5, [$d20 + 152];                   // readTlabEnd  
> 	add_s64 $d6, $d4, 0x18;                            // newTop = top + size 
> 	mov_b64 $d7, 0x100102d58;                          // class info for the class being allocated
> 	mov_b64 $d8, 0x7f001c0223b0;                       // eden-related pointer used only on the slow path
> 	mov_b64 $d9, 0x7f001c022388;                       // ditto
> 	cmp_lt_b1_u64 $c0, $d5, $d6;                       // newTop.belowOrEqual(end)
> 	cbr $c0, @L10;                                     // @L10 = slow path
> @L26:
> 	ld_global_s64 $d5, [$d7 + 176];                    // fast path object format, etc.
> 	st_global_s64 $d5, [$d4 + 0];
> 	st_global_s32 537003435, [$d4 + 8];
> 	st_global_s32 0, [$d4 + 12];
> 	st_global_s64 0, [$d4 + 16];
>       [...]
> 
> -- Tom
> 



More information about the graal-dev mailing list