review (M) for 4809552: Optimize Arrays.fill(...)
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Aug 20 10:49:32 PDT 2010
Assembler part review.
In stubGenerator_sparc.cpp
Move next lines above 64 bit value construction:
+ __ cmp(count, 2<<shift); // Short arrays (< 8 bytes) copy by element
+ __ brx(Assembler::lessUnsigned, false, Assembler::pn, L_copy_4_bytes); // use unsigned cmp
+ __ delayed()->nop();
In assembler_x86.cpp
You don't need next:
+ jmpb(L_copy_4_bytes); // all dwords were copied
Use next movdqa since you aligned address:
+ movdqa(Address(to, 0), xtmp);
+ movdqa(Address(to, 16), xtmp);
instead of
+ movq(Address(to, 0), xtmp);
+ movq(Address(to, 8), xtmp);
+ movq(Address(to, 16), xtmp);
+ movq(Address(to, 24), xtmp);
Vladimir
Vladimir Kozlov wrote:
> Tom,
>
> First, I would not call these changes Medium. They are Large at least.
>
> Should we allow OptimizeFill only when UseLoopPredicate is true?
>
> loopTransform.cpp:
>
> In match_fill_loop() should we exclude StoreCMNode also?
>
> RAW store check is hidden in as_AddP()->unpack_offsets(). Should we do
> it explicitly?
>
> store and store_value is not set for "copy candidate":
>
> + if (value->is_Load() && lpt->_body.contains(value)) {
> + // tty->print_cr("possible copy candidate");
> + } else {
> + msg = "variant store value";
> + }
>
> Why you assume that on 'else' it is mem_phi?:
>
> + if (n == head->phi()) {
> + // ok
> + } else {
> + // mem_phi
> + }
>
> Should we also skip proj node (ifFalse) or it is not part of loop body?
>
> + } else if (n->is_CountedLoopEnd()) {
> + // ok so skip it.
>
> + msg = "node used outside loop";
> ^ is
>
> How you translate next assert message?:
> + assert(store_value->is_Load(), "shouldn't only happen for this
> case");
>
> the next dump should be under flag and 'msg' should reflect "possible
> copy" or set msg_node:
>
> + #ifdef ASSERT
> + tty->print_cr("possible copy");
> + store_value->dump();
> + store->dump();
> + #endif
> + msg = "variant store in loop";
>
> For Op_LShiftX there is no check (n->in(1) == head->phi()):
>
> + } else if (n->Opcode() == Op_LShiftX) {
> + shift = n;
> + assert(type2aelembytes(store->as_Mem()->memory_type(), true)
> == 1 << shift->in(2)->get_int(), "scale should match");
>
>
> s_offs already includes base_offset, see GraphKit::array_element_address():
> + aligned = ((arrayOopDesc::base_offset_in_bytes(t) + s_offs *
> element_size) % HeapWordSize == 0);
> Also the above expression is wrong if initial index != 0.
> And actually you don't need to calculate it in match_fill_loop() since
> it is used only in call to StubRoutines::select_fill_function() to verify
> that element type is supported.
>
>
> In intrinsify_fill() initial index value is taking into account for aligned
> but base_offset_in_bytes could be already part of offset and you need
> to multiply by element_size only initial index:
>
> + if (offset != NULL && head->init_trip()->is_Con()) {
> + intptr_t offs = offset->find_intptr_t_type()->get_con() +
> head->init_trip()->get_int();
> + int element_size = type2aelembytes(t);
> + aligned = ((arrayOopDesc::base_offset_in_bytes(t) + offs *
> element_size) % HeapWordSize == 0);
> + }
>
> stubRoutines.cpp:
> why you have specialized copies for testing _jint_fill and _jbyte_fill.
> Is not it covered by TEST_FILL already?
>
> stubGenerator_sparc.cpp:
> + // Generate stub for disjoint short fill. If "aligned" is true, the
> ^ Generate stub for array fill.
>
> + // from: O0
> ^ to
> + // to: O1
> ^ value
>
> O5 is not used and not input argument:
> + const Register offset = O5; // offset from start of arrays
>
> stubs are generated only for byte,short and int, so allowing bollean,
> char and float is wrong:
> + switch (t) {
> + case T_BOOLEAN:
> + case T_BYTE:
> + shift = 2;
> + break;
> + case T_CHAR:
> + case T_SHORT:
> + shift = 1;
> + break;
> + case T_FLOAT:
> + case T_INT:
> + shift = 0;
> + break;
> + default: ShouldNotReachHere();
> + }
>
> The same in assembler_x86.cpp
>
> In stubGenerator_x86_64.cpp
> new fill_32_bytes_forward() is not used.
>
> Remove commented code for T_LONG in both stubGenerator_x86_??.cpp
>
> I did not look on assembler. May be tomorrow.
>
> Thanks,
> Vladimir
>
>
> Tom Rodriguez wrote:
>> 4809552: Optimize Arrays.fill(...)
>> Reviewed-by:
>>
>> This adds new logic to recognize fill idioms and convert them into a
>> call to an optimized fill routine. Loop predication creates easily
>> matched loops that are simply replaced with calls to the new assembly
>> stubs. Currently only 1,2 and 4 byte primitive types are supported.
>> Objects and longs/double will be supported in a later putback. Tested
>> with runthese, nsk and ctw plus jbb2005.
>>
>> http://cr.openjdk.java.net/~never/4809552
More information about the hotspot-compiler-dev
mailing list