VectorBox enabling

Fri Nov 17 22:11:14 UTC 2017

Hey Vladimir,

Thank you for looking over the patch. :)

As part of the latest patch, the design of how VectorBox works has been modified and works as follows:
- All intrinsifications now call "wrapWithVectorBox" in order to keep track of mapping between vector value and object.
- As part of VectorBox creation, the following nodes are created unconditionally: Allocate (for Vector object), AllocateArray (for array holding values), StoreN (to store array into object's field), and StoreVector (to store vector value into array).
- VectorBox is treated as a multinode and thus the following projections are created: memory, control, and object. The "result" of the intrinsification is set to be the Projection node corresponding to object.

So basically, anyone using the object will end up using the projection from the VectorBox.

During the "early" macro expansion phase (expand_vbox_nodes() happens before EA), it takes a look at the VectorBox out projection. This projection was built during intrinsification phase. If this projection has been removed from graph or if it has outcnt == 0 (every time I have seen it, it is actually removed from graph), the VectorBox node is marked for complete elimination. This is because there is noone actually using the object. As part of eliminating VectorBox, we also have to eliminate all nodes that were created specifically to allocate object and also the stores.

So basically, allocations are safe to remove because we know noone is using object and also we know that object has not been scalarized because this happens before EA. Once we are sure allocations are safe to remove, we call "eliminate_allocate_node". This method has logic to be able to remove allocate nodes and I used it in order to not duplicate code. However, this method does no work unless the flags "_is_non_escaping" and "_is_scalar_replaceable" are set to true. Thus, we manually force them to true because it is safe to do so - we know the object is not being used.

Regarding the issue of the stores being removed before the allocations, I believe it is ok! That is because in general, the removal is expected to be the reverse of the addition of nodes which happened in this order: allocate, allocatearray, storen, storevector. However, to be completely fair, it seems I did not adhere to this reverse ordering strictly since I do removal in this order: storen, storevector, allocate, allocatearray. However, as far as I can see, everything is appropriately removed.

Now with all this explained, do you still see an issue?

Thanks,
Razvan

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] 
Sent: Thursday, November 16, 2017 10:38 AM
To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; 'panama-dev at openjdk.java.net' <panama-dev at openjdk.java.net>
Subject: Re: VectorBox enabling

Nice work, Razvan!

> VectorBox enabling is now mostly complete and appearing to be functional. The VectorBox supports being able to generate objects for all supported Vector objects that have some intrinsic method. This includes GenericMask (subject to some limitation noted below). Additionally, VectorBox nodes can be removed along with their allocations in cases when the objects do not need created. I have tested BLAS (saxpy, sdot) and Sepia demo used in JavaOne and performance has not regressed.
> 
> Please see attached patch and if there are no concerns, I will merge tomorrow.
> http://cr.openjdk.java.net/~rlupusoru/panama/webrev_vectorbox_04/

src/share/vm/opto/macro.cpp:

+      vectorbox_remove_store(_igvn, fstore);
+      vectorbox_remove_store(_igvn, vstore);
+
+      vectorbox_remove_allocate(obj_alloc);
+      vectorbox_remove_allocate(arr_alloc);

+void PhaseMacroExpand::vectorbox_remove_allocate(Node* allocate) {
+  if (allocate != NULL && allocate->is_Allocate()) {
+    allocate->as_Allocate()->_is_scalar_replaceable = true;
+    allocate->as_Allocate()->_is_non_escaping = true;

Here's the problem I see with that: you try to persuade PhaseMacroExpand::eliminate_allocate_node that it's safe to scalarize the object (and remove the allocation). There are additional checks in
PhaseMacroExpand::can_eliminate_allocation() & PhaseMacroExpand::scalar_replacement(), but even if they all pass, C2 can't correctly scalarize the object anymore, because field stores are already removed.

For example, if a scalarizable object is referenced by a safepoint, C2 replaces it with SafePointScalarObjectNode and attaches up-to-date field values to it, so runtime can re-materialize the instance if needed.

It won't work as intended anymore if you remove stores first.

And I'm curious how profitable _is_scalar_replaceable = true trick becomes when you change the order.

Best regards,
Vladimir Ivanov

> Note that in patch you will find some "FIXME" related to masks (namely mask shape and type recovery is not possible at times during intrinsification). After this patch, I will look into solving this problem by potentially having specialized masks for each type and shape combination (as is done for species).
> 
> The main limitations remaining with VectorBox are as follows:
> 
> -          If VectorBox is used by any non-intrinsified calls, stores to heap, or runtime calls via deopt, it will generate an object at the original call site. The plan is to move this to slow path when Vector API object identities can be ignored.
> 
> -          VectorBox for GenericMask does not set the Species field. This will either be fixed in a follow-up patch or the approach for specialized masks will be employed instead.

> 
> Thanks so much!
> 
> --Razvan
>