Follow-up results for SwissTable with Vector API

Paul Sandoz paul.sandoz at oracle.com
Thu Jan 5 16:29:29 UTC 2023


Hi,

I saw you sent another email prior to this, but for some reason it got lost by the moderation system. (Since you are not a member of the list the emails need to be moderated and approved.)
 

> On Jan 5, 2023, at 8:09 AM, Zhu, Yifan <yzhu104 at UR.Rochester.edu> wrote:
> 
> This is the following up message for https://mail.openjdk.org/pipermail/jdk-dev/2023-January/007288.html.
> 
> > You do:
> >  converted.intoMemorySegment(MemorySegment.ofArray(control), offset, ByteOrder.nativeOrder());
> >
> > Can you just do: 
> >
> >  converted.intoArray(control, offset);
> 
> 
> I did so because I found that Vector<Byte> actually does not have that method.

Ah, yes. There could be an perf issue with memory segment access, although since you had to wrap the array in a segment there will be some cost to that. It’s like if you wrapped the control array in a segment and stored in a field it would work better. 


> After your suggestion, I switched to use ByteVector instead by Vector<Byte>. Surprisingly, this time the hashmap delivers a better performance. It 2~3 times faster during the insertion procedure. 

Good!


> However, there was still a performance gap behind the standard hashmap during finding precedure.
> 
> For the ease of discussion, I attach the relevant code here:
> 
> private int findWithHash(long hash, K key) {
>     byte h2 = Util.h2(hash); //highest 7 bits
>     int position = Util.h1(hash) & bucketMask; // h1 is just long to int
>     int stride = 0;
>     while (true) {
>         var mask = matchByte(position, h2).toLong(); // match byte is to load a vector of byte and do equality comparison
>         while (MaskIterator.hasNext(mask)) { 
>             var bit = MaskIterator.getNext(mask);
>             mask = MaskIterator.moveNext(mask);
>             var index = (position + bit) & bucketMask;
>             if (key.equals(keys[index])) return index;
>         }
> 
>         if (matchEmpty(position).anyTrue()) {
>             return -1;
>         }
> 
>         stride += VECTOR_LENGTH;
>         position = (position + stride) & bucketMask;
>     }
> }
> From Intellij IDEA's profiler, it seems that a large portion of time is spent on building the vectormask. I see there is an underlying bTest operation converting the results to boolean array and then give the mask. Will this be internally optimized to a single movemask operation by JVM?
> 

Can you get an inline/compilation trace like you did for insert?

The VectorMask.toLong method is an intrinsic method.

Try:

  var vmask = matchByte(position, h2);
  var mask = mask.toLong();

Probably will not make any difference, but if the findIInsertSlot performed ok operating on the mask returned from matchEmptyOrDelete it points to an issue with VectorMask.toLong.

Paul.
 
> 
> <Outlook-ejiaczyb.png>
> Schrodinger ZHU Yifan, Ph.D. Student
> Computer Science Department, University of Rochester
> 
> Personal Email: i at zhuyi.fan
> Work Email: yifanzhu at rochester.edu
> Website: https://www.cs.rochester.edu/~yzhu104/Main.html
> Github: SchrodingerZhu
> GPG Fingerprint: BA02CBEB8CB5D8181E9368304D2CC545A78DBCC3
> 
> <Outlook-3nrq0klq.svg>



More information about the panama-dev mailing list