<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<blockquote type="cite" cite="mid:CAL4QsguYkBshtgiD4v2MsnirbOsm+Lbaw6KSAc7pryopN6=z+w@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote gmail_quote_container">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>The fast paths in StringSupport call an out-of-line
stub that does a vectorized copy. At least in theory
C2's auto-vectorizer should be able to do the exact same
thing for a manual loop using charAt, but inline. i.e.
it might even be faster, especially for small strings.
That's why it would be good to try that approach and see
how it compares.<br>
</p>
</div>
</blockquote>
<div>I can take a closer look at this. To check my
understanding, would you expect it to be competitive for
UTF-16, or also UTF-8?</div>
</div>
</div>
</blockquote>
Either should work, though the UTF-16 code for expanding to a char
is more complex, so the vectorizer's pattern matching might fail
there. The code for UTF-8 (well, really latin1) is much simpler
though (just a plain array load), so that one is more likely to work
out of the two.
<blockquote type="cite" cite="mid:CAL4QsguYkBshtgiD4v2MsnirbOsm+Lbaw6KSAc7pryopN6=z+w@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote gmail_quote_container">
<div> For the UTF-8 case, would you expect something like what
proto is currently doing here [1] to get vectorized?</div>
<div><br>
</div>
<div>[1] <a href="https://urldefense.com/v3/__https://github.com/protocolbuffers/protobuf/blob/0a727cfc6e0a6dbeb46716f2f6142b99b6a604e0/java/core/src/main/java/com/google/protobuf/Utf8.java*L939-L990__;Iw!!ACWV5N9M2RV99hQ!OrxYNs2e8L35GrenrzSEvBmcp98_kc6dNk3fRY6NXkidCTXGY9QzRptWKz1YLh7-khqCsK4IDtwfbEiv$" moz-do-not-send="true">https://github.com/protocolbuffers/protobuf/blob/0a727cfc6e0a6dbeb46716f2f6142b99b6a604e0/java/core/src/main/java/com/google/protobuf/Utf8.java#L939-L990</a></div>
</div>
</div>
</blockquote>
<p>This doesn't look like something that would vectorize. Typically,
any non-loop-invariant control flow you have in a loop body will
inhibit vectorization.</p>
<blockquote type="cite" cite="mid:CAL4QsguYkBshtgiD4v2MsnirbOsm+Lbaw6KSAc7pryopN6=z+w@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote gmail_quote_container">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p> I was thinking primarily along the lines of adding a
MemorySegment::copy overload that accepts Strings as a
source (as opposed to e.g. an array), for copying from a
string to a memory segment only. We should probably also
add an overload to SegmentAllocator::allocateFrom that
accepts an offset and a length (we already have two for
full strings). These two overloads could fully support
the sub string use case without looking too out of
place. </p>
</div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>For reading a String, I think your proposal to augment
MemorySegment::getString looks good, but I think we
should leave setString alone in favor of adding a
MS::copy overload (there's the asymmetry I was talking
about before). </p>
</div>
</blockquote>
<div>Thanks, I think I understand better now. Using copy for
this seems a lot nicer than setStringWithoutNullTerminator.</div>
<div> </div>
For the allocateFrom part, do you think it would make sense to
pass the offset/length all the way through
bytesCompatible/copyToSegmentRaw? That could be decided with
benchmarks, and also potentially done later with the same
allocateFrom API shape if it ended up being worthwhile.</div>
</div>
</blockquote>
I think it should work similar to the overload we have with
MemorySegment as a source: i.e. just call allocateNoInit, and then
delegate to MemorySegment::copy.
<blockquote type="cite" cite="mid:CAL4QsguYkBshtgiD4v2MsnirbOsm+Lbaw6KSAc7pryopN6=z+w@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote gmail_quote_container">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>For completeness, I think we should also just add the
MemorySegment::ofString(String, CharSet) overload which
tries to return a read-only view of the string, to match
the existing ofArray methods. This seems generally just
a good primitive to have.</p>
</div>
</blockquote>
<div>That sounds good to me.</div>
<div><br>
</div>
<div>Do you have thoughts on the best way to proceed here? Do
you think it makes sense to do incrementally, or would you
prefer to see all of these related changes happen together
under a single issue?<br>
<br>
</div>
</div>
</div>
</blockquote>
<p>I don't have a preference. Since you've already started a PR for
enhancing getString, maybe you can focus on that for now, and
we'll file followup issues for the others. Splitting things up
might be nice since there's probably some benchmarking work
involved for each. I think the copy and allocateFrom overload can
be done in one patch though.</p>
<p>Jorn</p>
</body>
</html>