<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Hi Charles,</p>
    I am in the process of analyzing the performance drop with the new <i>VetorizedStringEncoder
    </i>with "<i>citm_catalog.json</i>",  I documented my steps to
    create a working setup [1].<br>
    In order to collect profiles for custom warmup and measured
    iteration, I created the following Ruby micro benchmark.
    <p><b>File: encoder_benchmark.rb</b></p>
    require "benchmark"<br>
    require 'json'<br>
    <br>
    puts "Ruby Engine: #{RUBY_ENGINE}"<br>
    puts "JSON::Parser: #{JSON::Parser}"<br>
    <br>
    benchmark_name="citm_catalog.json"<br>
    ruby_obj =
JSON.load_file("/mnt/c/Github/workloads/VectorAPI/json/benchmark/data/citm_catalog.json")<br>
    puts "== Encoding #{benchmark_name}"<br>
    <br>
    hash_accum = 0<br>
    def benchmark_coder(benchmark_name, ruby_obj)<br>
      coder = JSON::Coder.new<br>
      json_str = coder.dump(ruby_obj)<br>
      hash_accum = json_str.hash<br>
    end<br>
    <br>
    warmup_execution_time = Benchmark.measure do<br>
      10000.times { benchmark_coder("citm_catalog.json", ruby_obj) }<br>
    end<br>
    puts "Warmup execution time: #{warmup_execution_time.real}"<br>
    <br>
    execution_time = Benchmark.measure do<br>
      20000.times { benchmark_coder("citm_catalog.json", ruby_obj) }<br>
    end<br>
    puts "Execution time: #{execution_time.real}"<br>
    <br>
    My profiles and JIT code samples are placed at the following link
    [2]<br>
    Some initial analysis :-<br>
         -  With optimization, around 27.9% of the time is spent in
    StringEncoder.generate, out of which 11.6% time is spent in
    VectorizedStringEncoder.encode, which<br>
             internally spends around 8% time in StringEncoder.append.<br>
    <p>     -  In baseline JSON, 17.67% of the cycles are spent in
      StringEncoder.generate. </p>
    The performance drop is not related to the Vector API internal
    implementation, as all the APIs are getting intrinsified without any
    boxing penalties, but due<br>
    to the current algorithm. As a next step, I plan to spend time
    optimizing the existing implementation and also develop standalone
    JMH micro benchmarks comparing<br>
    just the String encoding and VectorizedStringEncoding without the
    glue logic for a true apple-to-apple comparison.<br>
    <br>
    Best Regards,<br>
    Jatin
    <p>PS: As per Paul's suggestion, I am also working on optimizing <i>simdjson-java</i>
      using constant index slice [3] and using two lookup tables [4]</p>
    [1]
    <a class="moz-txt-link-freetext"
href="https://github.com/jatin-bhateja/external_staging/blob/main/VectorizedAlgos/JRuby-json-data/jruby_vector_api_setup_steps.txt">https://github.com/jatin-bhateja/external_staging/blob/main/VectorizedAlgos/JRuby-json-data/jruby_vector_api_setup_steps.txt</a><br>
    [2]
    <a class="moz-txt-link-freetext"
href="https://github.com/jatin-bhateja/external_staging/tree/main/VectorizedAlgos/JRuby-json-data">https://github.com/jatin-bhateja/external_staging/tree/main/VectorizedAlgos/JRuby-json-data</a>
    <div class="moz-cite-prefix">[3] <a class="moz-txt-link-freetext"
        href="https://github.com/simdjson/simdjson-java/pull/68">https://github.com/simdjson/simdjson-java/pull/68</a></div>
    <div class="moz-cite-prefix">[4] <a class="moz-txt-link-freetext"
        href="https://github.com/simdjson/simdjson-java/issues/69">https://github.com/simdjson/simdjson-java/issues/69</a></div>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 7/31/2025 8:33 PM, Charles Oliver
      Nutter wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAE-f1xT=AuBJroY=E4JhgunXhu1Z-ixSCbvwmUmcq1GE_CfrEg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="auto">The developer experimenting with vectors has been
        running 21, so I did suggest to him recently to try newer
        releases or dev builds. I'm out of office right now but hoping
        to spend some time in the next week running this through a
        profiler to see if other missed optimizations are interfering
        with the vectorized version of the code.
        <div dir="auto"><br>
        </div>
        <div dir="auto">I also pointed out the other vector-based json
          project to him that was suggested by Daniel. I'm hopeful we
          can get more out of this than we have seen so far once I can
          help profile and dig into optimized results a little bit more.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">There are many other places in JRuby where we
          could use this, such as for handling text transcoding. There
          may even be some Ruby language constructs that could be
          vectorized by JRuby's compiler. I wish I had more hours in the
          day to experiment with this!</div>
      </div>
      <br>
      <div class="gmail_quote gmail_quote_container">
        <div dir="ltr" class="gmail_attr">On Mon, Jul 28, 2025, 22:40
          Paul Sandoz <<a href="mailto:paul.sandoz@oracle.com"
            moz-do-not-send="true" class="moz-txt-link-freetext">paul.sandoz@oracle.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div style="line-break:after-white-space">
            Hi Daniel,
            <div><br>
            </div>
            <div>Thanks for sharing. We have made progress optimizing
              the rearrange/selectFrom operations for UTF-8 related uses
              cases. The improvements were integrated into JDK release
              24 [0].</div>
            <div>
              <div>Further optimizations are in flight for slice
                operations with constant inputs [1], which I believe can
                simplify the referenced code and may further boost
                performance, but we need to verify.</div>
            </div>
            <div><br>
            </div>
            <div>Charlie, what version of the JDK are you using?</div>
            <div><br>
            </div>
            <div>Paul.</div>
            <div><br>
            </div>
            <div>[0] <a href="https://openjdk.org/jeps/489"
                target="_blank" rel="noreferrer" moz-do-not-send="true"
                class="moz-txt-link-freetext">https://openjdk.org/jeps/489</a><br>
              <div>[1] <a
                  href="https://github.com/openjdk/jdk/pull/24104"
                  target="_blank" rel="noreferrer"
                  moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/openjdk/jdk/pull/24104</a></div>
              <div><br>
                <blockquote type="cite">
                  <div>On Jul 16, 2025, at 10:46 AM, Daniel Lemire <<a
                      href="mailto:daniel@lemire.me" target="_blank"
                      rel="noreferrer" moz-do-not-send="true"
                      class="moz-txt-link-freetext">daniel@lemire.me</a>>
                    wrote:</div>
                  <br>
                  <div>
                    <div>
                      <div>Good day Charles,</div>
                      <div><br>
                      </div>
                      <div>The following link might be relevant :</div>
                      <div><br>
                      </div>
                      <div><a
href="https://github.com/simdjson/simdjson-java" target="_blank"
                          rel="noreferrer" moz-do-not-send="true"
                          class="moz-txt-link-freetext">https://github.com/simdjson/simdjson-java</a><br>
                      </div>
                      <div><br>
                      </div>
                      <div>- Daniel</div>
                      <div><br>
                      </div>
                      <blockquote type="cite"
                        id="m_5949170455508475308qt">
                        <div dir="ltr">
                          <div>After seeing similar work done for the C
                            version of the Ruby json standard library, I
                            suggested to the author that we could do the
                            same for JRuby using the Vector API. So he
                            went and did it!</div>
                          <div><br>
                          </div>
                          <div><a
href="https://github.com/ruby/json/pull/824" target="_blank"
                              rel="noreferrer" moz-do-not-send="true"
                              class="moz-txt-link-freetext">https://github.com/ruby/json/pull/824</a></div>
                          <div><br>
                          </div>
                          <div>The results are somewhat mixed;
                            performance of some cases is faster and
                            other cases is slower. We would love to get
                            input from anyone on this list interested in
                            seeing another real-world use case for the
                            Vector API.</div>
                          <div><br>
                          </div>
                          <div>I'm hopeful we can pump up these numbers
                            with some additional tweaking in JRuby and
                            json.</div>
                          <div><br>
                          </div>
                          <div>
                            <div dir="ltr">
                              <div dir="ltr">
                                <div><b>Charles Oliver Nutter</b></div>
                                <div><i>Architect and Technologist</i></div>
                                <div>Headius Enterprises</div>
                                <div><a href="https://www.headius.com/"
                                    target="_blank" rel="noreferrer"
                                    moz-do-not-send="true"
                                    class="moz-txt-link-freetext">https://www.headius.com</a></div>
                                <div>
                                  <div><a
                                      href="mailto:headius@headius.com"
                                      target="_blank" rel="noreferrer"
                                      moz-do-not-send="true"
                                      class="moz-txt-link-freetext">headius@headius.com</a></div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                    </div>
                  </div>
                </blockquote>
              </div>
              <br>
            </div>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>