<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    <div class="moz-cite-prefix">On 2014-01-13 22:39, Jungwoo Ha wrote:<br>
    </div>
    <blockquote
cite="mid:CA+n_jhir_sFMHqTT=de=piJo86TuLbER0MTAiL8SnK7av+fRmA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> <br>
                In CMSCollector there is still this code to change the
                value for ConcGCThreads based on AdjustGCThreadsToCores.
                <div class="im"><br>
                  <br>
                   639       if (AdjustGCThreadsToCores) {<br>
                   640         FLAG_SET_DEFAULT(ConcGCThreads,
                  ParallelGCThreads / 2);<br>
                   641       } else {<br>
                   642         FLAG_SET_DEFAULT(ConcGCThreads, (3 +
                  ParallelGCThreads) / 4);<br>
                   643       }<br>
                  <br>
                </div>
                Do you think that is needed or can we use the same logic
                in both cases given that ParallelGCThreads has a
                different value if AdjustGCThreadsToCores is enabled.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>I am happy to just use FLAG_SET_DEFAULT(ConcGCThreads,
              ParallelGCThreads / 2);<br>
              The original hotspot code used
              FLAG_SET_DEFAULT(ConcGCThreads, (3 + ParallelGCThreads) /
              4); which I think is somewhat arbitrary.</div>
            <div>Now that ParallelGCThreads will reduce on some
              configuration, dividing it into 4 seems to make the
              ConcGCThreads too small.</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Hm. Changing to FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads /
    2) might be the way to go, but I think that should probably done as
    a separate change. That way we can performance test it more
    thoroughly.<br>
    <br>
    <blockquote
cite="mid:CA+n_jhir_sFMHqTT=de=piJo86TuLbER0MTAiL8SnK7av+fRmA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> <br>
                Also, I don't fully understand the name
                AdjustGCThreadsToCores. In
                VM_Version::calc_parallel_worker_threads() for x86 we
                simply active_core_count with 2 if this flag is enabled.
                So, the flag does not really adjust to the cores. It
                seems like it is reduces the number of GC threads. How
                about calling the flag ReduceGCThreads or something like
                that?<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>The flag can be named better. However, ReduceGCThreads
              doesn't seem to reflect what this flag does.</div>
            <div>I am pretty bad at naming, so let me summarize what
              this flag is actually doing.</div>
            <div><br>
            </div>
            <div>The flag adjusts the GC threads to the number of
              "available" physical cores reported by /proc filesystem
              and the CPU mask set by sched_setaffinity.</div>
            <div>For example, ParallelGCThreads will remain the same
              regardless of whether hyperthreading is turned on/off.</div>
            <div>Current hotspot code will have twice more GC threads if
              hyperthreading is on.</div>
            <div>Usually, GC causes huge number of cache misses, thus
              having two GC threads competing for the same physical core
              hurts the GC throughput.</div>
            <div>Current hotspot code doesn't consider CPU mask at all.</div>
            <div>For example, even though the machine has 64 cores, if
              CPU mask is set for 2 cores, current hotspot calculates
              the number of GC threads based on 64.</div>
            <div>Thus, this flag is actually evaluating the number of GC
              threads to the number of physical cores available for the
              JVM process.</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Right. In VM_Version::calc_parallel_worker_threads() we take the
    value of os::active_core_count() and divide it by 2. I guess this is
    to reduce the cache issues. But if the flag is called
    AdjustGCThreadsToCores I would have expected that we set the number
    of GC threads to be equal to the core count. That's why I suggested
    "Reduce" in the name.<br>
    <br>
    Naming is hard and I am not particularly fond of the name
    ReduceGCThreads either. But maybe we can try to come up with
    something else?<br>
    <br>
    <blockquote
cite="mid:CA+n_jhir_sFMHqTT=de=piJo86TuLbER0MTAiL8SnK7av+fRmA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> I think I pointed
                this out earlier, but I don't feel comfortable reviewing
                the changes in os_linux_x86.cpp. I hope someone from the
                Runtime team can review that.</div>
            </blockquote>
            <div><br>
            </div>
            <div>Can you clarify what you meant? /proc & cpu mask is
              dependent on Linux & x86, and I only tested on that
              platform.</div>
            <div>The assumptions I used here is based on the x86 cache
              architecture.</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    What I was trying to say was that I don't know enough about Linux to
    be confident that your implementation of os::active_core_count() is
    the simplest and most stable way to retrieve that information. I'm
    sure it is good, I am just not the right person to review this piece
    of the code. That's why I think it would be good if someone from the
    Runtime team looked at this.<br>
    <br>
    Thanks,<br>
    Bengt<br>
    <br>
    <br>
    <blockquote
cite="mid:CA+n_jhir_sFMHqTT=de=piJo86TuLbER0MTAiL8SnK7av+fRmA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>Jungwoo</div>
            <div><br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>