New allocation profiler

Vladimir Sitnikov sitnikov.vladimir at gmail.com
Wed Apr 15 10:03:41 UTC 2015


>having no way to be notified about thread creation and destruction

I see at least two approaches there:
1) Java agent + hotswap Thread.init / Thread.exit.
2) lurk on java.lang.Thread#nextThreadID to detect and warn user on
non-captured threads (e.g. separate output counter with "number of
non-captured threads")

> I wonder if we should fold this new profiler into -prof gc, that reports
> churn rates. Users will then have a more complete picture of what is
> going on:

I wonder if profilers can have arguments.
Personally, I do not like that -prof gc prints lots of excessive
information. Well, it is useful when you really want to know that, but
I do not always want to know all the details.
Throwing all kind of metaspaces to an unprepared user is not that good either.

Something like "-prof gc" to print just basic allocation profile and
"-prof gc=details" to include space breakdown would be nice.

I use separate "profiler label" to make output compact.

> a,b,c

Those a fair points, however I do not think you always want double check that.
You do it once, then you run your benchmark with different parameters
and you know that allocation profile is measured "properly".

That relates to "why don't we always use all the profilers?" question.
Well, do you?

> profile is not something prudent to do. Don't use that mode for tiny
> benchmarks, and then allocation profile skew would be minimal.

Ok.

> Code comments:
>
>  * I would think counters should be "gc.alloc.rate" and
> "gc.alloc.rate.norm". It would send the message that "norm" is the
> derivative, not measured directly. It also aligns with what -prof gc does.

So you put the hardest problem first, don't you?
"allocation" is arguably a GC thing.
You can allocate without a single GC.
However, "gc.alloc.rate" is understandable enough and having proper
sorting is good, so I've updated that.

>  * Do we actually need static initializers? That will init the profiler
> even when we don't use it. E.g. it will be inited on any
> ProfilerFactory.getAvailableProfilers() call.

I just thought it would be nice to avoid duplication of getMethod stuff.
Do you think NoClass...Errors would appear in corner cases?
I've changed to catch(Throwable) in the clinit so the the error does
not block the whole thing.

>  * estimateAllocatedBytes() seems to skip current thread from
> estimation. Why bother doing the profiler code garbage-free then?

That's fair. Initially I thought the API might be useful outside
somehow, but it looks like we do not need that.

Am I right "current thread" means "jmh infrastructure only"?
The idea is to include allocations made by user-provided code only.

>  * "// To avoid allocation while returning two values from
> getTotalAllocatedMemory" comment references an old method name :) It
> took me a while to realize you are using two fields to avoid
> constructing a tuple-like wrapper class -- should the comment call that
> out better?

The code is removed in favour of tuple.

>  * Returning -1 as "absent" result. Double.NaN is universally used to
> denote the cases like that.

That's reasonable.
ClassloaderProfiler and CompilerProfiler obviously use -1.

>  * "threadMxBean" and "getThreadAllocatedBytes" should be shorter and/or
> capitalized?

Indeed. Not sure how to make GET_THREAD_ALLOCATED_BYTES shorter though.

Vladimir
-------------- next part --------------
# HG changeset patch
# User Vladimir Sitnikov <sitnikov.vladimir at gmail.com>
# Date 1428845687 -10800
#      Sun Apr 12 16:34:47 2015 +0300
# Node ID e46b44c314d5eb4ea4f96ff2efa1613a2b0cac69
# Parent  1898f77c754c84e33da6b74b4e913f009cd27841
Add profiler for memory allocation. The implementation is based on ThreadMXBean.getThreadAllocatedBytes.
The implementation can miss allocations if new thread is created and destroyed in between profiler snapshots.

diff -r 1898f77c754c -r e46b44c314d5 jmh-core-it/src/test/java/org/openjdk/jmh/it/profilers/HotspotAllocationProfilerTest.java
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/jmh-core-it/src/test/java/org/openjdk/jmh/it/profilers/HotspotAllocationProfilerTest.java	Sun Apr 12 16:34:47 2015 +0300
@@ -0,0 +1,61 @@
+package org.openjdk.jmh.it.profilers;
+
+import org.junit.Test;
+import org.openjdk.jmh.annotations.*;
+import org.openjdk.jmh.it.Fixtures;
+import org.openjdk.jmh.profile.HotspotAllocationProfiler;
+import org.openjdk.jmh.runner.Runner;
+import org.openjdk.jmh.runner.RunnerException;
+import org.openjdk.jmh.runner.options.Options;
+import org.openjdk.jmh.runner.options.OptionsBuilder;
+
+import java.util.ArrayList;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Tests allocation profiler.
+ */
+ at OutputTimeUnit(TimeUnit.NANOSECONDS)
+public class HotspotAllocationProfilerTest {
+    @Benchmark
+    @Warmup(iterations = 2)
+    @Measurement(iterations = 1, time = 1, timeUnit = TimeUnit.MILLISECONDS)
+    @BenchmarkMode(Mode.AverageTime)
+    @Fork(1)
+    public Object allocateObject() {
+        return new Object();
+    }
+
+    @Benchmark
+    @Warmup(iterations = 2)
+    @Measurement(iterations = 5)
+    @BenchmarkMode(Mode.SingleShotTime)
+    @Fork(1)
+    public Object allocateObjectSingleShot() {
+        return new Object();
+    }
+
+    @Benchmark
+    @Warmup(iterations = 2)
+    @Measurement(iterations = 1)
+    @BenchmarkMode(Mode.SampleTime)
+    @Fork(1)
+    public Object allocateObjectSampleTime() {
+        return new Object();
+    }
+
+    @Test
+    public void testAllocationProfiler() throws RunnerException {
+        HotspotAllocationProfiler profiler = new HotspotAllocationProfiler();
+        ArrayList<String> errors = new ArrayList<String>();
+        if (!profiler.checkSupport(errors)) {
+            System.out.println("Allocation profiler is not available: " + errors.toString());
+            return;
+        }
+        Options opts = new OptionsBuilder()
+                .include(Fixtures.getTestMask(this.getClass()))
+                .addProfiler(HotspotAllocationProfiler.class)
+                .build();
+        new Runner(opts).run();
+    }
+}
diff -r 1898f77c754c -r e46b44c314d5 jmh-core/src/main/java/org/openjdk/jmh/profile/HotspotAllocationProfiler.java
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/jmh-core/src/main/java/org/openjdk/jmh/profile/HotspotAllocationProfiler.java	Sun Apr 12 16:34:47 2015 +0300
@@ -0,0 +1,196 @@
+/*
+ * Copyright (c) 2014, 2015, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Oracle designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Oracle in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+package org.openjdk.jmh.profile;
+
+import org.openjdk.jmh.infra.BenchmarkParams;
+import org.openjdk.jmh.infra.IterationParams;
+import org.openjdk.jmh.results.AggregationPolicy;
+import org.openjdk.jmh.results.IterationResult;
+import org.openjdk.jmh.results.Result;
+
+import java.lang.management.ManagementFactory;
+import java.lang.management.ThreadMXBean;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.*;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Measures thread allocated memory in bytes via {@code ThreadMXBean.getThreadAllocatedBytes}.
+ * Profiler measures allocation of all the threads but current one.
+ */
+public class HotspotAllocationProfiler implements InternalProfiler {
+    private static final Method GET_THREAD_ALLOCATED_BYTES;
+    private static final Collection<String> FAIL_MSGS;
+
+    private AllocationSnapshot beforeAllocated;
+    private long beforeTime;
+
+    static {
+        ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
+        Collection<String> failMessages;
+        Method getBytes;
+        try {
+            getBytes = threadMXBean.getClass()
+                    .getMethod("getThreadAllocatedBytes", long[].class);
+            getBytes.setAccessible(true);
+            failMessages = Collections.emptyList();
+        } catch (Throwable e) { // To avoid jmh failure in case of incompatible JDK and/or inaccessible method
+            getBytes = null;
+            failMessages = Collections.singletonList(
+                    "Method getThreadAllocatedBytes is not found in ThreadMXBean: " + e
+                            .getMessage() + ", are you running HotSpot VM 1.6u26+?");
+        }
+        GET_THREAD_ALLOCATED_BYTES = getBytes;
+        FAIL_MSGS = failMessages;
+    }
+
+    @Override
+    public boolean checkSupport(List<String> msgs) {
+        msgs.addAll(FAIL_MSGS);
+        return FAIL_MSGS.isEmpty();
+    }
+
+    @Override
+    public String label() {
+        return "hs_alloc";
+    }
+
+    @Override
+    public String getDescription() {
+        return "Memory allocation profiler";
+    }
+
+    @Override
+    public void beforeIteration(BenchmarkParams benchmarkParams,
+                                IterationParams iterationParams) {
+        beforeAllocated = AllocationSnapshot.create();
+        beforeTime = System.nanoTime();
+    }
+
+    @Override
+    public Collection<? extends Result> afterIteration(
+            BenchmarkParams benchmarkParams, IterationParams iterationParams,
+            IterationResult result) {
+        long afterTime = System.nanoTime();
+        AllocationSnapshot newSnapshot = AllocationSnapshot.create();
+        if (beforeAllocated == AllocationSnapshot.EMPTY) {
+            // When allocation profiling fails, make sure it is distinguishable in report
+            // The assumption is the profiler always fails or it always works, so
+            // it won't be required to combine NaN and regular results.
+            return Collections.singleton(
+                    new ProfilerResult(Defaults.PREFIX + "gc.alloc.rate",
+                            Double.NaN,
+                            "MB/sec", AggregationPolicy.AVG));
+        }
+
+        long allocated = newSnapshot.subtract(beforeAllocated);
+        // When no allocations measured, we just ignore zero result lines to improve readability
+        if (allocated == 0) {
+            return Collections.emptyList();
+        }
+        ArrayList<Result> results = new ArrayList<Result>(2);
+        if (afterTime - beforeTime != 0) {
+            results.add(
+                    new ProfilerResult(Defaults.PREFIX + "gc.alloc.rate",
+                            1.0 * allocated / 1024 / 1024 * TimeUnit.SECONDS.toNanos(1) / (afterTime - beforeTime),
+                            "MB/sec", AggregationPolicy.AVG));
+        }
+        if (result.getMetadata().getAllOps() != 0) {
+            results.add(
+                    new ProfilerResult(Defaults.PREFIX + "gc.alloc.rate.norm",
+                            1.0 * allocated / result.getMetadata().getAllOps(),
+                            "B/op", AggregationPolicy.AVG));
+        }
+        return results;
+    }
+
+    private static class AllocationSnapshot {
+        private final static AllocationSnapshot EMPTY = new AllocationSnapshot(new long[0], new long[0]);
+        private final long[] threadIds;
+        private final long[] allocatedBytes;
+
+        private AllocationSnapshot(long[] threadIds, long[] allocatedBytes) {
+            this.threadIds = threadIds;
+            this.allocatedBytes = allocatedBytes;
+        }
+
+        /**
+         * Takes a snapshot of thread allocation counters.
+         * The method might allocate, however it is assumed that allocations made by "current thread" will
+         * be excluded from the result while performing {@link org.openjdk.jmh.profile.HotspotAllocationProfiler.AllocationSnapshot#subtract(AllocationSnapshot)}
+         *
+         * @return snapshot of thread allocation counters
+         */
+        public static AllocationSnapshot create() {
+            if (GET_THREAD_ALLOCATED_BYTES == null) {
+                return AllocationSnapshot.EMPTY;
+            }
+            try {
+                ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
+                long[] threadIds = threadMXBean.getAllThreadIds();
+                long[] allocatedBytes = (long[]) GET_THREAD_ALLOCATED_BYTES.invoke(threadMXBean, (Object) threadIds);
+                return new AllocationSnapshot(threadIds, allocatedBytes);
+            } catch (IllegalAccessException e) {
+                /* intentionally left blank */
+            } catch (InvocationTargetException e) {
+                /* intentionally left blank */
+            }
+            // In exceptional cases, assume information is not available
+            return AllocationSnapshot.EMPTY;
+        }
+
+        /**
+         * Estimates allocated bytes based on two snapshots.
+         * The problem is threads can come and go while performing the benchmark,
+         * thus we would miss allocations made in a thread that was created and died between the snapshots.
+         *
+         * <p>Current thread is intentionally excluded since it believed to execute jmh infrastructure code only.
+         *
+         * @return estimated number of allocated bytes between profiler calls
+         */
+        public long subtract(AllocationSnapshot other) {
+            HashMap<Long, Integer> prevIndex = new HashMap<Long, Integer>();
+            for (int i = 0; i < other.threadIds.length; i++) {
+                long id = other.threadIds[i];
+                prevIndex.put(id, i);
+            }
+            long currentThreadId = Thread.currentThread().getId();
+            long allocated = 0;
+            for (int i = 0; i < threadIds.length; i++) {
+                long id = threadIds[i];
+                if (id == currentThreadId) {
+                    continue;
+                }
+                allocated += allocatedBytes[i];
+                Integer prev = prevIndex.get(id);
+                if (prev != null) {
+                    allocated -= other.allocatedBytes[prev];
+                }
+            }
+            return allocated;
+        }
+    }
+}
diff -r 1898f77c754c -r e46b44c314d5 jmh-core/src/main/java/org/openjdk/jmh/profile/ProfilerFactory.java
--- a/jmh-core/src/main/java/org/openjdk/jmh/profile/ProfilerFactory.java	Tue Apr 07 13:13:18 2015 +0300
+++ b/jmh-core/src/main/java/org/openjdk/jmh/profile/ProfilerFactory.java	Sun Apr 12 16:34:47 2015 +0300
@@ -40,6 +40,7 @@
         profs.add(ClassloaderProfiler.class);
         profs.add(CompilerProfiler.class);
         profs.add(GCProfiler.class);
+        profs.add(HotspotAllocationProfiler.class);
         profs.add(HotspotClassloadingProfiler.class);
         profs.add(HotspotCompilationProfiler.class);
         profs.add(HotspotMemoryProfiler.class);


More information about the jmh-dev mailing list