Preliminary review: Adding tracing of I/O calls

Wed Nov 7 13:53:57 UTC 2012

Staffan,

When you say you removed all implementation from fileBeginRead, do you mean
you just return null instead of doing the ENABLED check? Does making
ENABLED private yield zero-cost? May give JIT more confidence that this
field isn't modified via reflection from outside.

The other option is to have two separate implementations of the callback
mechanism hidden inside IOTrace calls.  At static init time, you pick one
based on ENABLED.  If not enabled you use an empty method that just returns
null.  The JIT should handle this case well.

Sent from my phone
On Nov 7, 2012 7:56 AM, "Staffan Larsen" <staffan.larsen at oracle.com> wrote:

> An update on performance. I have written microbenchmarks for file and
> socket i/o and compared the results before my suggested changes and after.
>
> FileChannelRead       -4.06%
> FileChannelWrite      -1.06%
> FileInputStream        0.92%
> FileOutputStream       1.32%
> RandomAccessFileRead   1.66%
> RandomAccessFileWrite  0.76%
> SocketChannelReadWrite 0.75%
> SocketReadWrite        0.89%
>
> Negative values means that my changes added a regression. I think most of
> these values are within the margin of error in the measurements. The one
> exception is FileChannelRead. I've rerun this many times and it looks
> fairly consistent around a 4% regression. Why there is only a regression
> when reading from a FileChannel, I don't know.
>
> The 4% number is too high I think and as a result I'm looking at
> alternative implementations. As a first experiment I tried changing
> IoTrace.fileReadBegin/End into something like this:
>
> public static final boolean ENABLED =
> Boolean.getProperty("iotrace.enabled");
>
> public static IoTraceContext fileReadBegin(String path) {
>     if (ENABLED) {
>         ...
>     }
>     return null;
> }
>
> This got the regression down to 2%. Still not good.
>
> Removing all implementation from fileReadBegin/End gets me on par with the
> non-instrumented version.
>
> It looks like some form of dynamic class redefinition is need here. We
> could start out with empty implementations of the methods in IoTrace, and
> redefine the class to one that has implementations when tracing is enabled.
>
>
> /Staffan
>
> On 4 nov 2012, at 13:25, Aleksey Shipilev <aleksey.shipilev at oracle.com>
> wrote:
>
> > On 11/03/2012 01:15 AM, Mandy Chung wrote:
> >> On 11/2/2012 1:47 PM, Staffan Larsen wrote:
> >>> On 2 nov 2012, at 21:12, Mandy Chung<mandy.chung at oracle.com>  wrote:
> >>>
> >>>> The *Begin() methods return a "handle" that will be passed to the
> >>>> *End() methods.  Have you considered to define a type for it rather
> >>>> than Object?
> >>> Something like an empty interface, just to signal the intent?
> >>
> >> Yes I think so.  A marker interface would suffice.
> >>
> >>>> Do you have any performance measurement result that you can share?
> >>> I don't yet have any specific numbers - I'll try to get some. The
> >>> testing I have done indicates that the overhead is negligible. But it
> >>> would be good to confirm this.
> >>
> >> refworkload is easy to run. You probably have talked with the
> >> performance team. You can find out from them which benchmarks, if they
> >> have any, are applicable for IO instrumentation.
> >
> > Performance team here.
> >
> > There are virtually no benchmarks against I/O per se. Looking at the
> > patch, I would think anything doing intensive network I/O would help to
> > quantify the change, which boils down to SPECjbb2012, SPECjEnterprise,
> > and Volano. First two would be hard to run, and the third has terrible
> > run-to-run variance, so you will probably have to quantify the changes
> > with microbenchmarks. It should be easy enough to construct with our
> > micro harness (still not available in OpenJDK). Contact me internally if
> > you want to get that route.
> >
> > General patch review: I do have the preoccupation against interferenced
> > tracing code, and while appreciating the intent for tracing patch, we
> > need to look for performance penalties. The rule of thumb is that
> > HotSpot will optimize away the code guarded by static final flag (or, as
> > Remi points out with jsr292 magic). Doing the calls hoping for HotSpot
> > to inline and figure out the absence of useful work there is not working
> > reliably either. Inline conditionals will cost something if the tracing
> > method is not inlined.
> >
> > Hence, I would rather recommend to switch the uses like this:
> >
> >    public int read() throws IOException {
> >        Object traceHandle = IoTrace.fileReadBegin(path);
> >        int b = read0();
> >        IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1);
> >        return b;
> >    }
> >
> > ...to something more like:
> >
> >    public int read() throws IOException {
> >        if (IoTrace.ENABLED) {
> >            Object traceHandle = IoTrace.fileReadBegin(path);
> >            int b = read0();
> >            IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1);
> >            return b;
> >        } else {
> >            return read0();
> >        }
> >    }
> >
> > where
> >
> >  class IoTrace {
> >     public static final boolean ENABLED =
> > System.getProperty("java.io.trace");
> >  }
> >
> > ...which will demote the flexibility of setListener(), but have much
> > lower runtime overhead. This should be confirmed by microbenchmarking
> > anyway.
> >
> > -Aleksey.
>
>