Raw data output

Thu Jul 14 10:13:28 UTC 2016

Hi Petr,

On 07/13/2016 05:13 PM, Petr Stefan wrote:
> - Print sample time mode differently than other modes. It's a little
> inconsistent, but probably not too much. Sample time mode will be
> printed in compact way, others will be printed with current format.
> For this, we'd need to split "rawData" key into two new keys like
> "rawDataSimple" and "rawDataCombined" and say that actual data are in
> one of this key, but not both (in which one depends on benchmarking
> mode).

I like this one. "rawData" and "rawDataHistogram" would seem to explain
the intent better. Users would definitely need to have more steps for
processing raw data, but raw data is not supposed to be convenient, it
is supposed to be complete.

> Another question - is it useful to have rawData values divided into
> multiple BenchmarkResult section? I'm talking about
> 
> "rawData" : [
>     [
>         942.0,
>         384.0
>     ],
>     [
>         351.0,
>         781.0
>     ],
>  ...
> It would be more compact and disk space saving. Maybe it is useful for
> something, but I don't see it now. However, preserve this formatting is
> quite easy.

I think we added this to add finer granularity to raw data:

  [
    [
      run1-iteration1-data,
      run1-iteration2-data
    ],
    [
      run2-iteration1-data,
      run2-iteration2-data
    ]
  ]

This might help to analyze data progression from iteration to iteration
and from run to run. We should keep this intact.

>>   *) We would need to be able to trim/disable raw_data printing in JSON
>> output. The common way in JMH is to use system properties. Something
>> like boolean "jmh.json.rawData" would be fine.
> Default will be off?

Default will be on. This is a safeguard switch in case JMH is
accidentally generating gigabytes of data, chocking the output.

Thanks,
-Aleksey