RFR(M): 8007294: ReduceFieldZeroing doesn't check for dependent load and can lead to incorrect execution
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Feb 15 15:22:38 PST 2013
Sorry, I missed this.
In memnode.cpp could you explain how you can get old_mem->outcnt() == 0
in new check when it has user (this node): Node *mem =
in(MemNode::Memory) ?
In new code in can_capture_store():
Move "(n == st)" check before control edge check which is more expensive.
Don't check (n->as_MergeMem()->memory_at(alias_idx) == m) because m
could be IdxBot slice and later load could still reference alias_idx
slice through it. Just push uses of all MergeMem nodes.
I think we should be more conservative with StrIntrinsic, you can't
relay on address type because these intrinsics have 2 inputs and it may
read before the store since one of them has path to store's memory.
Your alias_idx compare for memory node will not works for (EA)
instances types since their alias idx will be different from general
alias idx.
You are also making decision (failed = false) in parser phase when
load is not generated yet so you may miss the case.
In phaseX.cpp new check "} else if (ReduceFieldZeroing" could be skipped
by next preceding code if there is only one additional user (Store):
} else if (in->outcnt() == 1 &&
in->has_special_unique_user()) {
_worklist.push(in->unique_out());
Use explicit check (in->in(0) != NULL)
Also I think you need to add check (i == MemNode::Memory) to do that
only for loads which depends on this Initialize node and memory projection.
Regards,
Vladimir
On 2/12/13 1:36 PM, Roland Westrelin wrote:
> InitializeNode::can_capture_store() must check that the captured store doesn't overwrite a memory location that is loaded before the store.
>
> http://cr.openjdk.java.net/~roland/8007294/webrev.00/
>
> And I used the following code to check that all stores captured by the existing code are still captured with the new code (with CTW, the regression tests, vm.jit.testlist, vm.regression.testlist, nsk.regression.testlist, nsk.stress.testlist, nsk.monitoring.testlist and reference_server):
> http://cr.openjdk.java.net/~roland/8007294/webrev.test/
>
> Below are reference_server perf results.
>
> Roland.
>
> ============================================================================
> tserver-optim-2: reference_server
> Benchmark Samples Mean Stdev
> jetstream 10 84.09 0.02
> Copy 10 991.60 0.08
> Parse 10 41.60 0.03
> Read 10 15.70 0.03
> Write 10 310.40 0.02
> scimark 10 457.48 0.01
> Sparse 10 217.88 0.03
> LU 10 702.49 0.01
> SOR 10 761.56 0.00
> FFT 10 32.38 0.00
> Monte 10 573.08 0.02
> specjbb2000 10 391197.66 0.01
> First_Warehouse 10 60667.47 0.01
> Last_Warehouse 10 391197.65 0.01
> specjbb2005 10 210427.78 0.01
> peak 10 219131.96 0.02
> peak_warehouse 10 4.40 0.29
> last 10 210427.79 0.01
> interval_average 10 12378.00 0.01
> first 10 31738.81 0.01
> overall_average 10 174274.66 0.01
> last_warehouse 10 8.00 0.00
> specjvm98 10 593.08 0.01
> javac 10 425.92 0.01
> db 10 423.57 0.03
> jess 10 684.69 0.10
> jack 10 493.38 0.01
> compress 10 500.68 0.01
> mtrt 10 1005.75 0.16
> mpegaudio 10 854.65 0.01
> volano25 10 160669.20 0.02
> connections 10 400.00 0.00
> time 10 4.98 0.02
> --------------------------------------------------------------------------
> Weighted Geomean 16531.66
> ============================================================================
> tserver-optim-3: reference_server
> Benchmark Samples Mean Stdev %Diff P Significant
> jetstream 10 84.85 0.02 0.90 0.383 *
> Copy 10 947.10 0.07 4.49 0.192 *
> Parse 10 41.80 0.02 -0.48 0.641 *
> Read 10 15.90 0.04 -1.27 0.408 *
> Write 10 307.70 0.01 0.87 0.281 *
> scimark 10 460.32 0.00 0.62 0.048 *
> Sparse 10 220.68 0.02 1.28 0.216 *
> LU 10 703.39 0.01 0.13 0.711 *
> SOR 10 761.82 0.00 0.03 0.682 *
> FFT 10 32.41 0.00 0.09 0.369 *
> Monte 10 583.32 0.01 1.79 0.036 *
> specjbb2000 10 392140.38 0.01 0.24 0.582 *
> First_Warehouse 10 60657.19 0.01 -0.02 0.950 *
> Last_Warehouse 10 392140.36 0.01 0.24 0.582 *
> specjbb2005 10 209739.73 0.01 -0.33 0.549 *
> peak 10 220593.43 0.02 0.67 0.374 *
> peak_warehouse 10 4.10 0.08 6.82 0.483 *
> last 10 209739.73 0.01 -0.33 0.549 *
> interval_average 10 12337.50 0.01 -0.33 0.549 *
> first 10 31930.08 0.01 0.60 0.098 *
> overall_average 10 175507.96 0.01 0.71 0.040 *
> last_warehouse 10 8.00 0.00 -0.00 0.000 *
> specjvm98 10 588.83 0.01 -0.72 0.084 *
> javac 10 423.75 0.01 -0.51 0.312 *
> db 10 419.99 0.03 -0.84 0.561 *
> jess 10 740.19 0.01 8.11 0.030 *
> jack 10 497.94 0.01 0.92 0.038 *
> compress 10 499.48 0.00 -0.24 0.378 *
> mtrt 10 877.78 0.01 -12.72 0.034 *
> mpegaudio 10 853.79 0.01 -0.10 0.676 *
> volano25 10 159542.09 0.02 -0.70 0.403 *
> connections 10 400.00 0.00 0.00 0.000 *
> time 10 5.02 0.02 -0.70 0.413 *
> --------------------------------------------------------------------------
> Weighted Geomean 16513.23 -0.11
> ============================================================================
>
More information about the hotspot-compiler-dev
mailing list