[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Jan 27 16:05:19 UTC 2015
Thanks for the feedback, John!
Updated webrev:
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot
Changes:
- renamed MHI::profileBranch to MHI::profileBoolean, and
ProfileBranchNode to ProfileBooleanNode;
- restructured profile layout ([0] => false_cnt, [1] => true_cnt)
- factored out profile injection in a separate function
(has_injected_profile() in parse2.cpp)
- ProfileBooleanNode stores true/false counts instead of
taken/not_taken counts
- matching from value counts to taken/not_taken happens in
has_injected_profile();
- added BoolTest::ne support
- sharpened test for AndI case: now it checks AndI (ProfileBoolean)
(ConI 1) shape
Best regards,
Vladimir Ivanov
On 1/27/15 3:04 AM, John Rose wrote:
> On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov
> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>
>> What do you think about the following version?
>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02
>>
>> As you suggested, I reified MHI::profileBranch on LambdaForm level and
>> removed @LambdaForm.Shared. My main concern about removing @Sharen was
>> that profile pollution can affect the code before profileBranch call
>> (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is
>> sensitive to that change (there's a 10% difference in peak performance
>> between @Shared and has_injected_profile()).
>>
>> I can leave @Shared as is for now or remove it and work on the fix to
>> the deoptimization counts pollution. What do you prefer?
>
> Generic advice here: It's better to leave it out, if in doubt. If it
> has a real benefit, and we don't have time to make it clean, put it in
> and file a tracking bug to clean it up.
>
> I re-read the change. It's simpler and more coherent now.
>
> I see one more issue which we should fix now, while we can. It's the
> sort of thing which is hard to clean up later.
>
> The two fields of the profileBranch array have obscure and inconsistent
> labelings. It took me some hard thought and the inspection of three
> files to decide what "taken" and "not taken" mean in the C2 code that
> injects the profile. The problem is that, when you look at
> profileBranch, all you see is an integer (boolean) argument and an
> array, and no clear indication about which array element corresponds to
> which argument value. It's made worse by the fact that "taken" and "not
> taken" are not mentioned at all in the JDK code, which instead wires
> together the branches of selectAlternative without much comment.
>
> My preferred formulation, for making things clearer: Decouple the idea
> of branching from the idea of profile injection. Name the intrinsic
> (yes, one more bikeshed color) "profileBoolean" (or even
> "injectBooleanProfile"), and use the natural indexing of the array: 0
> (Java false) is a[0], and 1 (Java true) is a[1]. We might later extend
> this to work with "booleans" (more generally, small-integer flags), of
> more than two possible values, klasses, etc.
>
> This line then goes away, and 'result' is used directly as the profile
> index:
> + int idx = result ? 0 : 1;
>
> The ProfileBooleanNode should have an embedded (or simply indirect)
> array of ints which is a simple copy of the profile array, so there's no
> doubt about which count is which.
>
> The parsing of the predicate that contains "profileBoolean" should
> probably be more robust, at least allowing for 'eq' and 'ne' versions of
> the test. (C2 freely flips comparison senses, in various places.) The
> check for Op_AndI must be more precise; make sure n->in(2) is a constant
> of the expected value (1). The most robust way to handle it (but try
> this another time, I think) would be to make two temp copies of the
> predicate, substituting the occurrence of ProfileBoolean with '0' and
> '1', respectively; if they both fold to '0' and '1' or '1' and '0', then
> you take the indicated action.
>
> I suggest putting the new code in Parse::dynamic_branch_prediction,
> which pattern-matches for injected profiles, into its own subroutine.
> Maybe:
> bool use_mdo = true;
> if (has_injected_profile(btest, test, &taken, ¬_taken)) {
> use_mdo = false;
> }
> if (use_mdo) { ... old code
>
> I see why you used the opposite order in the existing code: It mirrors
> the order of the second and third arguments to selectAlternative. But
> the JVM knows nothing about selectAlternative, so it's just confusing
> when reading the VM code to know which profile array element means what.
>
> — John
>
> P.S. Long experience with byte-order bugs in HotSpot convinces me that
> if you are not scrupulously clear in your terms, when working with equal
> and opposite configuration pairs, you will have a long bug tail,
> especially if you have to maintain agreement about the configurations
> through many layers of software. This is one of those cases. The best
> chance to fix such bugs is not to allow them in the first place. In the
> case of byte-order, we have "first" vs. "second", "MSB" vs. "LSB", and
> "high" vs. "low" parts of values, for values in memory and in registers,
> and all possible misunderstandings about them and their relation have
> probably happened and caused bugs.
More information about the mlvm-dev
mailing list