RFR: 8333382: [s390x] Enhance popcnt Instruction to use Z15 facilities [v4]
Amit Kumar
amitkumar at openjdk.org
Mon Jun 10 15:47:13 UTC 2024
On Mon, 10 Jun 2024 15:33:32 GMT, Andrew Haley <aph at openjdk.org> wrote:
>>> > anywhere else must provide a register to use or have some sort of conditionals at the point pop_count_int is used.
>>>
>>> So for machines `<=Z14` we will never go in the `if` block; And the moment we enter in `else` block there is check, which confirms `r_tmp` shouldn't be `noreg` which will force author to pass a temporary register. Isn't it ?
>>
>> The point is to avoid the failure where testing is done on Z15 and up, but some customers have earlier machines. A maintenance programmer uses popcount and all the testing on Z15 passes. The first person to see the failure is the customer. We're setting a trap for maintainers.
>>
>> This is not a complaint about the ad file use case. It's about anywhere else that popcount is used in hand-written assembly code. I'm telling you this from painful experience on AArch64.
>
> It'd be safer to break `pop_count_long` into two parts, for new and old versions, then use the two parts explicitly in s390x.ad patterns. Say, like this:
>
>
> instruct popCountI_Ext3(iRegI dst, iRegI src, flagsReg cr) %{
> match(Set dst (PopCountI src));
> effect(TEMP_DEF dst, KILL cr);
> predicate(UsePopCountInstruction &&
> VM_Version::has_PopCount() &&
> VM_Version::has_MiscInstrExt3());
> ins_cost(DEFAULT_COST);
> size(8); // popcnt + llgfr
> format %{ "POPCNT $dst,$src\t # pop count int" %}
> ins_encode %{
> Register Rdst = $dst$$Register;
> Register Rsrc = $src$$Register;
>
> __ pop_count_int_post_z15(Rdst, Rsrc);
>
> %}
> ins_pipe(pipe_class_dummy);
> %}
>
>
>
>
> void MacroAssembler::pop_count_int(Register r_dst, Register r_src, Register r_tmp) {
> assert(r_tmp != noreg, "temp register required for popcnt, for machines < z15");
> assert_different_registers(r_dst, r_tmp); // if r_src is same as r_tmp, it should be fine
>
> if (VM_Version::has_MiscInstrExt3()) {
> pop_count_int_post_z15(r_dst, r_src);
> } else {
> pop_count_int_pre_z15(r_dst, r_src, r_tmp);
> }
> }
Okay I'm fine with your suggestion. But I only hope that nobody directly uses `z_popcnt` instruction. Otherwise this solution is not gonna prevent the bug you described. Maybe I should put a comment in `assembler_s390.hpp` file as well, to have a look at the helper method present in `macroAssembler_s390.hpp` file instead of using it vanilla.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19509#discussion_r1633471193
More information about the hotspot-dev
mailing list