RFR: 8263718: unused-result warning happens at os_linux.cpp
David Holmes
david.holmes at oracle.com
Sun Mar 21 13:25:38 UTC 2021
Hi Thomas,
On 21/03/2021 3:57 pm, Thomas Stüfe wrote:
> On Sun, Mar 21, 2021 at 6:08 AM David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> On 20/03/2021 10:37 pm, Yasumasa Suenaga wrote:
> > On Sat, 20 Mar 2021 10:20:50 GMT, Magnus Ihse Bursie
> <ihse at openjdk.org <mailto:ihse at openjdk.org>> wrote:
> >
> >>>> I debug'ed fastdebug JDK on Visual Studio, it seems to work
> `alloca()`:
> >>>
> >>> I used cl.exe from Visual Studio 2019 (16.9.1). The generated
> code might be different by compiler version of course.
> >>
> >> @YaSuenag The code generated by debug builds is often
> significantly different from release builds. You might want to check
> the latter instead to figure out if this has any effect on what we
> actually ship.
> >
> > @magicus I checked assembly code of release build, it is similar
> to fastdebug build. The stack (RSP) is expanded.
>
> Thanks for confirming - I was going to make the same point about
> needing
> to check product build.
>
> Now we need someone who can check clang output.
>
> The rough benchmarking I did did not show any benefit to using the
> alloca on Linux. So I think we can safely say it can go on Linux.
>
> The benchmarking on Windows, with alloca removed, did not show any
> degradation in performance. So I think we can asafely say it is okay to
> remove on Windows too.
>
> The benchmarking on macOS show no degradation either, but until I know
> whether the alloca was being elided by clang that may not mean
> anything.
> If it is being elided we need to restore it and then check performance
> again. But I need to know the trick that worked on gcc will also work
> for clang.
>
>
> Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure
> we measure the right thing? I wish there were regression tests telling
> us when to re-apply this optimization.
No, no, no, and me too. It is really frustrating to have an optimization
like this put in place and absolutely zero information as to how the
purported benefit of it was measured and under what conditions.
> I dislike that this leaves us at the mercy of the underlying libc for
> something which is reasonably cheap and simple to do (just one alloca).
Not sure what you mean. We have no idea if the alloca is helpful or
harmful nor whether that depends on whether libc does, or does not, do
something "clever" itself, nor whether the OS (eg. via ASLR) does, or
does not do, something "clever" itself.
And even if we do want the alloca we have to fight the compiler to
ensure it gets used as intended!
Cheers,
David
> ..Thomas
>
> (Please leave the alloca in the AIX implementation; we currently don't
> have the cycles to run regression tests for this)
>
> -----
>
> > (I confirmed it with Visual Studio 16.9.2 because I received
> update notification before your reply...)
> > // Try to randomize the cache line index of hot stack frames.
> > // This helps when threads of the same stack traces evict each
> other's
> > // cache lines. The threads can be either from the same JVM
> instance, or
> > // from different JVM instances. The benefit is especially
> true for
> > // processors with hyperthreading technology.
> > static int counter = 0;
> > int pid = os::current_process_id();
> > 00007FF80F4E10ED mov eax,dword ptr [_initial_pid
> (07FF80F9D8164h)]
> > 00007FF80F4E10F3 test eax,eax
> > 00007FF80F4E10F5 jne thread_native_entry+3Dh
> (07FF80F4E10FDh)
> > 00007FF80F4E10F7 call qword ptr [__imp__getpid
> (07FF80F6EF748h)]
> > _alloca(((pid ^ counter++) & 7) * 128);
> > 00007FF80F4E10FD mov ecx,dword ptr [counter
> (07FF80F9D8300h)]
> > 00007FF80F4E1103 mov edx,ecx
> > 00007FF80F4E1105 xor edx,eax
> > 00007FF80F4E1107 inc ecx
> > 00007FF80F4E1109 mov dword ptr [counter
> (07FF80F9D8300h)],ecx
> > 00007FF80F4E110F and edx,7
> > 00007FF80F4E1112 shl edx,7
> > 00007FF80F4E1115 mov eax,edx
> > 00007FF80F4E1117 lea rcx,[rdx+0Fh]
> > 00007FF80F4E111B cmp rcx,rax
> > 00007FF80F4E111E ja thread_native_entry+6Ah
> (07FF80F4E112Ah)
> > 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h
> > 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h
> > 00007FF80F4E112E mov rax,rcx
> > 00007FF80F4E1131 call __chkstk (07FF80F6ED370h)
> > 00007FF80F4E1136 sub rsp,rcx
> >
> > -------------
> >
> > PR: https://git.openjdk.java.net/jdk/pull/3042
> <https://git.openjdk.java.net/jdk/pull/3042>
> >
>
More information about the hotspot-dev
mailing list