RFR(L): 8031581: PPC64: Addons and fixes for AIX to pass the jdk regression tests
Volker Simonis
volker.simonis at gmail.com
Mon Jan 20 09:59:13 UTC 2014
On Fri, Jan 17, 2014 at 10:15 PM, Volker Simonis
<volker.simonis at gmail.com> wrote:
> On Tue, Jan 14, 2014 at 10:19 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>> On 14/01/2014 08:40, Volker Simonis wrote:
>>>
>>> Hi,
>>>
>>> could you please review the following changes for the ppc-aix-port
>>> stage/stage-9 repositories (the changes are planned for integration into
>>> ppc-aix-port/stage-9 and subsequent backporting to ppc-aix-port/stage):
>>
>> I'd like to review this but I won't have time until later in the week. From
>> an initial look then there are a few things are not pretty (the changes to
>> fix the AIX problems with I/O cancellation in particular) and I suspect that
>> some refactoring is going to be required to handle some of this cleanly. A
>> minor comment is that bug synopsis doesn't really communicate what these
>> changes are about.
>>
>> -Alan.
>
> Just forwarded the following message from another thread here where it belongs:
>
> On 17/01/2014 16:57, Alan Bateman wrote:
>
> I've finally got to this one. As the event translation issue is now a
> separate issue then I've ignored that part.
>
> I'm not comfortable with the changes to FileDispatcherImpl.c as I
> don't think we shouldn't be calling into IO_ or NET_* functions here.
> I think I get the issue that you have on AIX (and assume it's the
> preClose/dup2 that blocks rather than close) but need a bit of time to
> suggest alternatives. It may be that it will require an AIX specific
> SocketDispatcher. Do you happen to know which tests fail due to this
> part?
>
> The other changes look okay. There is a typo in the change to
> zip_util.c, s/legel/legal/.
>
> In DatagramChannelImpl.c then you handle connect failing with
> EAFNOSUPPORT. I would be tempted to replace the comment to say that it
> EAFNOSUPPORT can be ignored on AIX. A minor comment but the
> indentation for rv = errno can be fixed (I see the BSD code has it
> wrong too).
> On 17/01/2014 21:23, Volker Simonis wrote:
>
> > You're right, one race is with preClose/dup2 but also with other calls
> > like read/fcntl/...
> >
> > There were several tests that failed and once I fixed it they all
> > succeeded. But I can recreate some of the failures for you. The
> > symptoms are always the same: the VMis locked. If you trigger a stack
> > trace you can see that at least on thread is blocked in a I/O
> > operation on a file descriptor like fcntl (e.g. for file locking),
> > read, etc. while another thread is trying to close that socket.
> >
>
> As it happens, we have some carry over issues from the Mac port,
> one of which is that async close of FileChannels will block
> indefinitely in dup2 when there is another thread blocked (on
> fnctl or reading from a pipe ...). I haven't time time to work on
> it but this discussion has reminded me that we need to sort it
> out. I've put a preliminary webrev with the changes here:
>
> http://cr.openjdk.java.net/~alanb/7133499/webrev/
>
> The important part is that it's using signal consistently on
> Linux/Solaris/OSX so that any blocked threads are interrupted. My
> guess is that if NativeThread.c is updated to define a signal on
> AIX they this should resolve some of the issues on AIX.
>
> I would like to see the list of tests failing. If there is an
> issue with dup2 with sockets (and OS X doesn't seem to have that
> issue) then it will require further work but I would at least
> like to start by understanding if this patch will help with the
> FileChannel issues.
Hi Alan,
yes, that's interesting. Sounds like a very similar problem on Mac.
I would suggest the following:
I cut out the "Async Close AIX FIX" stuff from this change (i.e.
"8031581: PPC64: Addons and fixes for AIX to pass the jdk regression
tests" and send out a new webrev for the remaining part. I think that
the remaining part was more or less reviewed and we can then push it
faster.
In the mean time, I'll recheck which tests exactly fail with my
missing "Async Close AIX FIX" stuff and which of these tests will be
fixed by your 7133499 webrev. Maybe we can really get trough with it
or with it and a few enhancements. I'll let you know my results later
today. By the way, my webrev already contained a AixNativeThread.c
implementation in src/aix/native/sun/nio/ch.
The only remaining problem I see with this approach is that we would
need to downport your 7133499 change to 8u-dev in the 8u20 time frame
to make our AIX port work. Would this be OK for you?
Regards,
Volker
More information about the core-libs-dev
mailing list