Code Review Request 7142596: RMI JPRT tests are failing

Tue Jul 17 21:18:37 UTC 2012

Pushed:

http://hg.openjdk.java.net/jdk8/tl/jdk/rev/b6f78869c66d

Darryl, please update the bug report.

Thanks,

s'marks

On 7/17/12 9:08 AM, Stuart Marks wrote:
> OK! Looks good!
>
> I guess I have to push this for you. Can you commit the changeset in your repo,
> with proper commit comments etc., and send me the exported changeset? (Directly
> to me is fine, it doesn't need to be on the open list.) I'll push it later today.
>
> Thanks.
>
> s'marks
>
>
> On 7/16/12 3:54 PM, Darryl Mocek wrote:
>> Here's webrev.05 (http://cr.openjdk.java.net/~dmocek/7142596/webrev.05).
>> Changes:
>>
>> - Simplify createRegistryOnUnusedPort()
>> - Simplify getUnusedRandomPort()
>> - Add isReservedPort() method to determine if a port is a reserved port # (not
>> just one of the FIXED_PORT's, but also any port which shouldn't be used).
>>
>> Passes all tests.
>>
>> Darryl
>>
>> On 07/16/2012 08:42 AM, Stuart Marks wrote:
>>> Hi Darryl,
>>>
>>> Sorry, I'm going to have to ask you to update this again. The issues are in
>>> TestLibrary.java and are all related to catching exceptions and throwing new
>>> exception instances, discarding the original exception (and probably relevant
>>> diagnostic information) in the process.
>>>
>>> Let's get together today and work on it. I expect it will take only half an
>>> hour or so to knock things out.
>>>
>>> s'marks
>>>
>>>
>>>
>>>
>>> On 7/13/12 1:40 PM, Darryl Mocek wrote:
>>>>
>>>> On 07/12/2012 05:13 PM, Stuart Marks wrote:
>>>>> OK, I took a look at the revised webrev. Most stuff is good. A couple changes
>>>>> are necessary though.
>>>>>
>>>>>
>>>>> *** MultipleRegistries.java
>>>>>
>>>>>
>>>>> Not a big deal, but the comment at lines 65-67 is no longer necessary.
>>>> Comment removed.
>>>>>
>>>>>
>>>>> *** TestLibrary.java
>>>>>
>>>>>
>>>>> I think the reserved port range stuff still needs to be fixed up.
>>>>>
>>>>> The comment at lines 81-82 talks about a range 64000-64100 which isn't really
>>>>> present anywhere. The comment should instead say that PORT_MIN and PORT_MAX
>>>>> should be kept adjusted so that they specify the range of ports defined by
>>>>> the constants following. Make sure to do this adjustment; PORT_MAX is 64002
>>>>> here.
>>>> I reserved ports 64000-64100 (even though only 64001-64005 are currently being
>>>> used) for fixed port tests (for tests which require fixed ports) so any future
>>>> tests which require fixed ports can be assured there are ports available
>>>> within
>>>> a range for their use. In fixing the tests, I saw that those who wrote tests
>>>> and used fixed ports grabbed random port numbers all over the place. Providing
>>>> a reserved range allows them to just grab the next number and to keep the
>>>> fixed
>>>> port numbers in a group, even when new tests are added. If the ports are
>>>> defined in a central place, like TestLibrary, then hopefully developers will
>>>> come to TestLibrary to reserve the ports rather then define them in the tests
>>>> themselves, like was done previously. I chose 100 ports for
>>>> "future-proofing"...I don't think 100 is too many, but I've reduced it to 10
>>>> and lowered the max to 64010. I was also thinking of looking for tests which
>>>> used fixed ports outside of RMI tests and standardizing those ports here, but
>>>> that's possible future work and I'm not sure it's necessary.
>>>>>
>>>>> The previous webrev had PORT_MIN and PORT_MAX specify a range of 1024-64000
>>>>> as the *allowed* range for ports returned by getUnusedRandomPort(). But as I
>>>>> described previously, this is difficult to enforce given the varying port
>>>>> allocation schemes on different OSes.
>>>>>
>>>>> Instead I think we need to invert the sense of the range PORT_MIN..PORT_MAX
>>>>> and make it be the range of ports reserved by tests that require fixed ports
>>>>> (64001 through 64005). Then, getUnusedRandomPort() needs to be changed to
>>>>> *disallow* ports in this range, that is, to retry if the port is within this
>>>>> range. Thus the condition on the while-loop has to be inverted, something
>>>>> like
>>>>>
>>>>> while (... unusedRandomPort >= PORT_MIN && unusedRandomPort <= PORT_MAX)
>>>> Yes, this is correct and it has been changed.
>>>>>
>>>>> So, how could getUnusedRandomPort() have worked in its present state? Well I
>>>>> think it ends up retrying 10 times, then at the end of the loop
>>>>> unusedRandomPort is actually some legal port number -- albeit possibly one
>>>>> within the disallowed range -- and almost certainly not -1, and so the method
>>>>> returns it and the caller uses it. So, the success-testing logic here will
>>>>> also have to change so that it retests whether the port is in the disallowed
>>>>> range.
>>>> Correct...done.
>>>>>
>>>>> The comment on getUnusedRandomPort() also needs to be updated to reflect its
>>>>> new policy on the reserved range, as well as throwing an exception on
>>>>> failure. Also (nitpick) it should say "less than" not "less then".
>>>> Yes, I always do this. :(
>>>>>
>>>>> Looking at getUnusedRandomPort() again more closely [sorry] I think the catch
>>>>> of Exception that does nothing is suspicious. I'm not sure why getting a
>>>>> ServerSocket would fail, but if it does (maybe the system is out of ports??)
>>>>> we should just throw out to the caller. Perhaps ideally we'd just have this
>>>>> method throw IOException, but if that requires all callers to be updated, it
>>>>> might be easier just to catch IOException and throw a RuntimeException that
>>>>> wraps the caught IOException.
>>>> I see no harm in trying again if it fails. If we have run out of ports, then
>>>> possibly one or more used ports will be closed and freed and a subsequent try
>>>> will succeed. I don't know all of the cases where a failure might occur.
>>>> Anyway, if this is the prevailing way of handling this, I'll change it.
>>>>>
>>>>> Similar comments apply to the catch(Exception)/do-nothing code in the other
>>>>> utility methods.
>>>> Now throwing RuntimeException in getRegistryPort.
>>>>>
>>>>> Certainly getRegistryPort() should just throw (or possibly wrap and rethrow)
>>>>> any exception caught.
>>>>>
>>>>> For createRegistryOnUnusedPort(), the catching of ExportException is handled
>>>>> properly. The second catch clause of the outer try-block, and the catch of
>>>>> the inner try-block, both ignore exceptions. The code will then end up
>>>>> retrying. Is it reasonable that retrying in these cases will result in a
>>>>> different outcome? My guess is that it's more likely that something is
>>>>> seriously wrong that will cause all the retries to fail, in which case this
>>>>> method will discard the exceptions it has just caught and throw a *new*
>>>>> RemoteException instance.
>>>> This was a holdover from webrev.00 which should have been removed. I agree
>>>> with
>>>> Alan and you here that multiple tries of LocateRegistry.createRegistry(0) is
>>>> unnecessary. I've modified this to make a single attempt at using
>>>> LocateRegistry.createRegistry(0) and if it fails, make a single attempt at
>>>> getting an unused random port and creating a registry on this port, then
>>>> failing with a RuntimeException.
>>>>>
>>>>> I'm particularly sensitive to this; as you might recall a couple weeks ago I
>>>>> was having a wrestling match with the jstatd tests (I finished the match, but
>>>>> I'm not sure I won). The primary problem was that several layers of code
>>>>> would catch and discard exceptions, which made diagnosing the problem
>>>>> incredibly difficult. So, I'd recommend removing the "do nothing" catch
>>>>> clauses.
>>>>>
>>>>> Thinking further about createRegistryOnUnusedPort(), I'm not sure that
>>>>> retrying 10 (or some other number of times) actually makes sense. It does for
>>>>> getUnusedRandomPort(), which has to retry in order to get a port outside the
>>>>> disallowed range. But for creating a registry, createRegistry(0) will usually
>>>>> work the first time. If it throws ExportException, it does so for a specific
>>>>> reason, so we should retry once on an unused random port. But if this still
>>>>> fails, don't think retrying repeatedly makes sense.
>>>> See above.
>>>>
>>>> Changes pass all tests. Updated webrev at
>>>> http://cr.openjdk.java.net/~dmocek/7142596/webrev.04
>>>>
>>>> Darryl
>>>>>
>>>>> s'marks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 7/10/12 2:14 PM, Darryl Mocek wrote:
>>>>>>
>>>>>> On 07/09/2012 04:41 PM, Stuart Marks wrote:
>>>>>>> OK, here's the review for the second half of the files in the webrev. I saw
>>>>>>> your reply to the first half (to which I'll reply separately), and I don't
>>>>>>> think there's anything here that's affected by them.
>>>>>>>
>>>>>>>
>>>>>>> *** AppleUserImpl.java
>>>>>>> *** ApplicationServer.java
>>>>>>>
>>>>>>>
>>>>>>> REGISTRY_PORT should be a local variable; also rename to use mixed case.
>>>>>> Changed to a private registryPort (see next issue).
>>>>>>>
>>>>>>> Eh, whoops, after looking at ApplicationServer.java I see that it accesses
>>>>>>> the REGISTRY_PORT field directly. This is why direct field access is a bad
>>>>>>> idea. :-) Now the question is, has REGISTRY_PORT been initialized before
>>>>>>> ApplicationServer needs it? It turns out that it has been -- but only in
>>>>>>> some
>>>>>>> cases.
>>>>>>>
>>>>>>> It seems like the test is trying to support two modes, one that runs in two
>>>>>>> threads in the same JVM, and the other that runs in two separate JVMs. If
>>>>>>> they are in separate JVMs, things will no longer work because in the JVM
>>>>>>> that
>>>>>>> runs ApplicationServer.main(), AppleUserImpl.REGISTRY_PORT will be -1. I
>>>>>>> suspect that our test environment doesn't support the separate JVM mode,
>>>>>>> but
>>>>>>> it seems unwise to break it.
>>>>>>>
>>>>>>> I'd suggest that in two-JVM mode the classes fall back to using a
>>>>>>> "well-known" default registry port number, which in this case seems like
>>>>>>> 2006.
>>>>>>>
>>>>>>> In single-JVM mode, AppleUserImpl creates an instance of ApplicationServer,
>>>>>>> so I'd suggest adding a method to ApplicationServer that allows
>>>>>>> AppleUserImpl
>>>>>>> to store the randomly-assigned registry port number into it, overriding the
>>>>>>> default value.
>>>>>>>
>>>>>>> This seems like this is the simplest way to preserve the two modes of
>>>>>>> operation but to support the random port selection model we're trying to
>>>>>>> achieve.
>>>>>> Rather then going the "fixed port" route, which is what we're trying to get
>>>>>> away from, I've changed the implementation of both AppletUserImpl's and
>>>>>> ApplicationServer so ApplicationServer requires a port and AppleUserImpl
>>>>>> supplies the port on construction of ApplicationServer. I thought of
>>>>>> modifying
>>>>>> ApplicationServer's constructor to create a port using
>>>>>> TestLibrary.getUnusedRandomPort, but decided requiring a port is better as
>>>>>> ApplicationServer's job is to look for already exported AppleUser objects.
>>>>>>>
>>>>>>>
>>>>>>> *** activatable/EchoImpl.java
>>>>>>>
>>>>>>>
>>>>>>> int registryPort = new Integer(System.getProperty("rmi.registry.port"));
>>>>>>>
>>>>>>> I'd suggest using Integer.parseInt() instead of new Integer(). Not a huge
>>>>>>> deal, but it's probably more conventional to use parseInt() and it avoids
>>>>>>> boxing.
>>>>>>>
>>>>>>> One could probably do Integer.getInteger("rmi.registry.port") but this is
>>>>>>> seems pretty obscure to me even though it's more succinct.
>>>>>>>
>>>>>>> The same also applies to the following:
>>>>>>> - HelloImpl.java
>>>>>>> - unicast/EchoImpl.java
>>>>>>> - ShutdownImpl.java
>>>>>>> - SelfTerminator.java
>>>>>>> - CheckFQDNClient.java
>>>>>>> - LeaseLeakClient.java
>>>>>>> - dgcDeadLock/TestImpl.java
>>>>>>>
>>>>>> Integer.parseInt returns a primitive (which is what the return is assigned
>>>>>> to)
>>>>>> and it appears Integer.parseInt is "faster" then creating a new Integer.
>>>>>> Changed to Integer.parseInt in all places referenced.
>>>>>>>
>>>>>>> *** FiniteGCLatency.java
>>>>>>>
>>>>>>>
>>>>>>> The pattern here is a bit odd, as the test creates the registry, throws
>>>>>>> away
>>>>>>> the returned reference, and then calls getRegistry() to get another
>>>>>>> Registry
>>>>>>> reference. It *seems* like they're identical references, but in fact the
>>>>>>> first is apparently a reference to the actual Registry implementation,
>>>>>>> whereas the second is a remote stub.
>>>>>>>
>>>>>>> The tests seem to do all the actual work using the remote stub, which seems
>>>>>>> proper.
>>>>>>>
>>>>>>> This is confusing, though, as it looks like there's a redundant Registry
>>>>>>> reference now. This might lead someone in the future to "simplify" the test
>>>>>>> by not getting the remote stub, which in turn might invalidate some tests.
>>>>>>> (In fact I was going to suggest this but I decided to investigate further
>>>>>>> first.)
>>>>>>>
>>>>>>> At the very least, I'd suggest renaming the variable that holds the newly
>>>>>>> created Registry to something like "registryImpl" to make it clear that
>>>>>>> it's
>>>>>>> different from the thing returned by getRegistry(), even though they have a
>>>>>>> the same time.
>>>>>>>
>>>>>>> Another possibility is to rearrange the TestLibrary API so that there is a
>>>>>>> single utility method that combines createRegistryOnUnusedPort() and
>>>>>>> getRegistryPort(). That is, it creates a new registry and simply returns
>>>>>>> the
>>>>>>> port on which it was created, not a reference to the registry
>>>>>>> implementation.
>>>>>>> I don't think the registry implementation is actually ever used by the
>>>>>>> tests,
>>>>>>> and it might simplify things a bit as well.
>>>>>>>
>>>>>>> Possibly similar issues with:
>>>>>>> - UnreferencedContext.java
>>>>>>> - NoConsoleOutput.java
>>>>>>>
>>>>>>>
>>>>>>> *** HttpSocketTest.java
>>>>>>>
>>>>>>>
>>>>>>> Unnecessary call to TestLibrary.getUnusedRandomPort()?
>>>>>> Looks like extra code left over from the change from using
>>>>>> TestLibrary.getUnusedRandomPort/LocateRegistry.createRegistry(randomPort) to
>>>>>> TestLibrary.createRegistryOnUnusedPort...removed.
>>>>>>>
>>>>>>>
>>>>>>> *** TestLibrary.java
>>>>>>>
>>>>>>> Mostly pretty straightforward, but I do have some concerns about the random
>>>>>>> port selection and a potential clash with the "reserved port range" as
>>>>>>> defined in this test library.
>>>>>>>
>>>>>>> The getUnusedRandomPort() method attempts to get a socket within the range
>>>>>>> (1024,64000) and will retry 10 times if it can't. Unfortunately, MacOS
>>>>>>> allocates ports more-or-less sequentially in the range [49152, 65536) which
>>>>>>> means that when the kernel's internal counter gets to 64000,
>>>>>>> getUnusedRandomPort()'s retries will fail, causing tests to fail until the
>>>>>>> counter wraps around.
>>>>>>>
>>>>>>> Other systems behave differently; Linux seems to allocate them randomly in
>>>>>>> the range [32768,65536) and Windows XP SP3 allocates them sequentially in
>>>>>>> the
>>>>>>> range (1024,5000]. So it's probably not a problem for them.
>>>>>>>
>>>>>>> I think the thing to do is to check only for "reserved ports" that are
>>>>>>> actually used by tests here. These are in the range [64001,64005]. In
>>>>>>> getUnusedRandomPort(), it should only need to retry if the returned port is
>>>>>>> within this narrow, reserved range. If it's anything else it should be OK.
>>>>>> I'll try setting the range this narrow, but I don't know how many sequential
>>>>>> tests will be run at a time and I'm concerned 5 is too few. The -concurrency
>>>>>> option on jtreg allows you to specify how many concurrent tests will be
>>>>>> run. We
>>>>>> should have enough test ports reserved to satisfy any concurrency request.
>>>>>> I've
>>>>>> run the tests with -concurrency=8 (I have a dual-core system showing 4
>>>>>> CPU's).
>>>>>> I tried reducing the port range to 64001/64002 and concurrency=4 and all
>>>>>> passed
>>>>>> fine, so maybe we're OK with just 5.
>>>>>>>
>>>>>>> On another topic, the three utility methods here:
>>>>>>> - createRegistryOnUnusedPort
>>>>>>> - getRegistryPort
>>>>>>> - getUnusedRandomPort
>>>>>>>
>>>>>>> all catch exceptions and then return illegal values (null or -1), sometimes
>>>>>>> after printing some diagnostic information. The problem is that the caller
>>>>>>> will attempt to soldier on with the illegal return value and will stumble
>>>>>>> over something later, such as NullPointerException or
>>>>>>> IllegalArgumentException. This will probably be obvious but it's equally
>>>>>>> likely to be confusing.
>>>>>>>
>>>>>>> Since these utilities are all called from test code, and the tests are
>>>>>>> relying on them to return valid results, I'd suggest just throwing
>>>>>>> exceptions
>>>>>>> from the utility methods if they fail. This will (should) cause the test to
>>>>>>> error out, but that's OK, as it never could have succeeded anyway if the
>>>>>>> utility call had failed.
>>>>>> I already modified createRegistryOnUnusedPort to throw an exception as
>>>>>> part of
>>>>>> the MultipleRegistries change. I'm now throwing a RuntimeException for
>>>>>> getRegistryPort and getUnusedRandomPort if they fail.
>>>>>>
>>>>>> See updated webrev: http://cr.openjdk.java.net/~dmocek/7142596/webrev.03
>>>>>>
>>>>>> Darryl
>>>>>>>
>>>>>>> s'marks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 7/5/12 2:22 PM, Darryl Mocek wrote:
>>>>>>>> Hello core-libs. Please review this webrev to fix Bugs #7142596 and
>>>>>>>> 7161503.
>>>>>>>> Webrev can be found here:
>>>>>>>> http://cr.openjdk.java.net/~dmocek/7142596/webrev.02.
>>>>>>>> This commit fixes concurrency issues with the RMI tests.
>>>>>>>>
>>>>>>>> - Added TestLibrary.createRegistryOnUnusedPort method. This creates an
>>>>>>>> RMIRegistry on an unused port. It will try up to 10 times before giving
>>>>>>>> up.
>>>>>>>> - Added a TestLibrary.getRegistryPort(Registry) method to get the port
>>>>>>>> number
>>>>>>>> of the registry.
>>>>>>>> - Changed almost all tests from using hard port numbers to using random
>>>>>>>> port
>>>>>>>> numbers for running the RMI Registry and RMID.
>>>>>>>> - Removed othervm from those tests which don't need it.
>>>>>>>> - Added parameters for tests which spawn a separate VM to pass RMI
>>>>>>>> Registry and
>>>>>>>> RMID ports in cases where needed.
>>>>>>>> - Added PropertyPermission to security policy files where needed.
>>>>>>>> - Removed java/rmi and sun/rmi from tests which cannot be run
>>>>>>>> concurrently.
>>>>>>>> - Added java/rmi/Naming to list of tests which cannot be run concurrently.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Darryl
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>