Code Review Request 7142596: RMI JPRT tests are failing

Tue Jul 17 16:08:40 UTC 2012

OK! Looks good!

I guess I have to push this for you. Can you commit the changeset in your repo, 
with proper commit comments etc., and send me the exported changeset? (Directly 
to me is fine, it doesn't need to be on the open list.) I'll push it later today.

Thanks.

s'marks

On 7/16/12 3:54 PM, Darryl Mocek wrote:
> Here's webrev.05 (http://cr.openjdk.java.net/~dmocek/7142596/webrev.05). Changes:
>
> - Simplify createRegistryOnUnusedPort()
> - Simplify getUnusedRandomPort()
> - Add isReservedPort() method to determine if a port is a reserved port # (not
> just one of the FIXED_PORT's, but also any port which shouldn't be used).
>
> Passes all tests.
>
> Darryl
>
> On 07/16/2012 08:42 AM, Stuart Marks wrote:
>> Hi Darryl,
>>
>> Sorry, I'm going to have to ask you to update this again. The issues are in
>> TestLibrary.java and are all related to catching exceptions and throwing new
>> exception instances, discarding the original exception (and probably relevant
>> diagnostic information) in the process.
>>
>> Let's get together today and work on it. I expect it will take only half an
>> hour or so to knock things out.
>>
>> s'marks
>>
>>
>>
>>
>> On 7/13/12 1:40 PM, Darryl Mocek wrote:
>>>
>>> On 07/12/2012 05:13 PM, Stuart Marks wrote:
>>>> OK, I took a look at the revised webrev. Most stuff is good. A couple changes
>>>> are necessary though.
>>>>
>>>>
>>>> *** MultipleRegistries.java
>>>>
>>>>
>>>> Not a big deal, but the comment at lines 65-67 is no longer necessary.
>>> Comment removed.
>>>>
>>>>
>>>> *** TestLibrary.java
>>>>
>>>>
>>>> I think the reserved port range stuff still needs to be fixed up.
>>>>
>>>> The comment at lines 81-82 talks about a range 64000-64100 which isn't really
>>>> present anywhere. The comment should instead say that PORT_MIN and PORT_MAX
>>>> should be kept adjusted so that they specify the range of ports defined by
>>>> the constants following. Make sure to do this adjustment; PORT_MAX is 64002
>>>> here.
>>> I reserved ports 64000-64100 (even though only 64001-64005 are currently being
>>> used) for fixed port tests (for tests which require fixed ports) so any future
>>> tests which require fixed ports can be assured there are ports available within
>>> a range for their use. In fixing the tests, I saw that those who wrote tests
>>> and used fixed ports grabbed random port numbers all over the place. Providing
>>> a reserved range allows them to just grab the next number and to keep the fixed
>>> port numbers in a group, even when new tests are added. If the ports are
>>> defined in a central place, like TestLibrary, then hopefully developers will
>>> come to TestLibrary to reserve the ports rather then define them in the tests
>>> themselves, like was done previously. I chose 100 ports for
>>> "future-proofing"...I don't think 100 is too many, but I've reduced it to 10
>>> and lowered the max to 64010. I was also thinking of looking for tests which
>>> used fixed ports outside of RMI tests and standardizing those ports here, but
>>> that's possible future work and I'm not sure it's necessary.
>>>>
>>>> The previous webrev had PORT_MIN and PORT_MAX specify a range of 1024-64000
>>>> as the *allowed* range for ports returned by getUnusedRandomPort(). But as I
>>>> described previously, this is difficult to enforce given the varying port
>>>> allocation schemes on different OSes.
>>>>
>>>> Instead I think we need to invert the sense of the range PORT_MIN..PORT_MAX
>>>> and make it be the range of ports reserved by tests that require fixed ports
>>>> (64001 through 64005). Then, getUnusedRandomPort() needs to be changed to
>>>> *disallow* ports in this range, that is, to retry if the port is within this
>>>> range. Thus the condition on the while-loop has to be inverted, something like
>>>>
>>>> while (... unusedRandomPort >= PORT_MIN && unusedRandomPort <= PORT_MAX)
>>> Yes, this is correct and it has been changed.
>>>>
>>>> So, how could getUnusedRandomPort() have worked in its present state? Well I
>>>> think it ends up retrying 10 times, then at the end of the loop
>>>> unusedRandomPort is actually some legal port number -- albeit possibly one
>>>> within the disallowed range -- and almost certainly not -1, and so the method
>>>> returns it and the caller uses it. So, the success-testing logic here will
>>>> also have to change so that it retests whether the port is in the disallowed
>>>> range.
>>> Correct...done.
>>>>
>>>> The comment on getUnusedRandomPort() also needs to be updated to reflect its
>>>> new policy on the reserved range, as well as throwing an exception on
>>>> failure. Also (nitpick) it should say "less than" not "less then".
>>> Yes, I always do this. :(
>>>>
>>>> Looking at getUnusedRandomPort() again more closely [sorry] I think the catch
>>>> of Exception that does nothing is suspicious. I'm not sure why getting a
>>>> ServerSocket would fail, but if it does (maybe the system is out of ports??)
>>>> we should just throw out to the caller. Perhaps ideally we'd just have this
>>>> method throw IOException, but if that requires all callers to be updated, it
>>>> might be easier just to catch IOException and throw a RuntimeException that
>>>> wraps the caught IOException.
>>> I see no harm in trying again if it fails. If we have run out of ports, then
>>> possibly one or more used ports will be closed and freed and a subsequent try
>>> will succeed. I don't know all of the cases where a failure might occur.
>>> Anyway, if this is the prevailing way of handling this, I'll change it.
>>>>
>>>> Similar comments apply to the catch(Exception)/do-nothing code in the other
>>>> utility methods.
>>> Now throwing RuntimeException in getRegistryPort.
>>>>
>>>> Certainly getRegistryPort() should just throw (or possibly wrap and rethrow)
>>>> any exception caught.
>>>>
>>>> For createRegistryOnUnusedPort(), the catching of ExportException is handled
>>>> properly. The second catch clause of the outer try-block, and the catch of
>>>> the inner try-block, both ignore exceptions. The code will then end up
>>>> retrying. Is it reasonable that retrying in these cases will result in a
>>>> different outcome? My guess is that it's more likely that something is
>>>> seriously wrong that will cause all the retries to fail, in which case this
>>>> method will discard the exceptions it has just caught and throw a *new*
>>>> RemoteException instance.
>>> This was a holdover from webrev.00 which should have been removed. I agree with
>>> Alan and you here that multiple tries of LocateRegistry.createRegistry(0) is
>>> unnecessary. I've modified this to make a single attempt at using
>>> LocateRegistry.createRegistry(0) and if it fails, make a single attempt at
>>> getting an unused random port and creating a registry on this port, then
>>> failing with a RuntimeException.
>>>>
>>>> I'm particularly sensitive to this; as you might recall a couple weeks ago I
>>>> was having a wrestling match with the jstatd tests (I finished the match, but
>>>> I'm not sure I won). The primary problem was that several layers of code
>>>> would catch and discard exceptions, which made diagnosing the problem
>>>> incredibly difficult. So, I'd recommend removing the "do nothing" catch
>>>> clauses.
>>>>
>>>> Thinking further about createRegistryOnUnusedPort(), I'm not sure that
>>>> retrying 10 (or some other number of times) actually makes sense. It does for
>>>> getUnusedRandomPort(), which has to retry in order to get a port outside the
>>>> disallowed range. But for creating a registry, createRegistry(0) will usually
>>>> work the first time. If it throws ExportException, it does so for a specific
>>>> reason, so we should retry once on an unused random port. But if this still
>>>> fails, don't think retrying repeatedly makes sense.
>>> See above.
>>>
>>> Changes pass all tests. Updated webrev at
>>> http://cr.openjdk.java.net/~dmocek/7142596/webrev.04
>>>
>>> Darryl
>>>>
>>>> s'marks
>>>>
>>>>
>>>>
>>>>
>>>> On 7/10/12 2:14 PM, Darryl Mocek wrote:
>>>>>
>>>>> On 07/09/2012 04:41 PM, Stuart Marks wrote:
>>>>>> OK, here's the review for the second half of the files in the webrev. I saw
>>>>>> your reply to the first half (to which I'll reply separately), and I don't
>>>>>> think there's anything here that's affected by them.
>>>>>>
>>>>>>
>>>>>> *** AppleUserImpl.java
>>>>>> *** ApplicationServer.java
>>>>>>
>>>>>>
>>>>>> REGISTRY_PORT should be a local variable; also rename to use mixed case.
>>>>> Changed to a private registryPort (see next issue).
>>>>>>
>>>>>> Eh, whoops, after looking at ApplicationServer.java I see that it accesses
>>>>>> the REGISTRY_PORT field directly. This is why direct field access is a bad
>>>>>> idea. :-) Now the question is, has REGISTRY_PORT been initialized before
>>>>>> ApplicationServer needs it? It turns out that it has been -- but only in
>>>>>> some
>>>>>> cases.
>>>>>>
>>>>>> It seems like the test is trying to support two modes, one that runs in two
>>>>>> threads in the same JVM, and the other that runs in two separate JVMs. If
>>>>>> they are in separate JVMs, things will no longer work because in the JVM
>>>>>> that
>>>>>> runs ApplicationServer.main(), AppleUserImpl.REGISTRY_PORT will be -1. I
>>>>>> suspect that our test environment doesn't support the separate JVM mode, but
>>>>>> it seems unwise to break it.
>>>>>>
>>>>>> I'd suggest that in two-JVM mode the classes fall back to using a
>>>>>> "well-known" default registry port number, which in this case seems like
>>>>>> 2006.
>>>>>>
>>>>>> In single-JVM mode, AppleUserImpl creates an instance of ApplicationServer,
>>>>>> so I'd suggest adding a method to ApplicationServer that allows
>>>>>> AppleUserImpl
>>>>>> to store the randomly-assigned registry port number into it, overriding the
>>>>>> default value.
>>>>>>
>>>>>> This seems like this is the simplest way to preserve the two modes of
>>>>>> operation but to support the random port selection model we're trying to
>>>>>> achieve.
>>>>> Rather then going the "fixed port" route, which is what we're trying to get
>>>>> away from, I've changed the implementation of both AppletUserImpl's and
>>>>> ApplicationServer so ApplicationServer requires a port and AppleUserImpl
>>>>> supplies the port on construction of ApplicationServer. I thought of
>>>>> modifying
>>>>> ApplicationServer's constructor to create a port using
>>>>> TestLibrary.getUnusedRandomPort, but decided requiring a port is better as
>>>>> ApplicationServer's job is to look for already exported AppleUser objects.
>>>>>>
>>>>>>
>>>>>> *** activatable/EchoImpl.java
>>>>>>
>>>>>>
>>>>>> int registryPort = new Integer(System.getProperty("rmi.registry.port"));
>>>>>>
>>>>>> I'd suggest using Integer.parseInt() instead of new Integer(). Not a huge
>>>>>> deal, but it's probably more conventional to use parseInt() and it avoids
>>>>>> boxing.
>>>>>>
>>>>>> One could probably do Integer.getInteger("rmi.registry.port") but this is
>>>>>> seems pretty obscure to me even though it's more succinct.
>>>>>>
>>>>>> The same also applies to the following:
>>>>>> - HelloImpl.java
>>>>>> - unicast/EchoImpl.java
>>>>>> - ShutdownImpl.java
>>>>>> - SelfTerminator.java
>>>>>> - CheckFQDNClient.java
>>>>>> - LeaseLeakClient.java
>>>>>> - dgcDeadLock/TestImpl.java
>>>>>>
>>>>> Integer.parseInt returns a primitive (which is what the return is assigned
>>>>> to)
>>>>> and it appears Integer.parseInt is "faster" then creating a new Integer.
>>>>> Changed to Integer.parseInt in all places referenced.
>>>>>>
>>>>>> *** FiniteGCLatency.java
>>>>>>
>>>>>>
>>>>>> The pattern here is a bit odd, as the test creates the registry, throws away
>>>>>> the returned reference, and then calls getRegistry() to get another Registry
>>>>>> reference. It *seems* like they're identical references, but in fact the
>>>>>> first is apparently a reference to the actual Registry implementation,
>>>>>> whereas the second is a remote stub.
>>>>>>
>>>>>> The tests seem to do all the actual work using the remote stub, which seems
>>>>>> proper.
>>>>>>
>>>>>> This is confusing, though, as it looks like there's a redundant Registry
>>>>>> reference now. This might lead someone in the future to "simplify" the test
>>>>>> by not getting the remote stub, which in turn might invalidate some tests.
>>>>>> (In fact I was going to suggest this but I decided to investigate further
>>>>>> first.)
>>>>>>
>>>>>> At the very least, I'd suggest renaming the variable that holds the newly
>>>>>> created Registry to something like "registryImpl" to make it clear that it's
>>>>>> different from the thing returned by getRegistry(), even though they have a
>>>>>> the same time.
>>>>>>
>>>>>> Another possibility is to rearrange the TestLibrary API so that there is a
>>>>>> single utility method that combines createRegistryOnUnusedPort() and
>>>>>> getRegistryPort(). That is, it creates a new registry and simply returns the
>>>>>> port on which it was created, not a reference to the registry
>>>>>> implementation.
>>>>>> I don't think the registry implementation is actually ever used by the
>>>>>> tests,
>>>>>> and it might simplify things a bit as well.
>>>>>>
>>>>>> Possibly similar issues with:
>>>>>> - UnreferencedContext.java
>>>>>> - NoConsoleOutput.java
>>>>>>
>>>>>>
>>>>>> *** HttpSocketTest.java
>>>>>>
>>>>>>
>>>>>> Unnecessary call to TestLibrary.getUnusedRandomPort()?
>>>>> Looks like extra code left over from the change from using
>>>>> TestLibrary.getUnusedRandomPort/LocateRegistry.createRegistry(randomPort) to
>>>>> TestLibrary.createRegistryOnUnusedPort...removed.
>>>>>>
>>>>>>
>>>>>> *** TestLibrary.java
>>>>>>
>>>>>> Mostly pretty straightforward, but I do have some concerns about the random
>>>>>> port selection and a potential clash with the "reserved port range" as
>>>>>> defined in this test library.
>>>>>>
>>>>>> The getUnusedRandomPort() method attempts to get a socket within the range
>>>>>> (1024,64000) and will retry 10 times if it can't. Unfortunately, MacOS
>>>>>> allocates ports more-or-less sequentially in the range [49152, 65536) which
>>>>>> means that when the kernel's internal counter gets to 64000,
>>>>>> getUnusedRandomPort()'s retries will fail, causing tests to fail until the
>>>>>> counter wraps around.
>>>>>>
>>>>>> Other systems behave differently; Linux seems to allocate them randomly in
>>>>>> the range [32768,65536) and Windows XP SP3 allocates them sequentially in
>>>>>> the
>>>>>> range (1024,5000]. So it's probably not a problem for them.
>>>>>>
>>>>>> I think the thing to do is to check only for "reserved ports" that are
>>>>>> actually used by tests here. These are in the range [64001,64005]. In
>>>>>> getUnusedRandomPort(), it should only need to retry if the returned port is
>>>>>> within this narrow, reserved range. If it's anything else it should be OK.
>>>>> I'll try setting the range this narrow, but I don't know how many sequential
>>>>> tests will be run at a time and I'm concerned 5 is too few. The -concurrency
>>>>> option on jtreg allows you to specify how many concurrent tests will be
>>>>> run. We
>>>>> should have enough test ports reserved to satisfy any concurrency request.
>>>>> I've
>>>>> run the tests with -concurrency=8 (I have a dual-core system showing 4
>>>>> CPU's).
>>>>> I tried reducing the port range to 64001/64002 and concurrency=4 and all
>>>>> passed
>>>>> fine, so maybe we're OK with just 5.
>>>>>>
>>>>>> On another topic, the three utility methods here:
>>>>>> - createRegistryOnUnusedPort
>>>>>> - getRegistryPort
>>>>>> - getUnusedRandomPort
>>>>>>
>>>>>> all catch exceptions and then return illegal values (null or -1), sometimes
>>>>>> after printing some diagnostic information. The problem is that the caller
>>>>>> will attempt to soldier on with the illegal return value and will stumble
>>>>>> over something later, such as NullPointerException or
>>>>>> IllegalArgumentException. This will probably be obvious but it's equally
>>>>>> likely to be confusing.
>>>>>>
>>>>>> Since these utilities are all called from test code, and the tests are
>>>>>> relying on them to return valid results, I'd suggest just throwing
>>>>>> exceptions
>>>>>> from the utility methods if they fail. This will (should) cause the test to
>>>>>> error out, but that's OK, as it never could have succeeded anyway if the
>>>>>> utility call had failed.
>>>>> I already modified createRegistryOnUnusedPort to throw an exception as
>>>>> part of
>>>>> the MultipleRegistries change. I'm now throwing a RuntimeException for
>>>>> getRegistryPort and getUnusedRandomPort if they fail.
>>>>>
>>>>> See updated webrev: http://cr.openjdk.java.net/~dmocek/7142596/webrev.03
>>>>>
>>>>> Darryl
>>>>>>
>>>>>> s'marks
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 7/5/12 2:22 PM, Darryl Mocek wrote:
>>>>>>> Hello core-libs. Please review this webrev to fix Bugs #7142596 and
>>>>>>> 7161503.
>>>>>>> Webrev can be found here:
>>>>>>> http://cr.openjdk.java.net/~dmocek/7142596/webrev.02.
>>>>>>> This commit fixes concurrency issues with the RMI tests.
>>>>>>>
>>>>>>> - Added TestLibrary.createRegistryOnUnusedPort method. This creates an
>>>>>>> RMIRegistry on an unused port. It will try up to 10 times before giving up.
>>>>>>> - Added a TestLibrary.getRegistryPort(Registry) method to get the port
>>>>>>> number
>>>>>>> of the registry.
>>>>>>> - Changed almost all tests from using hard port numbers to using random
>>>>>>> port
>>>>>>> numbers for running the RMI Registry and RMID.
>>>>>>> - Removed othervm from those tests which don't need it.
>>>>>>> - Added parameters for tests which spawn a separate VM to pass RMI
>>>>>>> Registry and
>>>>>>> RMID ports in cases where needed.
>>>>>>> - Added PropertyPermission to security policy files where needed.
>>>>>>> - Removed java/rmi and sun/rmi from tests which cannot be run concurrently.
>>>>>>> - Added java/rmi/Naming to list of tests which cannot be run concurrently.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Darryl
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>