JDK 9 RFR of 8026236: Add PrimeTest for BigInteger [TEST-ONLY]
Hello, Issue: https://bugs.openjdk.java.net/browse/JDK-8026236 Patch: http://cr.openjdk.java.net/~bpb/8026236/webrev.00/ This test provides a rudimentary verification of isProbablePrime() by: 1 - checking that the first 100000 primes are correctly identified as probably prime 2 - checking that a random sample of positive integers does not reveal non-prime numbers which test as prime The test allows for specification of an alternate source file of prime numbers if one wants to run it by hand. The file of primes included in the patch was limited to 100000 values in the interest of keeping the file size small. I think that the range of random numbers used for the non-prime portion of the test (currently [0,100000)) needs to be reexamined, but I wanted to throw this out there for discussion as-is. I’ve extended the use of the j.u.stream API to the entire test as recommended by Paul in http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-October/021859.htm.... Thanks, Brian
On Apr 23, 2014, at 1:48 AM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
Hello,
Issue: https://bugs.openjdk.java.net/browse/JDK-8026236 Patch: http://cr.openjdk.java.net/~bpb/8026236/webrev.00/
This test provides a rudimentary verification of isProbablePrime() by:
1 - checking that the first 100000 primes are correctly identified as probably prime 2 - checking that a random sample of positive integers does not reveal non-prime numbers which test as prime
Note that the Stream obtained from br.lines will not close the underlying file, since it did not create the resource. You need to place the BufferedReader in the try.
The test allows for specification of an alternate source file of prime numbers if one wants to run it by hand. The file of primes included in the patch was limited to 100000 values in the interest of keeping the file size small.
250k seems reasonable, but i don't know if there will be objections even to that size. Ideally it would be nice to download on demand to some separate repository and cached, but we really don't have the infrastructure in place for that (it would the same repository that could be used to publish JDK builds/images to be accessed by test machines).
I think that the range of random numbers used for the non-prime portion of the test (currently [0,100000)) needs to be reexamined, but I wanted to throw this out there for discussion as-is.
Why not range from [2, upperBound) ? long falsePositives = IntStream.range(0, upperBound) .mapToObj(BigInteger::valueOf) .filter(bi -> !primes.contains(bi)) .filter(bi -> bi.isProbablePrime(certainty)) .count(); Otherwise for better random sampling i recommend using SplittableRandom. Generally i don't like adding non-determinism to tests as it makes it very tricky to track down errors. Paul.
I’ve extended the use of the j.u.stream API to the entire test as recommended by Paul in http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-October/021859.htm....
Thanks,
Brian
Hi Brian, There seems to be a confusion between "upperBound" and the 1st "n" primer numbers. You pass "upperBound" as the parameter "n" of the parsePrimes() method which returns 1st "n" primes from the file (it can return less primes if the file is smaller). I suggest doing the following: - make parsePrimes() take "n" as it does and return 1st "n" primes from the file, but collect them into a NavigableSet (TreeSet) like that: private static *NavigableSet<BigInteger>* parsePrimes(File f, long n) throws IOException { GZIPInputStream gis = new GZIPInputStream(new FileInputStream(f)); BufferedReader br = new BufferedReader(new InputStreamReader(gis)); try (Stream<String> lines = br.lines()) { return lines.limit(n).map(BigInteger::new).collect(*toCollection(TreeSet::new)*); } } This would serve two purposes: - you could obtain the largest prime number from the set easily (NavigableSet.last()) and use it as the "upperBound" for the range of integers you test. - your test for containment would be faster - O(log2(n)) instead of O(n) Regards, Peter On 04/23/2014 01:48 AM, Brian Burkhalter wrote:
Hello,
Issue: https://bugs.openjdk.java.net/browse/JDK-8026236 Patch: http://cr.openjdk.java.net/~bpb/8026236/webrev.00/
This test provides a rudimentary verification of isProbablePrime() by:
1 - checking that the first 100000 primes are correctly identified as probably prime 2 - checking that a random sample of positive integers does not reveal non-prime numbers which test as prime
The test allows for specification of an alternate source file of prime numbers if one wants to run it by hand. The file of primes included in the patch was limited to 100000 values in the interest of keeping the file size small.
I think that the range of random numbers used for the non-prime portion of the test (currently [0,100000)) needs to be reexamined, but I wanted to throw this out there for discussion as-is.
I’ve extended the use of the j.u.stream API to the entire test as recommended by Paul in http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-October/021859.htm....
Thanks,
Brian
Hi Peter, The limitation below is precisely what I referred to in my post in particular with respect to the upper bound of non-primes tested. Thank you for your observations: I will look into the ideas and repost a revised patch later. Regards, Brian On Apr 23, 2014, at 7:28 AM, Peter Levart <peter.levart@gmail.com> wrote:
- you could obtain the largest prime number from the set easily (NavigableSet.last()) and use it as the "upperBound" for the range of integers you test.
Hello, I have posted an updated patch here: http://cr.openjdk.java.net/~bpb/8026236/webrev.01/ Thanks for your comments. Brian On Apr 23, 2014, at 7:28 AM, Peter Levart <peter.levart@gmail.com> wrote:
Hi Brian,
There seems to be a confusion between "upperBound" and the 1st "n" primer numbers. You pass "upperBound" as the parameter "n" of the parsePrimes() method which returns 1st "n" primes from the file (it can return less primes if the file is smaller).
I suggest doing the following: - make parsePrimes() take "n" as it does and return 1st "n" primes from the file, but collect them into a NavigableSet (TreeSet) like that:
private static NavigableSet<BigInteger> parsePrimes(File f, long n) throws IOException { GZIPInputStream gis = new GZIPInputStream(new FileInputStream(f)); BufferedReader br = new BufferedReader(new InputStreamReader(gis)); try (Stream<String> lines = br.lines()) { return lines.limit(n).map(BigInteger::new).collect(toCollection(TreeSet::new)); } }
This would serve two purposes: - you could obtain the largest prime number from the set easily (NavigableSet.last()) and use it as the "upperBound" for the range of integers you test. - your test for containment would be faster - O(log2(n)) instead of O(n)
Regards, Peter
On 04/23/2014 01:48 AM, Brian Burkhalter wrote:
Hello,
Issue: https://bugs.openjdk.java.net/browse/JDK-8026236 Patch: http://cr.openjdk.java.net/~bpb/8026236/webrev.00/
This test provides a rudimentary verification of isProbablePrime() by:
1 - checking that the first 100000 primes are correctly identified as probably prime 2 - checking that a random sample of positive integers does not reveal non-prime numbers which test as prime
The test allows for specification of an alternate source file of prime numbers if one wants to run it by hand. The file of primes included in the patch was limited to 100000 values in the interest of keeping the file size small.
I think that the range of random numbers used for the non-prime portion of the test (currently [0,100000)) needs to be reexamined, but I wanted to throw this out there for discussion as-is.
I’ve extended the use of the j.u.stream API to the entire test as recommended by Paul in http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-October/021859.htm....
Thanks,
Brian
Here is an updated patch http://cr.openjdk.java.net/~bpb/8026236/webrev.02/ which has been revised to obviate the need for a file source of prime numbers. Thanks, Brian On Apr 24, 2014, at 5:12 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
I have posted an updated patch here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.01/
Thanks for your comments.
Prod! On Apr 28, 2014, at 4:02 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
Here is an updated patch
http://cr.openjdk.java.net/~bpb/8026236/webrev.02/
which has been revised to obviate the need for a file source of prime numbers.
Thanks,
Brian
On Apr 24, 2014, at 5:12 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
I have posted an updated patch here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.01/
Thanks for your comments.
Hi Brian, I think the parsePrimes method would be better with a different name since no parsing is occurring anymore. I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-) How about adding another test method which tests some Mersenne primes for primality? [1] I'd prefer to see some comments on the primes method briefly explaining it methodology. Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE? Thanks, -Joe [1] http://en.wikipedia.org/wiki/Mersenne_prime On 5/2/2014 2:22 PM, Brian Burkhalter wrote:
Prod!
On Apr 28, 2014, at 4:02 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
Here is an updated patch
http://cr.openjdk.java.net/~bpb/8026236/webrev.02/
which has been revised to obviate the need for a file source of prime numbers.
Thanks,
Brian
On Apr 24, 2014, at 5:12 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
I have posted an updated patch here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.01/
Thanks for your comments.
On May 3, 2014, at 6:55 PM, Joe Darcy <joe.darcy@oracle.com> wrote:
Hi Brian,
I think the parsePrimes method would be better with a different name since no parsing is occurring anymore.
Yes, and there is no need for the try block. (The use of BitSet.stream() also reminds me we can implement a spliterator with size estimates based on the words in use, currently the stream() is created from an iterator.)
I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-) How about adding another test method which tests some Mersenne primes for primality? [1]
I'd prefer to see some comments on the primes method briefly explaining it methodology. Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE?
+1 to both of those, although perhaps the former could be done as a second iteration to get this code in? Also, probably better to create the "NavigableSet<BigInteger> primes" just once in "main", rather than creating it twice for both tests. Still skeptical about the use of a PRNG in the tests, see my comments in a previous email: http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-April/026579.html Paul.
Joe / Paul, Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here: http://cr.openjdk.java.net/~bpb/8026236/webrev.03/ I believe that it addresses all the points of concern aside from testing Mersenne primes. Please see more comments in line. Thanks, Brian On May 3, 2014, at 9:55 AM, Joe Darcy <joe.darcy@oracle.com> wrote:
Hi Brian,
I think the parsePrimes method would be better with a different name since no parsing is occurring anymore.
Renamed methods to createPrimes() and getPrimes().
I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-)
Done. Simply appended this one value to the list. ;-)
How about adding another test method which tests some Mersenne primes for primality? [1]
No done.
I'd prefer to see some comments on the primes method briefly explaining it methodology.
Done via a link to Wikipedia.
Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE?
Yes, I tried it. Still running after at least ten minutes when I killed it.
Thanks,
-Joe
On May 5, 2014, at 2:11 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
On May 3, 2014, at 6:55 PM, Joe Darcy <joe.darcy@oracle.com> wrote:
Hi Brian,
I think the parsePrimes method would be better with a different name since no parsing is occurring anymore.
Yes, and there is no need for the try block.
Removed.
(The use of BitSet.stream() also reminds me we can implement a spliterator with size estimates based on the words in use, currently the stream() is created from an iterator.)
Made no change based on this comment (which I interpreted as an aside).
I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-) How about adding another test method which tests some Mersenne primes for primality? [1]
I'd prefer to see some comments on the primes method briefly explaining it methodology. Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE?
+1 to both of those, although perhaps the former could be done as a second iteration to get this code in?
This is what I did: +1 on the both but not the Mersennes.
Also, probably better to create the "NavigableSet<BigInteger> primes" just once in "main", rather than creating it twice for both tests.
Done.
Still skeptical about the use of a PRNG in the tests, see my comments in a previous email:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-April/026579.html
Changed to SplittableRandom but collected the intermediate stream into a list which is inspected upon failure to find the value which exposed the misbehavior.
On May 5, 2014, at 11:17 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
Joe / Paul,
Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.03/
I believe that it addresses all the points of concern aside from testing Mersenne primes.
+1 Pau.
On 05/05/2014 11:17 PM, Brian Burkhalter wrote:
Joe / Paul,
Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here:
The command line parsing is off by one, I think, it should check for >= instead of >: 54 // Prepare arguments 55 int upperBound = args.length > 1 ? Integer.valueOf(args[0]) : DEFAULT_UPPER_BOUND; 56 int certainty = args.length > 2 ? Integer.valueOf(args[1]) : DEFAULT_CERTAINTY; 57 boolean parallel = args.length > 3 ? Boolean.valueOf(args[2]) : true; I think this line should go, it doesn't match the shifted indexing: 98 //bs.set(0, 2, true); There are some very long lines: 120 NavigableSet<BigInteger> primes = bs.stream().mapToObj(p -> BigInteger.valueOf(p + 2)).collect(toCollection(TreeSet::new)); 180 List<BigInteger> bigInts = (new SplittableRandom()).ints(NUM_NON_PRIMES, 2, maxPrime).mapToObj(BigInteger::valueOf).filter(b -> !b.isProbablePrime(certainty)).collect(toList()); The arithmetic in checkPrime() is unnecessary because isProbablePrime() will never report a prime as a non-prime (the Monte-Carlo vs Las Vegas thing), so the count will always be exact because you feed it only primes. It is impossible to check for the expected number of failures (non-primes reported as primes) in checkNonPrimes() because this would result in a test that fails sporadically. -- Florian Weimer / Red Hat Product Security Team
Paul / Florian, This is a combined response. I am making the corrections pointed out by Florian. Unless I hear otherwise, I am going to assume that Paul’s approval is still valid with these corrections included and will push the path on Wednesday. Please see the updated patch: http://cr.openjdk.java.net/~bpb/8026236/webrev.04/ Thanks, Brian On May 6, 2014, at 1:25 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
On May 5, 2014, at 11:17 PM, Brian Burkhalter <brian.burkhalter@oracle.com> wrote:
Joe / Paul,
Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.03/
I believe that it addresses all the points of concern aside from testing Mersenne primes.
+1
Paul
On May 6, 2014, at 1:59 AM, Florian Weimer <fweimer@redhat.com> wrote:
On 05/05/2014 11:17 PM, Brian Burkhalter wrote:
Joe / Paul,
Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here:
The command line parsing is off by one, I think, it should check for >= instead of >:
54 // Prepare arguments 55 int upperBound = args.length > 1 ? Integer.valueOf(args[0]) : DEFAULT_UPPER_BOUND; 56 int certainty = args.length > 2 ? Integer.valueOf(args[1]) : DEFAULT_CERTAINTY; 57 boolean parallel = args.length > 3 ? Boolean.valueOf(args[2]) : true;
I think this happened when I replaced the primes file input with programmatic generation of primes.
I think this line should go, it doesn't match the shifted indexing:
98 //bs.set(0, 2, true);
I forgot to delete it when I changed to shifted indexing.
There are some very long lines:
120 NavigableSet<BigInteger> primes = bs.stream().mapToObj(p -> BigInteger.valueOf(p + 2)).collect(toCollection(TreeSet::new));
180 List<BigInteger> bigInts = (new SplittableRandom()).ints(NUM_NON_PRIMES, 2, maxPrime).mapToObj(BigInteger::valueOf).filter(b -> !b.isProbablePrime(certainty)).collect(toList());
Lines split into several at worst not much longer than 80 characters.
The arithmetic in checkPrime() is unnecessary because isProbablePrime() will never report a prime as a non-prime (the Monte-Carlo vs Las Vegas thing), so the count will always be exact because you feed it only primes.
Good point: check removed.
It is impossible to check for the expected number of failures (non-primes reported as primes) in checkNonPrimes() because this would result in a test that fails sporadically.
There is no change as a result of this comment.
On 05/06/2014 06:38 PM, Brian Burkhalter wrote:
Paul / Florian,
This is a combined response. I am making the corrections pointed out by Florian. Unless I hear otherwise, I am going to assume that Paul’s approval is still valid with these corrections included and will push the path on Wednesday. Please see the updated patch:
Looks good to me now (but I'm not an official reviewer). -- Florian Weimer / Red Hat Product Security Team
On May 9, 2014, at 1:48 AM, Florian Weimer <fweimer@redhat.com> wrote:
On 05/06/2014 06:38 PM, Brian Burkhalter wrote:
Paul / Florian,
This is a combined response. I am making the corrections pointed out by Florian. Unless I hear otherwise, I am going to assume that Paul’s approval is still valid with these corrections included and will push the path on Wednesday. Please see the updated patch:
Looks good to me now (but I'm not an official reviewer).
The approval is informative and appreciated nonetheless. Thanks, Brian
-- Florian Weimer / Red Hat Product Security Team
Hi Brian, I've filed a follow-up bug: JDK-8042478: Include Mersenne primes in BigInteger primality testing Cheers, -Joe On 5/5/2014 2:17 PM, Brian Burkhalter wrote:
Joe / Paul,
Thanks for the comments. This an aggregate response to your two messages. An updated version of the patch is here:
http://cr.openjdk.java.net/~bpb/8026236/webrev.03/ <http://cr.openjdk.java.net/%7Ebpb/8026236/webrev.03/>
I believe that it addresses all the points of concern aside from testing Mersenne primes.
Please see more comments in line.
Thanks,
Brian
On May 3, 2014, at 9:55 AM, Joe Darcy <joe.darcy@oracle.com <mailto:joe.darcy@oracle.com>> wrote:
Hi Brian,
I think the parsePrimes method would be better with a different name since no parsing is occurring anymore.
Renamed methods to createPrimes() and getPrimes().
I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-)
Done. Simply appended this one value to the list. ;-)
How about adding another test method which tests some Mersenne primes for primality? [1]
No done.
I'd prefer to see some comments on the primes method briefly explaining it methodology.
Done via a link to Wikipedia.
Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE?
Yes, I tried it. Still running after at least ten minutes when I killed it.
Thanks,
-Joe
On May 5, 2014, at 2:11 AM, Paul Sandoz <paul.sandoz@oracle.com <mailto:paul.sandoz@oracle.com>> wrote:
On May 3, 2014, at 6:55 PM, Joe Darcy <joe.darcy@oracle.com <mailto:joe.darcy@oracle.com>> wrote:
Hi Brian,
I think the parsePrimes method would be better with a different name since no parsing is occurring anymore.
Yes, and there is no need for the try block.
Removed.
(The use of BitSet.stream() also reminds me we can implement a spliterator with size estimates based on the words in use, currently the stream() is created from an iterator.)
Made no change based on this comment (which I interpreted as an aside).
I think someone in this test the fact that Integer.MAX_VALUE is a prime should be mentioned in taken advantage of :-) How about adding another test method which tests some Mersenne primes for primality? [1]
I'd prefer to see some comments on the primes method briefly explaining it methodology. Would the running time be unacceptable (or memory usage too large) if the limit were set to Integer.MAX_VALUE?
+1 to both of those, although perhaps the former could be done as a second iteration to get this code in?
This is what I did: +1 on the both but not the Mersennes.
Also, probably better to create the "NavigableSet<BigInteger> primes" just once in "main", rather than creating it twice for both tests.
Done.
Still skeptical about the use of a PRNG in the tests, see my comments in a previous email:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-April/026579.html
Changed to SplittableRandom but collected the intermediate stream into a list which is inspected upon failure to find the value which exposed the misbehavior.
participants (5)
-
Brian Burkhalter
-
Florian Weimer
-
Joe Darcy
-
Paul Sandoz
-
Peter Levart