RFR 4641897: Faster string conversion of large integers

Fri Jun 21 16:56:42 UTC 2013

On 06/21/2013 09:04 AM, Brian Burkhalter wrote:
> 
> On Jun 21, 2013, at 2:49 AM, Alan Bateman wrote:
> 
>> One part that might need attention is getRadixConversionCache as I
>> could imagine cases where the synchronization could lead to
>> contention. Have you (or Alan) considered alternatives that would
>> limit the synchronization to only when the powers cache needs to be
>> expanded?
> 
> I have not but will do so. I cannot speak for Alan E.

   Yes, as noted in the code comments and in my comments on this list,
it would be possible to do something fancier, perhaps using Future.
This code was intended to be as simple and understandable as possible,
to rapidly give an asymptotically much faster algorithm that would have
minimal review impact but significant performance improvement.  Other
design goals were to not duplicate work in the case that two threads
attempted to convert the same large number and calculate the same cache
value at the same time (this used to be a big problem before the pow()
function got much faster.  Some of those cache expansions required
hours, and you didn't want to do them twice.)

   There was also potential room for argument about the need for a cache
at all.  However, it became clear that keeping a cache significantly
improved performance at the cost of holding on to some memory with the
idea that if you print one big number, you're likely going to print
another big number at some time during the life of the VM.

   It was also a goal at one point in the 11-year-history of this bug to
produce code that could be backported to previous Java versions, thus no
use of Future, etc.  That may no longer be necessary.  I don't know
about other Java profiles that might use this code.  I was asked for
small, simple patches that could be reviewed in stages, so that's
somewhat reflected in the simplicity of that code.  The caches are
extended rarely, so although that method may block, it should not block
for too long.  I'm willing to review any rewrites that people might
suggest.  Frankly, when I started looking at rewriting the cache using
Future, the code complexity and data structures got quite unwieldy very
quickly.  Especially when trying to eliminate duplicated effort.

   If profiling shows that this is a bottleneck, we can revise it, but I
didn't see evidence of that.  It was suggested to make some of these
algorithms multi-threaded, but that was soundly rejected on this list
for this phase.  Perhaps in a later phase when the algorithms become
multi-threaded, the cache can be revised.

   The Schoenhage recursive base conversion code is not triggered unless
numbers are already significantly largish, so it won't affect
performance of smaller numbers, but the algorithm's much-improved
efficiency will make it faster for the larger numbers that trigger it.

-- 
  Alan Eliasen
  eliasen at mindspring.com
  http://futureboy.us/