RFR: 8281631: HashMap.putAll can cause redundant space waste [v3]

Andrew Haley aph-open at littlepinkcloud.com
Sun Feb 20 16:10:22 UTC 2022


On 2/11/22 19:25, XenoAmess wrote:
> On Fri, 11 Feb 2022 18:24:49 GMT, Andrew Haley <aph at openjdk.org> wrote:
> 
>> Just multiply by 0.75.
>>
>> On a modern design, floating-point multiply is 4 clocks latency, 4 ops/clock throughput. FP max is 2 clocks latency, conversions int-float and vice versa 3 clocks latency, 4 ops/clock throughput. Long division is 7-9 clocks, 2ops/clock throughput. Shift and add 2 clocks, 2/3 ops/clock througput. Compare is 1 clock, 3 ops/clock throughput, conditional move is 1 clock, 3 ops/clock throughput.
>>
>> Seems like it's a wash.
> 
> @theRealAph
> 
> no multiply but divide.

Well yes, but that doesn't look at all hard to change.

> besides, did you count the cost for Math.ceil? it is the heaviest part.

Yes. 3 clocks latency, 4 ops/clock throughput. Your hardware may vary.
And that instruction does both the ceil() and the float-int conversion.

(Having said that, I don't know if we currently generate optimal code
for this operation. But of course that can be fixed.)

I don't think this is terribly important, but I don't like to see
attempts at hand optimization in the standard library.


More information about the core-libs-dev mailing list