RFR: 8264543: Cross modify fence optimization for x86 [v2]

Xubo Zhang github.com+58006833+xbzhang99 at openjdk.java.net
Tue Jul 13 21:43:24 UTC 2021


On Thu, 27 May 2021 17:36:24 GMT, Xubo Zhang <github.com+58006833+xbzhang99 at openjdk.org> wrote:

>> Intel introduced a new instruction “serialize” which ensures that all modifications to flags, registers, and memory by previous instructions are completed and all buffered writes are drained to memory before the next instruction is fetched and executed. It is a serializing instruction and can be used to implement cross modify fence (OrderAccess::cross_modify_fence_impl) more efficiently than using “cpuid” on supported 32-bit and 64-bit x86 platforms.
>> 
>> The availability of the SERIALIZE instruction is indicated by the presence of the CPUID feature flag SERIALIZE, bit 14 of the EDX register in sub-leaf CPUID:7H.0H.
>> 
>> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>
> Xubo Zhang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
> 
>   8264543: Using Intel serialize instruction to replace cpuid in Cross modify fence, on supported platforms
>   rebase with master

I profiled vmexit a simple c/asm program that calls cupid and serialize instructions running inside virtual machine, the results showed that each cupid caused a vmexit while serialize did not (excluding fixed overhead):

12000000 asm cpuid:
             VM-EXIT    Samples  Samples% 
               CPUID     12000347   99.88%

12000000 asm serialize:
             VM-EXIT    Samples  Samples%
               CPUID        331           6.25%     

It shows that replacing cpuid with serialize greatly reduced # of vmexit, which benefits java programs running in virtual environment

-------------

PR: https://git.openjdk.java.net/jdk/pull/3334


More information about the hotspot-dev mailing list