100% CPU usage in "VM Thread" for Hotspot 10/11 on x64 platform within data processing application

David Holmes - Sun Microsystems David.Holmes at Sun.COM
Thu Feb 19 00:42:18 PST 2009


David,

Those stacks seem weird to me too. The JVM_FindSignal calls don't make 
any sense like that - it's not a recursive function and shouldn't be 
being called from msvcrt.dll!free_dbg+0x147. My guess is that the JVM 
part of the stacks are not reliable.

Sorry, I know that's not really any help.

If I had to take a wild guess I'd say an exception is occurring during 
thread startup ... the VM signal handlers are probably already in place 
but the thread doesn't have enough VM context to be processed by the 
error handling code ... which is eerily similar to something I've seen 
on linux on rare occasions ...

David Holmes



David Sitsky said the following on 02/19/09 14:42:
> Hi Daniel,
> 
> Thanks for your reply.  FWIW, a number of stacks I have don't mention 
> AsyncGetCallTrace, so perhaps procexp has done a bit of mis-reporting 
> there.  However, all stacks seem to have this kind of pattern:
> 
> ntoskrnl.exe!ExpInterlockedFlushSList+0x126f
> ntoskrnl.exe!KeWaitForMultipleObjects+0xcca
> ntoskrnl.exe!KeWaitForMutexObject+0x2da
> ntoskrnl.exe!_misaligned_access+0x35
> ntoskrnl.exe!MmUnlockPages+0x1160
> ntoskrnl.exe!IoAcquireCancelSpinLock+0x163
> 
> [a bunch of jvm.dll calls]
> 
> msvcrt.dll!free_dbg+0x147
> msvcrt.dll!beginthreadex+0x131
> kernel32.dll!BaseThreadInitThunk+0xd
> ntdll.dll!RtlUserThreadStart+0x21
> 
> or a slight variant:
> 
> ntoskrnl.exe!ExpInterlockedFlushSList+0x126f
> ntoskrnl.exe!RtlNumberOfClearBits+0x5cc
> ntoskrnl.exe!ExpInterlockedFlushSList+0x134d
> ntoskrnl.exe!IoAcquireCancelSpinLock+0x163
> 
> [a bunch of jvm.dll calls]
> 
> msvcrt.dll!free_dbg+0x147
> msvcrt.dll!beginthreadex+0x131
> kernel32.dll!BaseThreadInitThunk+0xd
> ntdll.dll!RtlUserThreadStart+0x21
> 
> Any ideas anyone?
> 
> The call to IoAcquireCancelSpinLock seems to be always the first call 
> outside for jvm.dll in all these stack traces..
> 
> Cheers,
> David
> 
> Daniel D. Daugherty wrote:
>> David,
>>
>> I'm not sure about the rest of your problem, but the
>> mentions of "AsyncGetCallTrace" in your stacks don't
>> mean anything on Windows. It happens to be the nearest
>> named function that the stack trace stuff found.
>>
>> The AsyncGetCallTrace() API isn't supported on Windows
>> and there isn't any code in Sun's JDK that calls
>> AsyncGetCallTrace() on Windows.
>>
>> Hopefully someone else will have more information on
>> what is really happening here.
>>
>> Dan
>>
>>
>> David Sitsky wrote:
>>> I apologise in advance if this is a "breach of protocol".  I have 
>>> submitted a bug through the usual channels, but my experience with 
>>> this approach unfortunately has usually been a dead-end.
>>>
>>> I have a very intensive data application (I/O + CPU + memory) on the 
>>> Windows platform that reliably causes a 100% CPU lockup, but only for 
>>> the x64 distribution (jdk 1.6.0_07, 1.6.0_10 and 1.6.0_12).  When 
>>> using the x32 distribution on the same machines it works fine.  It is 
>>> not machine-specific - I have seen this across 7 different machines, 
>>> and it seems to occur after a few to several hours of processing.  
>>> The JVM is still responsive, but extremely slow.
>>>
>>> Using process explorer, I was able to find the thread in the process 
>>> consuming all the CPU.  The stack traces from procexp have the same 
>>> thread ID as the "VM Thread" from jstack.  The stacks are usually 
>>> something like the following:
>>>
>>> ntoskrnl.exe!ExpInterlockedFlushSList+0x126f
>>> ntoskrnl.exe!RtlNumberOfClearBits+0x5cc
>>> ntoskrnl.exe!ExpInterlockedFlushSList+0x134d
>>> ntoskrnl.exe!IoAcquireCancelSpinLock+0x163
>>> jvm.dll!AsyncGetCallTrace+0x3ac7f
>>> jvm.dll!JVM_FindSignal+0xe4533
>>> jvm.dll!JVM_FindSignal+0x10fa2e
>>> jvm.dll!JVM_FindSignal+0xe3eef
>>> jvm.dll!JVM_FindSignal+0x14d61d
>>> jvm.dll!JVM_FindSignal+0x14dabc
>>> jvm.dll!JVM_FindSignal+0x1593e0
>>> jvm.dll!JVM_FindSignal+0x12a374
>>> jvm.dll!JVM_FindSignal+0x2167d3
>>> jvm.dll!JVM_FindSignal+0x2193e8
>>> jvm.dll!JVM_FindSignal+0x218274
>>> jvm.dll!JVM_FindSignal+0x2186ca
>>> jvm.dll!JVM_FindSignal+0x218cd2
>>> jvm.dll!JVM_FindSignal+0x11d7f9
>>> msvcrt.dll!free_dbg+0x147
>>> msvcrt.dll!beginthreadex+0x131
>>> kernel32.dll!BaseThreadInitThunk+0xd
>>> ntdll.dll!RtlUserThreadStart+0x21
>>>
>>> or
>>>
>>> ntoskrnl.exe!ExpInterlockedFlushSList+0x126f
>>> ntoskrnl.exe!KeWaitForMultipleObjects+0xcca
>>> ntoskrnl.exe!KeWaitForMutexObject+0x2da
>>> ntoskrnl.exe!_misaligned_access+0x35
>>> ntoskrnl.exe!MmUnlockPages+0x1160
>>> ntoskrnl.exe!IoAcquireCancelSpinLock+0x163
>>> jvm.dll!AsyncGetCallTrace+0x3ac7f
>>> jvm.dll!JVM_FindSignal+0xe3eef
>>> jvm.dll!JVM_FindSignal+0x14d61d
>>> jvm.dll!JVM_FindSignal+0x14dabc
>>> jvm.dll!JVM_FindSignal+0x1593e0
>>> jvm.dll!JVM_FindSignal+0x12a374
>>> jvm.dll!JVM_FindSignal+0x2167d3
>>> jvm.dll!JVM_FindSignal+0x2193e8
>>> jvm.dll!JVM_FindSignal+0x218274
>>> jvm.dll!JVM_FindSignal+0x2186ca
>>> jvm.dll!JVM_FindSignal+0x218cd2
>>> jvm.dll!JVM_FindSignal+0x11d7f9
>>> msvcrt.dll!free_dbg+0x147
>>> msvcrt.dll!beginthreadex+0x131
>>> kernel32.dll!BaseThreadInitThunk+0xd
>>> ntdll.dll!RtlUserThreadStart+0x21
>>>
>>> or (without AsyncGetCallTrace):
>>>
>>> ntoskrnl.exe!ExpInterlockedFlushSList+0x14a0
>>> ntoskrnl.exe!IoAcquireCancelSpinLock+0x163
>>> jvm.dll!JVM_EnqueueOperation+0xb19f4
>>> jvm.dll!JVM_FindSignal+0x1b7fcc
>>> jvm.dll!JVM_FindSignal+0x14dcca
>>> jvm.dll!JVM_FindSignal+0x14e17c
>>> jvm.dll!JVM_FindSignal+0x159b80
>>> jvm.dll!JVM_FindSignal+0x12a5e4
>>> jvm.dll!JVM_FindSignal+0x216e53
>>> jvm.dll!JVM_FindSignal+0x219a38
>>> jvm.dll!JVM_FindSignal+0x2188d4
>>> jvm.dll!JVM_FindSignal+0x218d2a
>>> jvm.dll!JVM_FindSignal+0x219332
>>> jvm.dll!JVM_FindSignal+0x11da99
>>> msvcrt.dll!free_dbg+0x147
>>> msvcrt.dll!beginthreadex+0x131
>>> kernel32.dll!BaseThreadInitThunk+0xd
>>> ntdll.dll!RtlUserThreadStart+0x21
>>>
>>> I am more than happy to run test/debug versions in order to assist in 
>>> tracking this down.  I wish I could say, here is a unit test, but its 
>>> a very complex application with complex data processing.  The only 
>>> good news is it seems to be quite reproduceable on our systems.
>>>
>>> Apologies again in advance if this was an inappropriate place to 
>>> post. But given the severity of this issue, I am hoping somebody here 
>>> will be interested in it..
>>>
> 
> 



More information about the hotspot-dev mailing list