RFR(M): 8166317: InterpreterCodeSize should be computed
Christian Thalinger
cthalinger at twitter.com
Mon Oct 9 19:45:58 UTC 2017
> On Oct 9, 2017, at 9:24 AM, Volker Simonis <volker.simonis at gmail.com> wrote:
>
> Hi Vladimir,
>
> I've analyzed the crash. The problem is Sparc specific because on
> Sparc we do not call the SharedRuntime for G1 pre/post barriers (i.e.
> SharedRuntime::g1_wb_pre() / SharedRuntime::g1_wb_post()) like on
> other architectures. Instead we lazily create assembler stubs on the
> fly (generate_satb_log_enqueue_if_necessary() /
> generate_dirty_card_log_enqueue_if_necessary()) when they are needed.
Why are we using these stubs on SPARC? Can we get rid of them and just call into the runtime instead?
> This happens during the generation of the interpreter and allocates
> more memory in the code cache such that we can't shrink the memory
> which was initially allocated for the interpreter any more.
>
> Unfortunately we can't easily generate these stubs during
> 'stubRoutines_init1()' because
> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
> base address which is only initialized in
> 'CardTableModRefBS::initialize()' during 'univers_init()' which
> happens after 'stubRoutines_init1()'.
>
> I'm still thinking about a good way to fix this without too many
> platfrom-specific ifdefs.
>
> Regards,
> Volker
>
>
> On Tue, Oct 3, 2017 at 9:46 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>> I rebased it. But there is problem with changes. VM hit guarantee() in this
>> code when run on SPARC in both, fastdebug and product, builds.
>> Crash happens during build. We can't push this - problem should be
>> investigated and fixed first.
>>
>> Thanks,
>> Vladimir
>>
>> make/Main.gmk:443: recipe for target 'generate-link-opt-data' failed
>> /usr/ccs/bin/bash: line 4: 9349 Abort (core dumped)
>> /s/build/solaris-sparcv9-debug/support/interim-image/bin/java
>> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
>> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -cp
>> /s/build/solaris-sparcv9-debug/support/classlist.jar
>> build.tools.classlist.HelloClasslist 2>&1 >
>> /s/build/solaris-sparcv9-debug/support/link_opt/default_jli_trace.txt
>> make[3]: *** [/s/build/solaris-sparcv9-debug/support/link_opt/classlist]
>> Error 134
>> make[2]: *** [generate-link-opt-data] Error 1
>>
>>
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error (/s/open/src/hotspot/share/memory/heap.cpp:233), pid=9349,
>> tid=2
>> # guarantee(b == block_at(_next_segment - actual_number_of_segments))
>> failed: Intermediate allocation!
>> #
>> # JRE version: (10.0) (fastdebug build )
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
>> 10-internal+0-2017-09-30-014154.8166317, mixed mode, tiered, compressed
>> oops, g1 gc, solaris-sparc)
>> # Core dump will be written. Default location: /s/open/make/core or
>> core.9349
>> #
>> # If you would like to submit a bug report, please visit:
>> # http://bugreport.java.com/bugreport/crash.jsp
>> #
>>
>> --------------- S U M M A R Y ------------
>>
>> Command Line:
>> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
>> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true
>> build.tools.classlist.HelloClasslist
>>
>> Host: sca00dbv, Sparcv9 64 bit 3600 MHz, 16 cores, 32G, Oracle Solaris 11.2
>> SPARC
>> Time: Sat Sep 30 03:29:46 2017 UTC elapsed time: 0 seconds (0d 0h 0m 0s)
>>
>> --------------- T H R E A D ---------------
>>
>> Current thread (0x000000010012f000): JavaThread "Unknown thread"
>> [_thread_in_vm, id=2, stack(0x0007fffef9700000,0x0007fffef9800000)]
>>
>> Stack: [0x0007fffef9700000,0x0007fffef9800000], sp=0x0007fffef97ff020,
>> free space=1020k
>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>> j=interpreted, Vv=VM code, C=native code)
>> V [libjvm.so+0x1f94508] void VMError::report_and_die(int,const char*,const
>> char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned
>> long)+0xa58
>> V [libjvm.so+0x1f93a3c] void VMError::report_and_die(Thread*,const
>> char*,int,const char*,const char*,void*)+0x3c
>> V [libjvm.so+0xd02f38] void report_vm_error(const char*,int,const
>> char*,const char*,...)+0x78
>> V [libjvm.so+0xfc219c] void CodeHeap::deallocate_tail(void*,unsigned
>> long)+0xec
>> V [libjvm.so+0xbf4f14] void CodeCache::free_unused_tail(CodeBlob*,unsigned
>> long)+0xe4
>> V [libjvm.so+0x1e0ae70] void StubQueue::deallocate_unused_tail()+0x40
>> V [libjvm.so+0x1e7452c] void TemplateInterpreter::initialize()+0x19c
>> V [libjvm.so+0x1051220] void interpreter_init()+0x20
>> V [libjvm.so+0x10116e0] int init_globals()+0xf0
>> V [libjvm.so+0x1ed8548] int
>> Threads::create_vm(JavaVMInitArgs*,bool*)+0x4a8
>> V [libjvm.so+0x11c7b58] int
>> JNI_CreateJavaVM_inner(JavaVM_**,void**,void*)+0x108
>> C [libjli.so+0x7950] InitializeJVM+0x100
>>
>>
>> On 10/2/17 7:55 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> I can sponsor this for you once you rebase, and fix these compilation
>>> errors.
>>> Thanks,
>>> Coleen
>>>
>>> On 9/30/17 12:28 AM, Volker Simonis wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>> thanks a lot for remembering these changes!
>>>>
>>>> Regards,
>>>> Volker
>>>>
>>>>
>>>> Vladimir Kozlov <vladimir.kozlov at oracle.com
>>>> <mailto:vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 15:47:
>>>>
>>>> I hit build failure when tried to push changes:
>>>>
>>>> src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' :
>>>> conversion from 'size_t' to 'int', possible loss of data
>>>> src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' :
>>>> conversion from 'size_t' to 'int', possible loss of data
>>>>
>>>> I am going to fix it by casting (int):
>>>>
>>>> + void adjust_size(size_t used) {
>>>> + _size = (int)used;
>>>> + _data_offset = (int)used;
>>>> + _code_end = (address)this + used;
>>>> + _data_end = (address)this + used;
>>>> + }
>>>>
>>>> Note, CodeCache size can't more than 2Gb (max_int) so such casting is
>>>> fine.
>>>>
>>>> Vladimir
>>>>
>>>> On 9/6/17 6:20 AM, Volker Simonis wrote:
>>>>> On Tue, Sep 5, 2017 at 9:36 PM, <coleen.phillimore at oracle.com
>>>> <mailto:coleen.phillimore at oracle.com>> wrote:
>>>>>>
>>>>>> I was going to make the same comment about the friend declaration
>>>> in v1, so
>>>>>> v2 looks better to me. Looks good. Thank you for finding a
>>>> solution to
>>>>>> this problem that we've had for a long time. I will sponsor this
>>>> (remind me
>>>>>> if I forget after the 18th).
>>>>>>
>>>>>
>>>>> Thanks Coleen! I've updated
>>>>>
>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>>>
>>>>> in-place and added you as a second reviewer.
>>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>>
>>>>>> thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
>>>>>>>
>>>>>>> On 9/5/17 9:49 AM, Volker Simonis wrote:
>>>>>>>>
>>>>>>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
>>>>>>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
>>>> wrote:
>>>>>>>>>
>>>>>>>>> May be add new CodeBlob's method to adjust sizes instead of
>>>> directly
>>>>>>>>> setting
>>>>>>>>> them in CodeCache::free_unused_tail(). Then you would not need
>>>> friend
>>>>>>>>> class
>>>>>>>>> CodeCache in CodeBlob.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Changed as suggested (I didn't liked the friend declaration as
>>>> well :)
>>>>>>>>
>>>>>>>>> Also I think adjustment to header_size should be done in
>>>>>>>>> CodeCache::free_unused_tail() to limit scope of code who knows
>>>> about
>>>>>>>>> blob
>>>>>>>>> layout.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, that's much cleaner. Please find the updated webrev here:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>>
>>>>>>>
>>>>>>>
>>>>>>> Good.
>>>>>>>
>>>>>>>>
>>>>>>>> I've also found another "day 1" problem in StubQueue::next():
>>>>>>>>
>>>>>>>> Stub* next(Stub* s) const { int i =
>>>>>>>> index_of(s) + stub_size(s);
>>>>>>>> - if (i ==
>>>>>>>> _buffer_limit) i = 0;
>>>>>>>> + // Only wrap
>>>>>>>> around in the non-contiguous case (see stubss.cpp)
>>>>>>>> + if (i ==
>>>>>>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
>>>>>>>> return (i ==
>>>>>>>> _queue_end) ? NULL : stub_at(i);
>>>>>>>> }
>>>>>>>>
>>>>>>>> The problem was that the method was not prepared to handle the
>>>> case
>>>>>>>> where _buffer_limit == _queue_end == _buffer_size which lead to
>>>> an
>>>>>>>> infinite recursion when iterating over a StubQueue with
>>>>>>>> StubQueue::next() until next() returns NULL (as this was for
>>>> example
>>>>>>>> done with -XX:+PrintInterpreter). But with the new, trimmed
>>>> CodeBlob
>>>>>>>> we run into exactly this situation.
>>>>>>>
>>>>>>>
>>>>>>> Okay.
>>>>>>>
>>>>>>>>
>>>>>>>> While doing this last fix I also noticed that
>>>> "StubQueue::stubs_do()",
>>>>>>>> "StubQueue::queues_do()" and "StubQueue::register_queue()" don't
>>>> seem
>>>>>>>> to be used anywhere in the open code base (please correct me if
>>>> I'm
>>>>>>>> wrong). What do you think, maybe we should remove this code in a
>>>>>>>> follow up change if it is really not needed?
>>>>>>>
>>>>>>>
>>>>>>> register_queue() is used in constructor. Other 2 you can remove.
>>>>>>> stub_code_begin() and stub_code_end() are not used too -remove.
>>>>>>> I thought we run on linux with flag which warn about unused code.
>>>>>>>
>>>>>>>>
>>>>>>>> Finally, could you please run the new version through JPRT and
>>>> sponsor
>>>>>>>> it once jdk10/hs will be opened again?
>>>>>>>
>>>>>>>
>>>>>>> Will do when jdk10 "consolidation" is finished. Please, remind me
>>>> later if
>>>>>>> I forget.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Volker
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I've decided to split the fix for the
>>>> 'CodeHeap::contains_blob()'
>>>>>>>>>> problem into its own issue "8187091: ReturnBlobToWrongHeapTest
>>>> fails
>>>>>>>>>> because of problems in CodeHeap::contains_blob()"
>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and started
>>>> a new
>>>>>>>>>> review thread for discussing it at:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
>>>>>>>>>>
>>>>>>>>>> So please lets keep this thread for discussing the interpreter
>>>> code
>>>>>>>>>> size issue only. I've prepared a new version of the webrev
>>>> which is
>>>>>>>>>> the same as the first one with the only difference that the
>>>> change to
>>>>>>>>>> 'CodeHeap::contains_blob()' has been removed:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Volker
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
>>>>>>>>>> <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>>
>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
>>>>>>>>>>> <vladimir.kozlov at oracle.com
>>>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Very good change. Thank you, Volker.
>>>>>>>>>>>>
>>>>>>>>>>>> About contains_blob(). The problem is that AOTCompiledMethod
>>>>>>>>>>>> allocated
>>>>>>>>>>>> in
>>>>>>>>>>>> CHeap and not in aot code section (which is RO):
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>>>>>>>>>>>>
>>>>>>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
>>>>>>>>>>>> code_begin()
>>>>>>>>>>>> points to AOT code section but AOTCompiledMethod* points
>>>> outside it
>>>>>>>>>>>> (to
>>>>>>>>>>>> normal malloced space) so you can't use (char*)blob address.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the explanation - now I got it.
>>>>>>>>>>>
>>>>>>>>>>>> There are 2 ways to fix it, I think.
>>>>>>>>>>>> One is to add new field to CodeBlobLayout and set it to
>>>> blob* address
>>>>>>>>>>>> for
>>>>>>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
>>>>>>>>>>>> Second is to use contains(blob->code_end() - 1) assuming
>>>> that AOT
>>>>>>>>>>>> code
>>>>>>>>>>>> is
>>>>>>>>>>>> never zero.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'll give it a try tomorrow and will send out a new webrev.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Volker
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
>>>>>>>>>>>>> <claes.redestad at oracle.com
>>>> <mailto:claes.redestad at oracle.com>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> While working on this, I found another problem which is
>>>> related to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing
>>>> the JTreg
>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem is that JDK-8183573 replaced
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> virtual bool contains_blob(const CodeBlob* blob)
>>>> const {
>>>>>>>>>>>>>>> return
>>>>>>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob < high();
>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> by:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> bool contains_blob(const CodeBlob* blob) const {
>>>> return
>>>>>>>>>>>>>>> contains(blob->code_begin()); }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But that my be wrong in the corner case where the size of
>>>> the
>>>>>>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists
>>>> only of the
>>>>>>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that
>>>> case
>>>>>>>>>>>>>>> CodeBlob::code_begin() points right behind the CodeBlob's
>>>> header
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>> is a memory location which doesn't belong to the CodeBlob
>>>> anymore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I recall this change was somehow necessary to allow
>>>> merging
>>>>>>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob
>>>> into
>>>>>>>>>>>>>> one devirtualized method, so you need to ensure all AOT
>>>> tests
>>>>>>>>>>>>>> pass with this change (on linux-x64).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed and
>>>> passed
>>>>>>>>>>>>> successful. Are there any other tests I should check?
>>>>>>>>>>>>>
>>>>>>>>>>>>> That said, it is a little hard to follow the stages of your
>>>> change.
>>>>>>>>>>>>> It
>>>>>>>>>>>>> seems like
>>>>>>>>>>>>>
>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/
>>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
>>>>>>>>>>>>> was reviewed [1] but then finally the slightly changed
>>>> version from
>>>>>>>>>>>>>
>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/
>>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
>>>>
>>>>>>>>>>>>> was
>>>>>>>>>>>>> checked in and linked to the bug report.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The first, reviewed version of the change still had a
>>>> correct
>>>>>>>>>>>>> version
>>>>>>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while
>>>> the second,
>>>>>>>>>>>>> checked in version has the faulty version of that method.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't know why you finally did that change to
>>>> 'contains_blob()'
>>>>>>>>>>>>> but
>>>>>>>>>>>>> I don't see any reason why we shouldn't be able to directly
>>>> use the
>>>>>>>>>>>>> blob's address for inclusion checking. From what I
>>>> understand, it
>>>>>>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap so
>>>> no
>>>>>>>>>>>>> reason
>>>>>>>>>>>>> to mess with 'CodeBlob::code_begin()'.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know if I'm missing something.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can't help to wonder if we'd not be better served by
>>>> disallowing
>>>>>>>>>>>>>> zero-sized payloads. Is this something that can ever
>>>> actually
>>>>>>>>>>>>>> happen except by abuse of the white box API?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java)
>>>> specifically
>>>>>>>>>>>>> wants to allocate "segment sized" blocks which is most
>>>> easily
>>>>>>>>>>>>> achieved
>>>>>>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's
>>>> nothing
>>>>>>>>>>>>> wrong
>>>>>>>>>>>>> about it if we handle the inclusion tests correctly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you and best regards,
>>>>>>>>>>>>> Volker
>>>>>>>>>>>>>
>>>>>>>>>>>>>> /Claes
>>>>>>
>>>>>>
>>>>
>>>
>>
More information about the hotspot-dev
mailing list