RFR(S): 8019929: PPC64 (part 107): Extend ELF-decoder to support PPC64 function descriptor tables
Vladimir Kozlov
vladimir.kozlov at oracle.com
Thu Dec 5 10:49:35 PST 2013
Looks good to me. I will wait others to look on new changes and push
after that.
Thanks,
Vladimir
On 12/5/13 10:37 AM, Volker Simonis wrote:
> Hi,
>
> so here it comes, the hopefully final webrev of this change:)
>
> http://cr.openjdk.java.net/~simonis/webrevs/8019929.v3/
>
> I've:
> - fixed the comments for the ifdefs as requested by Vladimir
> - fixed the "else if" indentation as requested by Vladimir
> - fixed the out-of-memory situation detected by Vitaly
>
> I' also added a detailed description of why we need all this on PPC64 to
> elfFuncDescTable.hpp and fixed a small problem for old-style PPC64 objects
> with additional 'dot'-symbols (see description below). The correct handling
> of these old-style files also required another small '#ifdef PPC64' section
> in decoder_linux.cpp (for a detailed description why this is necessary see
> below). I hope that's OK.
>
> Thank you and best regards,
> Volker
>
> Detailed change description:
>
> On PowerPC-64 (and other architectures like for example IA64) a pointer to
> a function is not just a plain code address, but instead a pointer to a so
> called function descriptor (which is simply a structure containing 3
> pointers). This fact is also reflected in the ELF ABI for PowerPC-64.
>
> On architectures like x86 or SPARC, the ELF symbol table contains the start
> address and size of an object. So for example for a function object (i.e.
> type 'STT_FUNC') the symbol table's 'st_value' and 'st_size' fields
> directly represent the starting address and size of that function. On PPC64
> however, the symbol table's 'st_value' field only contains an index into
> another, PPC64 specific '.opd' (official procedure descriptors) section,
> while the 'st_size' field still holds the size of the corresponding
> function. In order to get the actual start address of a function, it is
> necessary to read the corresponding function descriptor entry in the '.opd'
> section at the corresponding index and extract the start address from
> there.
>
> That's exactly what this 'ElfFuncDescTable' class is used for. If the
> HotSpot runs on a PPC64 machine, and the corresponding ELF files contains
> an '.opd' section (which is actually mandatory on PPC64) it will be read
> into an object of type 'ElfFuncDescTable' just like the string and symbol
> table sections. Later on, during symbol lookup in
> 'ElfSymbolTable::lookup()' this function descriptor table will be used if
> available to find the real function address.
>
> All this is how things work today (2013) on contemporary Linux
> distributions (i.e. SLES 10) and new version of GCC (i.e. > 4.0). However
> there is a history, and it goes like this:
>
> In SLES 9 times (sometimes before GCC 3.4) gcc/ld on PPC64 generated two
> entries in the symbol table for every function. The value of the symbol
> with the name of the function was the address of the function descriptor
> while the dot '.' prefixed name was reserved to hold the actual address of
> that function (
> http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html#FUNC-DES).
>
>
> For a C-function 'foo' this resulted in two symbol table entries like this
> (extracted from the output of 'readelf -a '):
>
> Section Headers:
> [ 9] .text PROGBITS 0000000000000a20 00000a20
> 00000000000005a0 0000000000000000 AX 0 0 16
> [21] .opd PROGBITS 00000000000113b8 000013b8
> 0000000000000138 0000000000000000 WA 0 0 8
>
> Symbol table '.symtab' contains 86 entries:
> Num: Value Size Type Bind Vis Ndx Name
> 76: 00000000000114c0 24 FUNC GLOBAL DEFAULT 21 foo
> 78: 0000000000000bb0 76 FUNC GLOBAL DEFAULT 9 .foo
>
> You can see now that the '.foo' entry actually points into the '.text'
> segment ('Ndx'=9) and its value and size fields represent the functions
> actual address and size. On the other hand, the entry for plain 'foo'
> points into the '.opd' section ('Ndx'=21) and its value and size fields are
> the index into the '.opd' section and the size of the corresponding '.opd'
> section entry (3 pointers on PPC64).
>
> These so called 'dot symbols' were dropped around gcc 3.4 from GCC and
> BINUTILS, see http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00557.html. But
> nevertheless it may still be necessary to support both formats because we
> either run on an old system or because it is possible at any time that
> functions appear in the stack trace which come from old-style libraries.
>
> Therefore we not only have to check for the presence of the function
> descriptor table during symbol lookup in 'ElfSymbolTable::lookup()'. We
> additionally have to check that the symbol table entry references the
> '.opd' section. Only in that case we can resolve the actual function
> address from there. Otherwise we use the plain 'st_value' field from the
> symbol table as function address. This way we can also lookup the symbols
> in old-style ELF libraries (although we get the 'dotted' versions in that
> case). However, if present, the 'dot' will be conditionally removed on
> PPC64 from the symbol in 'ElfDecoder::demangle()' in decoder_linux.cpp.
>
> Notice that we can not reliably get the function address from old-style
> libraries because the 'st_value' field of the symbol table entries which
> point into the '.opd' section denote the size of the corresponding '.opd'
> entry and not that of the corresponding function. This has changed for the
> symbol table entries in new-style libraries as described at the beginning
> of this documentation.
>
> This change also slightly improves the implementation of
> ElfSymbolTable::lookup(). Before, the method always iterated over all
> symbols in the symbol table and returned the one with the highest address
> below the requested addr argument. This not only could take a significant
> amount of time for big libraries, it could also return bogus symbols for
> addresses which were not really covered by that symbol table at all. The
> new versions additionally uses the symbol table's st_size field to verify
> that the requested addr argument is indeed within the range covered by the
> corresponding symbol table entry. If so, the search is stopped and the
> symbol is returned immediately.
>
>
>
>
> On Thu, Dec 5, 2013 at 9:22 AM, Volker Simonis <volker.simonis at gmail.com>wrote:
>
>> Hi Vitaly,
>>
>> you're right - I'll fix it.
>>
>> Thanks,
>> Volker
>>
>>
>> On Thu, Dec 5, 2013 at 2:15 AM, Vitaly Davidovich <vitalyd at gmail.com>
>> wrote:
>>> Ok.
>>>
>>> 171 if (string_table->string_at(shdr.sh_name, buf, sizeof(buf))
>> &&
>>> !strncmp(".opd", buf, 4)) {
>>> 172 m_funcDesc_table = new (std::nothrow)
>> ElfFuncDescTable(m_file,
>>> shdr);
>>> 173 break;
>>>
>>> So if that alloc fails, I see that code handles a null m_funcDesc_table
>>> where it's used. But is that what you want for PPC64? Won't you get
>> wrong
>>> symbol info? Code reading other tables does this for OOM cases:
>>>
>>> m_status = NullDecoder::out_of_memory;
>>> return false;
>>>
>>> Sent from my phone
>>>
>>> On Dec 4, 2013 2:17 PM, "Volker Simonis" <volker.simonis at gmail.com>
>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> thanks for the comments.
>>>>
>>>> I think the function descriptor table logically belongs to the
>>>> ELF-file itself and not to symbol table. An ELF file can have several
>>>> symbol tables but just one function descriptor table. Also, the
>>>> function descriptor table is read in when the ELF file is opened (i.e.
>>>> in the ElfFile constructor).
>>>>
>>>> So after Vladimirs suggestion to remove most of the "#ifdef PPC64"
>>>> there was no reason to keep the ElfFuncDescTable class in
>>>> elfSymbolTable.{hpp,cpp} so I created two new files
>>>> elfFuncDescTable.{hpp, cpp} for it. Now the only remaining "#ifdef
>>>> PPC64" is in ElfFile::load_tables() when the function descriptor table
>>>> is loaded (as requested by Vladimir).
>>>>
>>>> But actually, the corresponding '.opd' section is only available on
>>>> PPC64 (see
>>>>
>> http://refspecs.linuxfoundation.org/LSB_3.1.1/LSB-Core-PPC64/LSB-Core-PPC64/specialsections.html
>> )
>>>> and I don't think the code will do any harm if it would be executed on
>>>> a non-PPC64 system - the '.opd' section would just not be found. I
>>>> also think the corresponding performance impact would be minimal
>>>> compared to the loading of the symbol and string tables. So I tend to
>>>> remove the last "#ifdef PPC64" as well. So what do you think - I'm OK
>>>> with both solutions?
>>>>
>>>> Below is a webrev with the described changes (and still with the last
>>>> "#ifdef PPC64" in ElfFile::load_tables()):
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/8019929.v2/
>>>>
>>>> If you agree with it, I would appreciate if you could push it trough
>> JPRT.
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>> PS: the little change in make/aix/makefiles/vm.make was necessarx to
>>>> exlude the new file from the AIX-build because AIX uses XCOFF instead
>>>> of ELF.
>>>>
>>>>
>>>> On Wed, Dec 4, 2013 at 3:39 AM, Vitaly Davidovich <vitalyd at gmail.com>
>>>> wrote:
>>>>> Hi Volker,
>>>>>
>>>>> Would it be cleaner if you were to extend ElfSymbolTable for PPC and
>>>>> embed
>>>>> the funcDesc in there, keeping the lookup() signature the same? It
>> seems
>>>>> like the funcDesc should be a hidden indirection as part of lookup()
>>>>> rather
>>>>> than a parameter.
>>>>>
>>>>> Just a thought ...
>>>>>
>>>>> Sent from my phone
>>>>>
>>>>> On Dec 3, 2013 5:43 PM, "Volker Simonis" <volker.simonis at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> thanks for looking at the change. I initially did it this way to
>>>>>> keep changes to the existing platforms as small as possible but I'll
>> be
>>>>>> happy to change in the way you suggested if nobody objects.
>>>>>>
>>>>>> Regards,
>>>>>> Volker
>>>>>>
>>>>>> On Tuesday, December 3, 2013, Vladimir Kozlov wrote:
>>>>>>
>>>>>>> Volker,
>>>>>>>
>>>>>>> It looks fine to me except #ifdef pollution.
>>>>>>>
>>>>>>> I think ElfSymbolTable::lookup() should always take
>> ElfFuncDescTable
>>>>>>> argument and you need ElfFuncDescTable always defined.
>>>>>>> In ElfSymbolTable::lookup() you can check funcDescTable for null
>>>>>>> instead
>>>>>>> of ifdefs.
>>>>>>> The only place we can keep #ifdef is m_funcDesc_table setting in
>>>>>>> ElfFile::load_tables().
>>>>>>> ElfFuncDescTable class's methods are not so big to ifdef them.
>>>>>>>
>>>>>>> But others may have different opinion.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 12/3/13 9:39 AM, Volker Simonis wrote:
>>>>>>>
>>>>>>>> On PowerPC-64 (and other architectures like for example IA64) a
>>>>>>>> pointer to a function is not just a plain code address, but a
>>>>>>>> pointer
>>>>>>>> to a so called function descriptor (see
>>>>>>>> http://refspecs.linuxfoundation.org/ELF/ppc64/
>>>>>>>> PPC-elf64abi-1.9.html#FUNC-DES).
>>>>>>>> This fact is also reflected in the ELF ABI for PowerPC-64. This
>>>>>>>> small
>>>>>>>> changes adds support for ELF function descriptor tables to the
>>>>>>>> current
>>>>>>>> ELF decoder:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8019929/
>>>>>>>>
>>>>>>>> On architectures like x86 or SPARC, the ELF symbol table contains
>>>>>>>> the
>>>>>>>> start address and size of an object. So for example for a function
>>>>>>>> object (i.e. type FUNC) the symbol table's value field directly
>>>>>>>> represents the starting address and the size field the size of a
>>>>>>>> function. On PPC64 however, the symbol table's value field only
>>>>>>>> contains an index into a PPC64 specific .opd (official procedure
>>>>>>>> descriptors) section, while the size field still holds the size of
>>>>>>>> the
>>>>>>>> corresponding function. In order to get the actual start address
>> of
>>>>>>>> a
>>>>>>>> function, it is necessary to read the corresponding function
>>>>>>>> descriptor entry in the .opd section at the corresponding index
>> and
>>>>>>>> extract the start address from there.
>>>>>>>>
>>>>>>>> This change extends the current HotSpot ELF utilities to support
>> the
>>>>>>>> .opd (official procedure descriptors) section on PPC64 platforms.
>> It
>>>>>>>> does this by adding a new field m_funcDesc_table of type
>>>>>>>> ElfFuncDescTable to the ElfFile class. The m_funcDesc_table is
>>>>>>>> initialized in the ElfFile::load_tables() in the same way like the
>>>>>>>> symbol table members by parsing the corresponding .opd section if
>> it
>>>>>>>> is available.
>>>>>>>>
>>>>>>>> The ElfSymbolTable::lookup() method is changed on PPC64 to take an
>>>>>>>> extra ElfFuncDescTable argument. If running on PPC64, this
>> argument
>>>>>>>> is
>>>>>>>> used to do the extra level of indirection through the function
>>>>>>>> description table to get the real start address associated with a
>>>>>>>> symbol.
>>>>>>>>
>>>>>>>> This change also slightly improves the implementation of
>>>>>>>> ElfSymbolTable::lookup(). Before, the method always iterated over
>>>>>>>> all
>>>>>>>> symbols in the symbol table and returned the one with the highest
>>>>>>>> address below the requested addr argument. This not only could
>> take
>>>>>>>> a
>>>>>>>> significant amount of time for big libraries, it could also return
>>>>>>>> bogus symbols for addresses which were not really covered by that
>>>>>>>> symbol table at all. The new versions additionally uses the symbol
>>>>>>>> table's st_size field to verify that the requested addr argument
>> is
>>>>>>>> indeed within the range covered by the corresponding symbol table
>>>>>>>> entry. If so, the search is stopped and the symbol is returned
>>>>>>>> immediately.
>>>>>>>>
>>>>>>>> Thank you and best regards,
>>>>>>>> Volker
>>>>>>>>
>>>>>>>>
>>
More information about the ppc-aix-port-dev
mailing list