SA and disassemblers

Fri Sep 11 05:59:58 PDT 2009

Hi Tom,

I like the extension of the decode_instructions interface, especially
because this is something we are already using anyway. We have
extended the disassembler in the VM to decode instructions which had
been dumped by the VM earlier when no disassembler library was
available. We realized this with an 'AbstractDisassaembler' which in
our case is the base class of 'Disassembler'. If the native
disassembler can not be loaded, we automatically fall back to the
abstract disassembler which prints a kind of hex dump, but with all
the usual annotations. Later on, we can take such an output produced
by the abstract disassembler (for example from an hs_err file) and
decode it to real assembly code. Perhaps this is an extension you also
want to consider? I could contribute some code if you'd like.

Besides this I have some remarks/suggestions regarding hsdis:

- I don't really think it is necessary for hsdis.c to export two
functions: decode_instructions() and decode_instructions_virtual(). I
think the more general one decode_instructions_virtual() would be
enough together with a small change in disassembler.{cpp,hpp}:

address decode_env::decode_instructions(address start, address end,
address virtual_start /* = 0*/) {
  if (virtual_start == 0) virtual_start = start;

This would unify the hsdis interface used by the VM and the SA and
allow one to easily implement the decoding of "abstract disassembly"
as described above.

- in hsdis.c in hsdis_read_memory_func you use a C++ style comment
which I think should be better changed into a C style comment and
'membar' should probably read 'mamaddr':

...
+   // convert the virtual address memaddr into an address within memory buffer
+   uintptr_t memaddr_p = ((uintptr_t) memaddr) - app_data->start_va +
app_data->start;
...

- The Readme file should contain the hint:

"The binutils should be configured with the '--disable-nls' flag to
disable Native Language Support, otherwise you might get an "undefined
reference to `libintl_gettext'" if you try to load hsdis.so on systems
which don't have NLS by default."

- There's a typo in the README file of the hsdis plugin:
line 70: 'hsdid-$LIBARCH.so' should read 'hsdis-$LIBARCH.so'

- On IA64 and PPC, function pointers are pointers to function
descriptors, so hsdis-demo.c should be changed accordingly:

static const char* lookup(void* addr) {
#if defined(__ia64) || defined(__powerpc__)
  /* On IA64 and PPC function pointers are pointers to function descriptors */
  #define CHECK_NAME(fn) \
    if (addr == *((void**) &fn))  return #fn;
#else
  #define CHECK_NAME(fn) \
    if (addr == (void*) &fn)  return #fn;
#endif

...

void disassemble(void* from, void* to) {
#if defined(__ia64) || defined(__powerpc__)
  /* On IA64 and PPC function pointers are pointers to function descriptors */
  from = *((void**)from);
  to = *((void**)to);
#endif

- In the Makefile I need to link hsdis-demo against 'ld'. So I suggest
adding (at least for Linux):

LDFLAGS		+= -ldl

- I don't understand why 'decode_instructions' in hsdis.c needs to
print a newline after every instruction. I think it would be more
consistent if this newline would be printed in the appropriate
callback in the disassembler class. I would therefore suggest to
delete the following lines from decode_instructions in hsdis.c:

     /* follow each complete insn by a nice newline */
     /*(*printf_callback)(printf_stream, "\n");*/

- The previous change would also make it easier to use
decode_instructions() to decode a single instruction. But it would be
additionally necessary to change the line:

    while (p < end && !app_data.losing) {

to:

    while (p <= end && !app_data.losing) {

to make that possible. Additionally, "out of bound check" in
hsdis_read_memory_func() will have to be removed, if we want to decode
single instructions:

if (memaddr_p + length > app_data->end) {
  /* read is out of bounds */
  return EIO;
} else {

I don't think that this does any harm, but on the other side it will
make the hsdis plugin more generally usable.

Finally I would like to suggest moving the printing of nmethods and
code blobs into the corresponding print methods of the CodeBlob and
nmethod classes.This could be easily achieved with the new
functionality which allows the decoding of a single assembler
instruction. I don't see a reason why the disassembler class needs to
know the internals of nmethod and CodeBlob. In my opinion it would be
a cleaner design if the disassembler class would export a clear
interface for decoding assembler instructions and leave all the rest
to the class like nmethod or CodeBlob which use it. Again this would
simplify the implementation of an "AbstractDisassembler" class as
described above because the code which prints all the various
annotations around the instructions would not have to be duplicated in
the "AbstractDisassembler" class.

Thank you and best regards,
Volker

On 8/27/09, Tom Rodriguez <Thomas.Rodriguez at sun.com> wrote:
> One of the limitations of the current SA is that it doesn't have a
> disassembler for amd64 which makes looking at 64 bit cores somewhat
> challenging.  Instead of trying to implement a full disassembler for x64 in
> Java I think we should switch the SA to using hsdis-arch for the decoding.
> It requires a minor extension to the decode_instructions interface so that
> we can disassemble a buffer of code that's not physically located where the
> code came from.  I've got a working version of this that also fixes the
> printing to match up better with what the JVM itself prints.  I was planning
> on blowing away the other disassemblers completely because they don't really
> fit into the new model.  I thought I'd float this before finishing it to see
> if there are any concerns.  The initial webrev is at
> http://cr.openjdk.java.net/~never/sadis.  This includes
> building the SA with source 1.5 to allow use of printf which will probably
> be pushed separately.
>
>  tom
>