Very poor performance of JNI AttachCurrentThread on Linux

Dmitry Samersoff dmitry.samersoff at oracle.com
Tue Feb 26 12:38:44 PST 2013


Andrew,

I'm repeating the answer I sent to Stephan Bergmann few days below,
is it the same issue with Libre Office?

-------- Original Message --------
Subject: Re: get_stack_bounds using read(2) syscalls to read
/proc/self/maps byte-by-byte
Date: Thu, 21 Feb 2013 19:05:13 +0400
From: Dmitry Samersoff <dmitry.samersoff at oracle.com>
To: Stephan Bergmann <sbergman at redhat.com>
CC: hotspot-dev at openjdk.java.net

Stephan,

There was some reasons for me to go off getline:

1. Behaviour of getline is not properly documented so it's not clean for
me whether we should clean the buffer on all platforms for all version
of libc in case of getline error.

see http://sourceware.org/bugzilla/show_bug.cgi?id=5666

2. The fix reduces number of heap memory allocations/deallocations
within VM.

Huge java app with long paths to DSO could have huge map file, getline
version read it entirely, but we are interesting in first
128 bytes of each line only.

3. Kernel doesn't report size of /proc files and doesn't notify process
on /proc/* files changes
(see e.g. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/454722)

So buffered reading of /proc/* files could lead to subtle errors or
crashes if file is changed during read e.g. after suspend/resume.


It might be possible to reduce number of read syscalls by implementing
some internal bufferisation but I definitely would prefer not to return
getline. Is there a performance problems with other applications?


-Dmitry


On 2013-02-26 21:58, Andrew Haley wrote:
> get_stack_bounds() was rewritten because of a small memory leak.
> Instead of simply free()ing the memory to prevent the leak, it was
> rewritten to use a byte-by-byte loop around read() :
> 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2011-February/001864.html
> 
> Unfortunately, the performance impact of this change is tragic.  As
> you can imagine, tens of thousands of system calls are made whenever
> get_stack_bounds() is called.
> 
> Before rewrite: typically 100 microseconds
> After rewrite: typically 1500 microseconds
> 
> It's impossible for me to tell from the discussion on the mailing list
> why such a change was made.  There is a rather elliptical comment
> 
>> I'm strictly against reading /proc entry using stdio functions,
>> as (a) /proc file could be changed while we are reading it (b) it's not
>> fancy as we are buffering kmem.
> 
> but this doesn't make any sense.  If the contents of "/proc/self/maps"
> really did change while it was being read, reading a byte at a time
> wouldn't help at all.  I don't know what the second sentence means.
> 
> It would be possible to read "/proc/self/maps" in chunks and scan that,
> but my measurements show that it would not be significantly faster
> than the original version of get_stack_bounds().
> 
> This severe regression is impacting a current large Java deployment.
> See https://bugzilla.redhat.com/show_bug.cgi?id=902004
> 
> Is there any reason why I should not simply submit a webrev that reverts
> to the original code, with suitable use of free() ?  I have attached the
> code that I am testing.
> 
> Andrew.
> 
> 
> 
> 
> static bool
> get_stack_bounds(uintptr_t *bottom, uintptr_t *top)
> {
>   FILE *f = fopen("/proc/self/maps", "r");
>   if (f == NULL)
>     return false;
>   char *str = NULL;
>   while (!feof(f)) {
>     size_t dummy;
>     ssize_t len = getline(&str, &dummy, f);
>     if (len == -1) {
>       free(str);
>       fclose(f);
>       return false;
>     }
> 
>     if (len > 0 && str[len-1] == '\n') {
>       str[len-1] = 0;
>       len--;
>     }
> 
>     static const char *stack_str = "[stack]";
>     if (len > (ssize_t)strlen(stack_str)
> 	&& (strcmp(str + len - strlen(stack_str), stack_str) == 0)) {
>       if (sscanf(str, "%" SCNxPTR "-%" SCNxPTR, bottom, top) == 2) {
>         uintptr_t sp = (uintptr_t)__builtin_frame_address(0);
>         if (sp >= *bottom && sp <= *top) {
>           free(str);
>           fclose(f);
>           return true;
>         }
>       }
>     }
>   }
>   free(str);
>   fclose(f);
>   return false;
> }
> 
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* Give Rabbit time, and he'll always get the answer


More information about the hotspot-dev mailing list