Implementation of IO with Panama

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon Apr 19 09:33:16 UTC 2021


I see,
you need both allocation and deallocation; arena works well when you 
need to allocate lots of resources at the same time - sometimes native 
apps will need to do that, but where you want everything to be freed at 
once.

It seems like you just need a better malloc/free. If that's the case we 
have a more general allocator we're working on which at some point we'll 
integrate under the SegmentAllocator umbrella. From what I can read from 
some of the javadoc in the allocator you use, what we're working on is 
exactly what you need here.

Regarding your implementation, I see that clients needs to obtain an 
allocator based on a given scope. This acquires your allocator's scope 
(so that memory cannot go away) and returns a new segment allocator 
which works using the client's scope, but which returns segment backed 
by your allocator impl. This is clever, and it allows you to avoid 
creating a ResourceScope for every returned segment - do you think that 
resource scope creation would have killed performances?

Then, on each allocator request you use `getSegmentForScope` which 
internally register a callback (on the client scope) for 
`putSegmentEntry`, which returns the segment to the pool. It seems all 
very consistent. With this design, the client can use confined, or 
shared segments, and you get same performances, as clients are not 
expected to call "close" on each segment - but you work on a scope 
granularity instead (which the API encourages).

So, if a client use the SegmentAllocator API, everything should be safe 
right? But you also expose a lower level API to, presumably, avoid the 
segment allocator. How much is it gained by that?

Thanks
Maurizio




On 19/04/2021 10:08, Radosław Smogura wrote:
> Hi Maurizio,
>
> I hope you have a good day.
>
> I’ve sent the allocator as PR (most for review purposes).
>
> I think that even this one requires a lot of improvements, I think maybe it’s good idea to add a pool allocator bound to scope, which can create sub-allocator bound to new scope?
>
> For benchmarks socket benchmarks are bit fragile as can slightly be different depending on socket buffer.
>
> I did test with file, and I can get consistent gain around 15% especially for small reads 16b - on the link there’s wiki with benchmarks results.
>
> The ArenaAllocator was very slow - it gave 50% of JNI I/O, and I think it can’t be used as if I wrap the InputStream I can’ require caller to open scope.
>
>
> Kind regards,
> Rado
>
>> On Apr 17, 2021, at 12:12 AM, Radosław Smogura <mail at smogura.eu> wrote:
>>
>> Hi Maurizio,
>>
>> I think I know what's happening there - however I would like to find this in some documentation.
>>
>> The link you shard shown that there's a call to __erno_loation and than move to/from address from rax so result from this call.
>>
>> But I did objdump and on my Linux the errno is symbol in .tbss section (thread local one),
>>
>> [ /usr/include ]
>> radek at radek-ubuntu # objdump -T /usr/lib/x86_64-linux-gnu/libc-2.32.so |grep errno
>> 0000000000000010 g    D  .tbss  0000000000000004  GLIBC_PRIVATE errno
>> 0000000000029030 g    DF .text  0000000000000015  GLIBC_2.2.5 __errno_location
>> 00000000001498d0 g    DF .text  000000000000005f (GLIBC_2.2.5) clnt_sperrno
>> 0000000000149930 g    DF .text  0000000000000084 (GLIBC_2.2.5) clnt_perrno
>> 000000000000006c g    D  .tbss  0000000000000004  GLIBC_PRIVATE __h_errno
>> 0000000000129b60 g    DF .text  0000000000000015  GLIBC_2.2.5 __h_errno_location
>>
>> D - dynamic or debugging symbol
>> F - function
>> g - global
>>
>> I don't know internals of thread-locals and  dynamic symbols (I only would guess the  first is implemented by kernel by mapping separate page for every thread), and I know linking sometimes can be more complicated (i.e. gcc has multiversioning - something like bootstrap methods).
>>
>> However I've found this
>> https://urldefense.com/v3/__https://lwn.net/Articles/5851/__;!!GqivPVa7Brio!OGI60qGQqVMJBJUIK1dpoOkAZ3oRCyD4Wm4UhDNmPDD0jgVF3C9_ZXclz96XO4aKaaP0Rkg$  (probably outdated)
>>
>> I think that only manipulation of page map, could make this working.
>>
>> Definitely it's worth of checking this on BSD and OSX, as well checking what kind of approach should be used, and how it can be determined on runtime, but it looks like errno_location looks like something which should be used.
>>
>>  From the other hand, I work on improving allocator, and I already see some good results, I'll send updates later.
>>
>> Kind regards,
>> Rado
>> [announce, patch] Thread-Local Storage (TLS) support for Linux, 2.5.28 [LWN.net]<https://urldefense.com/v3/__https://lwn.net/Articles/5851/__;!!GqivPVa7Brio!OGI60qGQqVMJBJUIK1dpoOkAZ3oRCyD4Wm4UhDNmPDD0jgVF3C9_ZXclz96XO4aKaaP0Rkg$ >
>> From:: Ingo Molnar <mingo at elte.hu> To:: linux-kernel at vger.kernel.org: Subject: [announce, patch] Thread-Local Storage (TLS) support for Linux, 2.5.28: Date:
>> lwn.net
>>>>
>>
>>
>>
>> ________________________________
>> Od: Maurizio Cimadamore <maurizio.cimadamore at oracle.com>
>> Wysłane: piątek, 16 kwietnia 2021 16:30
>> Do: Radosław Smogura <mail at smogura.eu>
>> DW: panama-dev at openjdk.java.net <panama-dev at openjdk.java.net>
>> Temat: Re: Implementation of IO with Panama
>>
>>
>>> On 16/04/2021 15:10, Radosław Smogura wrote:
>>> I could grab address for it just looking for symbol from CLinker.
>>>
>>> But, AFIK, errno is thread local variable so maybe I should double check how this should be handled.
>>
>> Interesting reading from man page:
>>
>> ```
>> errno  is  defined  by  the ISO C standard to be a modifiable lvalue of
>>         type int, and must not be explicitly declared; errno may be  a
>> macro.
>>         errno  is  thread-local;  setting  it in one thread does not
>> affect its
>>         value in any other thread.
>> ```
>>
>> this is subtle - it's an lvalue (e.g. can used e.g. for assignment) but
>> is not explicitly declared - meaning it's not a variable in the proper
>> sense.
>>
>> At this point I'm not sure that linking to it gives the expected results.
>>
>> In fact, firing up Godbolt shows that gcc uses __errno_location under
>> the hood:
>>
>> https://urldefense.com/v3/__https://godbolt.org/z/rcMPTKxva__;!!GqivPVa7Brio!OGI60qGQqVMJBJUIK1dpoOkAZ3oRCyD4Wm4UhDNmPDD0jgVF3C9_ZXclz96XO4aKvdRCIgg$
>>
>> Maurizio
>>
>>


More information about the panama-dev mailing list