Memory usage of EPollArraySelector
Vitaly Davidovich
vitalyd at gmail.com
Tue Oct 27 20:34:32 UTC 2015
It's very common for epoll implementations to use a flat uint_32 array to
map fd's. This makes it fast for the common case where there's either one
or very few epoll loops. If you try to "address" this in the jdk, you'll
need to make sure the common case lookups do not slow down.
On Tue, Oct 27, 2015 at 4:31 PM, Martin Buchholz <martinrb at google.com>
wrote:
> I agree that the amount of memory retained in the jdk (linear in the file
> descriptor limit) is a problem with the jdk that should be addressed
> somehow. Is there an open bug?
>
> On Fri, Oct 23, 2015 at 10:23 AM, Patrick Bergamin <
> patrick.bergamin at wedgenetworks.com> wrote:
>
>> Hi Vitaly,
>>
>> Yeah I understand what your saying. Right now we want to move forward
>> with a newer version of java. This application is holding us back.
>> Decoupling the Selectors from the socket connections is not a small task.
>> All your suggestions are good ideas and match some of the ideas I have had
>> on reworking the application.
>>
>> thanks for the help.
>>
>>
>> On 15-10-23 10:45 AM, Vitaly Davidovich wrote:
>>
>> 80k worker threads?? That doesn't sound right either :). Roughly
>> speaking, I suggest the following:
>>
>> 1) N compute bound worker threads where N = # of cpus
>> 2) 1-2 i/o threads that monitor fd's for write/read readiness, and
>> perform the read/write operations (workers in #1 hand off data to these)
>> 3) Some threadpool for IO/blocking operations where you don't have async
>> options (e.g. filesystem/disk) which you can size depending on latency of
>> the i/o operations
>>
>> Do some research on modern i/o threading models (e.g. nginx, netty,
>> etc). It may be a larger effort, but you'll be happier in the long run.
>>
>>
>>
>> On Fri, Oct 23, 2015 at 12:35 PM, Patrick Bergamin <
>> <patrick.bergamin at wedgenetworks.com>patrick.bergamin at wedgenetworks.com>
>> wrote:
>>
>>> I thought about using one Selector per thread. I wasn't sure if that
>>> was going reduce memory usage enough though as the application can allow
>>> upwards of 80000 worker threads. I should try this out as it isn't a large
>>> change to the application.
>>>
>>> thanks.
>>>
>>>
>>>
>>> On 15-10-23 10:23 AM, Vitaly Davidovich wrote:
>>>
>>> The entire problem is you have a Selector per connection rather than
>>> Selector per worker thread, as I mentioned in my previous reply. I don't
>>> think Selector was designed for such a thing, and so using a BitSet makes
>>> sense for the intended usage. HashSet will reduce your memory footprint
>>> because you'll have just 1 entry in there, but the common/intended case
>>> will make it worse as it has terrible memory locality properties. Your
>>> real fix is to redesign the Selector usage within your application.
>>>
>>> On Fri, Oct 23, 2015 at 12:06 PM, Patrick Bergamin <
>>> <patrick.bergamin at wedgenetworks.com>patrick.bergamin at wedgenetworks.com>
>>> wrote:
>>>
>>>> On 15-10-22 03:39 PM, Vitaly Davidovich wrote:
>>>>
>>>> Patrick,
>>>>
>>>>
>>>> Yes Selectors are reused. One Selector is used per connection. When a
>>>> socket connection is closed the Selectors are cached along with a
>>>> connection handler object so they can be reused when new socket connections
>>>> are accepted.
>>>>
>>>> I'm confused - how many selectors do you have? You mentioned a selector
>>>> is created for each accepted connection but then state that selectors are
>>>> reused for newly accepted connections. Do you mean a selector is reused
>>>> after its sole connection is closed?
>>>>
>>>>
>>>> I'm not sure what you mean by memory chasing. I changed the BitSet to
>>>> a HashSet because the file descriptor numbers can get easily get into the
>>>> millions. BitSet can allocate a good chunk of memory even if only a couple
>>>> of file descriptors are registered with it (if the file descriptor numbers
>>>> are large). When you have hundreds of thousands of open Selectors the
>>>> memory usage adds up. This is one reason why simply closing Selectors
>>>> after a connection is closed does not solve the memory usage problem.
>>>>
>>>> From my perspective it is the EpollArrayWrapper that is wasting memory
>>>> resources. It can allocate memory that is not used and never releases any
>>>> that is allocated. Likely this was done for performance reasons. I
>>>> haven't looked back through the revision history.
>>>>
>>>> I'm not proposing the suggested diff to be committed but rather I'm
>>>> presenting it to show where the memory is accumulating in the application.
>>>> After sun.nio.ch.maxUpdateArraySize was set to a smaller value it was the
>>>> 'registered' object and 'eventsHigh' object within EpollArrayWrapper that
>>>> were accumulating memory. I'm asking if there is something that can be
>>>> done to reduce the memory usage of EpollArrayWrapper by either adding more
>>>> system properties to change its behaviour or create a second implementation?
>>>>
>>>> thanks,
>>>> Patrick Bergamin
>>>>
>>>>
>>>> You should bite the bullet and reimplement your application. If I'm
>>>> reading right, you're wasting lots of resources all around, and your
>>>> proposal is just a hack (with adverse performance due to additional memory
>>>> chasing via HashSet) that will likely catch up to you anyway.
>>>>
>>>> sent from my phone
>>>> On Oct 22, 2015 4:51 PM, "Patrick Bergamin" <
>>>> patrick.bergamin at wedgenetworks.com> wrote:
>>>>
>>>>> I'm having problems with memory usage of the current implementation
>>>>> of EPollArraySelector in 1.8.0_60 for an existing proxy application.
>>>>> We've been on java version 1.7.0_05 for a while now because the new
>>>>> implementation of EPollArraySelector does not work well with the
>>>>> design of this proxy application.
>>>>>
>>>>> I did find the sun.nio.ch.maxUpdateArraySize property helped to
>>>>> reduce memory usage a bit. But as the proxy application runs
>>>>> all the EPollArraySelector objects will slowly accumulate memory.
>>>>>
>>>>> The current design of the proxy application is to have one Selector
>>>>> handle the listen socket. Once a connection is accepted, management
>>>>> of the connection is handed off to another thread on another Selector.
>>>>> Basically there is one Selector allocated per socket connection.
>>>>> The Selectors are never closed they are reused when a new socket
>>>>> connection is accepted.
>>>>>
>>>>> The application was designed before this latest implementation of
>>>>> EPollArraySelector. Redesigning the application to decouple the
>>>>> Selector from the socket connection would be a fair amount work.
>>>>>
>>>>> We have machines that are capable of running this proxy application
>>>>> with around 350,000 open connections. Since it is a proxy there
>>>>> are actually 700,000 open connections. The file descriptor limit
>>>>> is set high (5,000,000) to be able to handle all these open socket
>>>>> connections.
>>>>>
>>>>> Below I've included a patch to show what kinds of things I need
>>>>> to do to bring the memory usage of EPollArraySelector down for
>>>>> this proxy application.
>>>>>
>>>>> Is there any interest in including a second epoll implementation
>>>>> in openjdk that uses less memory or perhaps have more properties
>>>>> to control the memory usage of the existing EPollArraySelector?
>>>>>
>>>>> thanks,
>>>>> Patrick Bergamin
>>>>>
>>>>> --- EPollArrayWrapper.java 2015-07-30 06:27:02.000000000 -0600
>>>>> +++ EPollArrayWrapper2.java 2015-09-28 15:31:41.712607415 -0600
>>>>> @@ -29,6 +29,7 @@
>>>>> import java.security.AccessController;
>>>>> import java.util.BitSet;
>>>>> import java.util.HashMap;
>>>>> +import java.util.HashSet;
>>>>> import java.util.Map;
>>>>> import sun.security.action.GetIntegerAction;
>>>>>
>>>>> @@ -122,7 +123,7 @@
>>>>>
>>>>> // Used by release and updateRegistrations to track whether a file
>>>>> // descriptor is registered with epoll.
>>>>> - private final BitSet registered = new BitSet();
>>>>> + private final HashSet<Integer> registered = new
>>>>> HashSet<Integer>();
>>>>>
>>>>>
>>>>> EPollArrayWrapper() throws IOException {
>>>>> @@ -187,7 +188,10 @@
>>>>> }
>>>>> } else {
>>>>> Integer key = Integer.valueOf(fd);
>>>>> - if (!isEventsHighKilled(key) || force) {
>>>>> + if (events == KILLED) {
>>>>> + eventsHigh.remove(key);
>>>>> + }
>>>>> + else if (!isEventsHighKilled(key) || force) {
>>>>> eventsHigh.put(key, Byte.valueOf(events));
>>>>> }
>>>>> }
>>>>> @@ -201,6 +205,9 @@
>>>>> return eventsLow[fd];
>>>>> } else {
>>>>> Byte result = eventsHigh.get(Integer.valueOf(fd));
>>>>> + if (result == null) {
>>>>> + return KILLED;
>>>>> + }
>>>>> // result should never be null
>>>>> return result.byteValue();
>>>>> }
>>>>> @@ -235,7 +242,7 @@
>>>>> // force the initial update events to 0 as it may be KILLED
>>>>> by a
>>>>> // previous registration.
>>>>> synchronized (updateLock) {
>>>>> - assert !registered.get(fd);
>>>>> + assert !registered.contains(fd);
>>>>> setUpdateEvents(fd, (byte)0, true);
>>>>> }
>>>>> }
>>>>> @@ -249,9 +256,9 @@
>>>>> setUpdateEvents(fd, KILLED, false);
>>>>>
>>>>> // remove from epoll
>>>>> - if (registered.get(fd)) {
>>>>> + if (registered.contains(fd)) {
>>>>> epollCtl(epfd, EPOLL_CTL_DEL, fd, 0);
>>>>> - registered.clear(fd);
>>>>> + registered.remove(fd);
>>>>> }
>>>>> }
>>>>> }
>>>>> @@ -286,7 +293,7 @@
>>>>> while (j < updateCount) {
>>>>> int fd = updateDescriptors[j];
>>>>> short events = getUpdateEvents(fd);
>>>>> - boolean isRegistered = registered.get(fd);
>>>>> + boolean isRegistered = registered.contains(fd);
>>>>> int opcode = 0;
>>>>>
>>>>> if (events != KILLED) {
>>>>> @@ -298,9 +305,9 @@
>>>>> if (opcode != 0) {
>>>>> epollCtl(epfd, opcode, fd, events);
>>>>> if (opcode == EPOLL_CTL_ADD) {
>>>>> - registered.set(fd);
>>>>> + registered.add(fd);
>>>>> } else if (opcode == EPOLL_CTL_DEL) {
>>>>> - registered.clear(fd);
>>>>> + registered.remove(fd);
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20151027/665a5de3/attachment-0001.html>
More information about the nio-dev
mailing list