Linux edge triggered epoll avoiding multiple recv calls for close

271 Views Asked by At

I'm trying to understand if its possible to use edge-triggered epoll and avoid the need to call recv() to read from an epoll triggered READ event multiple times, every single time...

Take this scenario:

  • Server sends client say 64 bytes and then closes the socket.
  • Client ET epoll_wait triggers a read event on the client. Now lets say the close made it into the trigger ( I have seen this race where the close counts in this READ event).
  • Client reads into a buffer of say 4k. Now, to be optimal, I would hope that if you see that recv returns <4k ( the buffer size ) then you know there is no more data and you can get back into the epoll_wait . I think in general this does work, EXCEPT for the close() case. Since closing a socket is signaled by returning 0 bytes from the recv call on the client, it would appear that you HAVE to call recv again to ensure that you don't get a 0 back ( in the general case, you would get -1 with EWOULDBLOCK and continue on your merry way to the next epoll_wait call ).

Given this, it seems like one would always have to call recv twice per read event if you are using edge-triggered epoll... am I missing something here? It seems grossly inefficient

2

There are 2 best solutions below

0
On BEST ANSWER

Ok I've found the answer, I'm posting here to help someone else who hits this down the line. To account for this scenario in edge-triggered read processing, you need to add EPOLLRDHUP to your epoll interest. With this set, on the last "collapsed" data + close event, epoll_wait will return both EPOLLIN | EPOLLRDHUP. The application should read and then treat the EPOLLRDHUP as a close event

1
On

https://man7.org/linux/man-pages/man7/epoll.7.html

For stream-oriented files (e.g., pipe, FIFO, stream socket), the condition that the read/write I/O space is exhausted can also be detected by checking the amount of data read from / written to the target file descriptor. For example, if you call read(2) by asking to read a certain amount of data and read(2) returns a lower number of bytes, you can be sure of having exhausted the read I/O space for the file descriptor. The same is true when writing using write(2). (Avoid this latter technique if you cannot guarantee that the monitored file descriptor always refers to a stream-oriented file.)