I/O Multiplexing

There will be the case where asynchronous I/O is desired in order to serve different I/O events at the same time. For example, a Program# might have to serve two read requests from different Processes. However, since serving one read request will always block another, it is unwise to listen just one of the socket and wait for it since the program can’t know which one will arrive first. It is the same concern for asynchronous Socket Programming where the server might need to serve two sockets at the same time while being ignorant about the order of their arrival. I/O Multiplexing using three Unix System Call# : select(), pselect(), and poll().

Select

select() system call, receiving either three file descriptor sets (described below) including read, write and exceptional file descriptor sets, allows the user process to instruct the Kernel to wait for either reading, writing, or exceptional I/O events to happen and to wake up the process only when one of these events occur. The first argument for select() should be the highest-numbered file descriptor (numfds) to be tested in any of passed three sets plus 1. With this system call, we can monitor several sockets (up until the limit numfds) via their File Descriptor# at the same time to see which one is ready for reading, which one is ready for writing etc. in a non-blocking manner. Meaning, the process will only be notified when the data is readily available for reading from either socket. An example usage of select() is shown below:

fd_set read_fds;
struct timeval tv;
tv.tv_sec = 2;        // in second
tv.tv_usec = 500000;  // in microsecond

// Process can read from standard input and return after 2.5 seconds
select(STDIN + 1, &read_fds, (fd_set *) 0, (fd_set *) 0, &tv);

As you can see, we initialised a timeval struct to indicate the timeout for the system call. If the value for tv_sec and tv_usec is 0, it indicates for a polling, that is to return select() immediately after checking the descriptors instead of waiting for the data to arrive. If there are values for two of these fields, this will be the timeout (the sum of tv_sec and tv_usec) for the select() to wait for available data. If timeval is a NULL, then it is essentially the same with synchronous I/O where the program will be blocked until there is available data to be read in the file descriptor.

We can further manipulate the file descriptor set (variable type of fd_set) with the function FD_ZERO(), FD_SET(), FD_CLR(), and FD_ISSET(). The usage is shown below:

fd_set set;
FD_ZERO(&set);      // initialise set as an empty set
FD_SET(1, &set);    // include fd 1, which is STDOUT, to the set
FD_CLR(3, &set);    // remove fd 3 from the set
FD_ISSET(10, &set); // return value indicates whether fd 10 is in the set

Note: In Socket Programming, we need to check whether the file descriptor that we are checking is the server socket file descriptor. Different behaviours are desired for server socket and client sockets as the former is for connection purpose and the latter is data-oriented.

Pselect

pselect() is quite similar to select() functionality wise, with one additional argument that point to a signal mask# and a change to the timeout structure. It accepts timespec struct instead of timeval, where the second field changes to tv_nsec in nanosecond unit. The signal mask will determine which signal should be blocked for the file descriptor set.

Poll

poll() is a System V system call that functionality wise identical to of select(). However, instead of organising the file descriptors by their type of operations, it organises them into an array where each of them could have different events to be handled individually and whether they have occurred. The following example shows its usage:

struct pollfd {
  int fd;         // file descriptor
  short events;   // requested events
  short revents;  // returned events
};

struct pollfd client[15];
client[0].fd = listenfd;
client[0].events = POLLRDNORM;

// waits indefinitely for the file decriptors in client to become ready to
// perform I/O, 16 defined the sizes of the pollfd to be checked.
// Upon successfull polling, it will return number of file descriptors that are
// ready
nready = poll(client, 16, INFTIM);

if (client[0].revents & POLLRDNORM) { // check if the event have occurred
  ...
}

The timeout value could be -1 (or INFTIM), 0 or more than 0. 0 indicates non-blocking behaviour, where the system call will return immediately if there is no data available. If the timeout value is more than 0, then it will be treated as the waiting time for the event to occurred measured in milliseconds.

The possible events could be handled in this system call are shown in the following table:

EventsMeaning
POLLINread other than high priority data without blocking
POLLRDNORMread normal data without blocking
POLLRDBANDread priority data without blocking
POLLPRIread high-priority data without blocking
POLLOUTwrite normal data without blocking
POLLWRNORMsame as POLLOUT
POLLERRerror occurred on the descriptor
POLLHUPdevice has been disconnected
POLLNVALfile descriptor invalid
#operating-system #io #async