VFS: process char driver replies from main thread

Previously, processing of some replies coming from character drivers
could block on locks, and therefore, such processing was done from
threads that were associated to the character driver process. The
hidden consequence of this was that if all threads were in use, VFS
could drop replies coming from the driver. This patch returns VFS to
a situation where the replies from character drivers are processed
instantly from the main thread, by removing the situations that may
cause VFS to block while handling those replies.

- change the locking model for select, so that it will never block
  on any processing that happens after the select call has been set
  up, in particular processing of character driver select replies;
- clearly mark all select routines that may never block;
- protect against race conditions in do_select as result of the
  locking that still does happen there (as is required for pipes);
- also handle select timers from the main thread;
- move processing of character driver replies into device.c.

Change-Id: I4dc8e69f265cbd178de0fbf321d35f58f067cc57
This commit is contained in:
David van Moolenbroek 2013-08-30 13:42:51 +02:00 committed by Lionel Sambuc
parent 2e9f4d0198
commit f707937192
8 changed files with 214 additions and 176 deletions

View file

@ -143,21 +143,21 @@ thread is quite complicated. Consider Table 1.
--------------------------------------------------------- ---------------------------------------------------------
| File Server reply | resume | | | | File Server reply | resume | | |
+----------------------+----------+----------+----------+ +----------------------+----------+----------+----------+
| Sync. driver reply | resume | | | | Block driver reply | resume | | |
+----------------------+----------+----------+----------+ +----------------------+----------+----------+----------+
| Async. driver reply | resume/X | X | | | Char. driver reply | resume/X | | |
--------------------------------------------------------- ---------------------------------------------------------
}}} }}}
Table 1: VFS' message fetching main loop. X means 'start thread'. Table 1: VFS' message fetching main loop. X means 'start thread'.
The reason why asynchronous driver replies get their own thread is for the Driver replies are processed directly from the main thread. As a consequence,
following. In some cases, a reply has a thread blocked waiting for it which these processing routines may not block their calling thread. In some cases,
can be resumed (e.g., open). In another case there's a lot of work to be these routines may resume a thread that is blocked waiting for the reply. This
done which involves sending new messages (e.g., select replies). Finally, is always the case for block driver replies, and may or may not be the case for
DEV_REVIVE replies unblock suspended processes which in turn generate new jobs character driver replies. The character driver reply processing routines may
to be handled by the main loop (e.g., suspended reads and writes). So depending also unblock suspended processes which in turn generate new jobs to be handled
on the reply a new thread has to be started. Having all this logic in the main by the main loop (e.g., suspended reads and writes on pipes). So depending
loop is messy, so we start a thread regardless of the actual reply contents. on the reply a new thread may have to be started.
When there are no worker threads available and there is no need to invoke When there are no worker threads available and there is no need to invoke
the deadlock resolver (i.e., normal system calls), the request is queued in the deadlock resolver (i.e., normal system calls), the request is queued in
the fproc table. This works because a process can send only one system call the fproc table. This works because a process can send only one system call
@ -171,18 +171,17 @@ of pending requests. Moreover, this queueing mechanism is also the reason
why notifications from the kernel are handled by the system worker; the why notifications from the kernel are handled by the system worker; the
kernel has no corresponding fproc table entry (so we can't store it there) kernel has no corresponding fproc table entry (so we can't store it there)
and the linked list has no dependencies on that table. and the linked list has no dependencies on that table.
Communication with drivers is asynchronous even when the driver uses the Communication with block drivers is asynchronous, but at this time, access to
synchronous driver protocol. However, to guarantee identical behavior, these drivers is serialized on a per-driver basis. File Servers are treated
access to synchronous drivers is serialized. File Servers are treated differently. VFS was designed to be able to send requests concurrently to File
differently. VFS was designed to be able to send requests concurrently to Servers, although at the time of writing there are no File Servers that can
File Servers, although at the time of writing there are no File Servers that actually make use of that functionality. To identify which reply from an FS
can actually make use of that functionality. To identify which reply from an belongs to which worker thread, all requests have an embedded transaction
FS belongs to which worker thread, all requests have an embedded transaction identification number (a magic number + thread id encoded in the mtype field of
identification number (a magic number + thread id encoded in the mtype field a message) which the FS has to echo upon reply. Because the range of valid
of a message) which the FS has to echo upon reply. Because the range of valid transaction IDs is isolated from valid system call numbers, VFS can use that ID
transaction IDs is isolated from valid system call numbers, VFS can use that to differentiate between replies from File Servers and actual new system calls
ID to differentiate between replies from File Servers and actual new system from FSes. Using this mechanism VFS is able to support FUSE and ProcFS.
calls from FSes. Using this mechanism VFS is able to support FUSE and ProcFS.
== Locking == == Locking ==
## 4 Locking ## 4 Locking

View file

@ -5,8 +5,10 @@
* dev_open: open a character device * dev_open: open a character device
* dev_reopen: reopen a character device after a driver crash * dev_reopen: reopen a character device after a driver crash
* dev_close: close a character device * dev_close: close a character device
* cdev_reply: process the result of a character driver request
* bdev_open: open a block device * bdev_open: open a block device
* bdev_close: close a block device * bdev_close: close a block device
* bdev_reply: process the result of a block driver request
* dev_io: FS does a read or write on a device * dev_io: FS does a read or write on a device
* gen_opcl: generic call to a character driver to perform an open/close * gen_opcl: generic call to a character driver to perform an open/close
* gen_io: generic call to a character driver to initiate I/O * gen_io: generic call to a character driver to initiate I/O
@ -17,7 +19,6 @@
* ctty_io: perform controlling-tty-specific processing for I/O * ctty_io: perform controlling-tty-specific processing for I/O
* pm_setsid: perform VFS's side of setsid system call * pm_setsid: perform VFS's side of setsid system call
* do_ioctl: perform the IOCTL system call * do_ioctl: perform the IOCTL system call
* task_reply: process the result of a character driver I/O request
* dev_select: initiate a select call on a device * dev_select: initiate a select call on a device
* dev_cancel: cancel an I/O request, blocking until it has been cancelled * dev_cancel: cancel an I/O request, blocking until it has been cancelled
*/ */
@ -45,6 +46,7 @@ static int block_io(endpoint_t driver_e, message *mess_ptr);
static cp_grant_id_t make_grant(endpoint_t driver_e, endpoint_t user_e, int op, static cp_grant_id_t make_grant(endpoint_t driver_e, endpoint_t user_e, int op,
void *buf, size_t size); void *buf, size_t size);
static void restart_reopen(int major); static void restart_reopen(int major);
static void reopen_reply(message *m_ptr);
static int dummyproc; static int dummyproc;
@ -943,14 +945,17 @@ void cdev_up(int maj)
/*===========================================================================* /*===========================================================================*
* open_reply * * open_reply *
*===========================================================================*/ *===========================================================================*/
void open_reply(void) static void open_reply(message *m_ptr)
{ {
/* A character driver has replied to an open request. This function MUST NOT
* block its calling thread.
*/
struct fproc *rfp; struct fproc *rfp;
struct worker_thread *wp; struct worker_thread *wp;
endpoint_t proc_e; endpoint_t proc_e;
int slot; int slot;
proc_e = job_m_in.REP_ENDPT; proc_e = m_ptr->REP_ENDPT;
if (isokendpt(proc_e, &slot) != OK) return; if (isokendpt(proc_e, &slot) != OK) return;
rfp = &fproc[slot]; rfp = &fproc[slot];
wp = worker_get(rfp->fp_wtid); wp = worker_get(rfp->fp_wtid);
@ -958,7 +963,7 @@ void open_reply(void)
printf("VFS: no worker thread waiting for a reply from %d\n", who_e); printf("VFS: no worker thread waiting for a reply from %d\n", who_e);
return; return;
} }
*wp->w_drv_sendrec = job_m_in; *wp->w_drv_sendrec = *m_ptr;
worker_signal(wp); /* Continue open */ worker_signal(wp); /* Continue open */
} }
@ -966,25 +971,29 @@ void open_reply(void)
/*===========================================================================* /*===========================================================================*
* task_reply * * task_reply *
*===========================================================================*/ *===========================================================================*/
void task_reply(void) static void task_reply(message *m_ptr)
{ {
/* A character driver has results for a read, write, or ioctl call. */ /* A character driver has results for a read, write, or ioctl call. There may
* be a thread waiting for these results as part of an ongoing dev_cancel call.
* If so, wake up that thread; if not, send a reply to the requesting process.
* This function MUST NOT block its calling thread.
*/
struct fproc *rfp; struct fproc *rfp;
struct worker_thread *wp; struct worker_thread *wp;
endpoint_t proc_e; endpoint_t proc_e;
int slot; int slot;
proc_e = job_m_in.REP_ENDPT; proc_e = m_ptr->REP_ENDPT;
if (proc_e == VFS_PROC_NR) if (proc_e == VFS_PROC_NR)
proc_e = find_suspended_ep(job_m_in.m_source, job_m_in.REP_IO_GRANT); proc_e = find_suspended_ep(m_ptr->m_source, m_ptr->REP_IO_GRANT);
else else
printf("VFS: endpoint %u from %u is not VFS\n", printf("VFS: endpoint %u from %u is not VFS\n",
proc_e, job_m_in.m_source); proc_e, m_ptr->m_source);
if (proc_e == NONE) { if (proc_e == NONE) {
printf("VFS: proc with grant %d from %d not found\n", printf("VFS: proc with grant %d from %d not found\n",
job_m_in.REP_IO_GRANT, job_m_in.m_source); m_ptr->REP_IO_GRANT, m_ptr->m_source);
} else if (job_m_in.REP_STATUS == SUSPEND) { } else if (m_ptr->REP_STATUS == SUSPEND) {
printf("VFS: got SUSPEND on DEV_REVIVE: not reviving proc\n"); printf("VFS: got SUSPEND on DEV_REVIVE: not reviving proc\n");
} else { } else {
/* If there is a thread active for this process, we assume that this /* If there is a thread active for this process, we assume that this
@ -997,20 +1006,61 @@ void task_reply(void)
wp = worker_get(rfp->fp_wtid); wp = worker_get(rfp->fp_wtid);
if (wp != NULL && wp->w_task == who_e) { if (wp != NULL && wp->w_task == who_e) {
assert(!fp_is_blocked(rfp)); assert(!fp_is_blocked(rfp));
*wp->w_drv_sendrec = job_m_in; *wp->w_drv_sendrec = *m_ptr;
worker_signal(wp); /* Continue cancel */ worker_signal(wp); /* Continue cancel */
} else { } else {
revive(proc_e, job_m_in.REP_STATUS); revive(proc_e, m_ptr->REP_STATUS);
} }
} }
} }
/*===========================================================================* /*===========================================================================*
* dev_reply * * close_reply *
*===========================================================================*/ *===========================================================================*/
void dev_reply(struct dmap *dp) static void close_reply(message *m_ptr __unused)
{ {
/* A character driver replied to a close request. There is no need to do
* anything here, because we cheat: we assume that the close operation will
* succeed anyway, so we don't wait for the reply.
*/
/* Nothing. */
}
/*===========================================================================*
* cdev_reply *
*===========================================================================*/
void cdev_reply(void)
{
/* A character driver has results for us. */
switch (call_nr) {
case DEV_OPEN_REPL: open_reply(&m_in); break;
case DEV_REOPEN_REPL: reopen_reply(&m_in); break;
case DEV_CLOSE_REPL: close_reply(&m_in); break;
case DEV_REVIVE: task_reply(&m_in); break;
case DEV_SEL_REPL1:
select_reply1(m_in.m_source, m_in.DEV_MINOR, m_in.DEV_SEL_OPS);
break;
case DEV_SEL_REPL2:
select_reply2(m_in.m_source, m_in.DEV_MINOR, m_in.DEV_SEL_OPS);
break;
default:
printf("VFS: char driver %u sent unknown reply %x\n", who_e, call_nr);
}
}
/*===========================================================================*
* bdev_reply *
*===========================================================================*/
void bdev_reply(struct dmap *dp)
{
/* A block driver has results for a call. There must be a thread waiting for
* these results - wake it up. This function MUST NOT block its calling thread.
*/
struct worker_thread *wp; struct worker_thread *wp;
assert(dp != NULL); assert(dp != NULL);
@ -1098,7 +1148,7 @@ int maj;
/*===========================================================================* /*===========================================================================*
* reopen_reply * * reopen_reply *
*===========================================================================*/ *===========================================================================*/
void reopen_reply() static void reopen_reply(message *m_ptr)
{ {
endpoint_t driver_e; endpoint_t driver_e;
int filp_no, status, maj; int filp_no, status, maj;
@ -1106,9 +1156,9 @@ void reopen_reply()
struct vnode *vp; struct vnode *vp;
struct dmap *dp; struct dmap *dp;
driver_e = job_m_in.m_source; driver_e = m_ptr->m_source;
filp_no = job_m_in.REP_ENDPT; filp_no = m_ptr->REP_ENDPT;
status = job_m_in.REP_STATUS; status = m_ptr->REP_STATUS;
if (filp_no < 0 || filp_no >= NR_FILPS) { if (filp_no < 0 || filp_no >= NR_FILPS) {
printf("VFS: reopen_reply: bad filp number %d from driver %d\n", printf("VFS: reopen_reply: bad filp number %d from driver %d\n",

View file

@ -19,6 +19,7 @@ EXTERN struct filp {
/* the following fields are for select() and are owned by the generic /* the following fields are for select() and are owned by the generic
* select() code (i.e., fd-type-specific select() code can't touch these). * select() code (i.e., fd-type-specific select() code can't touch these).
* These fields may be changed without holding the filp lock.
*/ */
int filp_selectors; /* select()ing processes blocking on this fd */ int filp_selectors; /* select()ing processes blocking on this fd */
int filp_select_ops; /* interested in these SEL_* operations */ int filp_select_ops; /* interested in these SEL_* operations */

View file

@ -40,8 +40,6 @@ EXTERN unsigned long calls_stats[NCALLS];
#endif #endif
/* Thread related prototypes */ /* Thread related prototypes */
static void *do_char_dev_result(void *arg);
static void *do_control_msgs(void *arg);
static void *do_fs_reply(struct job *job); static void *do_fs_reply(struct job *job);
static void *do_work(void *arg); static void *do_work(void *arg);
static void *do_pm(void *arg); static void *do_pm(void *arg);
@ -109,12 +107,20 @@ int main(void)
continue; continue;
} else if (is_notify(call_nr)) { } else if (is_notify(call_nr)) {
/* A task notify()ed us */ /* A task notify()ed us */
if (who_e == DS_PROC_NR) switch (who_e) {
case DS_PROC_NR:
handle_work(ds_event); handle_work(ds_event);
else if (who_e == KERNEL) break;
case KERNEL:
mthread_stacktraces(); mthread_stacktraces();
else break;
sys_worker_start(do_control_msgs); case CLOCK:
/* Timer expired. Used only for select(). Check it. */
expire_timers(m_in.NOTIFY_TIMESTAMP);
break;
default:
printf("VFS: ignoring notification from %d\n", who_e);
}
continue; continue;
} else if (who_p < 0) { /* i.e., message comes from a task */ } else if (who_p < 0) { /* i.e., message comes from a task */
/* We're going to ignore this message. Tasks should /* We're going to ignore this message. Tasks should
@ -138,14 +144,14 @@ int main(void)
dp = get_dmap(who_e); dp = get_dmap(who_e);
if (dp != NULL) { if (dp != NULL) {
if (!IS_BDEV_RS(call_nr)) { if (!IS_BDEV_RS(call_nr)) {
handle_work(do_char_dev_result); cdev_reply();
} else { } else {
if (dp->dmap_servicing == NONE) { if (dp->dmap_servicing == NONE) {
printf("Got spurious dev reply from %d", printf("Got spurious dev reply from %d",
who_e); who_e);
} else { } else {
dev_reply(dp); bdev_reply(dp);
} }
} }
continue; continue;
@ -209,51 +215,6 @@ static void handle_work(void *(*func)(void *arg))
worker_start(func); worker_start(func);
} }
/*===========================================================================*
* do_char_dev_result *
*===========================================================================*/
static void *do_char_dev_result(void *arg)
{
struct job my_job;
my_job = *((struct job *) arg);
fp = my_job.j_fp;
/* A character driver has results for us. */
if (job_call_nr == DEV_REVIVE) task_reply();
else if (job_call_nr == DEV_OPEN_REPL) open_reply();
else if (job_call_nr == DEV_REOPEN_REPL) reopen_reply();
else if (job_call_nr == DEV_CLOSE_REPL) close_reply();
else if (job_call_nr == DEV_SEL_REPL1)
select_reply1(job_m_in.m_source, job_m_in.DEV_MINOR,
job_m_in.DEV_SEL_OPS);
else if (job_call_nr == DEV_SEL_REPL2)
select_reply2(job_m_in.m_source, job_m_in.DEV_MINOR,
job_m_in.DEV_SEL_OPS);
thread_cleanup(fp);
return(NULL);
}
/*===========================================================================*
* do_control_msgs *
*===========================================================================*/
static void *do_control_msgs(void *arg)
{
struct job my_job;
my_job = *((struct job *) arg);
fp = my_job.j_fp;
/* Check for special control messages. */
if (job_m_in.m_source == CLOCK) {
/* Alarm timer expired. Used only for select(). Check it. */
expire_timers(job_m_in.NOTIFY_TIMESTAMP);
}
thread_cleanup(NULL);
return(NULL);
}
/*===========================================================================* /*===========================================================================*
* do_fs_reply * * do_fs_reply *

View file

@ -707,11 +707,3 @@ int fd_nr;
return(OK); return(OK);
} }
/*===========================================================================*
* close_reply *
*===========================================================================*/
void close_reply()
{
/* No need to do anything */
}

View file

@ -468,7 +468,8 @@ int count; /* max number of processes to release */
void revive(endpoint_t proc_e, int returned) void revive(endpoint_t proc_e, int returned)
{ {
/* Revive a previously blocked process. When a process hangs on tty, this /* Revive a previously blocked process. When a process hangs on tty, this
* is the way it is eventually released. * is the way it is eventually released. For processes blocked on _SELECT and
* _OTHER, this function MUST NOT block its calling thread.
*/ */
struct fproc *rfp; struct fproc *rfp;
int blocked_on; int blocked_on;

View file

@ -31,10 +31,11 @@ void send_work(void);
/* device.c */ /* device.c */
int dev_open(dev_t dev, endpoint_t proc_e, int flags); int dev_open(dev_t dev, endpoint_t proc_e, int flags);
int dev_reopen(dev_t dev, int filp_no, int flags); int dev_reopen(dev_t dev, int filp_no, int flags);
void dev_reply(struct dmap *dp);
int dev_close(dev_t dev, int filp_no); int dev_close(dev_t dev, int filp_no);
void cdev_reply(void);
int bdev_open(dev_t dev, int access); int bdev_open(dev_t dev, int access);
int bdev_close(dev_t dev); int bdev_close(dev_t dev);
void bdev_reply(struct dmap *dp);
int dev_io(int op, dev_t dev, endpoint_t proc_e, void *buf, off_t pos, int dev_io(int op, dev_t dev, endpoint_t proc_e, void *buf, off_t pos,
size_t bytes, int flags, int suspend_reopen); size_t bytes, int flags, int suspend_reopen);
int gen_opcl(int op, dev_t dev, endpoint_t task_nr, int flags); int gen_opcl(int op, dev_t dev, endpoint_t task_nr, int flags);
@ -51,9 +52,6 @@ int dev_cancel(dev_t dev);
void pm_setsid(endpoint_t proc_e); void pm_setsid(endpoint_t proc_e);
void bdev_up(int major); void bdev_up(int major);
void cdev_up(int major); void cdev_up(int major);
void reopen_reply(void);
void open_reply(void);
void task_reply(void);
/* dmap.c */ /* dmap.c */
void lock_dmap(struct dmap *dp); void lock_dmap(struct dmap *dp);
@ -165,7 +163,6 @@ void unmount_all(int force);
/* open.c */ /* open.c */
int do_close(message *m_out); int do_close(message *m_out);
int close_fd(struct fproc *rfp, int fd_nr); int close_fd(struct fproc *rfp, int fd_nr);
void close_reply(void);
int common_open(char path[PATH_MAX], int oflags, mode_t omode); int common_open(char path[PATH_MAX], int oflags, mode_t omode);
int do_creat(void); int do_creat(void);
int do_lseek(message *m_out); int do_lseek(message *m_out);

View file

@ -4,6 +4,14 @@
* do_select: perform the SELECT system call * do_select: perform the SELECT system call
* select_callback: notify select system of possible fd operation * select_callback: notify select system of possible fd operation
* select_unsuspend_by_endpt: cancel a blocking select on exiting driver * select_unsuspend_by_endpt: cancel a blocking select on exiting driver
*
* The select code uses minimal locking, so that the replies from character
* drivers can be processed without blocking. Filps are locked only for pipes.
* We make the assumption that any other structures and fields are safe to
* check (and possibly change) as long as we know that a process is blocked on
* a select(2) call, meaning that all involved filps are guaranteed to stay
* open until either we finish the select call, it the process gets interrupted
* by a signal.
*/ */
#include "fs.h" #include "fs.h"
@ -37,13 +45,12 @@ static struct selectentry {
int nfds, nreadyfds; int nfds, nreadyfds;
int error; int error;
char block; char block;
char starting;
clock_t expiry; clock_t expiry;
timer_t timer; /* if expiry > 0 */ timer_t timer; /* if expiry > 0 */
} selecttab[MAXSELECTS]; } selecttab[MAXSELECTS];
static int copy_fdsets(struct selectentry *se, int nfds, int static int copy_fdsets(struct selectentry *se, int nfds, int direction);
direction);
static int do_select_request(struct selectentry *se, int fd, int *ops);
static void filp_status(struct filp *fp, int status); static void filp_status(struct filp *fp, int status);
static int is_deferred(struct selectentry *se); static int is_deferred(struct selectentry *se);
static void restart_proc(struct selectentry *se); static void restart_proc(struct selectentry *se);
@ -86,13 +93,15 @@ int do_select(message *UNUSED(m_out))
{ {
/* Implement the select(nfds, readfds, writefds, errorfds, timeout) system /* Implement the select(nfds, readfds, writefds, errorfds, timeout) system
* call. First we copy the arguments and verify their sanity. Then we check * call. First we copy the arguments and verify their sanity. Then we check
* whether there are file descriptors that satisfy the select call right of the * whether there are file descriptors that satisfy the select call right off
* bat. If so, or if there are no ready file descriptors but the process * the bat. If so, or if there are no ready file descriptors but the process
* requested to return immediately, we return the result. Otherwise we set a * requested to return immediately, we return the result. Otherwise we set a
* timeout and wait for either the file descriptors to become ready or the * timeout and wait for either the file descriptors to become ready or the
* timer to go off. If no timeout value was provided, we wait indefinitely. */ * timer to go off. If no timeout value was provided, we wait indefinitely.
*/
int r, nfds, do_timeout = 0, fd, s; int r, nfds, do_timeout = 0, fd, s;
struct filp *f;
unsigned int type, ops;
struct timeval timeout; struct timeval timeout;
struct selectentry *se; struct selectentry *se;
vir_bytes vtimeout; vir_bytes vtimeout;
@ -151,11 +160,14 @@ int do_select(message *UNUSED(m_out))
se->block = 0; se->block = 0;
se->expiry = 0; /* no timer set (yet) */ se->expiry = 0; /* no timer set (yet) */
/* We are going to lock filps, and that means that while locking a second
* filp, we might already get the results for the first one. In that case,
* the incoming results must not cause the select call to finish prematurely.
*/
se->starting = TRUE;
/* Verify that file descriptors are okay to select on */ /* Verify that file descriptors are okay to select on */
for (fd = 0; fd < nfds; fd++) { for (fd = 0; fd < nfds; fd++) {
struct filp *f;
unsigned int type, ops;
/* Because the select() interface implicitly includes file descriptors /* Because the select() interface implicitly includes file descriptors
* you might not want to select on, we have to figure out whether we're * you might not want to select on, we have to figure out whether we're
* interested in them. Typically, these file descriptors include fd's * interested in them. Typically, these file descriptors include fd's
@ -208,9 +220,6 @@ int do_select(message *UNUSED(m_out))
/* Check all file descriptors in the set whether one is 'ready' now */ /* Check all file descriptors in the set whether one is 'ready' now */
for (fd = 0; fd < nfds; fd++) { for (fd = 0; fd < nfds; fd++) {
int ops, r;
struct filp *f;
/* Again, check for involuntarily selected fd's */ /* Again, check for involuntarily selected fd's */
if (!(ops = tab2ops(fd, se))) if (!(ops = tab2ops(fd, se)))
continue; /* No operations set; nothing to do for this fd */ continue; /* No operations set; nothing to do for this fd */
@ -222,9 +231,15 @@ int do_select(message *UNUSED(m_out))
int wantops; int wantops;
wantops = (f->filp_select_ops |= ops); wantops = (f->filp_select_ops |= ops);
r = do_select_request(se, fd, &wantops); type = se->type[fd];
if (r != OK && r != SUSPEND) select_lock_filp(f, wantops);
r = fdtypes[type].select_request(f, &wantops, se->block);
unlock_filp(f);
if (r != OK && r != SUSPEND) {
se->error = r;
se->block = 0; /* Stop blocking to return asap */
break; /* Error or bogus return code; abort */ break; /* Error or bogus return code; abort */
}
/* The select request above might have turned on/off some /* The select request above might have turned on/off some
* operations because they were 'ready' or not meaningful. * operations because they were 'ready' or not meaningful.
@ -234,6 +249,9 @@ int do_select(message *UNUSED(m_out))
} }
} }
/* At this point there won't be any blocking calls anymore. */
se->starting = FALSE;
if ((se->nreadyfds > 0 || !se->block) && !is_deferred(se)) { if ((se->nreadyfds > 0 || !se->block) && !is_deferred(se)) {
/* fd's were found that were ready to go right away, and/or /* fd's were found that were ready to go right away, and/or
* we were instructed not to block at all. Must return * we were instructed not to block at all. Must return
@ -289,6 +307,9 @@ static int is_deferred(struct selectentry *se)
int fd; int fd;
struct filp *f; struct filp *f;
/* The select call must have finished its initialization at all. */
if (se->starting) return(TRUE);
for (fd = 0; fd < se->nfds; fd++) { for (fd = 0; fd < se->nfds; fd++) {
if ((f = se->filps[fd]) == NULL) continue; if ((f = se->filps[fd]) == NULL) continue;
if (f->filp_select_flags & (FSF_UPDATE|FSF_BUSY)) return(TRUE); if (f->filp_select_flags & (FSF_UPDATE|FSF_BUSY)) return(TRUE);
@ -320,7 +341,10 @@ static int is_pipe(struct filp *f)
*===========================================================================*/ *===========================================================================*/
static int is_supported_major(struct filp *f) static int is_supported_major(struct filp *f)
{ {
/* See if this filp is a handle on a device on which we support select() */ /* See if this filp is a handle on a character device on which we support
* select(). This function MUST NOT block its calling thread. The given filp
* may or may not be locked.
*/
unsigned int m; unsigned int m;
if (!(f && f->filp_vno)) return(FALSE); if (!(f && f->filp_vno)) return(FALSE);
@ -338,6 +362,11 @@ static int is_supported_major(struct filp *f)
*===========================================================================*/ *===========================================================================*/
static int select_request_major(struct filp *f, int *ops, int block) static int select_request_major(struct filp *f, int *ops, int block)
{ {
/* Check readiness status on a supported character device. Unless suitable
* results are available right now, this will only initiate the polling
* process, causing result processing to be deferred. This function MUST NOT
* block its calling thread. The given filp may or may not be locked.
*/
int r, rops, major; int r, rops, major;
struct dmap *dp; struct dmap *dp;
@ -407,6 +436,9 @@ static int select_request_file(struct filp *UNUSED(f), int *UNUSED(ops),
*===========================================================================*/ *===========================================================================*/
static int select_request_pipe(struct filp *f, int *ops, int block) static int select_request_pipe(struct filp *f, int *ops, int block)
{ {
/* Check readiness status on a pipe. The given filp is locked. This function
* may block its calling thread if necessary.
*/
int orig_ops, r = 0, err; int orig_ops, r = 0, err;
orig_ops = *ops; orig_ops = *ops;
@ -498,6 +530,9 @@ static void ops2tab(int ops, int fd, struct selectentry *e)
*===========================================================================*/ *===========================================================================*/
static int copy_fdsets(struct selectentry *se, int nfds, int direction) static int copy_fdsets(struct selectentry *se, int nfds, int direction)
{ {
/* Copy FD sets from or to the user process calling select(2). This function
* MUST NOT block the calling thread.
*/
int r; int r;
size_t fd_setsize; size_t fd_setsize;
endpoint_t src_e, dst_e; endpoint_t src_e, dst_e;
@ -549,7 +584,9 @@ static int copy_fdsets(struct selectentry *se, int nfds, int direction)
*===========================================================================*/ *===========================================================================*/
static void select_cancel_all(struct selectentry *se) static void select_cancel_all(struct selectentry *se)
{ {
/* Cancel select. Decrease select usage and cancel timer */ /* Cancel select, possibly on success. Decrease select usage and cancel timer.
* This function MUST NOT block its calling thread.
*/
int fd; int fd;
struct filp *f; struct filp *f;
@ -573,15 +610,15 @@ static void select_cancel_all(struct selectentry *se)
*===========================================================================*/ *===========================================================================*/
static void select_cancel_filp(struct filp *f) static void select_cancel_filp(struct filp *f)
{ {
/* Reduce number of select users of this filp */ /* Reduce the number of select users of this filp. This function MUST NOT block
* its calling thread.
*/
dev_t major; dev_t major;
assert(f); assert(f);
assert(f->filp_selectors > 0); assert(f->filp_selectors > 0);
assert(f->filp_count > 0); assert(f->filp_count > 0);
select_lock_filp(f, f->filp_select_ops);
f->filp_selectors--; f->filp_selectors--;
if (f->filp_selectors == 0) { if (f->filp_selectors == 0) {
/* No one selecting on this filp anymore, forget about select state */ /* No one selecting on this filp anymore, forget about select state */
@ -600,8 +637,6 @@ static void select_cancel_filp(struct filp *f)
dmap[major].dmap_sel_filp = NULL; /* leave _busy set */ dmap[major].dmap_sel_filp = NULL; /* leave _busy set */
} }
} }
unlock_filp(f);
} }
/*===========================================================================* /*===========================================================================*
@ -609,6 +644,9 @@ static void select_cancel_filp(struct filp *f)
*===========================================================================*/ *===========================================================================*/
static void select_return(struct selectentry *se) static void select_return(struct selectentry *se)
{ {
/* Return the results of a select call to the user process and revive the
* process. This function MUST NOT block its calling thread.
*/
int r, r1; int r, r1;
assert(!is_deferred(se)); /* Not done yet, first wait for async reply */ assert(!is_deferred(se)); /* Not done yet, first wait for async reply */
@ -632,6 +670,11 @@ static void select_return(struct selectentry *se)
*===========================================================================*/ *===========================================================================*/
void select_callback(struct filp *f, int status) void select_callback(struct filp *f, int status)
{ {
/* The status of a filp has changed, with the given ready operations or error.
* This function is currently called only for pipes, and holds the lock to
* the filp.
*/
filp_status(f, status); filp_status(f, status);
} }
@ -653,8 +696,9 @@ void init_select(void)
void select_forget(void) void select_forget(void)
{ {
/* The calling thread's associated process is expected to be unpaused, due to /* The calling thread's associated process is expected to be unpaused, due to
* a signal that is supposed to interrupt the current system call. * a signal that is supposed to interrupt the current system call. Totally
* Totally forget about the select(). * forget about the select(). This function may block its calling thread if
* necessary (but it doesn't).
*/ */
int slot; int slot;
struct selectentry *se; struct selectentry *se;
@ -667,6 +711,8 @@ void select_forget(void)
if (slot >= MAXSELECTS) return; /* Entry not found */ if (slot >= MAXSELECTS) return; /* Entry not found */
assert(se->starting == FALSE);
/* Do NOT test on is_deferred here. We can safely cancel ongoing queries. */ /* Do NOT test on is_deferred here. We can safely cancel ongoing queries. */
select_cancel_all(se); select_cancel_all(se);
} }
@ -677,6 +723,9 @@ void select_forget(void)
*===========================================================================*/ *===========================================================================*/
void select_timeout_check(timer_t *timer) void select_timeout_check(timer_t *timer)
{ {
/* An alarm has gone off for one of the select queries. This function MUST NOT
* block its calling thread.
*/
int s; int s;
struct selectentry *se; struct selectentry *se;
@ -740,8 +789,9 @@ endpoint_t driver_e;
int minor; int minor;
int status; int status;
{ {
/* Handle reply to DEV_SELECT request */ /* Handle the initial reply to DEV_SELECT request. This function MUST NOT
* block its calling thread.
*/
int major; int major;
dev_t dev; dev_t dev;
struct filp *f; struct filp *f;
@ -789,7 +839,6 @@ int status;
assert(f->filp_count >= 1); assert(f->filp_count >= 1);
assert(f->filp_select_flags & FSF_BUSY); assert(f->filp_select_flags & FSF_BUSY);
select_lock_filp(f, f->filp_select_ops);
f->filp_select_flags &= ~FSF_BUSY; f->filp_select_flags &= ~FSF_BUSY;
/* The select call is done now, except when /* The select call is done now, except when
@ -823,7 +872,6 @@ int status;
} }
} }
unlock_filp(f);
filp_status(f, status); /* Tell filp owners about the results */ filp_status(f, status); /* Tell filp owners about the results */
} }
@ -840,7 +888,9 @@ int minor;
int status; int status;
{ {
/* Handle secondary reply to DEV_SELECT request. A secondary reply occurs when /* Handle secondary reply to DEV_SELECT request. A secondary reply occurs when
* the select request is 'blocking' until an operation becomes ready. */ * the select request is 'blocking' until an operation becomes ready. This
* function MUST NOT block its calling thread.
*/
int major, slot, fd; int major, slot, fd;
dev_t dev; dev_t dev;
struct filp *f; struct filp *f;
@ -874,7 +924,6 @@ int status;
if (!S_ISCHR(vp->v_mode)) continue; if (!S_ISCHR(vp->v_mode)) continue;
if (vp->v_sdev != dev) continue; if (vp->v_sdev != dev) continue;
select_lock_filp(f, f->filp_select_ops);
if (status > 0) { /* Operations ready */ if (status > 0) { /* Operations ready */
/* Clear the replied bits from the request /* Clear the replied bits from the request
* mask unless FSF_UPDATE is set. * mask unless FSF_UPDATE is set.
@ -893,7 +942,6 @@ int status;
f->filp_select_flags &= ~FSF_BLOCKED; f->filp_select_flags &= ~FSF_BLOCKED;
ops2tab(SEL_RD|SEL_WR|SEL_ERR, fd, se); ops2tab(SEL_RD|SEL_WR|SEL_ERR, fd, se);
} }
unlock_filp(f);
if (se->nreadyfds > 0) restart_proc(se); if (se->nreadyfds > 0) restart_proc(se);
} }
} }
@ -904,11 +952,14 @@ int status;
/*===========================================================================* /*===========================================================================*
* select_restart_filps * * select_restart_filps *
*===========================================================================*/ *===========================================================================*/
static void select_restart_filps() static void select_restart_filps(void)
{ {
/* We got a result from a character driver, and now we need to check if we can
* restart deferred polling operations. This function MUST NOT block its
* calling thread.
*/
int fd, slot; int fd, slot;
struct filp *f; struct filp *f;
struct vnode *vp;
struct selectentry *se; struct selectentry *se;
/* Locate filps that can be restarted */ /* Locate filps that can be restarted */
@ -932,44 +983,25 @@ static void select_restart_filps()
if (!(f->filp_select_flags & FSF_UPDATE)) /* Must be in */ if (!(f->filp_select_flags & FSF_UPDATE)) /* Must be in */
continue; /* 'update' state */ continue; /* 'update' state */
/* This function is suitable only for character devices. In
* particular, checking pipes the same way would introduce a
* serious locking problem.
*/
assert(is_supported_major(f));
wantops = ops = f->filp_select_ops; wantops = ops = f->filp_select_ops;
vp = f->filp_vno; r = select_request_major(f, &wantops, se->block);
assert(S_ISCHR(vp->v_mode)); if (r != OK && r != SUSPEND) {
r = do_select_request(se, fd, &wantops); se->error = r;
if (r != OK && r != SUSPEND) se->block = 0; /* Stop blocking to return asap */
restart_proc(se);
break; /* Error or bogus return code; abort */ break; /* Error or bogus return code; abort */
}
if (wantops & ops) ops2tab(wantops, fd, se); if (wantops & ops) ops2tab(wantops, fd, se);
} }
} }
} }
/*===========================================================================*
* do_select_request *
*===========================================================================*/
static int do_select_request(se, fd, ops)
struct selectentry *se;
int fd;
int *ops;
{
/* Perform actual select request for file descriptor fd */
int r, type;
struct filp *f;
type = se->type[fd];
f = se->filps[fd];
select_lock_filp(f, *ops);
r = fdtypes[type].select_request(f, ops, se->block);
unlock_filp(f);
if (r != OK && r != SUSPEND) {
se->error = EINTR;
se->block = 0; /* Stop blocking to return asap */
if (!is_deferred(se)) select_cancel_all(se);
}
return(r);
}
/*===========================================================================* /*===========================================================================*
* filp_status * * filp_status *
*===========================================================================*/ *===========================================================================*/
@ -977,7 +1009,9 @@ static void filp_status(f, status)
struct filp *f; struct filp *f;
int status; int status;
{ {
/* Tell processes that need to know about the status of this filp */ /* Tell processes that need to know about the status of this filp. This
* function MUST NOT block its calling thread.
*/
int fd, slot; int fd, slot;
struct selectentry *se; struct selectentry *se;
@ -1003,7 +1037,8 @@ static void restart_proc(se)
struct selectentry *se; struct selectentry *se;
{ {
/* Tell process about select results (if any) unless there are still results /* Tell process about select results (if any) unless there are still results
* pending. */ * pending. This function MUST NOT block its calling thread.
*/
if ((se->nreadyfds > 0 || !se->block) && !is_deferred(se)) if ((se->nreadyfds > 0 || !se->block) && !is_deferred(se))
select_return(se); select_return(se);
@ -1033,8 +1068,10 @@ static void wipe_select(struct selectentry *se)
*===========================================================================*/ *===========================================================================*/
static void select_lock_filp(struct filp *f, int ops) static void select_lock_filp(struct filp *f, int ops)
{ {
/* Lock a filp and vnode based on which operations are requested */ /* Lock a filp and vnode based on which operations are requested. This function
tll_access_t locktype;; * may block its calling thread, obviously.
*/
tll_access_t locktype;
locktype = VNODE_READ; /* By default */ locktype = VNODE_READ; /* By default */