Previously, VFS would reopen a character device after a driver crash
if the associated file descriptor was opened with the O_REOPEN flag.
This patch removes support for this feature. The code was complex,
full of uncovered corner cases, and hard to test. Moreover, it did not
actually hide the crash from user applications: they would get an
error code to indicate that something went wrong, and have to decide
based on the nature of the underlying device how to continue.
- remove support for O_REOPEN, and make playwave(1) reopen its device;
- remove support for the DEV_REOPEN protocol message;
- remove all code in VFS related to reopening character devices;
- no longer change VFS filp reference count and FD bitmap upon filp
invalidation; instead, make get_filp* fail all calls on invalidated
FDs except when obtained with the locktype VNODE_OPCL which is used
by close_fd only;
- remove the VFS fproc file descriptor bitmap entirely, returning to
the situation that a FD is in use if its slot points to a filp; use
FILP_CLOSED as single means of marking a filp as invalidated.
Change-Id: I34f6bc69a036b3a8fc667c1f80435ff3af56558f
POSIX states that when interrupted, partially successful pipe
operations should return the partial result rather than EINTR. VFS
previously wouldn't look at the partial result, and not clear it
either, which would result in a panic upon the next pipe operation.
Change-Id: Ia1eb72b4b77394051444e63a1390d49bb315eb04
The main purpose of this patch is to fix handling of unpause calls
from PM while another call is ongoing. The solution to this problem
sparked a full revision of the threading model, consisting of a large
number of related changes:
- all active worker threads are now always associated with a process,
and every process has at most one active thread working for it;
- the process lock is always held by a process's worker thread;
- a process can now have both normal work and postponed PM work
associated to it;
- timer expiry and non-postponed PM work is done from the main thread;
- filp garbage collection is done from a thread associated with VFS;
- reboot calls from PM are now done from a thread associated with PM;
- the DS events handler is protected from starting multiple threads;
- support for a system worker thread has been removed;
- the deadlock recovery thread has been replaced by a parameter to the
worker_start() function; the number of worker threads has
consequently been increased by one;
- saving and restoring of global but per-thread variables is now
centralized in worker_suspend() and worker_resume(); err_code is now
saved and restored in all cases;
- the concept of jobs has been removed, and job_m_in now points to a
message stored in the worker thread structure instead;
- the PM lock has been removed;
- the separate exec lock has been replaced by a lock on the VM
process, which was already being locked for exec calls anyway;
- PM_UNPAUSE is now processed as a postponed PM request, from a thread
associated with the target process;
- the FP_DROP_WORK flag has been removed, since it is no longer more
than just an optimization and only applied to processes operating on
a pipe when getting killed;
- assignment to "fp" now takes place only when obtaining new work in
the main thread or a worker thread, when resuming execution of a
thread, and in the special case of exiting processes during reboot;
- there are no longer special cases where the yield() call is used to
force a thread to run.
Change-Id: I7a97b9b95c2450454a9b5318dfa0e6150d4e6858
Previously, processing of some replies coming from character drivers
could block on locks, and therefore, such processing was done from
threads that were associated to the character driver process. The
hidden consequence of this was that if all threads were in use, VFS
could drop replies coming from the driver. This patch returns VFS to
a situation where the replies from character drivers are processed
instantly from the main thread, by removing the situations that may
cause VFS to block while handling those replies.
- change the locking model for select, so that it will never block
on any processing that happens after the select call has been set
up, in particular processing of character driver select replies;
- clearly mark all select routines that may never block;
- protect against race conditions in do_select as result of the
locking that still does happen there (as is required for pipes);
- also handle select timers from the main thread;
- move processing of character driver replies into device.c.
Change-Id: I4dc8e69f265cbd178de0fbf321d35f58f067cc57
These days, DEV_OPEN calls to character drivers block the calling
thread until completion or failure, and thus never return SUSPEND to
the caller. The same already applied to BDEV_OPEN calls to block
drivers. It has thus become impossible for a process to enter a state
of being blocked on a device open call.
There is currently no support for restarting device open calls to
restarted character drivers. This support was present in the _DOPEN
logic, but was already no longer triggering. In the future, this case
should be handled by the thread performing the open request.
Change-Id: I6cc1e7b4c9ed116c6ce160b315e6e060124dce00
- change all sync char drivers into async drivers;
- retire support for the sync protocol in libchardev;
- remove async dev style, as this is now the default;
- remove dev_status from VFS;
- clean up now-unused protocol messages.
Change-Id: I6aacff712292f6b29f2ccd51bc1e7d7003723e87
. unpause() and revive() can race - revive() can run during
a device i/o unblock, causing two sendnb()s to occur, and the
2nd one to fail
. this can easily happen when a process is blocking on tty and
is then killed by a signal - tty cancels the i/o and then
kills the process by a signal
Change-Id: Ia319acaedfa336b78c030a2c4af7246959bdcf87
. libc: add vfs_mmap, a way for vfs to initiate mmap()s.
This is a good special case to have as vfs is a slightly
different client from regular user processes. It doesn't do it
for itself, and has the dev & inode info already so the callback
to VFS for the lookup isn't necessary. So it has different info
to have to give to VM.
. libc: also add minix_mmap64() that accepts a 64-bit offset, even
though our off_t is still 32 bit now.
. On exec() time, try to mmap() in the executable if available.
(It is not yet available in this commit.)
. To support mmap(), add do_vm_call that allows VM to lookup
(to ino+dev), do i/o from and close FD's on behalf of other
processes.
Change-Id: I831551e45a6781c74313c450eb9c967a68505932
m_out is shared between threads as the reply message, and it can happen
results get overwritten by another thread before the reply is sent. This
change
. makes m_out local to the message handling function,
declared on the stack of the caller
. forces callers of reply() to give it a message, or
declare the reply message has no significant fields except
for the return code by calling replycode()
Change-Id: Id06300083a63c72c00f34f86a5c7d96e4bbdf9f6
Remove old versions of system calls and system calls that don't have
a libc api interface anymore (dup, dup2, creat).
VFS still contains support for old system call numbers for the new stat
system calls (i.e., 65, 66, 67) to keep supporting old binaries built for
MINIX 3.2.1 (prior to the release).
Change-Id: I721779b58a50c7eeae20669de24658d55d69b25b
Because pipes have no file position. VFS maintained (file) offsets into a
buffer internal to PFS and stored them in vnodes for simplicity, mixing
the responsibilities of filp and vnode objects.
With this patch PFS ignores the position field in REQ_READ and REQ_WRITE
requests making VFS' job a lot simpler.
.sync and fsync used unnecessarily restrictive locking type
.fsync violated locking order by obtaining a vmnt lock after a filp lock
.fsync contained a TOCTOU bug
.new_node violated locking rules (didn't upgrade lock upon file creation)
.do_pipe used unnecessarily restrictive locking type
.always lock pipes exclusively; even a read operation might require to do
a write on a vnode object (update pipe size)
.when opening a file with O_TRUNC, upgrade vnode lock when truncating
.utime used unnecessarily restrictive locking type
.path parsing:
.always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to
VMNT_READ if that was what was actually requested. This prevents the
following deadlock scenario:
thread A:
lock_vmnt(vmp, TLL_READSER);
lock_vnode(vp, TLL_READSER);
upgrade_vmnt_lock(vmp, TLL_WRITE);
thread B:
lock_vmnt(vmp, TLL_READ);
lock_vnode(vp, TLL_READSER);
thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in
lock_vnode. This happens when, for example, thread A tries create a
new node (open.c:new_node) and thread B tries to do eat_path to
change dir (stadir.c:do_chdir). When the path is being resolved, a
vnode is always locked with VNODE_OPCL (TLL_READSER) and then
downgraded to VNODE_READ if read-only is actually requested. Thread
A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows
VMNT_READ locks. Thread B can't acquire a lock on the vnode because
thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE
(TLL_WRITE) because thread B has a VMNT_READ lock on it.
By serializing vmnt locks during path parsing, thread B can only
acquire a lock on vmp when thread A has completely finished its
operation.
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.
Furthermore, this patch cleans up some code and uses better types in a lot
of places.
- Fix locking bug when unable to send DEV_SELECT request. Upon failure
VFS tried to cancel the select operation, but this failed due to trying
to lock a filp that was already locked to send the request in the first
place. Do_select_request now handles locking of filps itself instead of
relying on the caller to do it. This fixes a crash when killing INET.
- Fix failure to revive a process after a non-blocking select operation
yielded no ready select operations when replying DEV_SEL_REPL1.
- Improve readability by using OK, SUSPEND, and standard error values as
results instead of having separate macros in select.
- Don't print not having a driver for a major device; after killing a driver
select will trigger this printf.
- Remove redundant code.
- Always wait for the initial reply from an asynchronous select request,
even if the select has been satisfied on another file descriptor or
was canceled due to a serious error.
- Restart asynchronous selects if upon reply from the driver turns out
that there are deferred operations (and do not forget we're still
interested in the results of the deferred operations).
- Do not hang a non-blocking select when another blocking select on
the same filp is still blocking.
- Split blocking operations in read, write, and exceptions (i.e.,
blocking on read does not imply the write will block as well).
- Some loops would iterate over OPEN_MAX file descriptors instead of
the "highest" file descriptor.
- Use proper internal error return values.
- A secondary reply from a synchronous driver is essentially the same
as from an asynchronous driver (the only difference being how the
answer is received). Merge.
- Return proper error code after a driver failure.
- Auto-detect whether a driver is synchronous or asynchronous.
- Remove some code duplication.
- Clean up code (coding style, add missing comments, put all select
related code together).
Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed
to know which actual process to copy data from/to, as that process may
not always be the caller. Now that we have full safecopy support, these
fields have become useless for that purpose: the owner of the grant is
*always* the caller. Allowing the caller to supply another endpoint is
in fact dangerous, because the callee may then end up using a grant
from a third party. One could call this a variant of the confused
deputy problem.
From now on, safecopy calls should always use the caller's endpoint as
grant owner. This fully obsoletes the DL_ENDPT field in the
inet/ethernet protocol. IO_ENDPT has other uses besides identifying the
grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only
because that is a more fitting name (it should never be used for I/O
after all), but also in order to intentionally break any old system
source code outside the base system. If this patch breaks your code,
fixing it is fairly simple:
- DL_ENDPT should be replaced with m_source;
- IO_ENDPT should be replaced with m_source when used for safecopies;
- IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g.
when setting REP_ENDPT, matching requests in CANCEL calls, getting
DEV_SELECT flags, and retrieving of the real user process's endpoint
in DEV_OPEN.
The changes in this patch are binary backward compatible.
this change
- makes panic() variadic, doing full printf() formatting -
no more NO_NUM, and no more separate printf() statements
needed to print extra info (or something in hex) before panicing
- unifies panic() - same panic() name and usage for everyone -
vm, kernel and rest have different names/syntax currently
in order to implement their own luxuries, but no longer
- throws out the 1st argument, to make source less noisy.
the panic() in syslib retrieves the server name from the kernel
so it should be clear enough who is panicing; e.g.
panic("sigaction failed: %d", errno);
looks like:
at_wini(73130): panic: sigaction failed: 0
syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
- throws out report() - printf() is more convenient and powerful
- harmonizes/fixes the use of panic() - there were a few places
that used printf-style formatting (didn't work) and newlines
(messes up the formatting) in panic()
- throws out a few per-server panic() functions
- cleans up a tie-in of tty with panic()
merging printf() and panic() statements to be done incrementally.
- Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly.
- Clean up MFS by removing old, dead code (backwards compatibility is broken by
the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all
functions have proper banners and prototypes.
- VFS should always provide a (syntactically) valid path to the FS; no need for
the FS to do sanity checks when leaving/entering mount points.
- Fix several bugs in MFS:
- Several path lookup bugs in MFS.
- A link can be too big for the path buffer.
- A mountpoint can become inaccessible when the creation of a new inode
fails, because the inode already exists and is a mountpoint.
- Introduce support for supplemental groups.
- Add test 46 to test supplemental group functionality (and removed obsolete
suppl. tests from test 2).
- Clean up VFS (not everything is done yet).
- ISOFS now opens device read-only. This makes the -r flag in the mount command
unnecessary (but will still report to be mounted read-write).
- Introduce PipeFS. PipeFS is a new FS that handles all anonymous and
named pipes. However, named pipes still reside on the (M)FS, as they are part
of the file system on disk. To make this work VFS now has a concept of
'mapped' inodes, which causes read, write, truncate and stat requests to be
redirected to the mapped FS, and all other requests to the original FS.
- all macros in consts.h that depend on NR_TASKS replaced by a FP_BLOCKED_ON_*
- fp_suspended removed and replaced by fp_blocked_on. Testing whether a process
is supended is qeual to testing whether fp_blocked_on is FP_BLOCKED_ON_NONE or
not
- fp_task is valid only if fp_blocked_on == FP_BLOCKED_ON_OTHER
- no need of special values that do not colide with valid and special endpoints
since they are not used as endpoints anymore
- suspend only takes FP_BLOCKED_ON_* values not endpoints anymore
- suspend(task) replaced by wait_for(task) which sets fp_task so we remember who
are we waiting for and suspend sets fp_blocked_on to FP_BLOCKED_ON_OTHER to
signal that we are waiting for some other process
- some functions should take endpoint_t instead of int, fixed
- When one does a select on a file descriptor that is meaningless for that particular file type, select shall indicate that the file descriptor is ready for that particular operation and that the file descriptor has no exceptional condition pending.
if the process was REVIVING. (susp_count doesn't count those
processes.) this together with dev_io SELECT suspend side effect
for asynch. character devices solves the hanging pipe bug. or
at last vastly improves it.
added sanity checks, turned off by default.
made the {NOT_,}{SUSPENDING,REVIVING} constants weirder to
help sanity checking.