Commit graph

56 commits

Author SHA1 Message Date
Thomas Veerman 473547c777 VFS: implement pipe2
Change-Id: Iedc8042dd73a903456b25ba665d12577f5589ca2
2013-02-28 10:08:53 +00:00
Thomas Veerman fa78dc389f socket: implement SOCK_CLOEXEC and SOCK_NONBLOCK
Change-Id: I3fa36fa999c82a192d402cb4d913bd397e106e53
2013-02-28 10:08:53 +00:00
Thomas Veerman 3de8d1cf6e VFS/PFS: remove notion of position in pipes
Because pipes have no file position. VFS maintained (file) offsets into a
buffer internal to PFS and stored them in vnodes for simplicity, mixing
the responsibilities of filp and vnode objects.

With this patch PFS ignores the position field in REQ_READ and REQ_WRITE
requests making VFS' job a lot simpler.
2013-01-11 09:18:35 +00:00
Thomas Veerman 7c8b3ddfed VFS: fix locking bugs
.sync and fsync used unnecessarily restrictive locking type
.fsync violated locking order by obtaining a vmnt lock after a filp lock
.fsync contained a TOCTOU bug
.new_node violated locking rules (didn't upgrade lock upon file creation)
.do_pipe used unnecessarily restrictive locking type
.always lock pipes exclusively; even a read operation might require to do
 a write on a vnode object (update pipe size)
.when opening a file with O_TRUNC, upgrade vnode lock when truncating
.utime used unnecessarily restrictive locking type
.path parsing:
  .always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to
   VMNT_READ if that was what was actually requested. This prevents the
   following deadlock scenario:
   thread A:
     lock_vmnt(vmp, TLL_READSER);
     lock_vnode(vp, TLL_READSER);
     upgrade_vmnt_lock(vmp, TLL_WRITE);

   thread B:
     lock_vmnt(vmp, TLL_READ);
     lock_vnode(vp, TLL_READSER);

   thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in
   lock_vnode. This happens when, for example, thread A tries create a
   new node (open.c:new_node) and thread B tries to do eat_path to
   change dir (stadir.c:do_chdir). When the path is being resolved, a
   vnode is always locked with VNODE_OPCL (TLL_READSER) and then
   downgraded to VNODE_READ if read-only is actually requested. Thread
   A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows
   VMNT_READ locks. Thread B can't acquire a lock on the vnode because
   thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE
   (TLL_WRITE) because thread B has a VMNT_READ lock on it.

   By serializing vmnt locks during path parsing, thread B can only
   acquire a lock on vmp when thread A has completely finished its
   operation.
2013-01-11 09:18:35 +00:00
Thomas Veerman 179261a9b6 mtab: support moving mount points
Also fix canonical_path function; it fails to parse some paths
2012-11-29 10:50:51 +00:00
Thomas Veerman d9f4f71916 Implement dynamic mtab support
With this patch /etc/mtab becomes obsolete.
2012-11-26 15:20:18 +00:00
Thomas Veerman ed23a7a7d2 VFS: fix reboot panic with mounted FUSE FS
Upon reboot VFS semi-exits all processes and unmounts the file system.
However, upon unmount, exiting FUSE file systems might need service from
the file system (due to libc). As the FUSE process is halfway the exit
procedure, it doesn't have a valid root directory and working directory.
Trying to do system calls then triggers a sanity check in VFS.

This fix first exits normal processes which should then allow for
unmounting FUSE file systems. Then VFS exits all processes including
File Servers and unmounts the rest of the file system.
2012-11-14 13:18:16 +00:00
Arne Welzel e35c4f78d2 VFS: fix check_bsf() locking
The check_bsf() macro uses assert(mutex_trylock(&bsf_lock)) and
assumes bsf_lock is locked afterwards. This breaks when compiling
with NOASSERTS="yes". Also: macro to function transition.
2012-09-28 14:57:34 +02:00
Thomas Veerman 992799b91f VFS: make all IPC asynchronous
By decoupling synchronous drivers from VFS, we are a big step closer to
supporting driver crashes under all circumstances. That is, VFS can't
become stuck on IPC with a synchronous driver (e.g., INET) and can
recover from crashing block drivers during open/close/ioctl or during
communication with an FS.

In order to maintain serialized communication with a synchronous driver,
the communication is wrapped by a mutex on a per driver basis (not major
numbers as there can be multiple majors with identical endpoints). Majors
that share a driver endpoint point to a single mutex object.

In order to support crashes from block drivers, the file reopen tactic
had to be changed; first reopen files associated with the crashed
driver, then send the new driver endpoint to FSes. This solves a
deadlock between the FS and the block driver;
  - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it
    after retrying the current request to the newly started driver.
  - The block driver would refuse the retried request until all files
    had been reopened.
  - VFS would reopen files only after getting a reply from the initial
    REQ_NEW_DRIVER.

When a character special driver crashes, all associated files have to
be marked invalid and closed (or reopened if flagged as such). However,
they can only be closed if a thread holds exclusive access to it. To
obtain exclusive access, the worker thread (which handles the new driver
endpoint event from DS) schedules a new job to garbage collect invalid
files. This way, we can signal the worker thread that was talking to the
crashed driver and will release exclusive access to a file associated
with the crashed driver and prevent the garbage collecting worker thread
from dead locking on that file.

Also, when a character special driver crashes, RS will unmap the driver
and remap it upon restart. During unmapping, associated files are marked
invalid instead of waiting for an endpoint up event from DS, as that
event might come later than new read/write/select requests and thus
cause confusion in the freshly started driver.

When locking a filp, the usage counters are no longer checked. The usage
counter can legally go down to zero during filp invalidation while there
are locks pending.

DS events are handled by a separate worker thread instead of the main
thread as reopening files could lead to another crash and a stuck thread.
An additional worker thread is then necessary to unlock it.

Finally, with everything asynchronous a race condition in do_select
surfaced. A select entry was only marked in use after succesfully sending
initial select requests to drivers and having to wait. When multiple
select() calls were handled there was opportunity that these entries
were overwritten. This had as effect that some select results were
ignored (and select() remained blocking instead if returning) or do_select
tried to access filps that were not present (because thrown away by
secondary select()). This bug manifested itself with sendrecs, but was
very hard to reproduce. However, it became awfully easy to trigger with
asynsends only.
2012-09-17 11:01:45 +00:00
Thomas Veerman 77dbd766c1 VFS: Use safe string copy functions 2012-07-16 10:57:43 +00:00
Thomas Veerman 7b81254069 VFS: simplify stat for pipes
According to POSIX the st_size field of struct stat is undefined for
fifos and anonymous pipes. Thus we can do anything we want. We save a
copy by not being accurate on pipe sizes.
2012-04-27 08:50:49 +00:00
Ben Gras 53002f6f6c recognize and execute dynamically linked executables
. generalize libexec slightly to get some more necessary information
	  from ELF files, e.g. the interpreter
	. execute dynamically linked executables when exec()ed by VFS
	. switch to netbsd variant of elf32.h exclusively, solves some
	  conflicting headers
2012-04-16 00:41:42 +00:00
Thomas Veerman 91a38b6d4e VFS: fix dead lock
When running out of worker threads to handle device replies a dead
lock resolver thread is used. However, it was only used for FS
endpoints; it is now used for "system processes" (drivers and FS
endpoints). Also, drivers were marked as system process when they
were not "forced" to map (i.e., mapping was done before endpoint was
alive).
2012-04-13 13:19:10 +00:00
Thomas Veerman b956493367 VFS: fix new signed/unsigned comparisons 2012-04-13 13:00:11 +00:00
Thomas Veerman 8f55767619 VFS: make m_in job local
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.

Furthermore, this patch cleans up some code and uses better types in a lot
of places.
2012-04-13 12:50:38 +00:00
Ben Gras 6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
Thomas Veerman 1efb51b1de VFS: improve crashed FS resource cleanup
When VFS detects that an FS has crashed and tries to clean up
resources, it marks fairly late in the process that a vmnt is not
to be used again (to send requests to). This allows a thread to
become blocked on a vmnt after all blocked threads were stopped, but
before it finds out it shouldn't try to send to that vmnt.
2012-02-22 13:54:35 +00:00
Thomas Veerman 80c4685324 VFS: replace VFS with AVFS 2012-02-13 16:53:21 +00:00
Thomas Veerman 1fc399a5c1 Add permission test for bind and socket
Also, apply forbidden patch to VFS from AVFS (fixes hanging test56 if
it has the permission test).
2012-01-30 15:16:20 +00:00
David van Moolenbroek db087efac4 VFS/FS: REQ_NEW_DRIVER now provides a label 2011-11-30 19:05:26 +01:00
David van Moolenbroek b4d909d415 Split block/character protocols and libdriver
This patch separates the character and block driver communication
protocols. The old character protocol remains the same, but a new
block protocol is introduced. The libdriver library is replaced by
two new libraries: libchardriver and libblockdriver. Their exposed
API, and drivers that use them, have been updated accordingly.
Together, libbdev and libblockdriver now completely abstract away
the message format used by the block protocol. As the memory driver
is both a character and a block device driver, it now implements its
own message loop.

The most important semantic change made to the block protocol is that
it is no longer possible to return both partial results and an error
for a single transfer. This simplifies the interaction between the
caller and the driver, as the I/O vector no longer needs to be copied
back. Also, drivers are now no longer supposed to decide based on the
layout of the I/O vector when a transfer should be cut short. Put
simply, transfers are now supposed to either succeed completely, or
result in an error.

After this patch, the state of the various pieces is as follows:
- block protocol: stable
- libbdev API: stable for synchronous communication
- libblockdriver API: needs slight revision (the drvlib/partition API
  in particular; the threading API will also change shortly)
- character protocol: needs cleanup
- libchardriver API: needs cleanup accordingly
- driver restarts: largely unsupported until endpoint changes are
  reintroduced

As a side effect, this patch eliminates several bugs, hacks, and gcc
-Wall and -W warnings all over the place. It probably introduces a
few new ones, too.

Update warning: this patch changes the protocol between MFS and disk
drivers, so in order to use old/new images, the MFS from the ramdisk
must be used to mount all file systems.
2011-11-23 14:06:37 +01:00
Adriana Szekeres eaa29370f4 ELF core files 2011-11-22 22:07:40 +01:00
Arun Thomas 86b061078b Build gcov code only if MKCOVERAGE is yes 2011-08-09 10:39:33 +02:00
Evgeniy Ivanov ef0a265086 New stat structure.
* VFS and installed MFSes must be in sync before and after this change *

Use struct stat from NetBSD. It requires adding new STAT, FSTAT and LSTAT
syscalls. Libc modification is both backward and forward compatible.

Also new struct stat uses modern field sizes to avoid ABI
incompatibility, when we update uid_t, gid_t and company.
Exceptions are ino_t and off_t in old libc (though paddings added).
2011-07-12 16:39:55 +02:00
Thomas Veerman aba392e630 Clean up and fix multiple bugs in select:
- Remove redundant code.
 - Always wait for the initial reply from an asynchronous select request,
   even if the select has been satisfied on another file descriptor or
   was canceled due to a serious error.
 - Restart asynchronous selects if upon reply from the driver turns out
   that there are deferred operations (and do not forget we're still
   interested in the results of the deferred operations).
 - Do not hang a non-blocking select when another blocking select on
   the same filp is still blocking.
 - Split blocking operations in read, write, and exceptions (i.e.,
   blocking on read does not imply the write will block as well).
 - Some loops would iterate over OPEN_MAX file descriptors instead of
   the "highest" file descriptor.
 - Use proper internal error return values.
 - A secondary reply from a synchronous driver is essentially the same
   as from an asynchronous driver (the only difference being how the 
   answer is received). Merge.
 - Return proper error code after a driver failure.
 - Auto-detect whether a driver is synchronous or asynchronous.
 - Remove some code duplication.
 - Clean up code (coding style, add missing comments, put all select
   related code together).
2011-04-13 13:25:34 +00:00
Thomas Veerman 13ef7f1f38 Prepare VFS to support back calls from PFS. For security reasons and to support
file descriptor passing, PFS does some back calls to VFS. For example, to
verify the validity of a path provided by a process and to tell VFS it must
copy file descriptors from one process to another.
2010-08-30 13:44:07 +00:00
Ben Gras 5d6c2aae0a gcov support, based on work contributed by Anton Kuijsten. 2010-08-25 13:06:43 +00:00
David van Moolenbroek 895850b8cf move timers code to libsys 2010-07-09 12:58:18 +00:00
Ben Gras fc01683584 include, vfs: statvfs, fstatvfs calls, contributed by Buccapatnam Tirumala, Gautam. 2010-06-23 23:53:50 +00:00
Arun Thomas 1bf6d23f34 Make exec() use entry point in a.out header 2010-06-10 14:59:10 +00:00
Kees van Reeuwijk 826b9590f2 More endpoint_t correctness.
More const correctness.
Other code cleanup.
2010-06-08 14:09:18 +00:00
Kees van Reeuwijk bc314bda91 Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar.
Use ANSI-style function declarations where necessary.
2010-04-13 10:58:41 +00:00
Cristiano Giuffrida 65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Cristiano Giuffrida 48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Ben Gras 35a108b911 panic() cleanup.
this change
   - makes panic() variadic, doing full printf() formatting -
     no more NO_NUM, and no more separate printf() statements
     needed to print extra info (or something in hex) before panicing
   - unifies panic() - same panic() name and usage for everyone -
     vm, kernel and rest have different names/syntax currently
     in order to implement their own luxuries, but no longer
   - throws out the 1st argument, to make source less noisy.
     the panic() in syslib retrieves the server name from the kernel
     so it should be clear enough who is panicing; e.g.
         panic("sigaction failed: %d", errno);
     looks like:
         at_wini(73130): panic: sigaction failed: 0
         syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
   - throws out report() - printf() is more convenient and powerful
   - harmonizes/fixes the use of panic() - there were a few places
     that used printf-style formatting (didn't work) and newlines
     (messes up the formatting) in panic()
   - throws out a few per-server panic() functions
   - cleans up a tie-in of tty with panic()

merging printf() and panic() statements to be done incrementally.
2010-03-05 15:05:11 +00:00
Ben Gras 35b471ad94 removal of unused vm<->vfs code. 2010-02-03 13:35:17 +00:00
David van Moolenbroek b31119abf5 Mount updates:
- allow mounting with "none" block device
- allow unmounting by mountpoint
- make VFS aware of file system process labels
- allow m3_ca1 to use the full available message size
- use *printf in u/mount(1), as mount(2) uses it already
- fix reference leaks for some mount error cases in VFS
2010-01-12 23:08:50 +00:00
David van Moolenbroek ac9ab099c8 General cleanup:
- clean up kernel section of minix/com.h somewhat
- remove ALLOCMEM and VM_ALLOCMEM calls
- remove non-safecopy and minix-vmd support from Inet
- remove SYS_VIRVCOPY and SYS_PHYSVCOPY calls
- remove obsolete segment encoding in SYS_SAFECOPY*
- remove DEVCTL call, svrctl(FSDEVUNMAP), map_driverX
- remove declarations of unimplemented svrctl requests
- remove everything related to swapping to disk
- remove floppysetup.sh
- remove traces of rescue device
- update DESCRIBE.sh with new devices
- some other small changes
2010-01-05 19:39:27 +00:00
Thomas Veerman 958b25be50 - Introduce support for sticky bit.
- Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly.
- Clean up MFS by removing old, dead code (backwards compatibility is broken by
  the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all
  functions have proper banners and prototypes.
- VFS should always provide a (syntactically) valid path to the FS; no need for
  the FS to do sanity checks when leaving/entering mount points.
- Fix several bugs in MFS:
  - Several path lookup bugs in MFS.
  - A link can be too big for the path buffer.
  - A mountpoint can become inaccessible when the creation of a new inode
    fails, because the inode already exists and is a mountpoint.
- Introduce support for supplemental groups.
- Add test 46 to test supplemental group functionality (and removed obsolete
  suppl. tests from test 2).
- Clean up VFS (not everything is done yet).
- ISOFS now opens device read-only. This makes the -r flag in the mount command
  unnecessary (but will still report to be mounted read-write).
- Introduce PipeFS. PipeFS is a new FS that handles all anonymous and
  named pipes. However, named pipes still reside on the (M)FS, as they are part
  of the file system on disk. To make this work VFS now has a concept of
  'mapped' inodes, which causes read, write, truncate and stat requests to be
  redirected to the mapped FS, and all other requests to the original FS.
2009-12-20 20:27:14 +00:00
David van Moolenbroek f89388c241 Kernel, servers: remove unused proto.h definitions 2009-10-31 14:11:50 +00:00
Ben Gras c373473f24 add prototype for wait_for() to fix compiler warning. 2009-10-05 16:43:02 +00:00
Tomas Hruby 8590ac260d Removed dependency of vfs on NR_TASKS macro
- all macros in consts.h that depend on NR_TASKS replaced by a FP_BLOCKED_ON_*

- fp_suspended removed and replaced by fp_blocked_on. Testing whether a process
  is supended is qeual to testing whether fp_blocked_on is FP_BLOCKED_ON_NONE or
  not

- fp_task is valid only if fp_blocked_on == FP_BLOCKED_ON_OTHER

- no need of special values that do not colide with valid and special endpoints
  since they are not used as endpoints anymore

- suspend only takes FP_BLOCKED_ON_* values not endpoints anymore

- suspend(task) replaced by wait_for(task) which sets fp_task so we remember who
  are we waiting for and suspend sets fp_blocked_on to FP_BLOCKED_ON_OTHER to
  signal that we are waiting for some other process

- some functions should take endpoint_t instead of int, fixed
2009-09-22 21:48:26 +00:00
David van Moolenbroek 0ac1aaccca Limited support for nested FS->VFS requests during VFS->FS call.
- Changed VFS-FS protocol to only store OK or negative error code in
  m_type field of reply messages.
- Changed VFS to treat nonzero positive replies from FS as requests.
- Added backwards compatibility to VFS and MFS.
No protection of global data structures is provided in VFS, so many
VFS calls cannot be made safely by FS servers during many FS calls.
Use with caution (or, preferably, not at all).
2009-05-11 10:02:28 +00:00
Ben Gras dc1238b7b9 make unpause() decrease susp_count, as it shouldn't be decreased
if the process was REVIVING. (susp_count doesn't count those
 processes.) this together with dev_io SELECT suspend side effect
 for asynch. character devices solves the hanging pipe bug. or
 at last vastly improves it.

 added sanity checks, turned off by default.

 made the {NOT_,}{SUSPENDING,REVIVING} constants weirder to
 help sanity checking.
2009-05-08 13:56:41 +00:00
Ben Gras 3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras 45ec30f6af mostly harmless sanity checks. 2009-01-20 13:43:00 +00:00
Ben Gras c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg 1d7d5aa629 dev_close needs the filp number for asynch I/O, dev_io gets suspend_reopen
flag to suspend a process until the filedescriptor is re-opened. Added 
dev_reopen, asyn_io, suspended_ep, reopen_reply, asynsend, diag_repl, 
close_filp, close_reply, unpause, select_reply1, select_reply2.
2008-02-22 14:03:14 +00:00
Philip Homburg ab3062c8c0 REQ_FSTATFS now operates on the root inode (the inode parameter has been
removed)
2007-08-17 11:20:59 +00:00
Philip Homburg f46319037b New VFS interface 2007-08-07 12:52:47 +00:00