Commit graph

56 commits

Author SHA1 Message Date
Thomas Veerman
49ad4e8888 Spring cleanup
Remove old versions of system calls and system calls that don't have
a libc api interface anymore (dup, dup2, creat).

VFS still contains support for old system call numbers for the new stat
system calls (i.e., 65, 66, 67) to keep supporting old binaries built for
MINIX 3.2.1 (prior to the release).

Change-Id: I721779b58a50c7eeae20669de24658d55d69b25b
2013-03-06 09:56:08 +00:00
Thomas Veerman
fa78dc389f socket: implement SOCK_CLOEXEC and SOCK_NONBLOCK
Change-Id: I3fa36fa999c82a192d402cb4d913bd397e106e53
2013-02-28 10:08:53 +00:00
Thomas Veerman
7c8b3ddfed VFS: fix locking bugs
.sync and fsync used unnecessarily restrictive locking type
.fsync violated locking order by obtaining a vmnt lock after a filp lock
.fsync contained a TOCTOU bug
.new_node violated locking rules (didn't upgrade lock upon file creation)
.do_pipe used unnecessarily restrictive locking type
.always lock pipes exclusively; even a read operation might require to do
 a write on a vnode object (update pipe size)
.when opening a file with O_TRUNC, upgrade vnode lock when truncating
.utime used unnecessarily restrictive locking type
.path parsing:
  .always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to
   VMNT_READ if that was what was actually requested. This prevents the
   following deadlock scenario:
   thread A:
     lock_vmnt(vmp, TLL_READSER);
     lock_vnode(vp, TLL_READSER);
     upgrade_vmnt_lock(vmp, TLL_WRITE);

   thread B:
     lock_vmnt(vmp, TLL_READ);
     lock_vnode(vp, TLL_READSER);

   thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in
   lock_vnode. This happens when, for example, thread A tries create a
   new node (open.c:new_node) and thread B tries to do eat_path to
   change dir (stadir.c:do_chdir). When the path is being resolved, a
   vnode is always locked with VNODE_OPCL (TLL_READSER) and then
   downgraded to VNODE_READ if read-only is actually requested. Thread
   A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows
   VMNT_READ locks. Thread B can't acquire a lock on the vnode because
   thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE
   (TLL_WRITE) because thread B has a VMNT_READ lock on it.

   By serializing vmnt locks during path parsing, thread B can only
   acquire a lock on vmp when thread A has completely finished its
   operation.
2013-01-11 09:18:35 +00:00
Ben Gras
8aeac26999 vfs: fix clobbering fd_nr
dumpcore: fd_nr can be in use as blocking fd but will then be clobbered
by common_open, causing disaster for exiting unpause().
2012-12-11 12:00:57 +01:00
Thomas Veerman
179261a9b6 mtab: support moving mount points
Also fix canonical_path function; it fails to parse some paths
2012-11-29 10:50:51 +00:00
Thomas Veerman
d9f4f71916 Implement dynamic mtab support
With this patch /etc/mtab becomes obsolete.
2012-11-26 15:20:18 +00:00
Thomas Veerman
14e470be81 VFS: fix TOCTOU bug in sync 2012-11-14 13:24:53 +00:00
Thomas Veerman
ed23a7a7d2 VFS: fix reboot panic with mounted FUSE FS
Upon reboot VFS semi-exits all processes and unmounts the file system.
However, upon unmount, exiting FUSE file systems might need service from
the file system (due to libc). As the FUSE process is halfway the exit
procedure, it doesn't have a valid root directory and working directory.
Trying to do system calls then triggers a sanity check in VFS.

This fix first exits normal processes which should then allow for
unmounting FUSE file systems. Then VFS exits all processes including
File Servers and unmounts the rest of the file system.
2012-11-14 13:18:16 +00:00
Ben Gras
60014efb3e vfs: pm_dumpcore: always clean up process
. whenever this function is called, pm will expect
	  the process to be cleaned up
	. so don't abort the process entirely on error
	. fixes a later 'forking on top of in-use child' vfs panic
2012-09-19 17:13:17 +02:00
Thomas Veerman
992799b91f VFS: make all IPC asynchronous
By decoupling synchronous drivers from VFS, we are a big step closer to
supporting driver crashes under all circumstances. That is, VFS can't
become stuck on IPC with a synchronous driver (e.g., INET) and can
recover from crashing block drivers during open/close/ioctl or during
communication with an FS.

In order to maintain serialized communication with a synchronous driver,
the communication is wrapped by a mutex on a per driver basis (not major
numbers as there can be multiple majors with identical endpoints). Majors
that share a driver endpoint point to a single mutex object.

In order to support crashes from block drivers, the file reopen tactic
had to be changed; first reopen files associated with the crashed
driver, then send the new driver endpoint to FSes. This solves a
deadlock between the FS and the block driver;
  - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it
    after retrying the current request to the newly started driver.
  - The block driver would refuse the retried request until all files
    had been reopened.
  - VFS would reopen files only after getting a reply from the initial
    REQ_NEW_DRIVER.

When a character special driver crashes, all associated files have to
be marked invalid and closed (or reopened if flagged as such). However,
they can only be closed if a thread holds exclusive access to it. To
obtain exclusive access, the worker thread (which handles the new driver
endpoint event from DS) schedules a new job to garbage collect invalid
files. This way, we can signal the worker thread that was talking to the
crashed driver and will release exclusive access to a file associated
with the crashed driver and prevent the garbage collecting worker thread
from dead locking on that file.

Also, when a character special driver crashes, RS will unmap the driver
and remap it upon restart. During unmapping, associated files are marked
invalid instead of waiting for an endpoint up event from DS, as that
event might come later than new read/write/select requests and thus
cause confusion in the freshly started driver.

When locking a filp, the usage counters are no longer checked. The usage
counter can legally go down to zero during filp invalidation while there
are locks pending.

DS events are handled by a separate worker thread instead of the main
thread as reopening files could lead to another crash and a stuck thread.
An additional worker thread is then necessary to unlock it.

Finally, with everything asynchronous a race condition in do_select
surfaced. A select entry was only marked in use after succesfully sending
initial select requests to drivers and having to wait. When multiple
select() calls were handled there was opportunity that these entries
were overwritten. This had as effect that some select results were
ignored (and select() remained blocking instead if returning) or do_select
tried to access filps that were not present (because thrown away by
secondary select()). This bug manifested itself with sendrecs, but was
very hard to reproduce. However, it became awfully easy to trigger with
asynsends only.
2012-09-17 11:01:45 +00:00
Thomas Veerman
77dbd766c1 VFS: Use safe string copy functions 2012-07-16 10:57:43 +00:00
Ben Gras
85ff5a947e dumpcore: use ptrace function to trigger a coredump
. dumpcore currently relies on minix segments
	. also ptrace dumpcore fix
2012-06-15 12:13:50 +02:00
David van Moolenbroek
1817f7fc07 VFS: fix "process already free" panic on reboot
Reported by Claudiu Dan Gheorghe, debugged by Thomas and myself
2012-05-02 17:42:50 +02:00
Thomas Veerman
db8198d99d VFS: use S_IS* macros 2012-04-27 08:49:38 +00:00
Thomas Veerman
933120b0b1 VFS: add getting active threads control msg 2012-04-13 13:21:01 +00:00
Thomas Veerman
0d63d9e125 VFS: enable sending control messages 2012-04-13 12:54:55 +00:00
Thomas Veerman
8f55767619 VFS: make m_in job local
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.

Furthermore, this patch cleans up some code and uses better types in a lot
of places.
2012-04-13 12:50:38 +00:00
Ben Gras
7336a67dfe retire PUBLIC, PRIVATE and FORWARD 2012-03-25 21:58:14 +02:00
Ben Gras
6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
Thomas Veerman
80c4685324 VFS: replace VFS with AVFS 2012-02-13 16:53:21 +00:00
David van Moolenbroek
6f374faca5 Add "expected size" parameter to getsysinfo()
This patch provides basic protection against damage resulting from
differently compiled servers blindly copying tables to one another.
In every getsysinfo() call, the caller is provided with the expected
size of the requested data structure. The callee fails the call if
the expected size does not match the data structure's actual size.
2011-12-11 22:34:14 +01:00
David van Moolenbroek
b4d909d415 Split block/character protocols and libdriver
This patch separates the character and block driver communication
protocols. The old character protocol remains the same, but a new
block protocol is introduced. The libdriver library is replaced by
two new libraries: libchardriver and libblockdriver. Their exposed
API, and drivers that use them, have been updated accordingly.
Together, libbdev and libblockdriver now completely abstract away
the message format used by the block protocol. As the memory driver
is both a character and a block device driver, it now implements its
own message loop.

The most important semantic change made to the block protocol is that
it is no longer possible to return both partial results and an error
for a single transfer. This simplifies the interaction between the
caller and the driver, as the I/O vector no longer needs to be copied
back. Also, drivers are now no longer supposed to decide based on the
layout of the I/O vector when a transfer should be cut short. Put
simply, transfers are now supposed to either succeed completely, or
result in an error.

After this patch, the state of the various pieces is as follows:
- block protocol: stable
- libbdev API: stable for synchronous communication
- libblockdriver API: needs slight revision (the drvlib/partition API
  in particular; the threading API will also change shortly)
- character protocol: needs cleanup
- libchardriver API: needs cleanup accordingly
- driver restarts: largely unsupported until endpoint changes are
  reintroduced

As a side effect, this patch eliminates several bugs, hacks, and gcc
-Wall and -W warnings all over the place. It probably introduces a
few new ones, too.

Update warning: this patch changes the protocol between MFS and disk
drivers, so in order to use old/new images, the MFS from the ramdisk
must be used to mount all file systems.
2011-11-23 14:06:37 +01:00
Adriana Szekeres
c30f014a89 gcore command to coredump a process 2011-11-22 22:07:41 +01:00
Adriana Szekeres
eaa29370f4 ELF core files 2011-11-22 22:07:40 +01:00
David van Moolenbroek
b02c260ecb Miscellaneous legacy cleanup 2011-11-07 22:20:55 +01:00
Thomas Veerman
d4b72e81b2 Cleanup servers to make GCC/Clang a little happier 2011-09-08 13:57:03 +00:00
Dirk Vogt
9ed280d1ec decouple file system server start/termination from mount/umount 2010-11-23 19:34:56 +00:00
David van Moolenbroek
354da24f5b make getsysinfo() a system-land call 2010-09-14 21:50:05 +00:00
Erik van der Kouwe
739f2d7536 Fix comment 2010-07-15 14:47:08 +00:00
Thomas Veerman
34a2864e27 Fix a few compile time warnings 2010-07-02 12:41:19 +00:00
Tomas Hruby
6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Ben Gras
94edf4fa12 vfs: start at vmnt[0] to sync mounted filesystems, not vmnt[1]. 2010-04-26 17:12:34 +00:00
Cristiano Giuffrida
65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Cristiano Giuffrida
48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk
fc7dced1fa Fix printfs with too few or too many parms, remove unused vars, fix incorrect flag tests, other code cleanup. 2010-04-01 13:25:05 +00:00
Ben Gras
35a108b911 panic() cleanup.
this change
   - makes panic() variadic, doing full printf() formatting -
     no more NO_NUM, and no more separate printf() statements
     needed to print extra info (or something in hex) before panicing
   - unifies panic() - same panic() name and usage for everyone -
     vm, kernel and rest have different names/syntax currently
     in order to implement their own luxuries, but no longer
   - throws out the 1st argument, to make source less noisy.
     the panic() in syslib retrieves the server name from the kernel
     so it should be clear enough who is panicing; e.g.
         panic("sigaction failed: %d", errno);
     looks like:
         at_wini(73130): panic: sigaction failed: 0
         syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
   - throws out report() - printf() is more convenient and powerful
   - harmonizes/fixes the use of panic() - there were a few places
     that used printf-style formatting (didn't work) and newlines
     (messes up the formatting) in panic()
   - throws out a few per-server panic() functions
   - cleans up a tie-in of tty with panic()

merging printf() and panic() statements to be done incrementally.
2010-03-05 15:05:11 +00:00
David van Moolenbroek
bdd4f5857f Fixes for truncate system calls:
- VFS: check for negative sizes in all truncate calls
- VFS: update file size after truncating with fcntl(F_FREESP)
- VFS: move pos/len checks for F_FREESP with l_len!=0 from FS to VFS
- MFS: do not zero data block for small files when fully truncating
- MFS: do not write out freed indirect blocks after freeing space
- MFS: make truncate work correctly with differing zone/block sizes
- tests: add new test50 for truncate call family
2010-02-09 08:12:37 +00:00
David van Moolenbroek
b31119abf5 Mount updates:
- allow mounting with "none" block device
- allow unmounting by mountpoint
- make VFS aware of file system process labels
- allow m3_ca1 to use the full available message size
- use *printf in u/mount(1), as mount(2) uses it already
- fix reference leaks for some mount error cases in VFS
2010-01-12 23:08:50 +00:00
David van Moolenbroek
ac9ab099c8 General cleanup:
- clean up kernel section of minix/com.h somewhat
- remove ALLOCMEM and VM_ALLOCMEM calls
- remove non-safecopy and minix-vmd support from Inet
- remove SYS_VIRVCOPY and SYS_PHYSVCOPY calls
- remove obsolete segment encoding in SYS_SAFECOPY*
- remove DEVCTL call, svrctl(FSDEVUNMAP), map_driverX
- remove declarations of unimplemented svrctl requests
- remove everything related to swapping to disk
- remove floppysetup.sh
- remove traces of rescue device
- update DESCRIBE.sh with new devices
- some other small changes
2010-01-05 19:39:27 +00:00
Thomas Veerman
958b25be50 - Introduce support for sticky bit.
- Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly.
- Clean up MFS by removing old, dead code (backwards compatibility is broken by
  the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all
  functions have proper banners and prototypes.
- VFS should always provide a (syntactically) valid path to the FS; no need for
  the FS to do sanity checks when leaving/entering mount points.
- Fix several bugs in MFS:
  - Several path lookup bugs in MFS.
  - A link can be too big for the path buffer.
  - A mountpoint can become inaccessible when the creation of a new inode
    fails, because the inode already exists and is a mountpoint.
- Introduce support for supplemental groups.
- Add test 46 to test supplemental group functionality (and removed obsolete
  suppl. tests from test 2).
- Clean up VFS (not everything is done yet).
- ISOFS now opens device read-only. This makes the -r flag in the mount command
  unnecessary (but will still report to be mounted read-write).
- Introduce PipeFS. PipeFS is a new FS that handles all anonymous and
  named pipes. However, named pipes still reside on the (M)FS, as they are part
  of the file system on disk. To make this work VFS now has a concept of
  'mapped' inodes, which causes read, write, truncate and stat requests to be
  redirected to the mapped FS, and all other requests to the original FS.
2009-12-20 20:27:14 +00:00
Tomas Hruby
8590ac260d Removed dependency of vfs on NR_TASKS macro
- all macros in consts.h that depend on NR_TASKS replaced by a FP_BLOCKED_ON_*

- fp_suspended removed and replaced by fp_blocked_on. Testing whether a process
  is supended is qeual to testing whether fp_blocked_on is FP_BLOCKED_ON_NONE or
  not

- fp_task is valid only if fp_blocked_on == FP_BLOCKED_ON_OTHER

- no need of special values that do not colide with valid and special endpoints
  since they are not used as endpoints anymore

- suspend only takes FP_BLOCKED_ON_* values not endpoints anymore

- suspend(task) replaced by wait_for(task) which sets fp_task so we remember who
  are we waiting for and suspend sets fp_blocked_on to FP_BLOCKED_ON_OTHER to
  signal that we are waiting for some other process

- some functions should take endpoint_t instead of int, fixed
2009-09-22 21:48:26 +00:00
David van Moolenbroek
f76d75a5ec Various VFS and MFS fixes to improve correctness, consistency and
POSIX compliance.

VFS changes:
* truncate() on a file system mounted read-only no longer panics MFS.
* ftruncate() and fcntl(F_FREESP) now check for write permission on
  the file descriptor instead of the file, write().
* utime(), chown() and fchown() now check for file system read-only
  status.

MFS changes:
* link() and rename() no longer return the internal EENTERMOUNT and
  ELEAVEMOUNT errors to the application as part of a check on the
  source path.
* rename() now treats EENTERMOUNT from the destination path check as
  an error, preventing file system corruption from renaming a normal
  directory to an existing mountpoint directory.
* mountpoints (mounted-on dirs) are hidden better during lookups:
  - if a lookup starts from a mountpoint, the first component has to
    be ".." (anything else being a VFS-FS protocol violation).
  - in that case, the permissions of the mountpoint are not checked.
  - in all other cases, visiting a mountpoint always results in
    EENTERMOUNT.
* a lookup on ".." from a mount root or chroot(2) root no longer
  succeeds if the caller does not have search permission on that
  directory.
* POSIX: getdents() now updates directory access times.
* POSIX: readlink() now returns partial results instead of ERANGE.

Miscellaneous changes:
* semaphore file handling bug (leading to hangs) fixed in test 32.

The VFS changes should now put the burden of checking for read-only
status of file systems entirely on VFS, and limit the access
permission checks that file systems have to perform, to checking
search permission on directories during lookups. From this point on,
any deviation from that spceification should be considered a bug.
Note that for legacy reasons, the root partition is assumed to be
mounted read-write.
2009-05-18 11:27:12 +00:00
Ben Gras
7c88767f75 remove debug msg 2009-05-11 11:57:20 +00:00
David van Moolenbroek
293be6b80b quick cleanup of old mfs cruft from vfs 2009-05-08 14:12:41 +00:00
Ben Gras
dc1238b7b9 make unpause() decrease susp_count, as it shouldn't be decreased
if the process was REVIVING. (susp_count doesn't count those
 processes.) this together with dev_io SELECT suspend side effect
 for asynch. character devices solves the hanging pipe bug. or
 at last vastly improves it.

 added sanity checks, turned off by default.

 made the {NOT_,}{SUSPENDING,REVIVING} constants weirder to
 help sanity checking.
2009-05-08 13:56:41 +00:00
Ben Gras
fd7ef243e4 cleanup of vfs shutdown logic; makes clean unmounts easier (but
needs checking if fp_wd or fp_rd is NULL before use)
2009-04-29 16:59:18 +00:00
Ben Gras
3bb80322d9 suppress more mostly-harmless messages. 2009-03-26 16:11:27 +00:00
Ben Gras
3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras
c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg
9388a27070 Support for O_REOPEN flag and pass the filp numbet to dev_open. 2008-02-22 14:49:02 +00:00