Commit graph

49 commits

Author SHA1 Message Date
Thomas Veerman
c087a60ed2 VFS: fix GCC compilation error 2012-09-17 15:29:38 +00:00
Thomas Veerman
992799b91f VFS: make all IPC asynchronous
By decoupling synchronous drivers from VFS, we are a big step closer to
supporting driver crashes under all circumstances. That is, VFS can't
become stuck on IPC with a synchronous driver (e.g., INET) and can
recover from crashing block drivers during open/close/ioctl or during
communication with an FS.

In order to maintain serialized communication with a synchronous driver,
the communication is wrapped by a mutex on a per driver basis (not major
numbers as there can be multiple majors with identical endpoints). Majors
that share a driver endpoint point to a single mutex object.

In order to support crashes from block drivers, the file reopen tactic
had to be changed; first reopen files associated with the crashed
driver, then send the new driver endpoint to FSes. This solves a
deadlock between the FS and the block driver;
  - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it
    after retrying the current request to the newly started driver.
  - The block driver would refuse the retried request until all files
    had been reopened.
  - VFS would reopen files only after getting a reply from the initial
    REQ_NEW_DRIVER.

When a character special driver crashes, all associated files have to
be marked invalid and closed (or reopened if flagged as such). However,
they can only be closed if a thread holds exclusive access to it. To
obtain exclusive access, the worker thread (which handles the new driver
endpoint event from DS) schedules a new job to garbage collect invalid
files. This way, we can signal the worker thread that was talking to the
crashed driver and will release exclusive access to a file associated
with the crashed driver and prevent the garbage collecting worker thread
from dead locking on that file.

Also, when a character special driver crashes, RS will unmap the driver
and remap it upon restart. During unmapping, associated files are marked
invalid instead of waiting for an endpoint up event from DS, as that
event might come later than new read/write/select requests and thus
cause confusion in the freshly started driver.

When locking a filp, the usage counters are no longer checked. The usage
counter can legally go down to zero during filp invalidation while there
are locks pending.

DS events are handled by a separate worker thread instead of the main
thread as reopening files could lead to another crash and a stuck thread.
An additional worker thread is then necessary to unlock it.

Finally, with everything asynchronous a race condition in do_select
surfaced. A select entry was only marked in use after succesfully sending
initial select requests to drivers and having to wait. When multiple
select() calls were handled there was opportunity that these entries
were overwritten. This had as effect that some select results were
ignored (and select() remained blocking instead if returning) or do_select
tried to access filps that were not present (because thrown away by
secondary select()). This bug manifested itself with sendrecs, but was
very hard to reproduce. However, it became awfully easy to trigger with
asynsends only.
2012-09-17 11:01:45 +00:00
Thomas Veerman
66dbf73049 VFS: fix locking bug in clone_opcl
When VFS runs out of vnodes after closing a vnode in opcl, common_open
will try to unlock a vnode through unlock_filp that has already been
unlocked in clone_opcl. By first obtaining and locking a new vnode this
situation is prevented; if there are no free vnodes, common_open will
unlock a still locked vnode.
2012-07-30 10:01:16 +00:00
Thomas Veerman
db8198d99d VFS: use S_IS* macros 2012-04-27 08:49:38 +00:00
Ben Gras
3945cfbfd3 block ioctls: pass request number 2012-04-18 11:01:15 +02:00
Thomas Veerman
8f55767619 VFS: make m_in job local
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.

Furthermore, this patch cleans up some code and uses better types in a lot
of places.
2012-04-13 12:50:38 +00:00
Ben Gras
7336a67dfe retire PUBLIC, PRIVATE and FORWARD 2012-03-25 21:58:14 +02:00
Ben Gras
6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
Tomas Hruby
72b7abd1a1 VFS - no CANCEL for async non-blocking operations
- if an operation (R, W, IOCTL) is non blocking, a flag is set
  and sent to the device.

- nothing changes for sync devices

- asyn devices should reply asap if an operation is non-blocking.
  We must trust the devices, but we had to trust them anyway to
  reply to CANCEL correctly

- we safe sending CANCEL commands to asyn devices. This greatly
  simplifies the protocol. Asynchronous devices can always reply
  when a reply is ready and do not need to deal with other
  situations

- currently, none of our drivers use the flags since they drive
  virtual devices which do not block
2012-03-02 15:44:48 +00:00
Tomas Hruby
369a12704f VFS - dev_style_asyn()
- dev_style_asyn() tests whether a device is asynchronous

 - simplifies code and helps readability
2012-03-02 15:44:47 +00:00
Tomas Hruby
35eb88461d VFS - cancel_nblock()
- duplicate code in dev_io() which sends CANCEL in case of a
  non-blocking operation moved to cancel_nblock()
2012-03-02 15:44:47 +00:00
Thomas Veerman
c540bcb001 VFS: various select fixes
- Fix locking bug when unable to send DEV_SELECT request. Upon failure
  VFS tried to cancel the select operation, but this failed due to trying
  to lock a filp that was already locked to send the request in the first
  place. Do_select_request now handles locking of filps itself instead of
  relying on the caller to do it.  This fixes a crash when killing INET.
- Fix failure to revive a process after a non-blocking select operation
  yielded no ready select operations when replying DEV_SEL_REPL1.
- Improve readability by using OK, SUSPEND, and standard error values as
  results instead of having separate macros in select.
- Don't print not having a driver for a major device; after killing a driver
  select will trigger this printf.
2012-02-17 21:09:07 +00:00
Thomas Veerman
80c4685324 VFS: replace VFS with AVFS 2012-02-13 16:53:21 +00:00
David van Moolenbroek
db087efac4 VFS/FS: REQ_NEW_DRIVER now provides a label 2011-11-30 19:05:26 +01:00
David van Moolenbroek
a9f89a7290 vfs/avfs: map O_ACCMODE to R_BIT|W_BIT on recovery 2011-11-24 13:57:36 +01:00
David van Moolenbroek
b4d909d415 Split block/character protocols and libdriver
This patch separates the character and block driver communication
protocols. The old character protocol remains the same, but a new
block protocol is introduced. The libdriver library is replaced by
two new libraries: libchardriver and libblockdriver. Their exposed
API, and drivers that use them, have been updated accordingly.
Together, libbdev and libblockdriver now completely abstract away
the message format used by the block protocol. As the memory driver
is both a character and a block device driver, it now implements its
own message loop.

The most important semantic change made to the block protocol is that
it is no longer possible to return both partial results and an error
for a single transfer. This simplifies the interaction between the
caller and the driver, as the I/O vector no longer needs to be copied
back. Also, drivers are now no longer supposed to decide based on the
layout of the I/O vector when a transfer should be cut short. Put
simply, transfers are now supposed to either succeed completely, or
result in an error.

After this patch, the state of the various pieces is as follows:
- block protocol: stable
- libbdev API: stable for synchronous communication
- libblockdriver API: needs slight revision (the drvlib/partition API
  in particular; the threading API will also change shortly)
- character protocol: needs cleanup
- libchardriver API: needs cleanup accordingly
- driver restarts: largely unsupported until endpoint changes are
  reintroduced

As a side effect, this patch eliminates several bugs, hacks, and gcc
-Wall and -W warnings all over the place. It probably introduces a
few new ones, too.

Update warning: this patch changes the protocol between MFS and disk
drivers, so in order to use old/new images, the MFS from the ramdisk
must be used to mount all file systems.
2011-11-23 14:06:37 +01:00
Thomas Veerman
aba392e630 Clean up and fix multiple bugs in select:
- Remove redundant code.
 - Always wait for the initial reply from an asynchronous select request,
   even if the select has been satisfied on another file descriptor or
   was canceled due to a serious error.
 - Restart asynchronous selects if upon reply from the driver turns out
   that there are deferred operations (and do not forget we're still
   interested in the results of the deferred operations).
 - Do not hang a non-blocking select when another blocking select on
   the same filp is still blocking.
 - Split blocking operations in read, write, and exceptions (i.e.,
   blocking on read does not imply the write will block as well).
 - Some loops would iterate over OPEN_MAX file descriptors instead of
   the "highest" file descriptor.
 - Use proper internal error return values.
 - A secondary reply from a synchronous driver is essentially the same
   as from an asynchronous driver (the only difference being how the 
   answer is received). Merge.
 - Return proper error code after a driver failure.
 - Auto-detect whether a driver is synchronous or asynchronous.
 - Remove some code duplication.
 - Clean up code (coding style, add missing comments, put all select
   related code together).
2011-04-13 13:25:34 +00:00
David van Moolenbroek
c51cd5fe91 Server/driver protocols: no longer allow third-party copies.
Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed
to know which actual process to copy data from/to, as that process may
not always be the caller. Now that we have full safecopy support, these
fields have become useless for that purpose: the owner of the grant is
*always* the caller. Allowing the caller to supply another endpoint is
in fact dangerous, because the callee may then end up using a grant
from a third party. One could call this a variant of the confused
deputy problem.

From now on, safecopy calls should always use the caller's endpoint as
grant owner. This fully obsoletes the DL_ENDPT field in the
inet/ethernet protocol. IO_ENDPT has other uses besides identifying the
grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only
because that is a more fitting name (it should never be used for I/O
after all), but also in order to intentionally break any old system
source code outside the base system. If this patch breaks your code,
fixing it is fairly simple:

- DL_ENDPT should be replaced with m_source;
- IO_ENDPT should be replaced with m_source when used for safecopies;
- IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g.
  when setting REP_ENDPT, matching requests in CANCEL calls, getting
  DEV_SELECT flags, and retrieving of the real user process's endpoint
  in DEV_OPEN.

The changes in this patch are binary backward compatible.
2011-04-11 17:35:05 +00:00
David van Moolenbroek
28f2a169da VFS: bugfixes for handling block-special files:
- on driver restarts, reopen devices on a per-file basis, not per-mount
- do not assume that there is just one vnode per block-special device
- update block-special files in the uncommon mounting success paths, too
- upon mount, sync but also invalidate affected buffers on the root FS
- upon unmount, check whether a vnode is in use before updating it
2011-03-25 10:56:43 +00:00
Thomas Veerman
13ef7f1f38 Prepare VFS to support back calls from PFS. For security reasons and to support
file descriptor passing, PFS does some back calls to VFS. For example, to
verify the validity of a path provided by a process and to tell VFS it must
copy file descriptors from one process to another.
2010-08-30 13:44:07 +00:00
Ben Gras
3badab8b70 vfs - split fp_fd field into fd + callnr fields 2010-07-22 14:55:28 +00:00
Erik van der Kouwe
c0dfa2f3f1 Get rid of asynsend backup copy in VFS 2010-06-25 14:57:54 +00:00
Erik van der Kouwe
498d7d8a4c Don't use kernel responses in servers 2010-06-24 07:37:26 +00:00
Arun Thomas
4c10a31440 Remove legacy MM, FS, and FS_PROC_NR macros 2010-06-08 13:58:01 +00:00
Tomas Hruby
6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Kees van Reeuwijk
bc314bda91 Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar.
Use ANSI-style function declarations where necessary.
2010-04-13 10:58:41 +00:00
Cristiano Giuffrida
65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Cristiano Giuffrida
48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk
94a81c840a Removed unused variables, added const where possible. 2010-04-07 11:25:51 +00:00
Kees van Reeuwijk
fc7dced1fa Fix printfs with too few or too many parms, remove unused vars, fix incorrect flag tests, other code cleanup. 2010-04-01 13:25:05 +00:00
Thomas Veerman
4d686f1616 Move allocation of temporary inodes for cloned character special devices from
MFS to PFS.
2010-03-30 15:00:09 +00:00
Ben Gras
35a108b911 panic() cleanup.
this change
   - makes panic() variadic, doing full printf() formatting -
     no more NO_NUM, and no more separate printf() statements
     needed to print extra info (or something in hex) before panicing
   - unifies panic() - same panic() name and usage for everyone -
     vm, kernel and rest have different names/syntax currently
     in order to implement their own luxuries, but no longer
   - throws out the 1st argument, to make source less noisy.
     the panic() in syslib retrieves the server name from the kernel
     so it should be clear enough who is panicing; e.g.
         panic("sigaction failed: %d", errno);
     looks like:
         at_wini(73130): panic: sigaction failed: 0
         syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
   - throws out report() - printf() is more convenient and powerful
   - harmonizes/fixes the use of panic() - there were a few places
     that used printf-style formatting (didn't work) and newlines
     (messes up the formatting) in panic()
   - throws out a few per-server panic() functions
   - cleans up a tie-in of tty with panic()

merging printf() and panic() statements to be done incrementally.
2010-03-05 15:05:11 +00:00
Ben Gras
adf0b6fb26 No more E{SRC,DST}DIED errno's, replaced by EDEADSRCDST.
The callers don't care about the difference and had to check 3 error
codes instead of one.
2010-03-03 15:47:16 +00:00
Kees van Reeuwijk
1597e701a0 Remove useless variables and the computations on them. 2010-02-19 10:00:32 +00:00
Thomas Veerman
958b25be50 - Introduce support for sticky bit.
- Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly.
- Clean up MFS by removing old, dead code (backwards compatibility is broken by
  the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all
  functions have proper banners and prototypes.
- VFS should always provide a (syntactically) valid path to the FS; no need for
  the FS to do sanity checks when leaving/entering mount points.
- Fix several bugs in MFS:
  - Several path lookup bugs in MFS.
  - A link can be too big for the path buffer.
  - A mountpoint can become inaccessible when the creation of a new inode
    fails, because the inode already exists and is a mountpoint.
- Introduce support for supplemental groups.
- Add test 46 to test supplemental group functionality (and removed obsolete
  suppl. tests from test 2).
- Clean up VFS (not everything is done yet).
- ISOFS now opens device read-only. This makes the -r flag in the mount command
  unnecessary (but will still report to be mounted read-write).
- Introduce PipeFS. PipeFS is a new FS that handles all anonymous and
  named pipes. However, named pipes still reside on the (M)FS, as they are part
  of the file system on disk. To make this work VFS now has a concept of
  'mapped' inodes, which causes read, write, truncate and stat requests to be
  redirected to the mapped FS, and all other requests to the original FS.
2009-12-20 20:27:14 +00:00
Tomas Hruby
8590ac260d Removed dependency of vfs on NR_TASKS macro
- all macros in consts.h that depend on NR_TASKS replaced by a FP_BLOCKED_ON_*

- fp_suspended removed and replaced by fp_blocked_on. Testing whether a process
  is supended is qeual to testing whether fp_blocked_on is FP_BLOCKED_ON_NONE or
  not

- fp_task is valid only if fp_blocked_on == FP_BLOCKED_ON_OTHER

- no need of special values that do not colide with valid and special endpoints
  since they are not used as endpoints anymore

- suspend only takes FP_BLOCKED_ON_* values not endpoints anymore

- suspend(task) replaced by wait_for(task) which sets fp_task so we remember who
  are we waiting for and suspend sets fp_blocked_on to FP_BLOCKED_ON_OTHER to
  signal that we are waiting for some other process

- some functions should take endpoint_t instead of int, fixed
2009-09-22 21:48:26 +00:00
Ben Gras
ece26e2731 don't suspend the process as a side-effect if
device returns SUSPEND if it's select; select already
does this.
2009-05-08 13:50:29 +00:00
Ben Gras
3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras
c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg
2ec762c60c Asynchronous communication with character specials. 2008-02-22 15:41:07 +00:00
Philip Homburg
4b1cd8c0ec Return EIO if a filedescriptor cannot be re-opened after a driver restart.
Select now returns such a filedescriptor as ready (instead of EBADF). 
Reply before dev_up in FSSIGNON to avoid the problem that a DEV_OPEN
request is received by a driver that expects a reply from the FSSIGNON.
2007-08-15 12:53:52 +00:00
Philip Homburg
f46319037b New VFS interface 2007-08-07 12:52:47 +00:00
Ben Gras
8eb09f6ddc . readall: use lseek64() to read more than 4GB of a device
. vfs: 64-bit offset support for character device i/o
   (also remove unused dev_bio function)
 . memory: /dev/null and /dev/zero are infinitely large, don't stop
   reading/writing at 4GB
2007-04-24 13:27:33 +00:00
Ben Gras
bd27c5240b Typo's. 2007-02-12 12:27:43 +00:00
Ben Gras
8ea438ae93 Retired DEV_{READ,WRITE,GATHER,SCATTER,IOCTL} (safe versions *_S are to
be used and drivers should never receieve these 'unsafe' variants
any more).
2007-02-07 16:22:19 +00:00
Ben Gras
41e9fedf87 Mostly bugfixes of bugs triggered by the test set.
bugfixes:
 SYSTEM:
 . removed
        rc->p_priv->s_flags = 0;
   for the priv struct shared by all user processes in get_priv(). this
   should only be done once. doing a SYS_PRIV_USER in sys_privctl()
   caused the flags of all user processes to be reset, so they were no
   longer PREEMPTIBLE. this happened when RS executed a policy script.
   (this broke test1 in the test set)

 VFS/MFS:
 . chown can change the mode of a file, and chmod arguments are only
   part of the full file mode so the full filemode is slightly magic.
   changed these calls so that the final modes are returned to VFS, so
   that the vnode can be kept up-to-date.
   (this broke test11 in the test set)

 MFS:
 . lookup() checked for sizeof(string) instead of sizeof(user_path),
   truncating long path names
   (caught by test 23)
 . truncate functions neglected to update ctime
   (this broke test16)

 VFS:
 . corner case of an empty filename lookup caused fields of a request
   not to be filled in in the lookup functions, not making it clear
   that the lookup had failed, causing messages to garbage processes,
   causing strange failures.
   (caught by test 30)
 . trust v_size in vnode when doing reads or writes on non-special
   files, truncating i/o where necessary; this is necessary for pipes,
   as MFS can't tell when a pipe has been truncated without it being
   told explicitly each time.
   when the last reader/writer on a pipe closes, tell FS about
   the new size using truncate_vn().
   (this broke test 25, among others)
 . permission check for chdir() had disappeared; added a
   forbidden() call
   (caught by test 23)

new code, shouldn't change anything:
 . introduced RTS_SET, RTS_UNSET, and RTS_ISSET macro's, and their
   LOCK variants. These macros set and clear the p_rts_flags field,
   causing a lot of duplicated logic like

       old_flags = rp->p_rts_flags;            /* save value of the flags */
       rp->p_rts_flags &= ~NO_PRIV;
       if (old_flags != 0 && rp->p_rts_flags == 0) lock_enqueue(rp);

   to change into the simpler

       RTS_LOCK_UNSET(rp, NO_PRIV);

   so the macros take care of calling dequeue() and enqueue() (or lock_*()),
   as the case may be). This makes the code a bit more readable and a
   bit less fragile.
 . removed return code from do_clocktick in CLOCK as it currently
   never replies
 . removed some debug code from VFS
 . fixed grant debug message in device.c
 
preemptive checks, tests, changes:
 . added return code checks of receive() to SYSTEM and CLOCK
 . O_TRUNC should never arrive at MFS (added sanity check and removed
   O_TRUNC code)
 . user_path declared with PATH_MAX+1 to let it be null-terminated
 . checks in MFS to see if strings passed by VFS are null-terminated
 
 IS:
 . static irq name table thrown out
2007-02-01 17:50:02 +00:00
Philip Homburg
9092146be7 VFS cleanup (mostly open). 2007-01-05 16:36:55 +00:00
Philip Homburg
bafc45a309 First cut at 64-bit file offsets in block devices for mkfs/fsck. 2006-11-27 14:21:43 +00:00
Ben Gras
fa0ba56bc9 Merge of VFS by Balasz Gerofi with Minix trunk. 2006-10-25 13:40:36 +00:00
Renamed from servers/fs/device.c (Browse further)