minix

Author	SHA1	Message	Date
Thomas Veerman	7c8b3ddfed	VFS: fix locking bugs .sync and fsync used unnecessarily restrictive locking type .fsync violated locking order by obtaining a vmnt lock after a filp lock .fsync contained a TOCTOU bug .new_node violated locking rules (didn't upgrade lock upon file creation) .do_pipe used unnecessarily restrictive locking type .always lock pipes exclusively; even a read operation might require to do a write on a vnode object (update pipe size) .when opening a file with O_TRUNC, upgrade vnode lock when truncating .utime used unnecessarily restrictive locking type .path parsing: .always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to VMNT_READ if that was what was actually requested. This prevents the following deadlock scenario: thread A: lock_vmnt(vmp, TLL_READSER); lock_vnode(vp, TLL_READSER); upgrade_vmnt_lock(vmp, TLL_WRITE); thread B: lock_vmnt(vmp, TLL_READ); lock_vnode(vp, TLL_READSER); thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in lock_vnode. This happens when, for example, thread A tries create a new node (open.c:new_node) and thread B tries to do eat_path to change dir (stadir.c:do_chdir). When the path is being resolved, a vnode is always locked with VNODE_OPCL (TLL_READSER) and then downgraded to VNODE_READ if read-only is actually requested. Thread A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows VMNT_READ locks. Thread B can't acquire a lock on the vnode because thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE (TLL_WRITE) because thread B has a VMNT_READ lock on it. By serializing vmnt locks during path parsing, thread B can only acquire a lock on vmp when thread A has completely finished its operation.	2013-01-11 09:18:35 +00:00
Thomas Veerman	992799b91f	VFS: make all IPC asynchronous By decoupling synchronous drivers from VFS, we are a big step closer to supporting driver crashes under all circumstances. That is, VFS can't become stuck on IPC with a synchronous driver (e.g., INET) and can recover from crashing block drivers during open/close/ioctl or during communication with an FS. In order to maintain serialized communication with a synchronous driver, the communication is wrapped by a mutex on a per driver basis (not major numbers as there can be multiple majors with identical endpoints). Majors that share a driver endpoint point to a single mutex object. In order to support crashes from block drivers, the file reopen tactic had to be changed; first reopen files associated with the crashed driver, then send the new driver endpoint to FSes. This solves a deadlock between the FS and the block driver; - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it after retrying the current request to the newly started driver. - The block driver would refuse the retried request until all files had been reopened. - VFS would reopen files only after getting a reply from the initial REQ_NEW_DRIVER. When a character special driver crashes, all associated files have to be marked invalid and closed (or reopened if flagged as such). However, they can only be closed if a thread holds exclusive access to it. To obtain exclusive access, the worker thread (which handles the new driver endpoint event from DS) schedules a new job to garbage collect invalid files. This way, we can signal the worker thread that was talking to the crashed driver and will release exclusive access to a file associated with the crashed driver and prevent the garbage collecting worker thread from dead locking on that file. Also, when a character special driver crashes, RS will unmap the driver and remap it upon restart. During unmapping, associated files are marked invalid instead of waiting for an endpoint up event from DS, as that event might come later than new read/write/select requests and thus cause confusion in the freshly started driver. When locking a filp, the usage counters are no longer checked. The usage counter can legally go down to zero during filp invalidation while there are locks pending. DS events are handled by a separate worker thread instead of the main thread as reopening files could lead to another crash and a stuck thread. An additional worker thread is then necessary to unlock it. Finally, with everything asynchronous a race condition in do_select surfaced. A select entry was only marked in use after succesfully sending initial select requests to drivers and having to wait. When multiple select() calls were handled there was opportunity that these entries were overwritten. This had as effect that some select results were ignored (and select() remained blocking instead if returning) or do_select tried to access filps that were not present (because thrown away by secondary select()). This bug manifested itself with sendrecs, but was very hard to reproduce. However, it became awfully easy to trigger with asynsends only.	2012-09-17 11:01:45 +00:00
Thomas Veerman	77dbd766c1	VFS: Use safe string copy functions	2012-07-16 10:57:43 +00:00
Thomas Veerman	b6ff38065f	VFS: release what can be released Only attempt to release blocked processes that are blocked. There is no use in trying to find more blocked processes than we know that are blocked (on a pipe).	2012-04-27 08:51:02 +00:00
Thomas Veerman	db8198d99d	VFS: use S_IS* macros	2012-04-27 08:49:38 +00:00
Thomas Veerman	96bbc5da3e	VFS: I_PIPE is redundant Also, use S_IS* macros instead of manual comparison.	2012-04-27 08:49:38 +00:00
Thomas Veerman	26ec619a30	VFS: fix filp reuse race Pipes consist of two filps (read filp and write filp) and a shared vnode. When the writer leaves the filp reference count drops to zero and subsequent find_filp()s should not find the filp when a reader looks for it and the reader gets EOF. However, the pipe() system call tries to find two filps, marks them in use, and only after a successful node creation on PFS, overwrites the shared vnode with the new vnode. Consequently, this leaves a small window where a just closed 'pipe write filp' gets reused and marked as present, before becoming the actual new 'pipe write filp' for a new pipe. A reader for the old pipe will think a writer is present and wait for that writer to write something or to leave; both actions should revive the suspended reader. This will never happen and the reader will be stuck forever.	2012-04-13 13:22:57 +00:00
Thomas Veerman	8f55767619	VFS: make m_in job local By making m_in job local (i.e., each job has its own copy of m_in instead of refering to the global m_in) we don't have to store and restore m_in on every thread yield. This reduces overhead. Moreover, remove the assumption that m_in is preserved. Do_XXX functions have to copy the system call parameters as soon as possible and only pass those copies to other functions. Furthermore, this patch cleans up some code and uses better types in a lot of places.	2012-04-13 12:50:38 +00:00
Ben Gras	7336a67dfe	retire PUBLIC, PRIVATE and FORWARD	2012-03-25 21:58:14 +02:00
Ben Gras	6a73e85ad1	retire _PROTOTYPE . only good for obsolete K&R support . also remove a stray ansi.h and the proto cmd	2012-03-25 16:17:10 +02:00
Thomas Veerman	80c4685324	VFS: replace VFS with AVFS	2012-02-13 16:53:21 +00:00
David van Moolenbroek	b4d909d415	Split block/character protocols and libdriver This patch separates the character and block driver communication protocols. The old character protocol remains the same, but a new block protocol is introduced. The libdriver library is replaced by two new libraries: libchardriver and libblockdriver. Their exposed API, and drivers that use them, have been updated accordingly. Together, libbdev and libblockdriver now completely abstract away the message format used by the block protocol. As the memory driver is both a character and a block device driver, it now implements its own message loop. The most important semantic change made to the block protocol is that it is no longer possible to return both partial results and an error for a single transfer. This simplifies the interaction between the caller and the driver, as the I/O vector no longer needs to be copied back. Also, drivers are now no longer supposed to decide based on the layout of the I/O vector when a transfer should be cut short. Put simply, transfers are now supposed to either succeed completely, or result in an error. After this patch, the state of the various pieces is as follows: - block protocol: stable - libbdev API: stable for synchronous communication - libblockdriver API: needs slight revision (the drvlib/partition API in particular; the threading API will also change shortly) - character protocol: needs cleanup - libchardriver API: needs cleanup accordingly - driver restarts: largely unsupported until endpoint changes are reintroduced As a side effect, this patch eliminates several bugs, hacks, and gcc -Wall and -W warnings all over the place. It probably introduces a few new ones, too. Update warning: this patch changes the protocol between MFS and disk drivers, so in order to use old/new images, the MFS from the ramdisk must be used to mount all file systems.	2011-11-23 14:06:37 +01:00
David van Moolenbroek	c51cd5fe91	Server/driver protocols: no longer allow third-party copies. Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed to know which actual process to copy data from/to, as that process may not always be the caller. Now that we have full safecopy support, these fields have become useless for that purpose: the owner of the grant is always the caller. Allowing the caller to supply another endpoint is in fact dangerous, because the callee may then end up using a grant from a third party. One could call this a variant of the confused deputy problem. From now on, safecopy calls should always use the caller's endpoint as grant owner. This fully obsoletes the DL_ENDPT field in the inet/ethernet protocol. IO_ENDPT has other uses besides identifying the grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only because that is a more fitting name (it should never be used for I/O after all), but also in order to intentionally break any old system source code outside the base system. If this patch breaks your code, fixing it is fairly simple: - DL_ENDPT should be replaced with m_source; - IO_ENDPT should be replaced with m_source when used for safecopies; - IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g. when setting REP_ENDPT, matching requests in CANCEL calls, getting DEV_SELECT flags, and retrieving of the real user process's endpoint in DEV_OPEN. The changes in this patch are binary backward compatible.	2011-04-11 17:35:05 +00:00
Dirk Vogt	9ed280d1ec	decouple file system server start/termination from mount/umount	2010-11-23 19:34:56 +00:00
Thomas Veerman	13ef7f1f38	Prepare VFS to support back calls from PFS. For security reasons and to support file descriptor passing, PFS does some back calls to VFS. For example, to verify the validity of a path provided by a process and to tell VFS it must copy file descriptors from one process to another.	2010-08-30 13:44:07 +00:00
Tomas Hruby	6e25ad8b0a	Use of all NIL_* defines converted to NULL	2010-05-10 13:26:00 +00:00
Thomas Veerman	958b25be50	- Introduce support for sticky bit. - Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly. - Clean up MFS by removing old, dead code (backwards compatibility is broken by the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all functions have proper banners and prototypes. - VFS should always provide a (syntactically) valid path to the FS; no need for the FS to do sanity checks when leaving/entering mount points. - Fix several bugs in MFS: - Several path lookup bugs in MFS. - A link can be too big for the path buffer. - A mountpoint can become inaccessible when the creation of a new inode fails, because the inode already exists and is a mountpoint. - Introduce support for supplemental groups. - Add test 46 to test supplemental group functionality (and removed obsolete suppl. tests from test 2). - Clean up VFS (not everything is done yet). - ISOFS now opens device read-only. This makes the -r flag in the mount command unnecessary (but will still report to be mounted read-write). - Introduce PipeFS. PipeFS is a new FS that handles all anonymous and named pipes. However, named pipes still reside on the (M)FS, as they are part of the file system on disk. To make this work VFS now has a concept of 'mapped' inodes, which causes read, write, truncate and stat requests to be redirected to the mapped FS, and all other requests to the original FS.	2009-12-20 20:27:14 +00:00
Philip Homburg	047cc090e4	Added filp_state for driver recovery and filp_select_flags to store select state for character specials that use asynch I/O.	2008-02-22 14:19:23 +00:00
Philip Homburg	4b1cd8c0ec	Return EIO if a filedescriptor cannot be re-opened after a driver restart. Select now returns such a filedescriptor as ready (instead of EBADF). Reply before dev_up in FSSIGNON to avoid the problem that a DEV_OPEN request is received by a driver that expects a reply from the FSSIGNON.	2007-08-15 12:53:52 +00:00
Philip Homburg	a116b3aa55	To return the right error, check first is an object is a directory (for mkdir, rmdir/unlink, mknod), simply pipe code by using v_pipe_rd_pos and v_pipe_wr_pos directly. Some cleanup work in open.c	2007-08-08 14:01:36 +00:00
Ben Gras	41e9fedf87	Mostly bugfixes of bugs triggered by the test set. bugfixes: SYSTEM: . removed rc->p_priv->s_flags = 0; for the priv struct shared by all user processes in get_priv(). this should only be done once. doing a SYS_PRIV_USER in sys_privctl() caused the flags of all user processes to be reset, so they were no longer PREEMPTIBLE. this happened when RS executed a policy script. (this broke test1 in the test set) VFS/MFS: . chown can change the mode of a file, and chmod arguments are only part of the full file mode so the full filemode is slightly magic. changed these calls so that the final modes are returned to VFS, so that the vnode can be kept up-to-date. (this broke test11 in the test set) MFS: . lookup() checked for sizeof(string) instead of sizeof(user_path), truncating long path names (caught by test 23) . truncate functions neglected to update ctime (this broke test16) VFS: . corner case of an empty filename lookup caused fields of a request not to be filled in in the lookup functions, not making it clear that the lookup had failed, causing messages to garbage processes, causing strange failures. (caught by test 30) . trust v_size in vnode when doing reads or writes on non-special files, truncating i/o where necessary; this is necessary for pipes, as MFS can't tell when a pipe has been truncated without it being told explicitly each time. when the last reader/writer on a pipe closes, tell FS about the new size using truncate_vn(). (this broke test 25, among others) . permission check for chdir() had disappeared; added a forbidden() call (caught by test 23) new code, shouldn't change anything: . introduced RTS_SET, RTS_UNSET, and RTS_ISSET macro's, and their LOCK variants. These macros set and clear the p_rts_flags field, causing a lot of duplicated logic like old_flags = rp->p_rts_flags; /* save value of the flags / rp->p_rts_flags &= ~NO_PRIV; if (old_flags != 0 && rp->p_rts_flags == 0) lock_enqueue(rp); to change into the simpler RTS_LOCK_UNSET(rp, NO_PRIV); so the macros take care of calling dequeue() and enqueue() (or lock_()), as the case may be). This makes the code a bit more readable and a bit less fragile. . removed return code from do_clocktick in CLOCK as it currently never replies . removed some debug code from VFS . fixed grant debug message in device.c preemptive checks, tests, changes: . added return code checks of receive() to SYSTEM and CLOCK . O_TRUNC should never arrive at MFS (added sanity check and removed O_TRUNC code) . user_path declared with PATH_MAX+1 to let it be null-terminated . checks in MFS to see if strings passed by VFS are null-terminated IS: . static irq name table thrown out	2007-02-01 17:50:02 +00:00
Philip Homburg	bafc45a309	First cut at 64-bit file offsets in block devices for mkfs/fsck.	2006-11-27 14:21:43 +00:00
Ben Gras	fa0ba56bc9	Merge of VFS by Balasz Gerofi with Minix trunk.	2006-10-25 13:40:36 +00:00

23 commits