minix

Author	SHA1	Message	Date
Thomas Veerman	c66fd312d4	VFS-FS protocol: add versioning Change-Id: Ice6fbfd4b535b7435653fa08b27a3378d1cfdbf8	2014-02-18 11:25:00 +01:00
Ben Gras	740c1a7425	libminixfs: allow non-pagesize-multiple FSes The memory-mapped files implementation (mmap() etc.) is implemented with the help of the filesystems using the in-VM FS cache. Filesystems tell it about all cached blocks and their metadata. Metadata is: device offset and, if any (and known), inode number and in-inode offset. VM can then map in requested memory-mapped file blocks, and request them if necessary. A limitation of this system is that filesystem block sizes that are not a multiple of the VM system (and VM hardware) page size are not possible; we can't map blocks in partially. (We can copy, but then the benefits of mapping and sharing the physical pages is gone.) So until before this commit various pieces of caching code assumed page size multiple blocksizes. This isn't strictly necessary as long as mmap() needn't be supported on that FS. This change allows the in-FS cache code (libminixfs) to allocate any-sized blocks, and will not interact with the VM cache for non-pagesize-multiple blocks. In that case it will also signal requestors, by failing 'peek' requests, that mmap() should not be supported on this FS. VM and VFS will then gracefully fail all file-mapping mmap() calls, and exec() will fall back to copying executable blocks instead of mmap()ping executables. As a result, 3 diagnostics that signal file-mapped mmap()s failing (hitherto an unusual occurence) are disabled, as ld.so does file-mapped mmap()s to map in objects it needs. On FSes not supporting it this situation is legitimate and shouldn't cause so much noise. ld.so will revert to its own minix-specific allocate+copy style of starting executables if mmap()s fail. Change-Id: Iecb1c8090f5e0be28da8f5181bb35084eb18f67b	2013-11-21 10:03:06 +00:00
Gerard	f1b0deacf3	Replaced add64, add64u and add64ul with operators. Change-Id: Ia537f83e15cb686f1b81b34d73596f4298b0a924	2013-11-13 13:11:33 +00:00
Ben Gras	c8f3b10909	fix a few more minix specific warnings . also disable stack protection feature for gcc, causes build errors for pkgsrc gcc on minix Change-Id: I1c6e2bcb4d948098d642543d7b2711284ee55c72	2013-08-27 16:16:03 +00:00
David van Moolenbroek	78d707cd26	VM: support for shared call mask ACLs The VM server now manages its call masks such that all user processes share the same call mask. As a result, an update for the call mask of any user process will apply to all user processes. This is similar to the privilege infrastructure employed by the kernel, and may serve as a template for similar fine-grained restrictions in other servers. Concretely, this patch fixes the problem of "service edit init" not applying the given VM call mask to user processes started from RC scripts during system startup. In addition, this patch makes RS set a proper VM call mask for each recovery script it spawns. Change-Id: I520a30d85a0d3f3502d2b158293a2258825358cf	2013-08-08 23:22:58 +02:00
Antoine Leca	da82f9b2e8	<a.out.h>, MINIX style: remove as obsolete Change-Id: Icc8b7210d60a93ac9cc4610d676dcba270756410	2013-08-06 11:43:35 +02:00
David van Moolenbroek	c4f3c7d66f	VFS: fix access bits in device reopen calls	2013-08-05 12:30:13 +02:00
Lionel Sambuc	0cdf705cc6	Enable optional GCC install and GCC improvements -By adding MKGCC=yes and MKGCCCMDS=yes on the make commandline it is now possible to compile and install GCC on the system. Before doing this, if you are not using the build.sh script, you will need to call the fetch scripts in order to retrieve the sources of GCC and its dependencies. -Reduce difference with NetBSD share/mk Move Minix-specific parameters from bsd.gcc.mk to bsd.own.mk, which is anyway patched, so that bsd.gcc.mk is now aligned on the NetBSD version. -Clean libraries dependencies, compiles stdc++ only if gcc is also compiled (it is part of the gcc sources) -Correct minix.h header sequence, cleanup spec headers. -Fix cross-compilation from a 32bit host targeting MINIX/arm Change-Id: I1b234af18eed4ab5675188244e931b2a2b7bd943	2013-07-12 14:22:03 +02:00
Xiaoguang Sun	64f10ee644	Implement getrusage Implement getrusage. These fields of struct rusage are not supported and always set to zero at this time long ru_nswap; /* swaps / long ru_inblock; / block input operations / long ru_oublock; / block output operations / long ru_msgsnd; / messages sent / long ru_msgrcv; / messages received / long ru_nvcsw; / voluntary context switches / long ru_nivcsw; / involuntary context switches */ test75.c is the unit test for this new function Change-Id: I3f1eb69de1fce90d087d76773b09021fc6106539	2013-07-01 23:00:47 +02:00
Ben Gras	cdf2f55a90	kernel, arm ucontext: ARM DBG=-g run fixes kernel: . modules can be as big as the space (8MB) between them instead of 4MB; memory is slightly bigger with DBG=-g arm ucontext: . r4 is clobbered by the restore function, as it's used as a scratch register, causing problems for the DBG=-g build . r1-r3 are safe for scratch registers, as they are caller-save, so use r3 instead; and don't bother restoring r1-r3, but preserve r4 vfs: . improve TLL pointer sanity check a bit Change-Id: I0e3cfc367fdc14477e40d04b5e044f288ca4cc7d	2013-06-24 16:57:30 +02:00
Ben Gras	456359aa72	retire 64-bit conversion functions Change-Id: Ib6b81403f877c363a286c654e0524fa1cb781b80	2013-06-24 16:50:57 +02:00
Ben Gras	8f2749cca8	vfs: patch for unpause()/revive() race condition . unpause() and revive() can race - revive() can run during a device i/o unblock, causing two sendnb()s to occur, and the 2nd one to fail . this can easily happen when a process is blocking on tty and is then killed by a signal - tty cancels the i/o and then kills the process by a signal Change-Id: Ia319acaedfa336b78c030a2c4af7246959bdcf87	2013-06-14 23:58:43 +02:00
Ben Gras	9e43052b21	inline sendnb should not call send vector . also vfs has to reply to a vm call - so use asynsend for that Change-Id: I30ac1e591191dea5c99e25b03151a4415d1151b0	2013-06-12 07:04:53 +00:00
Ben Gras	9178749e13	libc syslog, syslogd, logger, uds fixes changes necessary for libc syslog() using a unix domain socket. . libc syslog: don't use send() connect() for unix datagram sockets, minix wants write() and ioctl() . syslogd: listen on _PATH_LOG unix domain socket . logger: warnings fixes . pfs: make uds dgram socket type nonblocking so syslog() doesn't block . vfs: add sanity check for empty fd in unpause() Change-Id: Ied136c6fe0cc288f5a53478f1eebccc1ab1f39fb	2013-06-12 07:04:52 +00:00
Ben Gras	f369157d95	pfs, vfs: increase various limits . pipes in pfs . vnodes in vfs . thread stack sizes in vfs Change-Id: Ib27dedd42f57a90821a5b950cd7ea25cb2b42f3f	2013-05-31 15:42:00 +00:00
Ben Gras	33a7ac7557	vfs: mmap support . libc: add vfs_mmap, a way for vfs to initiate mmap()s. This is a good special case to have as vfs is a slightly different client from regular user processes. It doesn't do it for itself, and has the dev & inode info already so the callback to VFS for the lookup isn't necessary. So it has different info to have to give to VM. . libc: also add minix_mmap64() that accepts a 64-bit offset, even though our off_t is still 32 bit now. . On exec() time, try to mmap() in the executable if available. (It is not yet available in this commit.) . To support mmap(), add do_vm_call that allows VM to lookup (to ino+dev), do i/o from and close FD's on behalf of other processes. Change-Id: I831551e45a6781c74313c450eb9c967a68505932	2013-05-31 15:42:00 +00:00
Ben Gras	2d2a1a077d	panic: declare as printf-style-checked . and related fixes Change-Id: I5131ac57dc53d8aec8d421a34c5ceea383404d7a	2013-05-31 13:35:25 +00:00
Ben Gras	5507a12d7c	vfs: who_p fix Change-Id: I0e04b6460907f5e67f6c90b2038d296d66b9a414	2013-05-31 09:28:38 +00:00
Ben Gras	4ebb889e7a	libsys: panic hook feature . vfs: use it to dump threads stacks Change-Id: I7ae3521fc153a407505f11049629e6d4142cf7c7	2013-05-07 17:18:40 +00:00
Ben Gras	44f34e53d5	VFS: Implement REQ_BPEEK. This commit introduces a new request type called REQ_BPEEK. It requests minor device blocks from the FS. Analogously to REQ_PEEK, it requests the filesystem to get the requested blocks into its cache, without actually copying the result anywhere. Change-Id: If1d06645b0e17553a64b3167091e9d12efeb3d6f	2013-04-24 10:18:16 +00:00
Ben Gras	0cfff08e56	libexec: mmap support, prealloc variants In libexec, split the memory allocation method into cleared and non-cleared. Cleared gives zeroed memory, non-cleared gives 'junk' memory (that will be overwritten anyway, and so needn't be cleared) that is faster to get. Also introduce the 'memmap' method that can be used, if available, to map code and data from executables into a process using the third-party mmap() mode. Change-Id: I26694fd3c21deb8b97e01ed675dfc14719b0672b	2013-04-24 10:18:16 +00:00
Xiaoguang Sun	20e6c9329f	Change function prototype to use endpoint_t instead of int	2013-04-23 17:15:15 +02:00
Ben Gras	072d916c1c	vfs: fix null deref, pfs: add fchmod() . vfs read_only() assumes vnode->v_vmnt is non-NULL, but it can be NULL sometimes . e.g. fchmod() on UDS triggered NULL deref; add a check and add REQ_CHMOD to pfs so unix domain sockets can be fchmod()ded . add to test56 Change-Id: I83c840f101b647516897cc99fcf472116d762012	2013-04-19 17:06:56 +02:00
Ben Gras	cef94e096e	vfs: make m_out non-global m_out is shared between threads as the reply message, and it can happen results get overwritten by another thread before the reply is sent. This change . makes m_out local to the message handling function, declared on the stack of the caller . forces callers of reply() to give it a message, or declare the reply message has no significant fields except for the return code by calling replycode() Change-Id: Id06300083a63c72c00f34f86a5c7d96e4bbdf9f6	2013-04-12 23:40:38 +00:00
Antoine Leca	9131e98a7d	utimens(2) system call Variant of utime(2) with struct timespec (with ns precision) instead of time_t values; also allows for tv_nsec members the values UTIME_NOW (force update to current time) or UTIME_OMIT (allow to set either atim or mtim independently.) Provides a superset of utimes(2), futimes(2), lutimes(2), and futimens(2). Provides the same subset of utimensat(2) as does NetBSD 6. Also import utimens() and lutimeNS() from NetBSD-current.	2013-04-12 18:55:39 +00:00
Antoine Leca	4069cef7f9	Subsecond timestamps support for FS Expand REQ_UTIME to include tv_nsec members (as in struct timespec) in addition to tv_sec==time_t Designed with help from David van Moolenbroek	2013-04-12 11:11:59 +02:00
Thomas Cort	516fec97d9	libc: add clock_settime() system call. This also adds the sys_settime() kernel call which allows for the adjusting of the clock named realtime in the kernel. The existing sys_stime() function is still needed for a separate job (setting the boottime). The boottime is set in the readclock driver. The sys_settime() interface is meant to be flexible and will support both clock_settime() and adjtime() when adjtime() is implemented later. settimeofday() was adjusted to use the clock_settime() interface. One side note discovered during testing: uptime(1) (part of the last(1)), uses wtmp to determine boottime (not Minix's times(2)). This leads `uptime` to report odd results when you set the time to a time prior to boottime. This isn't a new bug introduced by my changes. It's been there for a while.	2013-04-04 15:04:54 +02:00
Thomas Cort	e67fc5771d	libc: add clock_getres()/clock_gettime() system calls. In order to make it more clear that ticks should be used for timers and realtime should be used for timestamps / displaying the date/time, getuptime() was renamed to getticks() and getuptime2() was renamed to getuptime(). Servers, drivers, libraries, tests, etc that use getuptime()/getuptime2() have been updated. In instances where a realtime was calculated, the calculation was changed to use realtime. System calls clock_getres() and clock_gettime() were added to PM/libc.	2013-04-04 15:04:53 +02:00
Thomas Veerman	6ee180f5f7	VFS: wikify README Change-Id: I746f7c8ddabd1e047b8d536df14586c5b1594d55	2013-03-21 15:20:34 +00:00
Ben Gras	4f9139778d	vfs: coredump fix: write zeroes for missing memory	2013-03-20 20:05:31 +00:00
Thomas Veerman	76ddef10da	UDS: terminate canonical path string When you provided a string with junk after the terminating nul to a UNIX domain socket and used bind(2), the canonical path function would not properly terminate the new string. This caused VFS to return ENAMETOOLONG on an otherwise valid path name. Test case is added to test56. Change-Id: I883b6be23d9e4ea13c3cee28cbb3726343df037f	2013-03-08 15:42:32 +00:00
Ben Gras	a9f55a2e46	VFS, FSes: add REQ_PEEK request type REQ_PEEK behaves just like REQ_READ except that it does not copy data anywhere, just obtains the blocks from the FS into the cache. To be used by the future mmap implementation. Change-Id: I1b56de304f0a7152b69a72c8962d04258adb44f9	2013-03-07 10:57:38 +00:00
Lionel Sambuc	8f3fbf7cc1	Cleanup: Remove minix.bootprog.mk The build system distinction between "bootprog" and "service" is meaningless as boot programs are standard services. As minix.service.mk simply imports minix.bootprog.mk, reduce confusion by removing minix.bootprog.mk and placing the rules in minix.service.mk. Change-Id: I4056b1e574bed59a8c890239b41b1a7c7cad63e8	2013-03-06 11:56:56 +01:00
Thomas Veerman	49ad4e8888	Spring cleanup Remove old versions of system calls and system calls that don't have a libc api interface anymore (dup, dup2, creat). VFS still contains support for old system call numbers for the new stat system calls (i.e., 65, 66, 67) to keep supporting old binaries built for MINIX 3.2.1 (prior to the release). Change-Id: I721779b58a50c7eeae20669de24658d55d69b25b	2013-03-06 09:56:08 +00:00
Thomas Veerman	473547c777	VFS: implement pipe2 Change-Id: Iedc8042dd73a903456b25ba665d12577f5589ca2	2013-02-28 10:08:53 +00:00
Thomas Veerman	fa78dc389f	socket: implement SOCK_CLOEXEC and SOCK_NONBLOCK Change-Id: I3fa36fa999c82a192d402cb4d913bd397e106e53	2013-02-28 10:08:53 +00:00
Thomas Veerman	fd610ba1b0	VFS: add ability to open files O_CLOEXEC .adjust libc to make use of it (undo __minix diff) Change-Id: I90a1aa219fcd1b12b6bc60e72176f326eac8184a	2013-02-28 10:08:53 +00:00
Lionel Sambuc	f640210005	Removing obsolete _NBSD_LIBC define Change-Id: Ia6ce84ccdf36cf6f64540b990baaa7d85c53533d	2013-02-26 09:44:24 +00:00
Lionel Sambuc	8e4736f2df	Removing obsolete _MINIX define Change-Id: Id33ac7e973d1c0e249b690fe44a597474fac6076	2013-02-26 09:44:20 +00:00
Thomas Veerman	2b90964e33	VFS: don't garbage collect if file is already closed	2013-02-21 10:29:08 +00:00
Thomas Veerman	cfcce207c1	VFS: prevent unmapping drivers that don't support reopening libchardriver does not support DEV_REOPEN and will return ERESTART when you do try it. This made VFS unhappy and concluded erroneously that the driver was EDEADEPT.	2013-02-21 10:29:08 +00:00
Ben Gras	298b41b523	libexec: detect short files if an exec() fails partway through reading in the sections, the target process is already gone and a defunct process remains. sanity checking the binary beforehand helps that. test10 mutilates binaries and exec()s them on purpose; making an exec() fail cleanly in such cases seems like acceptable behaviour. fixes test10 on ARM. Change-Id: I1ed9bb200ce469d4d349073cadccad5503b2fcb0	2013-02-04 12:04:35 +01:00
Thomas Veerman	06e2adbeaa	VFS: fix select again Change-Id: Ia5e26cdbfe38e3fb293dd57269a76b15c1fe236b	2013-01-25 17:42:36 +00:00
Thomas Veerman	b180f32ab3	VFS/PFS: remove remnants of file position in pipes	2013-01-23 11:14:34 +00:00
Thomas Veerman	306f3ccd6f	VFS: fix select bug on pipes	2013-01-23 11:14:34 +00:00
Lionel Sambuc	f14fb60209	Libraries updates and cleanup * Updating common/lib * Updating lib/csu * Updating lib/libc * Updating libexec/ld.elf_so * Corrected test on __minix in featuretest to actually follow the meaning of the comment. * Cleaned up _REENTRANT-related defintions. * Disabled -D_REENTRANT for libfetch * Removing some unneeded __NBSD_LIBC defines and tests Change-Id: Ic1394baef74d11b9f86b312f5ff4bbc3cbf72ce2	2013-01-14 11:36:26 +01:00
Thomas Veerman	bdfef53dbf	VFS: initialize variables	2013-01-11 12:46:44 +00:00
Thomas Veerman	aa521228a5	VFS: Coverity appeasements	2013-01-11 09:42:01 +00:00
Thomas Veerman	ea8ff9284a	Add stack trace dumps for VFS over serial	2013-01-11 09:18:36 +00:00
Thomas Veerman	625f4ae4a3	VFS: add documentation about internal working	2013-01-11 09:18:36 +00:00
Thomas Veerman	23c5f56e32	VFS: change locking to ease concurrent FSes This patch uses stricter locking for REQ_LINK, REQ_MKDIR, REQ_MKNOD, REQ_RENAME, REQ_RMDIR, REQ_SLINK and REQ_UNLINK. For all requests, VFS locks the directory in which we add or remove an inode with VNODE_WRITE. I.e., the operations have exclusive access to that directory. Furthermore, REQ_CHOWN, REQ_CHMOD, and REQ_FTRUNC now lock the vmnt VMNT_READ; VMNT_WRITE was unnecessary.	2013-01-11 09:18:35 +00:00
Thomas Veerman	3de8d1cf6e	VFS/PFS: remove notion of position in pipes Because pipes have no file position. VFS maintained (file) offsets into a buffer internal to PFS and stored them in vnodes for simplicity, mixing the responsibilities of filp and vnode objects. With this patch PFS ignores the position field in REQ_READ and REQ_WRITE requests making VFS' job a lot simpler.	2013-01-11 09:18:35 +00:00
Thomas Veerman	7c8b3ddfed	VFS: fix locking bugs .sync and fsync used unnecessarily restrictive locking type .fsync violated locking order by obtaining a vmnt lock after a filp lock .fsync contained a TOCTOU bug .new_node violated locking rules (didn't upgrade lock upon file creation) .do_pipe used unnecessarily restrictive locking type .always lock pipes exclusively; even a read operation might require to do a write on a vnode object (update pipe size) .when opening a file with O_TRUNC, upgrade vnode lock when truncating .utime used unnecessarily restrictive locking type .path parsing: .always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to VMNT_READ if that was what was actually requested. This prevents the following deadlock scenario: thread A: lock_vmnt(vmp, TLL_READSER); lock_vnode(vp, TLL_READSER); upgrade_vmnt_lock(vmp, TLL_WRITE); thread B: lock_vmnt(vmp, TLL_READ); lock_vnode(vp, TLL_READSER); thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in lock_vnode. This happens when, for example, thread A tries create a new node (open.c:new_node) and thread B tries to do eat_path to change dir (stadir.c:do_chdir). When the path is being resolved, a vnode is always locked with VNODE_OPCL (TLL_READSER) and then downgraded to VNODE_READ if read-only is actually requested. Thread A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows VMNT_READ locks. Thread B can't acquire a lock on the vnode because thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE (TLL_WRITE) because thread B has a VMNT_READ lock on it. By serializing vmnt locks during path parsing, thread B can only acquire a lock on vmp when thread A has completely finished its operation.	2013-01-11 09:18:35 +00:00
Kees Jongenburger	c0c581a635	vfs:fix for variable 'rfp' set but not used. mount.c: In function 'mount_pfs': mount.c:395:17: error: variable 'rfp' set but not used [-Werror=unused-but-set-variable] Change-Id: I2f22590ab4e3a4a1678e9096626ebca53d2660e6	2013-01-07 09:12:27 +01:00
Ben Gras	8aeac26999	vfs: fix clobbering fd_nr dumpcore: fd_nr can be in use as blocking fd but will then be clobbered by common_open, causing disaster for exiting unpause().	2012-12-11 12:00:57 +01:00
David van Moolenbroek	766047123a	VFS: fix off-by-one in get_name()	2012-11-30 12:24:47 +00:00
Thomas Veerman	179261a9b6	mtab: support moving mount points Also fix canonical_path function; it fails to parse some paths	2012-11-29 10:50:51 +00:00
Thomas Veerman	d9f4f71916	Implement dynamic mtab support With this patch /etc/mtab becomes obsolete.	2012-11-26 15:20:18 +00:00
Thomas Veerman	de83b2a9d9	VFS: change 'last_dir' to match locking assumption new_node makes the assumption that when it does last_dir on a path, a successive advance would not yield a lock on a vmnt, because last_dir already locked the vmnt. This is true except when last_dir resolves to a directory on the parent vmnt of the file that was the result of advance. For example, # cd / # echo foo > home where home is on a different (sub) partition than / is (default install). last_dir would resolve to / and advance would resolve to /home. With this change, last_dir resolves to the root node on the /home partition, making the assumption valid again.	2012-11-26 15:20:18 +00:00
David van Moolenbroek	7dd286e6b8	VFS: do not save device node for new regular files The VFS/FS protocol does not require the file server to supply a special device node number in response to a REQ_CREATE request, as this call creates only regular files. Therefore, VFS should not erroneously save this piece of information from the REQ_CREATE reply either.	2012-11-15 14:29:59 +00:00
Thomas Veerman	14e470be81	VFS: fix TOCTOU bug in sync	2012-11-14 13:24:53 +00:00
Thomas Veerman	ed23a7a7d2	VFS: fix reboot panic with mounted FUSE FS Upon reboot VFS semi-exits all processes and unmounts the file system. However, upon unmount, exiting FUSE file systems might need service from the file system (due to libc). As the FUSE process is halfway the exit procedure, it doesn't have a valid root directory and working directory. Trying to do system calls then triggers a sanity check in VFS. This fix first exits normal processes which should then allow for unmounting FUSE file systems. Then VFS exits all processes including File Servers and unmounts the rest of the file system.	2012-11-14 13:18:16 +00:00
Thomas Veerman	badec36b33	VFS: fix deadlock when out of worker threads There is a deadlock vulnerability when there are no worker threads available and all of them blocked on a worker thread that's waiting for a reply from a driver or a reply from an FS that needs to make a back call. In these cases the deadlock resolver thread should kick in, but didn't in all cases. Moreover, POSIX calls from File Servers weren't handled properly anymore, which also could lead to deadlocks.	2012-11-14 13:12:37 +00:00
Arne Welzel	e35c4f78d2	VFS: fix check_bsf() locking The check_bsf() macro uses assert(mutex_trylock(&bsf_lock)) and assumes bsf_lock is locked afterwards. This breaks when compiling with NOASSERTS="yes". Also: macro to function transition.	2012-09-28 14:57:34 +02:00
Arne Welzel	7e1074732b	VFS: resolve unused parameter if NOASSERTS="yes" If VFS is compiled with NOASSERTS="yes", ctty_opcl() does not use the op parameter. Change to "non-assert()" sanity check.	2012-09-28 14:57:32 +02:00
Ben Gras	60014efb3e	vfs: pm_dumpcore: always clean up process . whenever this function is called, pm will expect the process to be cleaned up . so don't abort the process entirely on error . fixes a later 'forking on top of in-use child' vfs panic	2012-09-19 17:13:17 +02:00
Thomas Veerman	c087a60ed2	VFS: fix GCC compilation error	2012-09-17 15:29:38 +00:00
Thomas Veerman	3881e732a9	VFS: panic when unmount_all fails	2012-09-17 11:01:46 +00:00
Thomas Veerman	992799b91f	VFS: make all IPC asynchronous By decoupling synchronous drivers from VFS, we are a big step closer to supporting driver crashes under all circumstances. That is, VFS can't become stuck on IPC with a synchronous driver (e.g., INET) and can recover from crashing block drivers during open/close/ioctl or during communication with an FS. In order to maintain serialized communication with a synchronous driver, the communication is wrapped by a mutex on a per driver basis (not major numbers as there can be multiple majors with identical endpoints). Majors that share a driver endpoint point to a single mutex object. In order to support crashes from block drivers, the file reopen tactic had to be changed; first reopen files associated with the crashed driver, then send the new driver endpoint to FSes. This solves a deadlock between the FS and the block driver; - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it after retrying the current request to the newly started driver. - The block driver would refuse the retried request until all files had been reopened. - VFS would reopen files only after getting a reply from the initial REQ_NEW_DRIVER. When a character special driver crashes, all associated files have to be marked invalid and closed (or reopened if flagged as such). However, they can only be closed if a thread holds exclusive access to it. To obtain exclusive access, the worker thread (which handles the new driver endpoint event from DS) schedules a new job to garbage collect invalid files. This way, we can signal the worker thread that was talking to the crashed driver and will release exclusive access to a file associated with the crashed driver and prevent the garbage collecting worker thread from dead locking on that file. Also, when a character special driver crashes, RS will unmap the driver and remap it upon restart. During unmapping, associated files are marked invalid instead of waiting for an endpoint up event from DS, as that event might come later than new read/write/select requests and thus cause confusion in the freshly started driver. When locking a filp, the usage counters are no longer checked. The usage counter can legally go down to zero during filp invalidation while there are locks pending. DS events are handled by a separate worker thread instead of the main thread as reopening files could lead to another crash and a stuck thread. An additional worker thread is then necessary to unlock it. Finally, with everything asynchronous a race condition in do_select surfaced. A select entry was only marked in use after succesfully sending initial select requests to drivers and having to wait. When multiple select() calls were handled there was opportunity that these entries were overwritten. This had as effect that some select results were ignored (and select() remained blocking instead if returning) or do_select tried to access filps that were not present (because thrown away by secondary select()). This bug manifested itself with sendrecs, but was very hard to reproduce. However, it became awfully easy to trigger with asynsends only.	2012-09-17 11:01:45 +00:00
Ben Gras	e4ac80eb60	various warning/errorwarning fixes for gcc47 . warnings (sometimes promoted to errors) in servers/ and kernel/ . -Os for ext2 boot module to make it small enough	2012-08-27 16:19:18 +02:00
Ben Gras	31d8526346	libexec: add load_offset feature, used for ld.so . ld.so is linked at 0 but it can relocate itself; we wish to load ld.so higher though to trap NULL dereferences. if we know we have to execute ld.so, vfs tells libexec to put it higher.	2012-08-12 23:22:54 +02:00
Thomas Veerman	66dbf73049	VFS: fix locking bug in clone_opcl When VFS runs out of vnodes after closing a vnode in opcl, common_open will try to unlock a vnode through unlock_filp that has already been unlocked in clone_opcl. By first obtaining and locking a new vnode this situation is prevented; if there are no free vnodes, common_open will unlock a still locked vnode.	2012-07-30 10:01:16 +00:00
Thomas Veerman	f6b0d662b5	VFS: check path components for NAME_MAX length	2012-07-30 09:44:58 +00:00
David van Moolenbroek	0b4c154160	VFS: call req_inhibread again	2012-07-19 14:36:51 +00:00
David van Moolenbroek	e0742978f1	VFS: do not resolve symlinks in rename(2)	2012-07-18 14:59:45 +00:00
Thomas Veerman	0d3ccd8908	VFS: fix coverity defects	2012-07-17 10:29:22 +00:00
Thomas Veerman	fd60f03129	VFS: remove support for sync FS communication	2012-07-17 10:12:53 +00:00
Thomas Veerman	06f49fe167	VFS: prevent buffer overflow If an FS returns faulty struct dirent data, VFS could overflow a buffer that holds this data.	2012-07-17 08:49:41 +00:00
Ben Gras	cbcdb838f1	various coverity-inspired fixes . some strncpy/strcpy to strlcpy conversions . new <minix/param.h> to avoid including other minix headers that have colliding definitions with library and commands code, causing parse warnings . removed some dead code / assignments	2012-07-16 14:00:56 +02:00
Thomas Veerman	77dbd766c1	VFS: Use safe string copy functions	2012-07-16 10:57:43 +00:00
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	0fb2f83da9	drop from segments physcopy/vircopy invocations . sys_vircopy always uses D for both src and dst . sys_physcopy uses PHYS_SEG if and only if corresponding endpoint is NONE, so we can derive the mode (PHYS_SEG or D) from the endpoint arg in the kernel, dropping the seg args . fields in msg still filled in for backwards compatability, using same NONE-logic in the library	2012-06-18 12:28:40 +00:00
Ben Gras	2bfeeed885	drop segment from safecopy invocations . all invocations were S or D, so can safely be dropped to prepare for the segmentless world . still assign D to the SCP_SEG field in the message to make previous kernels usable	2012-06-16 16:22:51 +00:00
Ben Gras	85ff5a947e	dumpcore: use ptrace function to trigger a coredump . dumpcore currently relies on minix segments . also ptrace dumpcore fix	2012-06-15 12:13:50 +02:00
Ben Gras	769af57274	further libexec generalization . new mode for sys_memset: include process so memset can be done in physical or virtual address space. . add a mode to mmap() that lets a process allocate uninitialized memory. . this allows an exec()er (RS, VFS, etc.) to request uninitialized memory from VM and selectively clear the ranges that don't come from a file, leaving no uninitialized memory left for the process to see. . use callbacks for clearing the process, clearing memory in the process, and copying into the process; so that the libexec code can be used from rs, vfs, and in the future, kernel (to load vm) and vm (to load boot-time processes)	2012-06-07 15:15:02 +02:00
Ben Gras	040362e379	exec() cleanup, generalization, improvement . make exec() callers (i.e. vfs and rs) determine the memory layout by explicitly reserving regions using mmap() calls on behalf of the exec()ing process, i.e. handling all of the exec logic, thereby eliminating all special exec() knowledge from VM. . the new procedure is: clear the exec()ing process first, then call third-party mmap()s to reserve memory, then copy the executable file section contents in, all using callbacks tailored to the caller's way of starting an executable . i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM as with rigid 2-section arguments . this naturally allows generalizing exec() by simply loading all ELF sections . drop/merge of lots of duplicate exec() code into libexec . not copying the code sections to vfs and into the executable again is a measurable performance improvement (about 3.3% faster for 'make' in src/servers/)	2012-06-07 15:15:01 +02:00
Ben Gras	41b869d4d6	drop aout support justification: soon we won't be able to execute sep I&D aouts at all (because of the vanishing segments), which was the default mode to generate them so most binaries will be sep I&D. this makes the vfs/rs exec() unification work simpler. after unification, common I&D aout could be added back quite simply.	2012-06-07 12:43:16 +02:00
David van Moolenbroek	1817f7fc07	VFS: fix "process already free" panic on reboot Reported by Claudiu Dan Gheorghe, debugged by Thomas and myself	2012-05-02 17:42:50 +02:00
Thomas Veerman	068d443d12	VFS: unlock vmnt when out of vnodes	2012-04-27 08:51:13 +00:00
Thomas Veerman	b6ff38065f	VFS: release what can be released Only attempt to release blocked processes that are blocked. There is no use in trying to find more blocked processes than we know that are blocked (on a pipe).	2012-04-27 08:51:02 +00:00
Thomas Veerman	7b81254069	VFS: simplify stat for pipes According to POSIX the st_size field of struct stat is undefined for fifos and anonymous pipes. Thus we can do anything we want. We save a copy by not being accurate on pipe sizes.	2012-04-27 08:50:49 +00:00
Thomas Veerman	db8198d99d	VFS: use S_IS* macros	2012-04-27 08:49:38 +00:00
Thomas Veerman	96bbc5da3e	VFS: I_PIPE is redundant Also, use S_IS* macros instead of manual comparison.	2012-04-27 08:49:38 +00:00
Ben Gras	755102d67f	AT_SUN_EXECNAME support . vfs: pass execname in aux vectors . ld.elf_so: use this to expand $ORIGIN . this requires the executable to reserve more space at exec() calling time	2012-04-26 13:32:39 +02:00
David van Moolenbroek	26f817243b	VFS: reimplement truncate mtime/ctime fix POSIX mandates that a file's modification and change time be left untouched upon truncate/ftruncate iff the file size does not change. However, an open(O_TRUNC) call must always update the modification and change time of the file, even if it was already zero-sized. VFS uses the file systems' truncate call to implement O_TRUNC. This patch replaces git-255ae85, which did not take into account the open case. The size check is now moved into VFS, so that individual file systems need not check for this case anymore.	2012-04-20 11:35:59 +02:00
Ben Gras	3945cfbfd3	block ioctls: pass request number	2012-04-18 11:01:15 +02:00
Ben Gras	53002f6f6c	recognize and execute dynamically linked executables . generalize libexec slightly to get some more necessary information from ELF files, e.g. the interpreter . execute dynamically linked executables when exec()ed by VFS . switch to netbsd variant of elf32.h exclusively, solves some conflicting headers	2012-04-16 00:41:42 +00:00
Thomas Veerman	26ec619a30	VFS: fix filp reuse race Pipes consist of two filps (read filp and write filp) and a shared vnode. When the writer leaves the filp reference count drops to zero and subsequent find_filp()s should not find the filp when a reader looks for it and the reader gets EOF. However, the pipe() system call tries to find two filps, marks them in use, and only after a successful node creation on PFS, overwrites the shared vnode with the new vnode. Consequently, this leaves a small window where a just closed 'pipe write filp' gets reused and marked as present, before becoming the actual new 'pipe write filp' for a new pipe. A reader for the old pipe will think a writer is present and wait for that writer to write something or to leave; both actions should revive the suspended reader. This will never happen and the reader will be stuck forever.	2012-04-13 13:22:57 +00:00
Thomas Veerman	e292ba487e	VFS: more three-level-lock sanity checking	2012-04-13 13:22:42 +00:00
Thomas Veerman	933120b0b1	VFS: add getting active threads control msg	2012-04-13 13:21:01 +00:00

1 2 3 4 5 ...

378 commits