minix

Author	SHA1	Message	Date
Thomas Veerman	badec36b33	VFS: fix deadlock when out of worker threads There is a deadlock vulnerability when there are no worker threads available and all of them blocked on a worker thread that's waiting for a reply from a driver or a reply from an FS that needs to make a back call. In these cases the deadlock resolver thread should kick in, but didn't in all cases. Moreover, POSIX calls from File Servers weren't handled properly anymore, which also could lead to deadlocks.	2012-11-14 13:12:37 +00:00
Arne Welzel	e35c4f78d2	VFS: fix check_bsf() locking The check_bsf() macro uses assert(mutex_trylock(&bsf_lock)) and assumes bsf_lock is locked afterwards. This breaks when compiling with NOASSERTS="yes". Also: macro to function transition.	2012-09-28 14:57:34 +02:00
Arne Welzel	7e1074732b	VFS: resolve unused parameter if NOASSERTS="yes" If VFS is compiled with NOASSERTS="yes", ctty_opcl() does not use the op parameter. Change to "non-assert()" sanity check.	2012-09-28 14:57:32 +02:00
Ben Gras	60014efb3e	vfs: pm_dumpcore: always clean up process . whenever this function is called, pm will expect the process to be cleaned up . so don't abort the process entirely on error . fixes a later 'forking on top of in-use child' vfs panic	2012-09-19 17:13:17 +02:00
Thomas Veerman	c087a60ed2	VFS: fix GCC compilation error	2012-09-17 15:29:38 +00:00
Thomas Veerman	3881e732a9	VFS: panic when unmount_all fails	2012-09-17 11:01:46 +00:00
Thomas Veerman	992799b91f	VFS: make all IPC asynchronous By decoupling synchronous drivers from VFS, we are a big step closer to supporting driver crashes under all circumstances. That is, VFS can't become stuck on IPC with a synchronous driver (e.g., INET) and can recover from crashing block drivers during open/close/ioctl or during communication with an FS. In order to maintain serialized communication with a synchronous driver, the communication is wrapped by a mutex on a per driver basis (not major numbers as there can be multiple majors with identical endpoints). Majors that share a driver endpoint point to a single mutex object. In order to support crashes from block drivers, the file reopen tactic had to be changed; first reopen files associated with the crashed driver, then send the new driver endpoint to FSes. This solves a deadlock between the FS and the block driver; - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it after retrying the current request to the newly started driver. - The block driver would refuse the retried request until all files had been reopened. - VFS would reopen files only after getting a reply from the initial REQ_NEW_DRIVER. When a character special driver crashes, all associated files have to be marked invalid and closed (or reopened if flagged as such). However, they can only be closed if a thread holds exclusive access to it. To obtain exclusive access, the worker thread (which handles the new driver endpoint event from DS) schedules a new job to garbage collect invalid files. This way, we can signal the worker thread that was talking to the crashed driver and will release exclusive access to a file associated with the crashed driver and prevent the garbage collecting worker thread from dead locking on that file. Also, when a character special driver crashes, RS will unmap the driver and remap it upon restart. During unmapping, associated files are marked invalid instead of waiting for an endpoint up event from DS, as that event might come later than new read/write/select requests and thus cause confusion in the freshly started driver. When locking a filp, the usage counters are no longer checked. The usage counter can legally go down to zero during filp invalidation while there are locks pending. DS events are handled by a separate worker thread instead of the main thread as reopening files could lead to another crash and a stuck thread. An additional worker thread is then necessary to unlock it. Finally, with everything asynchronous a race condition in do_select surfaced. A select entry was only marked in use after succesfully sending initial select requests to drivers and having to wait. When multiple select() calls were handled there was opportunity that these entries were overwritten. This had as effect that some select results were ignored (and select() remained blocking instead if returning) or do_select tried to access filps that were not present (because thrown away by secondary select()). This bug manifested itself with sendrecs, but was very hard to reproduce. However, it became awfully easy to trigger with asynsends only.	2012-09-17 11:01:45 +00:00
Ben Gras	e4ac80eb60	various warning/errorwarning fixes for gcc47 . warnings (sometimes promoted to errors) in servers/ and kernel/ . -Os for ext2 boot module to make it small enough	2012-08-27 16:19:18 +02:00
Ben Gras	31d8526346	libexec: add load_offset feature, used for ld.so . ld.so is linked at 0 but it can relocate itself; we wish to load ld.so higher though to trap NULL dereferences. if we know we have to execute ld.so, vfs tells libexec to put it higher.	2012-08-12 23:22:54 +02:00
Thomas Veerman	66dbf73049	VFS: fix locking bug in clone_opcl When VFS runs out of vnodes after closing a vnode in opcl, common_open will try to unlock a vnode through unlock_filp that has already been unlocked in clone_opcl. By first obtaining and locking a new vnode this situation is prevented; if there are no free vnodes, common_open will unlock a still locked vnode.	2012-07-30 10:01:16 +00:00
Thomas Veerman	f6b0d662b5	VFS: check path components for NAME_MAX length	2012-07-30 09:44:58 +00:00
David van Moolenbroek	0b4c154160	VFS: call req_inhibread again	2012-07-19 14:36:51 +00:00
David van Moolenbroek	e0742978f1	VFS: do not resolve symlinks in rename(2)	2012-07-18 14:59:45 +00:00
Thomas Veerman	0d3ccd8908	VFS: fix coverity defects	2012-07-17 10:29:22 +00:00
Thomas Veerman	fd60f03129	VFS: remove support for sync FS communication	2012-07-17 10:12:53 +00:00
Thomas Veerman	06f49fe167	VFS: prevent buffer overflow If an FS returns faulty struct dirent data, VFS could overflow a buffer that holds this data.	2012-07-17 08:49:41 +00:00
Ben Gras	cbcdb838f1	various coverity-inspired fixes . some strncpy/strcpy to strlcpy conversions . new <minix/param.h> to avoid including other minix headers that have colliding definitions with library and commands code, causing parse warnings . removed some dead code / assignments	2012-07-16 14:00:56 +02:00
Thomas Veerman	77dbd766c1	VFS: Use safe string copy functions	2012-07-16 10:57:43 +00:00
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	0fb2f83da9	drop from segments physcopy/vircopy invocations . sys_vircopy always uses D for both src and dst . sys_physcopy uses PHYS_SEG if and only if corresponding endpoint is NONE, so we can derive the mode (PHYS_SEG or D) from the endpoint arg in the kernel, dropping the seg args . fields in msg still filled in for backwards compatability, using same NONE-logic in the library	2012-06-18 12:28:40 +00:00
Ben Gras	2bfeeed885	drop segment from safecopy invocations . all invocations were S or D, so can safely be dropped to prepare for the segmentless world . still assign D to the SCP_SEG field in the message to make previous kernels usable	2012-06-16 16:22:51 +00:00
Ben Gras	85ff5a947e	dumpcore: use ptrace function to trigger a coredump . dumpcore currently relies on minix segments . also ptrace dumpcore fix	2012-06-15 12:13:50 +02:00
Ben Gras	769af57274	further libexec generalization . new mode for sys_memset: include process so memset can be done in physical or virtual address space. . add a mode to mmap() that lets a process allocate uninitialized memory. . this allows an exec()er (RS, VFS, etc.) to request uninitialized memory from VM and selectively clear the ranges that don't come from a file, leaving no uninitialized memory left for the process to see. . use callbacks for clearing the process, clearing memory in the process, and copying into the process; so that the libexec code can be used from rs, vfs, and in the future, kernel (to load vm) and vm (to load boot-time processes)	2012-06-07 15:15:02 +02:00
Ben Gras	040362e379	exec() cleanup, generalization, improvement . make exec() callers (i.e. vfs and rs) determine the memory layout by explicitly reserving regions using mmap() calls on behalf of the exec()ing process, i.e. handling all of the exec logic, thereby eliminating all special exec() knowledge from VM. . the new procedure is: clear the exec()ing process first, then call third-party mmap()s to reserve memory, then copy the executable file section contents in, all using callbacks tailored to the caller's way of starting an executable . i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM as with rigid 2-section arguments . this naturally allows generalizing exec() by simply loading all ELF sections . drop/merge of lots of duplicate exec() code into libexec . not copying the code sections to vfs and into the executable again is a measurable performance improvement (about 3.3% faster for 'make' in src/servers/)	2012-06-07 15:15:01 +02:00
Ben Gras	41b869d4d6	drop aout support justification: soon we won't be able to execute sep I&D aouts at all (because of the vanishing segments), which was the default mode to generate them so most binaries will be sep I&D. this makes the vfs/rs exec() unification work simpler. after unification, common I&D aout could be added back quite simply.	2012-06-07 12:43:16 +02:00
David van Moolenbroek	1817f7fc07	VFS: fix "process already free" panic on reboot Reported by Claudiu Dan Gheorghe, debugged by Thomas and myself	2012-05-02 17:42:50 +02:00
Thomas Veerman	068d443d12	VFS: unlock vmnt when out of vnodes	2012-04-27 08:51:13 +00:00
Thomas Veerman	b6ff38065f	VFS: release what can be released Only attempt to release blocked processes that are blocked. There is no use in trying to find more blocked processes than we know that are blocked (on a pipe).	2012-04-27 08:51:02 +00:00
Thomas Veerman	7b81254069	VFS: simplify stat for pipes According to POSIX the st_size field of struct stat is undefined for fifos and anonymous pipes. Thus we can do anything we want. We save a copy by not being accurate on pipe sizes.	2012-04-27 08:50:49 +00:00
Thomas Veerman	db8198d99d	VFS: use S_IS* macros	2012-04-27 08:49:38 +00:00
Thomas Veerman	96bbc5da3e	VFS: I_PIPE is redundant Also, use S_IS* macros instead of manual comparison.	2012-04-27 08:49:38 +00:00
Ben Gras	755102d67f	AT_SUN_EXECNAME support . vfs: pass execname in aux vectors . ld.elf_so: use this to expand $ORIGIN . this requires the executable to reserve more space at exec() calling time	2012-04-26 13:32:39 +02:00
David van Moolenbroek	26f817243b	VFS: reimplement truncate mtime/ctime fix POSIX mandates that a file's modification and change time be left untouched upon truncate/ftruncate iff the file size does not change. However, an open(O_TRUNC) call must always update the modification and change time of the file, even if it was already zero-sized. VFS uses the file systems' truncate call to implement O_TRUNC. This patch replaces git-255ae85, which did not take into account the open case. The size check is now moved into VFS, so that individual file systems need not check for this case anymore.	2012-04-20 11:35:59 +02:00
Ben Gras	3945cfbfd3	block ioctls: pass request number	2012-04-18 11:01:15 +02:00
Ben Gras	53002f6f6c	recognize and execute dynamically linked executables . generalize libexec slightly to get some more necessary information from ELF files, e.g. the interpreter . execute dynamically linked executables when exec()ed by VFS . switch to netbsd variant of elf32.h exclusively, solves some conflicting headers	2012-04-16 00:41:42 +00:00
Thomas Veerman	26ec619a30	VFS: fix filp reuse race Pipes consist of two filps (read filp and write filp) and a shared vnode. When the writer leaves the filp reference count drops to zero and subsequent find_filp()s should not find the filp when a reader looks for it and the reader gets EOF. However, the pipe() system call tries to find two filps, marks them in use, and only after a successful node creation on PFS, overwrites the shared vnode with the new vnode. Consequently, this leaves a small window where a just closed 'pipe write filp' gets reused and marked as present, before becoming the actual new 'pipe write filp' for a new pipe. A reader for the old pipe will think a writer is present and wait for that writer to write something or to leave; both actions should revive the suspended reader. This will never happen and the reader will be stuck forever.	2012-04-13 13:22:57 +00:00
Thomas Veerman	e292ba487e	VFS: more three-level-lock sanity checking	2012-04-13 13:22:42 +00:00
Thomas Veerman	933120b0b1	VFS: add getting active threads control msg	2012-04-13 13:21:01 +00:00
Thomas Veerman	e1a73469c8	VFS: remove debug print	2012-04-13 13:20:28 +00:00
Thomas Veerman	c2bb739760	VFS: let know when skipping reply	2012-04-13 13:19:45 +00:00
Thomas Veerman	91a38b6d4e	VFS: fix dead lock When running out of worker threads to handle device replies a dead lock resolver thread is used. However, it was only used for FS endpoints; it is now used for "system processes" (drivers and FS endpoints). Also, drivers were marked as system process when they were not "forced" to map (i.e., mapping was done before endpoint was alive).	2012-04-13 13:19:10 +00:00
Thomas Veerman	b956493367	VFS: fix new signed/unsigned comparisons	2012-04-13 13:00:11 +00:00
Thomas Veerman	defe329519	VFS: warnings are errors	2012-04-13 12:59:32 +00:00
Thomas Veerman	0d63d9e125	VFS: enable sending control messages	2012-04-13 12:54:55 +00:00
Thomas Veerman	f571466c56	VFS: find job only if request is an transaction	2012-04-13 12:52:52 +00:00
Thomas Veerman	8f55767619	VFS: make m_in job local By making m_in job local (i.e., each job has its own copy of m_in instead of refering to the global m_in) we don't have to store and restore m_in on every thread yield. This reduces overhead. Moreover, remove the assumption that m_in is preserved. Do_XXX functions have to copy the system call parameters as soon as possible and only pass those copies to other functions. Furthermore, this patch cleans up some code and uses better types in a lot of places.	2012-04-13 12:50:38 +00:00
Ben Gras	1e2b3f4326	vfs: more regions for coredumps	2012-04-12 14:29:59 +02:00
Ben Gras	204ae72525	retire _ANSI and <minix/ansi.h>	2012-03-25 21:58:27 +02:00
Ben Gras	7336a67dfe	retire PUBLIC, PRIVATE and FORWARD	2012-03-25 21:58:14 +02:00
Ben Gras	6a73e85ad1	retire _PROTOTYPE . only good for obsolete K&R support . also remove a stray ansi.h and the proto cmd	2012-03-25 16:17:10 +02:00

1 2 3 4 5 ...

266 commits