minix

Author	SHA1	Message	Date
Thomas Veerman	77dbd766c1	VFS: Use safe string copy functions	2012-07-16 10:57:43 +00:00
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	769af57274	further libexec generalization . new mode for sys_memset: include process so memset can be done in physical or virtual address space. . add a mode to mmap() that lets a process allocate uninitialized memory. . this allows an exec()er (RS, VFS, etc.) to request uninitialized memory from VM and selectively clear the ranges that don't come from a file, leaving no uninitialized memory left for the process to see. . use callbacks for clearing the process, clearing memory in the process, and copying into the process; so that the libexec code can be used from rs, vfs, and in the future, kernel (to load vm) and vm (to load boot-time processes)	2012-06-07 15:15:02 +02:00
Ben Gras	040362e379	exec() cleanup, generalization, improvement . make exec() callers (i.e. vfs and rs) determine the memory layout by explicitly reserving regions using mmap() calls on behalf of the exec()ing process, i.e. handling all of the exec logic, thereby eliminating all special exec() knowledge from VM. . the new procedure is: clear the exec()ing process first, then call third-party mmap()s to reserve memory, then copy the executable file section contents in, all using callbacks tailored to the caller's way of starting an executable . i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM as with rigid 2-section arguments . this naturally allows generalizing exec() by simply loading all ELF sections . drop/merge of lots of duplicate exec() code into libexec . not copying the code sections to vfs and into the executable again is a measurable performance improvement (about 3.3% faster for 'make' in src/servers/)	2012-06-07 15:15:01 +02:00
Ben Gras	41b869d4d6	drop aout support justification: soon we won't be able to execute sep I&D aouts at all (because of the vanishing segments), which was the default mode to generate them so most binaries will be sep I&D. this makes the vfs/rs exec() unification work simpler. after unification, common I&D aout could be added back quite simply.	2012-06-07 12:43:16 +02:00
Thomas Veerman	7b81254069	VFS: simplify stat for pipes According to POSIX the st_size field of struct stat is undefined for fifos and anonymous pipes. Thus we can do anything we want. We save a copy by not being accurate on pipe sizes.	2012-04-27 08:50:49 +00:00
Thomas Veerman	db8198d99d	VFS: use S_IS* macros	2012-04-27 08:49:38 +00:00
Ben Gras	755102d67f	AT_SUN_EXECNAME support . vfs: pass execname in aux vectors . ld.elf_so: use this to expand $ORIGIN . this requires the executable to reserve more space at exec() calling time	2012-04-26 13:32:39 +02:00
Ben Gras	53002f6f6c	recognize and execute dynamically linked executables . generalize libexec slightly to get some more necessary information from ELF files, e.g. the interpreter . execute dynamically linked executables when exec()ed by VFS . switch to netbsd variant of elf32.h exclusively, solves some conflicting headers	2012-04-16 00:41:42 +00:00
Thomas Veerman	b956493367	VFS: fix new signed/unsigned comparisons	2012-04-13 13:00:11 +00:00
Thomas Veerman	8f55767619	VFS: make m_in job local By making m_in job local (i.e., each job has its own copy of m_in instead of refering to the global m_in) we don't have to store and restore m_in on every thread yield. This reduces overhead. Moreover, remove the assumption that m_in is preserved. Do_XXX functions have to copy the system call parameters as soon as possible and only pass those copies to other functions. Furthermore, this patch cleans up some code and uses better types in a lot of places.	2012-04-13 12:50:38 +00:00
Ben Gras	7336a67dfe	retire PUBLIC, PRIVATE and FORWARD	2012-03-25 21:58:14 +02:00
Ben Gras	6a73e85ad1	retire _PROTOTYPE . only good for obsolete K&R support . also remove a stray ansi.h and the proto cmd	2012-03-25 16:17:10 +02:00
Thomas Veerman	80c4685324	VFS: replace VFS with AVFS	2012-02-13 16:53:21 +00:00
Thomas Veerman	1fc399a5c1	Add permission test for bind and socket Also, apply forbidden patch to VFS from AVFS (fixes hanging test56 if it has the permission test).	2012-01-30 15:16:20 +00:00
Ben Gras	34a8901eb8	vfs,avfs: verify an interpreter was found on #! line . if not, NULL *interp is dereferenced	2011-12-21 23:44:13 +01:00
Thomas Veerman	b4fb061802	Implement issetugid syscall Implement issetugid syscall and provide a test. This gets rid of the scary "Unsecure. Implement me" warning during compilation.	2011-11-28 10:03:43 +00:00
Arun Thomas	530bd5d486	vfs/rs: for ELF, sep_id should be 0	2011-07-26 15:21:07 +02:00
Evgeniy Ivanov	ef0a265086	New stat structure. * VFS and installed MFSes must be in sync before and after this change * Use struct stat from NetBSD. It requires adding new STAT, FSTAT and LSTAT syscalls. Libc modification is both backward and forward compatible. Also new struct stat uses modern field sizes to avoid ABI incompatibility, when we update uid_t, gid_t and company. Exceptions are ino_t and off_t in old libc (though paddings added).	2011-07-12 16:39:55 +02:00
Ben Gras	674cd6fd48	larger i/o buffer for exec() . makes exec() for large executables (e.g. clang, gcc) significantly faster Thanks to Antoine Leca.	2011-05-12 19:12:28 +02:00
Thomas Veerman	f0740680cd	Do not print an error message when a binary is corrupt	2011-04-12 13:09:19 +00:00
Arun Thomas	cd9b4b46f4	libexec: return physaddr info from ELF headers	2011-04-07 12:22:36 +00:00
Ben Gras	f0f34dd8d9	vfs - use a static buffer instead of malloc()+free(), solving recently appeared ENOMEM problems during exec().	2010-12-15 14:43:59 +00:00
Arun Thomas	372b873413	VFS/RS support for ELF	2010-12-10 09:27:56 +00:00
Dirk Vogt	9ed280d1ec	decouple file system server start/termination from mount/umount	2010-11-23 19:34:56 +00:00
Thomas Veerman	13ef7f1f38	Prepare VFS to support back calls from PFS. For security reasons and to support file descriptor passing, PFS does some back calls to VFS. For example, to verify the validity of a path provided by a process and to tell VFS it must copy file descriptors from one process to another.	2010-08-30 13:44:07 +00:00
Thomas Veerman	34a2864e27	Fix a few compile time warnings	2010-07-02 12:41:19 +00:00
Arun Thomas	1bf6d23f34	Make exec() use entry point in a.out header	2010-06-10 14:59:10 +00:00
Arun Thomas	f0a158d8c1	More cleanup to remove MM and FS references	2010-06-10 14:04:46 +00:00
Arun Thomas	4c10a31440	Remove legacy MM, FS, and FS_PROC_NR macros	2010-06-08 13:58:01 +00:00
Tomas Hruby	6e25ad8b0a	Use of all NIL_* defines converted to NULL	2010-05-10 13:26:00 +00:00
Kees van Reeuwijk	86a23c1fbd	Remove U16_t and most other similar types. Rewrite functions to ansi-style declaration if necessary.	2010-04-21 11:05:22 +00:00
Kees van Reeuwijk	bc314bda91	Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar. Use ANSI-style function declarations where necessary.	2010-04-13 10:58:41 +00:00
Cristiano Giuffrida	48c6bb79f4	Driver refactory for live update and crash recovery. SYSLIB CHANGES: - DS calls to publish / retrieve labels consider endpoints instead of u32_t. VFS CHANGES: - mapdriver() only adds an entry in the dmap table in VFS. - dev_up() is only executed upon reception of a driver up event. INET CHANGES: - INET no longer searches for existing drivers instances at startup. - A newtwork driver is (re)initialized upon reception of a driver up event. - Networking startup is now race-free by design. No need to waste 5 seconds at startup any more. DRIVER CHANGES: - Every driver publishes driver up events when starting for the first time or in case of restart when recovery actions must be taken in the upper layers. - Driver up events are published by drivers through DS. - For regular drivers, VFS is normally the only subscriber, but not necessarily. For instance, when the filter driver is in use, it must subscribe to driver up events to initiate recovery. - For network drivers, inet is the only subscriber for now. - Every VFS driver is statically linked with libdriver, every network driver is statically linked with libnetdriver. DRIVER LIBRARIES CHANGES: - Libdriver is extended to provide generic receive() and ds_publish() interfaces for VFS drivers. - driver_receive() is a wrapper for sef_receive() also used in driver_task() to discard spurious messages that were meant to be delivered to a previous version of the driver. - driver_receive_mq() is the same as driver_receive() but integrates support for queued messages. - driver_announce() publishes a driver up event for VFS drivers and marks the driver as initialized and expecting a DEV_OPEN message. - Libnetdriver is introduced to provide similar receive() and ds_publish() interfaces for network drivers (netdriver_announce() and netdriver_receive()). - Network drivers all support live update with no state transfer now. KERNEL CHANGES: - Added kernel call statectl for state management. Used by driver_announce() to unblock eventual callers sendrecing to the driver.	2010-04-08 13:41:35 +00:00
Thomas Veerman	958b25be50	- Introduce support for sticky bit. - Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly. - Clean up MFS by removing old, dead code (backwards compatibility is broken by the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all functions have proper banners and prototypes. - VFS should always provide a (syntactically) valid path to the FS; no need for the FS to do sanity checks when leaving/entering mount points. - Fix several bugs in MFS: - Several path lookup bugs in MFS. - A link can be too big for the path buffer. - A mountpoint can become inaccessible when the creation of a new inode fails, because the inode already exists and is a mountpoint. - Introduce support for supplemental groups. - Add test 46 to test supplemental group functionality (and removed obsolete suppl. tests from test 2). - Clean up VFS (not everything is done yet). - ISOFS now opens device read-only. This makes the -r flag in the mount command unnecessary (but will still report to be mounted read-write). - Introduce PipeFS. PipeFS is a new FS that handles all anonymous and named pipes. However, named pipes still reside on the (M)FS, as they are part of the file system on disk. To make this work VFS now has a concept of 'mapped' inodes, which causes read, write, truncate and stat requests to be redirected to the mapped FS, and all other requests to the original FS.	2009-12-20 20:27:14 +00:00
Ben Gras	8d9aa1fe4f	throw out exec debugging message.	2009-09-30 08:36:13 +00:00
Ben Gras	f5459e38db	- some exec debugging prints when errors happen - lookup mounted_on check to avoid NULL dereference - some errors in exec() shouldn't be fatal	2009-09-21 14:49:26 +00:00
Ben Gras	c078ec0331	Basic VM and other minor improvements. Not complete, probably not fully debugged or optimized.	2008-11-19 12:26:10 +00:00
Philip Homburg	f46319037b	New VFS interface	2007-08-07 12:52:47 +00:00
Ben Gras	dc67b37a10	more removing of warning and debug messages.	2007-04-13 14:00:31 +00:00
Philip Homburg	9092146be7	VFS cleanup (mostly open).	2007-01-05 16:36:55 +00:00
Ben Gras	da42185e1c	Removed verbose statements from vfs and mfs	2006-12-22 11:54:42 +00:00
Philip Homburg	bafc45a309	First cut at 64-bit file offsets in block devices for mkfs/fsck.	2006-11-27 14:21:43 +00:00
Ben Gras	fa0ba56bc9	Merge of VFS by Balasz Gerofi with Minix trunk.	2006-10-25 13:40:36 +00:00

44 commits