minix

Author	SHA1	Message	Date
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	230b7775fe	changes for detecting and building for clang/binutils elf and minor fixes: . add ack/clean target to lib, 'unify' clean target . add includes as library dependency . mk: exclude warning options clang doesn't have in non-gcc . set -e in lib/*.sh build files . clang compile error circumvention (disable NOASSERTS for release builds)	2011-06-07 16:49:52 +02:00
David van Moolenbroek	a7285dfabc	Kernel/RS: fix permission computation with 32+ system processes	2010-12-07 10:32:42 +00:00
Tomas Hruby	63e2d73d1b	Fixed brackets in bitmap macros	2010-03-30 08:34:33 +00:00
Tomas Hruby	1b56fdb33c	Time accounting based on TSC - as thre are still KERNEL and IDLE entries, time accounting for kernel and idle time works the same as for any other process - everytime we stop accounting for the currently running process, kernel or idle, we read the TSC counter and increment the p_cycles entry. - the process cycles inherently include some of the kernel cycles as we can stop accounting for the process only after we save its context and we start accounting just before we restore its context - this assumes that the system does not scale the CPU frequency which will be true for ... long time ;-)	2010-02-10 15:36:54 +00:00
Tomas Hruby	c6fec6866f	No locking in kernel code - No locking in RTS_(UN)SET macros - No lock_notify() - Removed unused lock_send() - No lock/unlock macros anymore	2010-02-09 15:26:58 +00:00
Tomas Hruby	728f0f0c49	Removal of the system task * Userspace change to use the new kernel calls - _taskcall(SYSTASK...) changed to _kernel_call(...) - int 32 reused for the kernel calls - _do_kernel_call() to make the trap to kernel - kernel_call() to make the actuall kernel call from C using _do_kernel_call() - unlike ipc call the kernel call always succeeds as kernel is always available, however, kernel may return an error * Kernel side implementation of kernel calls - the SYSTEm task does not run, only the proc table entry is preserved - every data_copy(SYSTEM is no data_copy(KERNEL - "locking" is an empty operation now as everything runs in kernel - sys_task() is replaced by kernel_call() which copies the message into kernel, dispatches the call to its handler and finishes by either copying the results back to userspace (if need be) or by suspending the process because of VM - suspended processes are later made runnable once the memory issue is resolved, picked up by the scheduler and only at this time the call is resumed (in fact restarted) which does not need to copy the message from userspace as the message is already saved in the process structure. - no ned for the vmrestart queue, the scheduler will restart the system calls - no special case in do_vmctl(), all requests remove the RTS_VMREQUEST flag	2010-02-09 15:20:09 +00:00
Kees van Reeuwijk	a701e290f7	Removed unused symbols. Made some functions PRIVATE, including ones that aren't used anywhere.	2010-01-25 18:13:48 +00:00
Cristiano Giuffrida	d1fd04e72a	Initialization protocol for system services. SYSLIB CHANGES: - SEF framework now supports a new SEF Init request type from RS. 3 different callbacks are available (init_fresh, init_lu, init_restart) to specify initialization code when a service starts fresh, starts after a live update, or restarts. SYSTEM SERVICE CHANGES: - Initialization code for system services is now enclosed in a callback SEF will automatically call at init time. The return code of the callback will tell RS whether the initialization completed successfully. - Each init callback can access information passed by RS to initialize. As of now, each system service has access to the public entries of RS's system process table to gather all the information required to initialize. This design eliminates many existing or potential races at boot time and provides a uniform initialization interface to system services. The same interface will be reused for the upcoming publish/subscribe model to handle dynamic registration / deregistration of system services. VM CHANGES: - Uniform privilege management for all system services. Every service uses the same call mask format. For boot services, VM copies the call mask from init data. For dynamic services, VM still receives the call mask via rs_set_priv call that will be soon replaced by the upcoming publish/subscribe model. RS CHANGES: - The system process table has been reorganized and split into private entries and public entries. Only the latter ones are exposed to system services. - VM call masks are now entirely configured in rs/table.c - RS has now its own slot in the system process table. Only kernel tasks and user processes not included in the boot image are now left out from the system process table. - RS implements the initialization protocol for system services. - For services in the boot image, RS blocks till initialization is complete and panics when failure is reported back. Services are initialized in their order of appearance in the boot image priv table and RS blocks to implements synchronous initialization for every system service having the flag SF_SYNCH_BOOT set. - For services started dynamically, the initialization protocol is implemented as though it were the first ping for the service. In this case, if the system service fails to report back (or reports failure), RS brings the service down rather than trying to restart it.	2010-01-08 01:20:42 +00:00
Cristiano Giuffrida	f4574783dc	Rewrite of boot process KERNEL CHANGES: - The kernel only knows about privileges of kernel tasks and the root system process (now RS). - Kernel tasks and the root system process are the only processes that are made schedulable by the kernel at startup. All the other processes in the boot image don't get their privileges set at startup and are inhibited from running by the RTS_NO_PRIV flag. - Removed the assumption on the ordering of processes in the boot image table. System processes can now appear in any order in the boot image table. - Privilege ids can now be assigned both statically or dynamically. The kernel assigns static privilege ids to kernel tasks and the root system process. Each id is directly derived from the process number. - User processes now all share the static privilege id of the root user process (now INIT). - sys_privctl split: we have more calls now to let RS set privileges for system processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS / SYS_PRIV_SET_USER are used to set privileges for a system / user process. - boot image table flags split: PROC_FULLVM is the only flag that has been moved out of the privilege flags and is still maintained in the boot image table. All the other privilege flags are out of the kernel now. RS CHANGES: - RS is the only user-space process who gets to run right after in-kernel startup. - RS uses the boot image table from the kernel and three additional boot image info table (priv table, sys table, dev table) to complete the initialization of the system. - RS checks that the entries in the priv table match the entries in the boot image table to make sure that every process in the boot image gets schedulable. - RS only uses static privilege ids to set privileges for system services in the boot image. - RS includes basic memory management support to allocate the boot image buffer dynamically during initialization. The buffer shall contain the executable image of all the system services we would like to restart after a crash. - First step towards decoupling between resource provisioning and resource requirements in RS: RS must know what resources it needs to restart a process and what resources it has currently available. This is useful to tradeoff reliability and resource consumption. When required resources are missing, the process cannot be restarted. In that case, in the future, a system flag will tell RS what to do. For example, if CORE_PROC is set, RS should trigger a system-wide panic because the system can no longer function correctly without a core system process. PM CHANGES: - The process tree built at initialization time is changed to have INIT as root with pid 0, RS child of INIT and all the system services children of RS. This is required to make RS in control of all the system services. - PM no longer registers labels for system services in the boot image. This is now part of RS's initialization process.	2009-12-11 00:08:19 +00:00
David van Moolenbroek	fce9fd4b4e	Add 'getidle' CPU utilization measurement infrastructure	2009-12-02 11:52:26 +00:00
Tomas Hruby	ae75f9d4e5	Removal of the executable flag from files that cannot be executed - 755 -> 644	2009-11-09 10:26:00 +00:00
Ben Gras	9647fbc94e	moved type and constants for random data to include file; added consistency check in random; added source of randomness internal to random using timing; only retrieve random bins that are full.	2009-04-02 15:24:44 +00:00
Ben Gras	3cc092ff06	. new kernel call sysctl for generic unprivileged system operations; now used for printing diagnostic messages through the kernel message buffer. this lets processes print diagnostics without sending messages to tty and log directly, simplifying the message protocol a lot and reducing difficulties with deadlocks and other situations in which diagnostics are blackholed (e.g. grants don't work). this makes DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty and log still accept the codes for 'old' binaries. This also simplifies diagnostics in several servers and drivers - only tty needs its own kputc() now. . simplifications in vfs, and some effort to get the vnode references right (consistent) even during shutdown. m_mounted_on is now NULL for root filesystems (!) (the original and new root), a less awkward special case than 'm_mounted_on == m_root_node'. root now has exactly one reference, to root, if no files are open, just like all other filesystems. m_driver_e is unused.	2009-01-26 17:43:59 +00:00
Ben Gras	c078ec0331	Basic VM and other minor improvements. Not complete, probably not fully debugged or optimized.	2008-11-19 12:26:10 +00:00
Ben Gras	6f77685609	Split of architecture-dependent and -independent functions for i386, mainly in the kernel and headers. This split based on work by Ingmar Alting <iaalting@cs.vu.nl> done for his Minix PowerPC architecture port. . kernel does not program the interrupt controller directly, do any other architecture-dependent operations, or contain assembly any more, but uses architecture-dependent functions in arch/$(ARCH)/. . architecture-dependent constants and types defined in arch/$(ARCH)/include. . <ibm/portio.h> moved to <minix/portio.h>, as they have become, for now, architecture-independent functions. . int86, sdevio, readbios, and iopenable are now i386-specific kernel calls and live in arch/i386/do_* now. . i386 arch now supports even less 86 code; e.g. mpx86.s and klib86.s have gone, and 'machine.protected' is gone (and always taken to be 1 in i386). If 86 support is to return, it should be a new architecture. . prototypes for the architecture-dependent functions defined in kernel/arch/$(ARCH)/*.c but used in kernel/ are in kernel/proto.h . /etc/make.conf included in makefiles and shell scripts that need to know the building architecture; it defines ARCH=<arch>, currently only i386. . some basic per-architecture build support outside of the kernel (lib) . in clock.c, only dequeue a process if it was ready . fixes for new include files files deleted: . mpx/klib.s - only for choosing between mpx/klib86 and -386 . klib86.s - only for 86 i386-specific files files moved (or arch-dependent stuff moved) to arch/i386/: . mpx386.s (entry point) . klib386.s . sconst.h . exception.c . protect.c . protect.h . i8269.c	2006-12-22 15:22:27 +00:00
Philip Homburg	add4be444f	get_sys_bits	2006-06-23 15:32:24 +00:00
Ben Gras	831bc7ecd1	Move bitmap manipulation macros to <minix/bitmap.h>	2006-06-20 09:50:26 +00:00
Ben Gras	1335d5d700	'proc number' is process slot, 'endpoint' are generation-aware process instance numbers, encoded and decoded using macros in <minix/endpoint.h>. proc number -> endpoint migration . proc_nr in the interrupt hook is now an endpoint, proc_nr_e. . m_source for messages and notifies is now an endpoint, instead of proc number. . isokendpt() converts an endpoint to a process number, returns success (but fails if the process number is out of range, the process slot is not a living process, or the given endpoint number does not match the endpoint number in the process slot, indicating an old process). . okendpt() is the same as isokendpt(), but panic()s if the conversion fails. This is mainly used for decoding message.m_source endpoints, and other endpoint numbers in kernel data structures, which should always be correct. . if DEBUG_ENABLE_IPC_WARNINGS is enabled, isokendpt() and okendpt() get passed the __FILE__ and __LINE__ of the calling lines, and print messages about what is wrong with the endpoint number (out of range proc, empty proc, or inconsistent endpoint number), with the caller, making finding where the conversion failed easy without having to include code for every call to print where things went wrong. Sometimes this is harmless (wrong arg to a kernel call), sometimes it's a fatal internal inconsistency (bogus m_source). . some process table fields have been appended an _e to indicate it's become and endpoint. . process endpoint is stored in p_endpoint, without generation number. it turns out the kernel never needs the generation number, except when fork()ing, so it's decoded then. . kernel calls all take endpoints as arguments, not proc numbers. the one exception is sys_fork(), which needs to know in which slot to put the child.	2006-03-03 10:00:02 +00:00
Ben Gras	88ba4b5268	added reenter check to lock_dequeue() to avoid unlocking of interrupts via cause_sig() during an exception. moved lock check configuration to <minix/sys_config.h> instead of kernel/config.h, because the 'relocking' field in kinfo depends on it. other prettification: common locking macro, whether lock timing is on or not.	2006-02-10 16:53:51 +00:00
Ben Gras	d11b2e4b8c	Al's double-blank-line removal request	2005-08-22 15:23:47 +00:00
Jorrit Herder	74711a3b14	Check if kernel calls is allowed (from process' call mask) added. Not yet enforced. If a call is denied, this will be kprinted. Please report any such errors, so that I can adjust the mask before returning errors instead of warnings. Wrote CMOS driver. All CMOS code from FS has been removed. Currently the driver only supports get time calls. Set time is left out as an exercise for the book readers ... startup scripts were updated because the CMOS driver is needed early on. (IS got same treatment.) Don't forget to run MAKEDEV cmos in /dev/, otherwise the driver cannot be loaded.	2005-08-04 19:23:03 +00:00
Jorrit Herder	89cf745fe2	Single boot driver loaded, while multiple can be included in the boot image. The user needs to set label=... to choose the driver of his or her choice. This driver will be mapped onto the controller that is set in controller=... Minor cleanup of kernel source code (boot image table now is static).	2005-08-03 16:06:35 +00:00
Jorrit Herder	8866b4d0ef	Kernel changes: - reinstalled priority changing, now in sched() and unready() - reinstalled check on message buffer in sys_call() - reinstalled check in send masks in sys_call() - changed do_fork() to get new privilege structure for SYS_PROCs - removed some processes from boot image---will be dynamically started later	2005-07-26 12:48:34 +00:00
Jorrit Herder	198c976f7e	System processes can be signaled; signals are transformed in SYS_EVENT message that passes signal map along. This mechanisms is also used for nonuser signals like SIGKMESS, SIGKSTOP, SIGKSIG. Revised comments of many system call handlers. Renamed setpriority to nice.	2005-07-19 12:21:36 +00:00
Philip Homburg	7d4e914618	Random number generator	2005-07-18 15:40:24 +00:00
Jorrit Herder	42ab148155	Reorganized system call library; uses separate file per call now. New configuration header file to include/ exclude functionality. Extracted privileged features from struct proc and create new struct priv. Renamed various system calls for readability.	2005-07-14 15:12:12 +00:00
Jorrit Herder	bac6068857	Rewrite of process scheduling: - current and maximum priority per process; - quantum size and current ticks left per process; - max number of full quantums in a row allow (otherwise current priority is decremented)	2005-06-30 15:55:19 +00:00
Ben Gras	3eeff022fb	Added function read_cpu_flags() that returns current cpu flags as a long. This is used to check for interrupts being disabled at the time of a lock() call, if enabled in config.h. The number of times this happens is then counted in the kinfo structure. These events (recursive lockings) lead to nasty race conditions.	2005-06-20 14:53:13 +00:00
Ben Gras	3c7120d830	Changed arguments of timer library functions.	2005-06-17 13:36:01 +00:00
Jorrit Herder	f2a85e58d9	Various updates. * Removed some variants of the SYS_GETINFO calls from the kernel; replaced them with new PM and utils libary functionality. Fixed bugs in utils library that used old get_kenv() variant. * Implemented a buffer in the kernel to gather random data. Memory driver periodically checks this for /dev/random. A better random algorithm can now be implemented in the driver. Removed SYS_RANDOM; the SYS_GETINFO call is used instead. * Remove SYS_KMALLOC from the kernel. Memory allocation can now be done at the process manager with new 'other' library functions.	2005-06-03 13:55:06 +00:00
Ben Gras	c977bd8709	Added args to lock() and unlock() to tell them apart, for use when lock timing is enabled in minix/config.h. Added phys_zero() routine to klib386.s that zeroes a range of memory, and added corresponding system call.	2005-06-01 09:37:52 +00:00
Jorrit Herder	0165662cd9	Replaced flagalrm() timers with another technique to check for timeouts. This allowed removing the p_flagarlm timer from the kernel's process table. Furthermore, I merged p_syncalrm and p_signalrm into p_alarm_timer to save even more space. Note that processes can no longer have both a signal and synchronous alarm timer outstanding as of now.	2005-05-31 14:43:04 +00:00
Jorrit Herder	322ec9ef8b	Moved stime, time, times POSIX calls from FS to PM. Removed child time accounting from kernel (now in PM). Large amount of files in this commit is due to system time problems during development.	2005-05-31 09:50:51 +00:00
Jorrit Herder	c2be104821	Improved NOTIFY system: fixed a minor error, ignore pending notifications on SENDREC system calls. To be done: resource (buffer pool) management; make it structurally impossible to run out of buffers.	2005-05-27 12:44:14 +00:00
Jorrit Herder	ccd17ecfed	New NOTIFY system call! Queued at kernel. Duplicate messages (with same source and type) are overwritten with newer flags/ arguments. The interface from within the kernel is lock_notify(). User processes can make a system call with notify(). NOTIFY fully replaces the old notification mechanism.	2005-05-24 10:06:17 +00:00
Jorrit Herder	1cb880b158	Intermediate update---please await next commit.	2005-05-19 09:36:44 +00:00
Jorrit Herder	ac0995259d	* empty log message *	2005-05-02 14:30:04 +00:00
Jorrit Herder	89ac678b9b	* empty log message *	2005-04-29 15:36:43 +00:00
Ben Gras	9865aeaa79	Initial revision	2005-04-21 14:53:53 +00:00

40 commits