minix

Author	SHA1	Message	Date
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	cfe1ed4df4	profiling related cleanup . do not declare any data in <minix/profile.h> . addr check no longer necessary	2012-07-15 21:56:55 +02:00
Ben Gras	0fb2f83da9	drop from segments physcopy/vircopy invocations . sys_vircopy always uses D for both src and dst . sys_physcopy uses PHYS_SEG if and only if corresponding endpoint is NONE, so we can derive the mode (PHYS_SEG or D) from the endpoint arg in the kernel, dropping the seg args . fields in msg still filled in for backwards compatability, using same NONE-logic in the library	2012-06-18 12:28:40 +00:00
Ben Gras	0e35eb0c6b	drop segments from safemap/safeunmap invocations	2012-06-18 12:28:40 +00:00
Ben Gras	2bfeeed885	drop segment from safecopy invocations . all invocations were S or D, so can safely be dropped to prepare for the segmentless world . still assign D to the SCP_SEG field in the message to make previous kernels usable	2012-06-16 16:22:51 +00:00
Ben Gras	769af57274	further libexec generalization . new mode for sys_memset: include process so memset can be done in physical or virtual address space. . add a mode to mmap() that lets a process allocate uninitialized memory. . this allows an exec()er (RS, VFS, etc.) to request uninitialized memory from VM and selectively clear the ranges that don't come from a file, leaving no uninitialized memory left for the process to see. . use callbacks for clearing the process, clearing memory in the process, and copying into the process; so that the libexec code can be used from rs, vfs, and in the future, kernel (to load vm) and vm (to load boot-time processes)	2012-06-07 15:15:02 +02:00
Ben Gras	cfb2d7bca5	retire BIOS_SEG and umap_bios . readbios call is now a physical copy with range check in the kernel call instead of BIOS_SEG+umap_bios . requires all access to physical memory in bios range to go through sys_readbios . drivers/dpeth: wasn't using it . adjusted printer	2012-05-09 19:03:59 +02:00
Ben Gras	a149be43fc	use linker to align fpu state save area	2012-04-19 15:06:47 +02:00
Ben Gras	1e399dd8bd	various kernel printing fixes . remove some call cycles by low-level functions invoking printf(); e.g. send_sig() gets a return value that the caller should check . reason: very-early-phase printf() would trigger a printf() causing infinite recursion -> GPF . move serial initialization a little earlier so DEBUG_EXTRA works for serial earlier (e.g. its first instance, for "cstart") . closes tracker item 583: System Fails to Complete Startup with Verbose 2 and 3 Boot Parameters, reported by Stephen Hatton / pikpik.	2012-03-28 18:23:12 +02:00
David van Moolenbroek	9cca9d7566	Kernel: arch-related cleanup - move umap_bios() into arch-specific code - move proc.p_fpu_state access into arch-specific blocks	2012-03-26 14:19:33 +02:00
Ben Gras	7336a67dfe	retire PUBLIC, PRIVATE and FORWARD	2012-03-25 21:58:14 +02:00
Ben Gras	6a73e85ad1	retire _PROTOTYPE . only good for obsolete K&R support . also remove a stray ansi.h and the proto cmd	2012-03-25 16:17:10 +02:00
David van Moolenbroek	70abb127cc	Add sys_vumap() kernel call This new call is a vectored version of sys_umap(). It supports batch lookups, non-contiguous memory, faulting in memory, and basic access checks.	2012-03-24 19:51:13 +01:00
David van Moolenbroek	08af3f672b	Kernel: replace vm_contiguous with vm_lookup_range	2012-03-24 19:51:12 +01:00
Ben Gras	6af9856d4a	libcompat_minix-centric cleanup remove some old minix-userland-specific stuff . /etc/ttytab as a file, and minix-compat function (fftyslot()), replaced by /etc/ttys and new libc functions . also remove minix-specific nlist(), cuserid(), fttyslot(), v8 regex functions and <compat/regex.h> . and remaining minix-only utilities that use them . also unused <compat/pwd.h> and <compat/syslog.h> and redundant <sys/sigcontext.h>	2012-03-16 17:06:24 +01:00
David van Moolenbroek	4b6a98de5f	Kernel: adjust FPU state upon process slot swap This fixes seemingly random FPU exceptions and kernel panics occurring after a system server restart.	2012-03-05 22:32:14 +01:00
Antoine Leca	3fb8cb760c	More cleaning up	2012-02-15 19:04:58 +00:00
Tomas Hruby	758d788bbe	SMP - asyn send SMP safe - we must not deliver messages from/to unstable address spaces. In such a case, we must postpone the delivery. To make sute that a process which is expecting an asynchronous message does not starve, we must remember that we skipped delivery of some messages and we must try to deliver again once the source address space is stable again.	2012-01-13 11:30:01 +00:00
Tomas Hruby	8fa95abae4	SMP - fixed usage of stale TLB entries - when kernel copies from userspace, it must be sure that the TLB entries are not stale and thus the referenced memory is correct - everytime we change a process' address space we set p_stale_tlb bits for all CPUs. - Whenever a cpu finds its bit set when it wants to access the process' memory, it refreshes the TLB - it is more conservative than it needs to be but it has low overhead than checking precisely	2012-01-13 11:30:00 +00:00
Tomas Hruby	0468fca72b	SMP - do_update fix - adjust_proc_slot() must preserve scheduling info, for example on which cpu the process should run - do_update() - consitency check	2012-01-13 11:30:00 +00:00
Tomas Hruby	8d0a1f71bf	KERNEL - do_privctl() fix - after a driver is restarted, do not register permissions which are already set again.	2012-01-13 11:29:59 +00:00
David van Moolenbroek	84662ec4b3	libsys: unbreak getidle()	2011-12-16 16:06:09 +00:00
David van Moolenbroek	1e1db53986	Introduce sys_getregs call, and let vfs use it	2011-11-22 02:07:33 +01:00
David van Moolenbroek	8b00ebde78	Kernel: remove unused MF_ASYNMSG	2011-11-01 19:21:19 +00:00
Arun Thomas	d69519f86a	kernel: don't build cprofile code by default	2011-08-02 15:25:54 +02:00
Arun Thomas	aaefc6f838	Add MKMCONTEXT option	2011-08-02 13:57:31 +02:00
Ben Gras	b984fa41df	Revert "print kernel stacktrace for exceptions in kernel" This reverts commit `eff1369cab`. This was in a working branch and I only intended to commit exception.c. But I committed the exact inverse. Sorry.	2011-07-22 15:01:44 +02:00
Ben Gras	eff1369cab	print kernel stacktrace for exceptions in kernel fpu alignment check feature, checksum feature	2011-07-22 11:03:45 +00:00
Arun Thomas	c356e9997e	kernel: fix GCC warnings	2011-07-18 19:44:59 +02:00
Erik van der Kouwe	6e0f3b3bda	Split off sys_umap_remote from sys_umap sys_umap now supports only: - looking up the physical address of a virtual address in the address space of the caller; - looking up the physical address of a grant for which the caller is the grantee. This is enough for nearly all umap users. The new sys_umap_remote supports lookups in arbitrary address spaces and grants for arbitrary grantees.	2011-06-10 14:28:20 +00:00
Ben Gras	a77c2973b3	fix clang warnings -R in kernel/ and servers/	2011-06-09 16:09:13 +02:00
Erik van der Kouwe	1e5f9dfa14	Make sys_umap on grants check grantee	2011-06-09 05:05:20 +00:00
Erik van der Kouwe	e969b5e11b	Remote unused segctl kernel call	2011-04-26 23:28:23 +02:00
Arun Thomas	25a790a631	VM and kernel support for ELF	2011-02-26 23:00:55 +00:00
Ben Gras	c6e6aa8850	mark forked process as such in the kernel p_name . helps debugging output; you can see the difference between parent and child easily (it's sometimes confusing to see an expected endpoint number with an unexpected name, i.e. before exec()) . when processes crash after fork and before exec, it's an instant hint that that's what's going on, instead of it being the parent (endpoint numbers don't usually convey this) . name returns to 'normal' after exec(), so *F isn't visible normally at all. (Except for for RS which forks apparently.)	2011-02-21 15:05:32 +00:00
Ben Gras	07bfb4f4e4	kernel - account for kernel cpu time (ipc, kcalls) in caller	2011-02-08 13:58:32 +00:00
Ben Gras	95702f970b	kernel - doesn't do lock timings any more	2011-02-04 13:42:17 +00:00
David van Moolenbroek	5d8d5e0c3a	change bitchunk_t from 16-bit to 32-bit	2010-12-21 10:44:45 +00:00
Arun Thomas	361f377493	Fix multiboot for ACK-built images Move the profiling buffer to the end of the data segment	2010-12-17 13:47:11 +00:00
David van Moolenbroek	b6f3b7e7f6	Kernel: statistical profiling fixes - create name entries for forked processes as well; - create name entries only for system processes.	2010-12-16 09:46:26 +00:00
David van Moolenbroek	a7285dfabc	Kernel/RS: fix permission computation with 32+ system processes	2010-12-07 10:32:42 +00:00
Tomas Hruby	ac780f36a0	sys_getcpuinfo()	2010-10-26 21:07:50 +00:00
Tomas Hruby	9e01a83636	SMP - reduced TLB flushing - flush TLB of processes only if the page tables has been changed and the page tables of this process are already loaded on this cpu which means that there might be stale entries in TLB. Until now SMP was always flushing TLB to make sure everything is consistent.	2010-10-25 16:21:23 +00:00
Tomas Hruby	5b832396f5	if verbore=1 tell us who registers which irq handler - a useful piece of information when debugging	2010-10-21 17:07:12 +00:00
Ben Gras	c521f2a138	kernel: fix idle time accounting.	2010-10-04 19:12:55 +00:00
Tomas Hruby	87c576584d	Internal 64M buffer for profiling - when profiling is compiled in kernel includes a 64M buffer for sample - 64M is the default used by profile tool as its buffer - when using nmi profiling it is not possible to always copy sample stright to userland as the nmi may (and does) happen in bad moments - reduces sampling overhead as samples are copied out only when profiling stops	2010-09-23 10:49:48 +00:00
Tomas Hruby	e63b85a50b	NMI sampling - if profile --nmi kernel uses NMI watchdog based sampling based on Intel architecture performance counters - using NMI makes kernel profiling possible - watchdog kernel lockup detection is disabled while sampling as we may get unpredictable interrupts in kernel and thus possibly many false positives - if watchdog is not enabled at boot time, profiling enables it and turns it of again when done	2010-09-23 10:49:45 +00:00
Tomas Hruby	db12229ce3	New profile protocol - when kernel profiles a process for the first time it saves an entry describing the process [endpoint\|name] - every profile sample is only [endpoint\|pc] - profile utility creates a table of endpoint <-> name relations and translates endpoints of samples into names and writing out the results to comply with the processing tools - "task" endpoints like KERNEL are negative thus we must cast it to unsigned when hashing	2010-09-23 10:49:39 +00:00
Tomas Hruby	a665ae3de1	Userspace scheduling - exporting stats - contributed by Bjorn Swift - adds process accounting, for example counting the number of messages sent, how often the process was preemted and how much time it spent in the run queue. These statistics, along with the current cpu load, are sent back to the user-space scheduler in the Out Of Quantum message. - the user-space scheduler may choose to make use of these statistics when making scheduling decisions. For isntance the cpu load becomes especially useful when scheduling on multiple cores.	2010-09-19 15:52:12 +00:00
Tomas Hruby	5b8b623765	SMP - lazy FPU - when a process is migrated to a different CPU it may have an active FPU context in the processor registers. We must save it and migrate it together with the process.	2010-09-15 14:11:25 +00:00
Tomas Hruby	6513d20744	SMP - Process is stopped when VM modifies the page tables - RTS_VMINHIBIT flag is used to stop process while VM is fiddling with its pagetables - more generic way of sending synchronous scheduling events among cpus - do the x-cpu smp sched calls only if the target process is runnable. If it is not, it cannot be running and it cannot become runnable this CPU holds the BKL	2010-09-15 14:11:12 +00:00
Tomas Hruby	906a81a1c7	SMP - runctl() can stop across cpus - if stopping a process that runs on a different CPU we tell the remote cpu to do that	2010-09-15 14:11:09 +00:00
Tomas Hruby	06b6e5624a	SMP - Changed prototype of sys_schedule() - sys_schedule can change only selected values, -1 means that the current value should be kept unchanged. For instance we mostly want to change the scheduling quantum and priority but we want to keep the process at the current cpu - RS can hand off its processes to scheduler - service can read the destination cpu from system.conf - RS can pass the information farther	2010-09-15 14:10:42 +00:00
Tomas Hruby	865e21b884	SMP - CPU local idle stub - each CPU has its own pseudo idle process and its structure - idle cycles accounting is agregated when exporting to userspace	2010-09-15 14:10:21 +00:00
Tomas Hruby	13a0d5fa5e	SMP - Cpu local variables - most global variables carry information which is specific to the local CPU and each CPU must have its own copy - cpu local variable must be declared in cpulocal.h between DECLARE_CPULOCAL_START and DECLARE_CPULOCAL_END markers using DECLARE_CPULOCAL macro - to access the cpu local data the provided macros must be used get_cpu_var(cpu, name) get_cpu_var_ptr(cpu, name) get_cpulocal_var(name) get_cpulocal_var_ptr(name) - using this macros makes future changes in the implementation possible - switching to ELF will make the declaration of cpu local data much simpler, e.g. CPULOCAL int blah; anywhere in the kernel source code	2010-09-15 14:09:46 +00:00
Tomas Hruby	ce4fd0c0fb	Enable paging - some more code reshuffling	2010-09-15 14:09:41 +00:00
Ben Gras	c0074d3aa9	kernel: fix case of EAX getting clobbered after sigreturn.	2010-07-20 17:10:09 +00:00
Cristiano Giuffrida	20101b3bab	Remove patch leftovers.	2010-07-13 22:40:14 +00:00
Cristiano Giuffrida	f8a8ea0a79	Dynamic configuration in system.conf for boot system services.	2010-07-13 21:11:44 +00:00
Cristiano Giuffrida	8cedace2f5	Scheduling parameters out of the kernel.	2010-07-13 15:30:17 +00:00
Cristiano Giuffrida	8427d774b6	RS live update support.	2010-07-09 18:29:04 +00:00
Cristiano Giuffrida	1f8dbed029	RS crash recovery support.	2010-07-06 22:05:21 +00:00
Ben Gras	f6f814cb02	include, kernel: minor fixes to make compiling and linking work with clang. (fixing warnings)	2010-07-06 11:59:19 +00:00
Tomas Hruby	97eb470bee	Fix	2010-07-01 12:31:53 +00:00
Tomas Hruby	7920d48156	FPU cleanup - last reference to MF_USED_FPU removed - proc_used_fpu() used to test for MF_FPU_INITIALIZED	2010-07-01 12:23:25 +00:00
Erik van der Kouwe	23284ee7bd	User-space scheduling for system processes	2010-07-01 08:32:33 +00:00
Cristiano Giuffrida	06700d05d1	Give RS a page table.	2010-06-28 21:53:37 +00:00
Arun Thomas	c0c8d25799	Rename mkfiles from minix..mk to bsd..mk Makes things easier for pkgsrc	2010-06-25 18:29:09 +00:00
Tomas Hruby	6bc21b6992	Cycle counters zeroed after fork for the child	2010-06-18 14:01:34 +00:00
Tomas Hruby	360de619c0	No linear addresses in message delivery - removes p_delivermsg_lin item from the process structure and code related to it - as the send part, the receive does not need to use the PHYS_COPY_CATCH() and umap_local() couple. - The address space of the target process is installed before delivermsg() is called. - unlike the linear address, the virtual address does not change when paging is turned on nor after fork().	2010-06-11 08:16:10 +00:00
Kees van Reeuwijk	826b9590f2	More endpoint_t correctness. More const correctness. Other code cleanup.	2010-06-08 14:09:18 +00:00
Arun Thomas	4c10a31440	Remove legacy MM, FS, and FS_PROC_NR macros	2010-06-08 13:58:01 +00:00
Erik van der Kouwe	78186ee5f5	Add endpoint checks in scheduling kernel calls	2010-06-08 12:04:21 +00:00
Tomas Hruby	cbc9586c13	Lazy FPU - FPU context is stored only if conflict between 2 FPU users or while exporting context of a process to userspace while it is the active user of FPU - FPU has its owner (fpu_owner) which points to the process whose state is currently loaded in FPU - the FPU exception is only turned on when scheduling a process which is not the owner of FPU - FPU state is restored for the process that generated the FPU exception. This process runs immediately without letting scheduler to pick a new process to resolve the FPU conflict asap, to minimize the FPU thrashing and FPU exception hadler execution - faster all non-FPU-exception kernel entries as FPU state is not checked nor saved - removed MF_USED_FPU flag, only MF_FPU_INITIALIZED remains to signal that a process has used FPU in the past	2010-06-07 07:43:17 +00:00
Cristiano Giuffrida	a53514d4a9	Fix range checking in safecopy.	2010-06-04 18:05:38 +00:00
Tomas Hruby	f28acecb78	Removed a buggy assert unintentionally commted in r7044	2010-06-04 10:54:43 +00:00
Ben Gras	2f892aca91	kernel fpu context switching: fix race condition There seems to have been a broken assumption in the fpu context restoring code. It restores the context of the running process, without guarantee that the current process is the one that will be scheduled. This caused fpu saving for a different process to be triggered without fpu hardware being enabled, causing an fpu exception in the kernel. This practically only shows up with DEBUG_RACE on. Fix my thruby+me. The fix . is to only set the fpu-in-use-by-this-process flag in the exception handler, and then take care of fpu restoring when actually returning to userspace And the patch . translates fpu saving and restoring to c in arch_system.c, getting rid of a juicy chunk of assembly . makes osfxsr_feature private to arch_system.c . removes most of the arch dependent code from do_sigsend	2010-06-03 11:32:22 +00:00
Tomas Hruby	40f440b8cd	KCall methods do not depend on m_source and m_type fields - substituted the use of the m_source message field by caller->p_endpoint in kernel calls. It is the same information, just passed more intuitively. - the last dependency on m_type field is removed. - do_unused() is substituted by a check for NULL. - this pretty much removes the depency of kernel calls on the general message format. In the future this may be used to pass the kcall arguments in a different structure or registers (x86-64, ARM?) The kcall number may be passed in a register already.	2010-06-01 08:54:31 +00:00
Tomas Hruby	ebbd319ac0	do_safecopy split - removes dependency of do_safecopy() on the m_type field of the kcall messages. - instead of do_safecopy() figuring out what action is requested, the correct safecopy method is called right away.	2010-06-01 08:51:37 +00:00
David van Moolenbroek	51ff10d7c0	reset alarm timer on PRIVCTL	2010-05-26 07:10:28 +00:00
Tomas Hruby	451a6890d6	scheduling - time quantum in miliseconds - Currently the cpu time quantum is timer-ticks based. Thus the remaining quantum is decreased only if the processes is interrupted by a timer tick. As processes block a lot this typically does not happen for normal user processes. Also the quantum depends on the frequency of the timer. - This change makes the quantum miliseconds based. Internally the miliseconds are translated into cpu cycles. Everytime userspace execution is interrupted by kernel the cycles just consumed by the current process are deducted from the remaining quantum. - It makes the quantum system timer frequency independent. - The boot processes quantum is loosely derived from the tick-based quantas and 60Hz timer and subject to future change - the 64bit arithmetics is a little ugly, will be changes once we have compiler support for 64bit integers (soon)	2010-05-25 08:06:14 +00:00
Kees van Reeuwijk	ac14a989b3	Fixed some inconsistent strict typing declarations. Better strict typing.	2010-05-25 07:23:24 +00:00
Erik van der Kouwe	1f11a57141	Oops, last commit included more than was intended	2010-05-20 08:07:47 +00:00
Erik van der Kouwe	5f15ec05b2	More system processes, this was not enough for the release script to run on some configurations	2010-05-20 08:05:07 +00:00
Tomas Hruby	b09bcf6779	Scheduling server (by Bjorn Swift) In this second phase, scheduling is moved from PM to its own scheduler (see r6557 for phase one). In the next phase we hope to a) include useful information in the "out of quantum" message and b) create some simple scheduling policy that makes use of that information. When the system starts up, PM will iterate over its process table and ask SCHED to take over scheduling unprivileged processes. This is done by sending a SCHEDULING_START message to SCHED. This message includes the processes endpoint, the parent's endpoint and its nice level. The scheduler adds this process to its schedproc table, issues a schedctl, and returns its own endpoint to PM - as the endpoint of the effective scheduler. When a process terminates, a SCHEDULING_STOP message is sent to the scheduler. The reason for this effective endpoint is for future compatibility. Some day, we may have a scheduler that, instead of scheduling the process itself, forwards the SCHEDULING_START message on to another scheduler. PM has information on who schedules whom. As such, scheduling messages from user-land are sent through PM. An example is when processes change their priority, using nice(). In that case, a getsetpriority message is sent to PM, which then sends a SCHEDULING_SET_NICE to the process's effective scheduler. When a process is forked through PM, it inherits its parent's scheduler, but is spawned with an empty quantum. As before, a request to fork a process flows through VM before returning to PM, which then wakes up the child process. This flow has been modified slightly so that PM notifies the scheduler of the new process, before waking up the child process. If the scheduler fails to take over scheduling, the child process is torn down and the fork fails with an erroneous value. Process priority is entirely decided upon using nice levels. PM stores a copy of each process's nice level and when a child is forked, its parent's nice level is sent in the SCHEDULING_START message. How this level is mapped to a priority queue is up to the scheduler. It should be noted that the nice level is used to determine the max_priority and the parent could have been in a lower priority when it was spawned. To prevent a CPU intensive process from hawking the CPU by continuously forking children that get scheduled in the max_priority, the scheduler should determine in which queue the parent is currently scheduled, and schedule the child in that same queue. Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as NR_SCHED_QUEUES/2. That results in a "off by one" error when converting priority->nice->priority for nice=0. This also had the side effect that if someone were to set the MAX_USER_Q to something else than 0, then USER_Q would be off.	2010-05-18 13:39:04 +00:00
Ben Gras	c5c25e7abc	kernel/vm: change pde table info from single buffer to explicit per-process. makes code in kernel more readable, and allows better sanity checking on using the pde info.	2010-05-12 08:31:05 +00:00
Ben Gras	4e837dcfb3	kernel: more diagnostics for privctl ENOMEM conditions.	2010-04-29 08:50:52 +00:00
Kees van Reeuwijk	b412fb7ad5	Code cleanup: remove unused #include, variables and code,	2010-04-15 18:49:36 +00:00
Tomas Hruby	9b599bac1d	Quantum in fork - This patch removes the time slice split between parent and child in fork. - The time slice of the parent remains unchanged and the child does not have any. - If the process has a scheduler, the scheduler must assign the quantum and priority of the new process and let it run. - If the child does not inherit a scheduler, it is scheduled by the dummy default kernel policy. (servers, drivers, etc.) - In theory, the scheduler can change the quantum even of the parent process and implement any policy for splitting the quantum as neither the parent nor the child are runnable. Sending the out-of_quantum message on behalf of the processes may look like the right solution, however, the scheduler would probably handle the message before the whole fork protocol is finished. This way the scheduler has absolute control when the process should become runnable.	2010-04-10 15:27:38 +00:00
Tomas Hruby	485a037563	do_schedule() cleanup - it is not neccessary to test whether the scheduler is a system process as the process already head permissions to make this call. - it is better to test whether the scheduler has permission to make changes to this process before testing whether the values are valid.	2010-04-10 15:17:09 +00:00
Cristiano Giuffrida	48c6bb79f4	Driver refactory for live update and crash recovery. SYSLIB CHANGES: - DS calls to publish / retrieve labels consider endpoints instead of u32_t. VFS CHANGES: - mapdriver() only adds an entry in the dmap table in VFS. - dev_up() is only executed upon reception of a driver up event. INET CHANGES: - INET no longer searches for existing drivers instances at startup. - A newtwork driver is (re)initialized upon reception of a driver up event. - Networking startup is now race-free by design. No need to waste 5 seconds at startup any more. DRIVER CHANGES: - Every driver publishes driver up events when starting for the first time or in case of restart when recovery actions must be taken in the upper layers. - Driver up events are published by drivers through DS. - For regular drivers, VFS is normally the only subscriber, but not necessarily. For instance, when the filter driver is in use, it must subscribe to driver up events to initiate recovery. - For network drivers, inet is the only subscriber for now. - Every VFS driver is statically linked with libdriver, every network driver is statically linked with libnetdriver. DRIVER LIBRARIES CHANGES: - Libdriver is extended to provide generic receive() and ds_publish() interfaces for VFS drivers. - driver_receive() is a wrapper for sef_receive() also used in driver_task() to discard spurious messages that were meant to be delivered to a previous version of the driver. - driver_receive_mq() is the same as driver_receive() but integrates support for queued messages. - driver_announce() publishes a driver up event for VFS drivers and marks the driver as initialized and expecting a DEV_OPEN message. - Libnetdriver is introduced to provide similar receive() and ds_publish() interfaces for network drivers (netdriver_announce() and netdriver_receive()). - Network drivers all support live update with no state transfer now. KERNEL CHANGES: - Added kernel call statectl for state management. Used by driver_announce() to unblock eventual callers sendrecing to the driver.	2010-04-08 13:41:35 +00:00
Tomas Hruby	b464da5d73	do_nice.c - this file is not used and should have been remove in r6557	2010-04-06 13:44:03 +00:00
Tomas Hruby	b0d37b81c4	RTS_SYS_LOCK and do_runctl() - No need for RTS_SYS_LOCK as there are no tasks anymore.	2010-04-06 11:18:04 +00:00
Tomas Hruby	cdd6743e88	do_vtimer() - removed comment which is not true anymore as we don't have any tasks. No need to take any special measures.	2010-04-06 11:16:14 +00:00
Arun Thomas	4ed3a0cf3a	Convert kernel over to bsdmake	2010-04-01 22:22:33 +00:00
Kees van Reeuwijk	fc7dced1fa	Fix printfs with too few or too many parms, remove unused vars, fix incorrect flag tests, other code cleanup.	2010-04-01 13:25:05 +00:00
Cristiano Giuffrida	d8b42a755d	Move kernel signal SIGKNDELAY to system signal SIGSNDELAY and fix broken ptrace.	2010-03-31 08:55:12 +00:00
Tomas Hruby	b4cf88a04f	Userspace scheduling - cotributed by Bjorn Swift - In this first phase, scheduling is moved from the kernel to the PM server. The next steps are to a) moving scheduling to its own server and b) include useful information in the "out of quantum" message, so that the scheduler can make use of this information. - The kernel process table now keeps record of who is responsible for scheduling each process (p_scheduler). When this pointer is NULL, the process will be scheduled by the kernel. If such a process runs out of quantum, the kernel will simply renew its quantum an requeue it. - When PM loads, it will take over scheduling of all running processes, except system processes, using sys_schedctl(). Essentially, this only results in taking over init. As children inherit a scheduler from their parent, user space programs forked by init will inherit PM (for now) as their scheduler. - Once a process has been assigned a scheduler, and runs out of quantum, its RTS_NO_QUANTUM flag will be set and the process dequeued. The kernel will send a message to the scheduler, on the process' behalf, informing the scheduler that it has run out of quantum. The scheduler can take what ever action it pleases, based on its policy, and then reschedule the process using the sys_schedule() system call. - Balance queues does not work as before. While the old in-kernel function used to renew the quantum of processes in the highest priority run queue, the user-space implementation only acts on processes that have been bumped down to a lower priority queue. This approach reacts slower to changes than the old one, but saves us sending a sys_schedule message for each process every time we balance the queues. Currently, when processes are moved up a priority queue, their quantum is also renewed, but this can be fiddled with. - do_nice has been removed from kernel. PM answers to get- and setpriority calls, updates it's own nice variable as well as the max_run_queue. This will be refactored once scheduling is moved to a separate server. We will probably have PM update it's local nice value and then send a message to whoever is scheduling the process. - changes to fix an issue in do_fork() where processes could run out of quantum but bypassing the code path that handles it correctly. The future plan is to remove the policy from do_fork() and implement it in userspace too.	2010-03-29 11:07:20 +00:00
Tomas Hruby	12ef495cac	atomicity fix when enabling paging - before enabling paging VM asks kernel to resize its segments. This may cause kernel to segfault if APIC is used and an interrupt happens between this and paging enabled. As these are 2 separate vmctl calls it is not atomic. This patch fixes this problem. VM does not ask kernel to resize the segments in a separate call anymore. The new segments limit is part of the "enable paging" call. It generalizes this call in such a way that more information can be passed as need be or the information may be completely different if another architecture requires this.	2010-03-22 07:42:52 +00:00
Cristiano Giuffrida	cb176df60f	New RS and new signal handling for system processes. UPDATING INFO: 20100317: /usr/src/etc/system.conf updated to ignore default kernel calls: copy it (or merge it) to /etc/system.conf. The hello driver (/dev/hello) added to the distribution: # cd /usr/src/commands/scripts && make clean install # cd /dev && MAKEDEV hello KERNEL CHANGES: - Generic signal handling support. The kernel no longer assumes PM as a signal manager for every process. The signal manager of a given process can now be specified in its privilege slot. When a signal has to be delivered, the kernel performs the lookup and forwards the signal to the appropriate signal manager. PM is the default signal manager for user processes, RS is the default signal manager for system processes. To enable ptrace()ing for system processes, it is sufficient to change the default signal manager to PM. This will temporarily disable crash recovery, though. - sys_exit() is now split into sys_exit() (i.e. exit() for system processes, which generates a self-termination signal), and sys_clear() (i.e. used by PM to ask the kernel to clear a process slot when a process exits). - Added a new kernel call (i.e. sys_update()) to swap two process slots and implement live update. PM CHANGES: - Posix signal handling is no longer allowed for system processes. System signals are split into two fixed categories: termination and non-termination signals. When a non-termination signaled is processed, PM transforms the signal into an IPC message and delivers the message to the system process. When a termination signal is processed, PM terminates the process. - PM no longer assumes itself as the signal manager for system processes. It now makes sure that every system signal goes through the kernel before being actually processes. The kernel will then dispatch the signal to the appropriate signal manager which may or may not be PM. SYSLIB CHANGES: - Simplified SEF init and LU callbacks. - Added additional predefined SEF callbacks to debug crash recovery and live update. - Fixed a temporary ack in the SEF init protocol. SEF init reply is now completely synchronous. - Added SEF signal event type to provide a uniform interface for system processes to deal with signals. A sef_cb_signal_handler() callback is available for system processes to handle every received signal. A sef_cb_signal_manager() callback is used by signal managers to process system signals on behalf of the kernel. - Fixed a few bugs with memory mapping and DS. VM CHANGES: - Page faults and memory requests coming from the kernel are now implemented using signals. - Added a new VM call to swap two process slots and implement live update. - The call is used by RS at update time and in turn invokes the kernel call sys_update(). RS CHANGES: - RS has been reworked with a better functional decomposition. - Better kernel call masks. com.h now defines the set of very basic kernel calls every system service is allowed to use. This makes system.conf simpler and easier to maintain. In addition, this guarantees a higher level of isolation for system libraries that use one or more kernel calls internally (e.g. printf). - RS is the default signal manager for system processes. By default, RS intercepts every signal delivered to every system process. This makes crash recovery possible before bringing PM and friends in the loop. - RS now supports fast rollback when something goes wrong while initializing the new version during a live update. - Live update is now implemented by keeping the two versions side-by-side and swapping the process slots when the old version is ready to update. - Crash recovery is now implemented by keeping the two versions side-by-side and cleaning up the old version only when the recovery process is complete. DS CHANGES: - Fixed a bug when the process doing ds_publish() or ds_delete() is not known by DS. - Fixed the completely broken support for strings. String publishing is now implemented in the system library and simply wraps publishing of memory ranges. Ideally, we should adopt a similar approach for other data types as well. - Test suite fixed. DRIVER CHANGES: - The hello driver has been added to the Minix distribution to demonstrate basic live update and crash recovery functionalities. - Other drivers have been adapted to conform the new SEF interface.	2010-03-17 01:15:29 +00:00

1 2 3 4 5 ...

320 commits