minix

Author	SHA1	Message	Date
Tomas Hruby	b90c2d7026	rename of mode/context switching functions - this patch only renames schedcheck() to switch_to_user(), cycles_accounting_stop() to context_stop() and restart() to +restore_user_context() - the motivation is that since the introduction of schedcheck() it has been abused for many things. It deserves a better name. It should express the fact that from the moment we call the function we are in the process of switching to user. - cycles_accounting_stop() was originally a single purpose function. As this function is called at were convenient places it is used in for other things too, e.g. (un)locking the kernel. Thus it deserves a better name too. - using the old name, restart() does not call schedcheck(), however calls to restart are replaced by calls to schedcheck() [switch_to_user] and it calls restart() [restore_user_context]	2010-05-18 13:00:39 +00:00
Ben Gras	a1636b85b7	kernel: new DEBUG_RACE option. try to provoke race conditions between processes. it does this by - making all processes interruptible by running out of quantum - giving all processes a single tick of quantum - picking a random runnable process instead of in order, and from a single pool of runnable processes (no priorities) This together with very high HZ values currently provokes some race conditions seen earlier only when running with SMP.	2010-05-08 18:00:03 +00:00
Tomas Hruby	4f962b4798	A small mini_receive() cleanup - this patch substitutes *xpp for sender to increase readability of mini_receive(). - makes sure that the dequeued sender has p_q_link == NULL and that this condition holds when enqueuing the sender again. - it is a sanity check to make sure that the new sender is not enqueued already. Before this change the dequeued sender's p_q_link may not be NULL and it was only set to NULL when enqueued again.	2010-05-07 11:22:49 +00:00
Tomas Hruby	ec56479675	deadlock() - more info - deadlock() is more verbose in case of a detected deadlock. First, it lists all processses in the deadlock group. Then it prints the proc extra info, not only stack trace and register dump	2010-05-03 17:38:54 +00:00
Cristiano Giuffrida	0f353411d7	Set IPC status code only for RECEIVE	2010-04-26 14:43:59 +00:00
Tomas Hruby	9fdb773cdb	A simpler test whether to use kernel's default scheduling - this is a small addition to the userspace scheduling. proc_kernel_scheduler() tests whether to use the default scheduling policy in kernel. It is true if the process' scheduler is NULL _or_ self. Currently none of the tests was complete.	2010-04-10 15:19:25 +00:00
Tomas Hruby	987b87e2ad	Small fixes - do_sync_ipc() is private - fixed typo in a comment	2010-04-06 11:29:31 +00:00
Tomas Hruby	a774cc832f	do_ipc() rearrangements this patch does not add or change any functionality of do_ipc(), it only makes things a little cleaner (hopefully). Until now do_ipc() was responsible for handling all ipc calls. The catch is that SENDA is fairly different which results in some ugly code like this typecasting and variables naming which does not make much sense for SENDA and makes the code hard to read. result = mini_senda(caller_ptr, (asynmsg_t *)m_ptr, (size_t)src_dst_e); As it is called directly from assembly, the new do_ipc() takes as input values of 3 registers in reg_t variables (it used to be 4, however, bit_map wasn't used so I removed it), does the checks common to all ipc calls and call the appropriate handler either for do_sync_ipc() (all except SENDA) or mini_senda() (for SENDA) while typecasting the reg_t values correctly. As a result, handling SENDA differences in do_sync_ipc() is no more needed. Also the code that uses msg_size variable is improved a little bit. arch_do_syscall() is simplified too.	2010-04-06 11:24:26 +00:00
Kees van Reeuwijk	4865e3f4f9	More use of endpoint_t. Other code cleanup.	2010-03-30 14:07:15 +00:00
Tomas Hruby	62203ec287	NOREC_ENTER and NOREC_RETURN checks removed - the reasons for these checks no longer exist - these check are problematic on SMP	2010-03-29 11:43:10 +00:00
Tomas Hruby	5b52c5aa02	A reliable way for userspace to check if a msg is from kernel - IPC_FLG_MSG_FROM_KERNEL status flag is returned to userspace if the receive was satisfied by s message which was sent by the kernel on behalf of a process. This perfectly reliale information. - MF_SENDING_FROM_KERNEL flag added to processes to be able to set IPC_FLG_MSG_FROM_KERNEL when finishing receive if the receiver wasn't ready to receive immediately. - PM is changed to use this information to confirm that the scheduling messages are indeed from the kernel and not faked by a process. PM uses sef_receive_status() - get_work() is removed from PM to make the changes simpler	2010-03-29 11:25:01 +00:00
Tomas Hruby	2521cc6bdf	Slightly faster IPC - there are cycles wasted in the IPC call due to a fairly compliacted way of copying messages from userland to kernel. Sometimes this complicated way (generic though) is used even for copying within the kernel address space, sometimes it is used for copying in case _no_ copying is necessary. The goal of this patch is to improve this a little bit. - the places where a copy is from user to kernel use the copy_msg_from_user() kernel-kernel copies are turned into assignments and BuildNotifyMessage uses the delivery buffers to avoid copying. - copy_msg_from_user() was introduced when removing the system task and is about 2/3 faster then using the current mechanism (phys_copy). It also avoids the PHYS_COPY_CATCH macro. Assignment is also faster and no copy is the fastest ;-) so perhaps there will be some hardly noticable performance gain besides the clean up.	2010-03-29 11:16:37 +00:00
Tomas Hruby	b4cf88a04f	Userspace scheduling - cotributed by Bjorn Swift - In this first phase, scheduling is moved from the kernel to the PM server. The next steps are to a) moving scheduling to its own server and b) include useful information in the "out of quantum" message, so that the scheduler can make use of this information. - The kernel process table now keeps record of who is responsible for scheduling each process (p_scheduler). When this pointer is NULL, the process will be scheduled by the kernel. If such a process runs out of quantum, the kernel will simply renew its quantum an requeue it. - When PM loads, it will take over scheduling of all running processes, except system processes, using sys_schedctl(). Essentially, this only results in taking over init. As children inherit a scheduler from their parent, user space programs forked by init will inherit PM (for now) as their scheduler. - Once a process has been assigned a scheduler, and runs out of quantum, its RTS_NO_QUANTUM flag will be set and the process dequeued. The kernel will send a message to the scheduler, on the process' behalf, informing the scheduler that it has run out of quantum. The scheduler can take what ever action it pleases, based on its policy, and then reschedule the process using the sys_schedule() system call. - Balance queues does not work as before. While the old in-kernel function used to renew the quantum of processes in the highest priority run queue, the user-space implementation only acts on processes that have been bumped down to a lower priority queue. This approach reacts slower to changes than the old one, but saves us sending a sys_schedule message for each process every time we balance the queues. Currently, when processes are moved up a priority queue, their quantum is also renewed, but this can be fiddled with. - do_nice has been removed from kernel. PM answers to get- and setpriority calls, updates it's own nice variable as well as the max_run_queue. This will be refactored once scheduling is moved to a separate server. We will probably have PM update it's local nice value and then send a message to whoever is scheduling the process. - changes to fix an issue in do_fork() where processes could run out of quantum but bypassing the code path that handles it correctly. The future plan is to remove the policy from do_fork() and implement it in userspace too.	2010-03-29 11:07:20 +00:00
Tomas Hruby	a3ffc0f7ad	Removed NIL_SYS_PROC and NIL_PROC - NIL_PROC replaced by simple NULLs	2010-03-28 09:54:32 +00:00
Kees van Reeuwijk	98493805fd	Lots of const correctness.	2010-03-27 14:31:00 +00:00
Cristiano Giuffrida	9192dbecc9	Preserve the order of IPC messages between two parties. Currently a sequence of messages between a sender A and a receiver B of the form: A.asynsend(M1, B); A.send(M2, B) may result in the receiver receiving M1 first and then M2 or viceversa. This patch makes sure that the original order M1, M2 is always preserved. Note that the order of a hypotetical sequence A.asynsend(M1, B); A.asynsend(M2, B) is already guaranteed by the implementation of asynsend by design. Other senda-based wrappers can define their own semantics.	2010-03-27 00:09:22 +00:00
Cristiano Giuffrida	bde2109b7c	IPC status code for receive(). IPC changes: - receive() is changed to take an additional parameter, which is a pointer to a status code. - The status code is filled in by the kernel to provide additional information to the caller. For now, the kernel only fills in the IPC call used by the sender. Syslib changes: - sef_receive() has been split into sef_receive() (with the original semantics) and sef_receive_status() which exposes the status code to userland. - Ideally, every sys process should gradually switch to sef_receive_status() and use is_ipc_notify() as a dependable way to check for notify. - SEF has been modified to use is_ipc_notify() and demonstrate how to use the new status code.	2010-03-23 00:09:11 +00:00
Ben Gras	0937d6c367	re-establish kernel assert()s. use the regular <assert.h> assert() instead of vmassert() in kernel. throw out some #if 0 code. fix a few assert() conditions. enable by default.	2010-03-10 13:00:05 +00:00
Ben Gras	35a108b911	panic() cleanup. this change - makes panic() variadic, doing full printf() formatting - no more NO_NUM, and no more separate printf() statements needed to print extra info (or something in hex) before panicing - unifies panic() - same panic() name and usage for everyone - vm, kernel and rest have different names/syntax currently in order to implement their own luxuries, but no longer - throws out the 1st argument, to make source less noisy. the panic() in syslib retrieves the server name from the kernel so it should be clear enough who is panicing; e.g. panic("sigaction failed: %d", errno); looks like: at_wini(73130): panic: sigaction failed: 0 syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a - throws out report() - printf() is more convenient and powerful - harmonizes/fixes the use of panic() - there were a few places that used printf-style formatting (didn't work) and newlines (messes up the formatting) in panic() - throws out a few per-server panic() functions - cleans up a tie-in of tty with panic() merging printf() and panic() statements to be done incrementally.	2010-03-05 15:05:11 +00:00
Ben Gras	e6cb76a2e2	no more kprintf - kernel uses libsys printf now, only kputc is special to the kernel.	2010-03-03 15:45:01 +00:00
Ben Gras	18924ea563	New P_BLOCKEDON for kernel - a macro that encodes the "who is this process waiting for" logic, which is duplicated a few times in the kernel. (For a new feature for top.) Introducing it and throwing out ESRCDIED and EDSTDIED (replaced by EDEADSRCDST - so we don't have to care which part of the blocking is failing in system.c) simplifies some code in the kernel and callers that check for E{DEADSRCDST,ESRCDIED,EDSTDIED}, but don't care about the difference, a fair bit, and more significantly doesn't duplicate the 'blocked-on' logic.	2010-03-03 15:32:26 +00:00
Kees van Reeuwijk	1ba0936619	Fix some uses of uninitialized variables.	2010-02-19 10:41:02 +00:00
Kees van Reeuwijk	97c169b93a	Remove some unused #include. Remove some unused variables and computations on them.	2010-02-17 20:24:42 +00:00
Tomas Hruby	1b56fdb33c	Time accounting based on TSC - as thre are still KERNEL and IDLE entries, time accounting for kernel and idle time works the same as for any other process - everytime we stop accounting for the currently running process, kernel or idle, we read the TSC counter and increment the p_cycles entry. - the process cycles inherently include some of the kernel cycles as we can stop accounting for the process only after we save its context and we start accounting just before we restore its context - this assumes that the system does not scale the CPU frequency which will be true for ... long time ;-)	2010-02-10 15:36:54 +00:00
Tomas Hruby	c9da61022b	intr_disabled() tests removed - we don't need to test this in kernel as we always have interrupts disabled - if interrupts are enabled in kernel, it is only at very carefully chosen places. There are no such places now.	2010-02-09 15:29:58 +00:00
Tomas Hruby	c6fec6866f	No locking in kernel code - No locking in RTS_(UN)SET macros - No lock_notify() - Removed unused lock_send() - No lock/unlock macros anymore	2010-02-09 15:26:58 +00:00
Tomas Hruby	728f0f0c49	Removal of the system task * Userspace change to use the new kernel calls - _taskcall(SYSTASK...) changed to _kernel_call(...) - int 32 reused for the kernel calls - _do_kernel_call() to make the trap to kernel - kernel_call() to make the actuall kernel call from C using _do_kernel_call() - unlike ipc call the kernel call always succeeds as kernel is always available, however, kernel may return an error * Kernel side implementation of kernel calls - the SYSTEm task does not run, only the proc table entry is preserved - every data_copy(SYSTEM is no data_copy(KERNEL - "locking" is an empty operation now as everything runs in kernel - sys_task() is replaced by kernel_call() which copies the message into kernel, dispatches the call to its handler and finishes by either copying the results back to userspace (if need be) or by suspending the process because of VM - suspended processes are later made runnable once the memory issue is resolved, picked up by the scheduler and only at this time the call is resumed (in fact restarted) which does not need to copy the message from userspace as the message is already saved in the process structure. - no ned for the vmrestart queue, the scheduler will restart the system calls - no special case in do_vmctl(), all requests remove the RTS_VMREQUEST flag	2010-02-09 15:20:09 +00:00
Tomas Hruby	ad9ba944d1	Early address space switch - switch_address_space() implements a switch of the user address space for the destination process - this makes memory of this process easily accessible, e.g. a pointer valid in the userspace can be used with a little complexity to access the process's memory - the switch does not happed only just before we return to userspace, however, it happens right after we know which process we are going to schedule. This happens before we start processing the misc flags of this process so its memory is available - if the process becomes not runnable while processing the mics flags we pick a new process and we switch the address space again which introduces possibly a little bit more overhead, however, it is hopefully hidden by reducing the overheads when we actually access the memory	2010-02-09 15:13:52 +00:00
Tomas Hruby	b14a86ca5c	Sys calls are called ipc calls now - the syscalls are pretty much just ipc calls, however, sendrec() is used to implement system task (sys) calls - sendrec() won't be used anymore for this, therefore ipc calls will become pure ipc calls	2010-02-09 15:13:07 +00:00
Kees van Reeuwijk	c8a11b5453	Fixed some type inconsistencies in the kernel.	2010-01-26 12:26:06 +00:00
Kees van Reeuwijk	b67f788eea	Removed a number of useless #includes	2010-01-26 10:59:01 +00:00
Tomas Hruby	0cfbe936ce	Removed bunch of unused variables in kernel/proc.c	2010-01-22 16:14:57 +00:00
Kees van Reeuwijk	9d247900c0	Remove obsolete m_ptr calculations in try_one() and mini_senda().	2010-01-14 12:04:24 +00:00
Cristiano Giuffrida	b4d6d9db26	Fix bug in IPC deadlock detection code. The old deadlock code was misplaced and unable to deal with asynchronous IPC primitives (notify and senda) effectively. As an example, the following sequence of messages allowed the deadlock detection code to trigger a false positive: 1. A.notify(B) 2. A.receive(B) 3. B.receive(A) 1. B.notify(A) The solution is to run the deadlock detection routine only when a process is about to block in mini_send() or mini_receive().	2009-12-16 23:32:08 +00:00
Cristiano Giuffrida	f4574783dc	Rewrite of boot process KERNEL CHANGES: - The kernel only knows about privileges of kernel tasks and the root system process (now RS). - Kernel tasks and the root system process are the only processes that are made schedulable by the kernel at startup. All the other processes in the boot image don't get their privileges set at startup and are inhibited from running by the RTS_NO_PRIV flag. - Removed the assumption on the ordering of processes in the boot image table. System processes can now appear in any order in the boot image table. - Privilege ids can now be assigned both statically or dynamically. The kernel assigns static privilege ids to kernel tasks and the root system process. Each id is directly derived from the process number. - User processes now all share the static privilege id of the root user process (now INIT). - sys_privctl split: we have more calls now to let RS set privileges for system processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS / SYS_PRIV_SET_USER are used to set privileges for a system / user process. - boot image table flags split: PROC_FULLVM is the only flag that has been moved out of the privilege flags and is still maintained in the boot image table. All the other privilege flags are out of the kernel now. RS CHANGES: - RS is the only user-space process who gets to run right after in-kernel startup. - RS uses the boot image table from the kernel and three additional boot image info table (priv table, sys table, dev table) to complete the initialization of the system. - RS checks that the entries in the priv table match the entries in the boot image table to make sure that every process in the boot image gets schedulable. - RS only uses static privilege ids to set privileges for system services in the boot image. - RS includes basic memory management support to allocate the boot image buffer dynamically during initialization. The buffer shall contain the executable image of all the system services we would like to restart after a crash. - First step towards decoupling between resource provisioning and resource requirements in RS: RS must know what resources it needs to restart a process and what resources it has currently available. This is useful to tradeoff reliability and resource consumption. When required resources are missing, the process cannot be restarted. In that case, in the future, a system flag will tell RS what to do. For example, if CORE_PROC is set, RS should trigger a system-wide panic because the system can no longer function correctly without a core system process. PM CHANGES: - The process tree built at initialization time is changed to have INIT as root with pid 0, RS child of INIT and all the system services children of RS. This is required to make RS in control of all the system services. - PM no longer registers labels for system services in the boot image. This is now part of RS's initialization process.	2009-12-11 00:08:19 +00:00
David van Moolenbroek	fce9fd4b4e	Add 'getidle' CPU utilization measurement infrastructure	2009-12-02 11:52:26 +00:00
David van Moolenbroek	6c6e1db676	Kernel: fix faulty trap check	2009-11-28 13:15:07 +00:00
Tomas Hruby	cb9faaebfd	No need for a special idle queue - as the idle task is never placed on any run queue, we don't need any special idle queue. - one more queue available for user processes	2009-11-12 08:47:25 +00:00
Tomas Hruby	ad4dcaab71	Idle task never runs - idle task becomes a pseudo task which is never scheduled. It is never put on any run queue and never enters userspace. An entry for this task still remains in the process table for time accounting - Instead of panicing if there is not process to schedule, pick_proc() returns NULL which is a signal to put the cpu in an idle state and set everything in such a way that after receiving and interrupt it looks like idle task was preempted - idle task is set non-preemptible to avoid handling in the timer interrupt code which make userspace scheduling simpler as idle task does not need to be handled as a special case.	2009-11-12 08:42:18 +00:00
Tomas Hruby	a972f4bacc	All macros defining rts flags are prefixed with RTS_ - macros used with RTS_SET group of macros to define struct proc p_rts_flags are now prefixed with RTS_ to make things clear	2009-11-10 09:11:13 +00:00
Tomas Hruby	daf7940c69	pick_proc() called only just before returning to userspace - new proc_is_runnable() macro to test whether process is runnable. All tests whether p_rts_flags == 0 converted to use this macro - pick_proc() calls removed from enqueue() and dequeue() - removed the test for recursive calls from pick_proc() as it certainly cannot be called recursively now - PREEMPTED flag to mark processes that were preempted by enqueueuing a higher priority process in enqueue() - enqueue_head() to enqueue PREEMPTED processes again at the head of their current priority queue - NO_QUANTUM flag to block and dequeue processes preempted by timer tick with exceeded quantum. They need to be enqueued again in schedcheck() - next_ptr global variable removed	2009-11-09 17:48:31 +00:00
Tomas Hruby	ae75f9d4e5	Removal of the executable flag from files that cannot be executed - 755 -> 644	2009-11-09 10:26:00 +00:00
Tomas Hruby	ebbce7507b	Complete ovehaul of mode switching code - after a trap to kernel, the code automatically switches to kernel stack, in the future local to the CPU - k_reenter variable replaced by a test whether the CS is kernel cs or not. The information is passed further if needed. Removes a global variable which would need to be cpu local - no need for global variables describing the exception or trap context. This information is kept on stack and a pointer to this structure is passed to the C code as a single structure - removed loadedcr3 variable and its use replaced by reading the %cr3 register - no need to redisable interrupts in restart() as they are already disabled. - unified handling of traps that push and don't push errorcode - removed save() function as the process context is not saved directly to process table but saved as required by the trap code. Essentially it means that save() code is inlined everywhere not only in the exception handling routine - returning from syscall is more arch independent - it sets the retger in C - top of the x86 stack contains the current CPU id and pointer to the currently scheduled process (the one right interrupted) so the mode switch code can find where to save the context without need to use proc_ptr which will be cpu local in the future and therefore difficult to access in assembler and expensive to access in general - some more clean up of level0 code. No need to read-back the argument passed in %eax from the proc structure. The mode switch code does not clobber %the general registers and hence we can just call what is in %eax - many assebly macros in sconst.h as they will be reused by the apic assembly	2009-11-06 09:08:26 +00:00
Tomas Hruby	f2a1f21a39	Clock task split - preemption handled in the clock timer interrupt handler, not in the clock task - more achitecture independent clock timer handling code - smp ready as each CPU can have its own timer	2009-11-06 09:04:15 +00:00
Ben Gras	fe35879325	- panic if there's no runnable process - more basic sanity check before recursive enter check (data segment) - try to jump to boot monitor instantly on recursive panic	2009-10-03 11:30:35 +00:00
David van Moolenbroek	b423d7b477	Merge of David's ptrace branch. Summary: o Support for ptrace T_ATTACH/T_DETACH and T_SYSCALL o PM signal handling logic should now work properly, even with debuggers being present o Asynchronous PM/VFS protocol, full IPC support for senda(), and AMF_NOREPLY senda() flag DETAILS Process stop and delay call handling of PM: o Added sys_runctl() kernel call with sys_stop() and sys_resume() aliases, for PM to stop and resume a process o Added exception for sending/syscall-traced processes to sys_runctl(), and matching SIGKREADY pseudo-signal to PM o Fixed PM signal logic to deal with requests from a process after stopping it (so-called "delay calls"), using the SIGKREADY facility o Fixed various PM panics due to race conditions with delay calls versus VFS calls o Removed special PRIO_STOP priority value o Added SYS_LOCK RTS kernel flag, to stop an individual process from running while modifying its process structure Signal and debugger handling in PM: o Fixed debugger signals being dropped if a second signal arrives when the debugger has not retrieved the first one o Fixed debugger signals being sent to the debugger more than once o Fixed debugger signals unpausing process in VFS; removed PM_UNPAUSE_TR protocol message o Detached debugger signals from general signal logic and from being blocked on VFS calls, meaning that even VFS can now be traced o Fixed debugger being unable to receive more than one pending signal in one process stop o Fixed signal delivery being delayed needlessly when multiple signals are pending o Fixed wait test for tracer, which was returning for children that were not waited for o Removed second parallel pending call from PM to VFS for any process o Fixed process becoming runnable between exec() and debugger trap o Added support for notifying the debugger before the parent when a debugged child exits o Fixed debugger death causing child to remain stopped forever o Fixed consistently incorrect use of _NSIG Extensions to ptrace(): o Added T_ATTACH and T_DETACH ptrace request, to attach and detach a debugger to and from a process o Added T_SYSCALL ptrace request, to trace system calls o Added T_SETOPT ptrace request, to set trace options o Added TO_TRACEFORK trace option, to attach automatically to children of a traced process o Added TO_ALTEXEC trace option, to send SIGSTOP instead of SIGTRAP upon a successful exec() of the tracee o Extended T_GETUSER ptrace support to allow retrieving a process's priv structure o Removed T_STOP ptrace request again, as it does not help implementing debuggers properly o Added MINIX3-specific ptrace test (test42) o Added proper manual page for ptrace(2) Asynchronous PM/VFS interface: o Fixed asynchronous messages not being checked when receive() is called with an endpoint other than ANY o Added AMF_NOREPLY senda() flag, preventing such messages from satisfying the receive part of a sendrec() o Added asynsend3() that takes optional flags; asynsend() is now a #define passing in 0 as third parameter o Made PM/VFS protocol asynchronous; reintroduced tell_fs() o Made PM_BASE request/reply number range unique o Hacked in a horrible temporary workaround into RS to deal with newly revealed RS-PM-VFS race condition triangle until VFS is asynchronous System signal handling: o Fixed shutdown logic of device drivers; removed old SIGKSTOP signal o Removed is-superuser check from PM's do_procstat() (aka getsigset()) o Added sigset macros to allow system processes to deal with the full signal set, rather than just the POSIX subset Miscellaneous PM fixes: o Split do_getset into do_get and do_set, merging common code and making structure clearer o Fixed setpriority() being able to put to sleep processes using an invalid parameter, or revive zombie processes o Made find_proc() global; removed obsolete proc_from_pid() o Cleanup here and there Also included: o Fixed false-positive boot order kernel warning o Removed last traces of old NOTIFY_FROM code THINGS OF POSSIBLE INTEREST o It should now be possible to run PM at any priority, even lower than user processes o No assumptions are made about communication speed between PM and VFS, although communication must be FIFO o A debugger will now receive incoming debuggee signals at kill time only; the process may not yet be fully stopped o A first step has been made towards making the SYSTEM task preemptible	2009-09-30 09:57:22 +00:00
Ben Gras	bcd7d04203	throw out FIXME reminders for release	2009-09-30 07:40:34 +00:00
Ben Gras	cd8b915ed9	Primary goal for these changes is: - no longer have kernel have its own page table that is loaded on every kernel entry (trap, interrupt, exception). the primary purpose is to reduce the number of required reloads. Result: - kernel can only access memory of process that was running when kernel was entered - kernel must be mapped into every process page table, so traps to kernel keep working Problem: - kernel must often access memory of arbitrary processes (e.g. send arbitrary processes messages); this can't happen directly any more; usually because that process' page table isn't loaded at all, sometimes because that memory isn't mapped in at all, sometimes because it isn't mapped in read-write. So: - kernel must be able to map in memory of any process, in its own address space. Implementation: - VM and kernel share a range of memory in which addresses of all page tables of all processes are available. This has two purposes: . Kernel has to know what data to copy in order to map in a range . Kernel has to know where to write the data in order to map it in That last point is because kernel has to write in the currently loaded page table. - Processes and kernel are separated through segments; kernel segments haven't changed. - The kernel keeps the process whose page table is currently loaded in 'ptproc.' - If it wants to map in a range of memory, it writes the value of the page directory entry for that range into the page directory entry in the currently loaded map. There is a slot reserved for such purposes. The kernel can then access this memory directly. - In order to do this, its segment has been increased (and the segments of processes start where it ends). - In the pagefault handler, detect if the kernel is doing 'trappable' memory access (i.e. a pagefault isn't a fatal error) and if so, - set the saved instruction pointer to phys_copy_fault, breaking out of phys_copy - set the saved eax register to the address of the page fault, both for sanity checking and for checking in which of the two ranges that phys_copy was called with the fault occured - Some boot-time processes do not have their own page table, and are mapped in with the kernel, and separated with segments. The kernel detects this using HASPT. If such a process has to be scheduled, any page table will work and no page table switch is done. Major changes in kernel are - When accessing user processes memory, kernel no longer explicitly checks before it does so if that memory is OK. It simply makes the mapping (if necessary), tries to do the operation, and traps the pagefault if that memory isn't present; if that happens, the copy function returns EFAULT. So all of the CHECKRANGE_OR_SUSPEND macros are gone. - Kernel no longer has to copy/read and parse page tables. - A message copying optimisation: when messages are copied, and the recipient isn't mapped in, they are copied into a buffer in the kernel. This is done in QueueMess. The next time the recipient is scheduled, this message is copied into its memory. This happens in schedcheck(). This eliminates the mapping/copying step for messages, and makes it easier to deliver messages. This eliminates soft_notify. - Kernel no longer creates a page table at all, so the vm_setbuf and pagetable writing in memory.c is gone. Minor changes in kernel are - ipc_stats thrown out, wasn't used - misc flags all renamed to MF_* - NOREC_* macros to enter and leave functions that should not be called recursively; just sanity checks really - code to fully decode segment selectors and descriptors to print on exceptions - lots of vmassert()s added, only executed if DEBUG_VMASSERT is 1	2009-09-21 14:31:52 +00:00
David van Moolenbroek	b8b8f537bd	IPC privileges fixes Kernel: o Remove s_ipc_sendrec, instead using s_ipc_to for all send primitives o Centralize s_ipc_to bit manipulation, - disallowing assignment of bits pointing to unused priv structs; - preventing send-to-self by not setting bit for own priv struct; - preserving send mask matrix symmetry in all cases o Add IPC send mask checks to SENDA, which were missing entirely somehow o Slightly improve IPC stats accounting for SENDA o Remove SYSTEM from user processes' send mask o Half-fix the dependency between boot image order and process numbers, - correcting the table order of the boot processes; - documenting the order requirement needed for proper send masks; - warning at boot time if the order is violated RS: o Add support in /etc/drivers.conf for servers that talk to user processes, - disallowing IPC to user processes if no "ipc" field is present - adding a special "USER" label to explicitly allow IPC to user processes o Always apply IPC masks when specified; remove -i flag from service(8) o Use kernel send mask symmetry to delay adding IPC permissions for labels that do not exist yet, adding them to that label's process upon creation o Add VM to ipc permissions list for rtl8139 and fxp in drivers.conf Left to future fixes: o Removal of the table order vs process numbers dependency altogether, possibly using per-process send list structures as used for SYSTEM calls o Proper assignment of send masks to boot processes; some of the assigned (~0) masks are much wider than necessary o Proper assignment of IPC send masks for many more servers in drivers.conf o Removal of the debugging warning about the now legitimate case where RS's add_forward_ipc cannot find the IPC destination's label yet	2009-07-02 16:25:31 +00:00
Ben Gras	e0f3a5acf1	- enable ipc warnings by default - ipc checking code in kernel didn't properly catch the sendrec() to self case; added special case check - triggered by PM using stock panic() - needs its own _exit() reported by Joren l'Ami.	2009-04-17 13:46:37 +00:00

1 2 3

130 commits