minix

Author	SHA1	Message	Date
Erik van der Kouwe	22fa466268	Restore poweroff to some of it's former glory (on QEMU, at least)	2012-11-21 20:28:37 +01:00
Arun Thomas	263ec1e885	pm: update for ARM	2012-08-12 23:30:54 +02:00
Arun Thomas	6723dcfab7	Replace MACHINE/CHIP macros with compiler macros	2012-08-06 17:49:22 +02:00
Thomas Veerman	238a9a057b	PM: a few Coverity inspired fixes .initialize variable to prevent negative array indexing .remove dead code	2012-07-30 09:44:58 +00:00
Ben Gras	cbcdb838f1	various coverity-inspired fixes . some strncpy/strcpy to strlcpy conversions . new <minix/param.h> to avoid including other minix headers that have colliding definitions with library and commands code, causing parse warnings . removed some dead code / assignments	2012-07-16 14:00:56 +02:00
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	5e38c802d8	pm: ignore notify() from unknown sender . avoids annoying error message if e.g. buggy drivers send pm notify()s that pm tries to reply() ENOSYS to	2012-06-14 15:36:38 +02:00
Ben Gras	53002f6f6c	recognize and execute dynamically linked executables . generalize libexec slightly to get some more necessary information from ELF files, e.g. the interpreter . execute dynamically linked executables when exec()ed by VFS . switch to netbsd variant of elf32.h exclusively, solves some conflicting headers	2012-04-16 00:41:42 +00:00
Ben Gras	7336a67dfe	retire PUBLIC, PRIVATE and FORWARD	2012-03-25 21:58:14 +02:00
Ben Gras	6a73e85ad1	retire _PROTOTYPE . only good for obsolete K&R support . also remove a stray ansi.h and the proto cmd	2012-03-25 16:17:10 +02:00
Antoine Leca	3fb8cb760c	More cleaning up	2012-02-15 19:04:58 +00:00
Thomas Veerman	a6d0ee24c3	Use correct value for _NSIG User processes can send signals with number up to _NSIG. There are a few signal numbers above that used by the kernel, but should explicitly not be included in the range or range checks in PM will fail. The system processes use a different version of sigaddset, sigdelset, sigemptyset, sigfillset, and sigismember which does not include a range check on signal numbers (as opposed to the normal functions used by normal processes). This patch unbreaks test37 when the boot image is compiled with GCC/Clang.	2012-01-16 11:42:29 +00:00
David van Moolenbroek	6f374faca5	Add "expected size" parameter to getsysinfo() This patch provides basic protection against damage resulting from differently compiled servers blindly copying tables to one another. In every getsysinfo() call, the caller is provided with the expected size of the requested data structure. The callee fails the call if the expected size does not match the data structure's actual size.	2011-12-11 22:34:14 +01:00
Adriana Szekeres	c30f014a89	gcore command to coredump a process	2011-11-22 22:07:41 +01:00
Ben Gras	c24d15b2db	pm: add mproc table sanity check feature . make procfs check it . detects pm/procfs mismatches . was triggered by ack/clang pm/procfs: add padding to mproc struct to align ack/clang layout to fix this	2011-11-18 17:18:10 +01:00
Arun Thomas	62841e2935	pm: remove dead minix_munmap functions	2011-11-02 18:43:59 +01:00
Arun Thomas	9602f63a72	pm: remove dead function	2011-08-11 17:51:27 +02:00
Arun Thomas	25a790a631	VM and kernel support for ELF	2011-02-26 23:00:55 +00:00
David van Moolenbroek	354da24f5b	make getsysinfo() a system-land call	2010-09-14 21:50:05 +00:00
Cristiano Giuffrida	8cedace2f5	Scheduling parameters out of the kernel.	2010-07-13 15:30:17 +00:00
David van Moolenbroek	895850b8cf	move timers code to libsys	2010-07-09 12:58:18 +00:00
Thomas Veerman	34a2864e27	Fix a few compile time warnings	2010-07-02 12:41:19 +00:00
Erik van der Kouwe	23284ee7bd	User-space scheduling for system processes	2010-07-01 08:32:33 +00:00
Arun Thomas	1bf6d23f34	Make exec() use entry point in a.out header	2010-06-10 14:59:10 +00:00
Arun Thomas	f0a158d8c1	More cleanup to remove MM and FS references	2010-06-10 14:04:46 +00:00
Arun Thomas	4c10a31440	Remove legacy MM, FS, and FS_PROC_NR macros	2010-06-08 13:58:01 +00:00
Cristiano Giuffrida	354d88f883	Put initialization code where it belongs.	2010-06-04 18:08:15 +00:00
Kees van Reeuwijk	ed0b81c25c	Removed some unused variables and functions.	2010-06-02 19:41:38 +00:00
Erik van der Kouwe	1f11a57141	Oops, last commit included more than was intended	2010-05-20 08:07:47 +00:00
Erik van der Kouwe	5f15ec05b2	More system processes, this was not enough for the release script to run on some configurations	2010-05-20 08:05:07 +00:00
Ben Gras	bcdaf033b5	pm - fix sched interaction For coredumping processes, PM forgets to inform SCHED that the process has vanished, causing future fork()s to fail.	2010-05-19 13:22:29 +00:00
Tomas Hruby	b09bcf6779	Scheduling server (by Bjorn Swift) In this second phase, scheduling is moved from PM to its own scheduler (see r6557 for phase one). In the next phase we hope to a) include useful information in the "out of quantum" message and b) create some simple scheduling policy that makes use of that information. When the system starts up, PM will iterate over its process table and ask SCHED to take over scheduling unprivileged processes. This is done by sending a SCHEDULING_START message to SCHED. This message includes the processes endpoint, the parent's endpoint and its nice level. The scheduler adds this process to its schedproc table, issues a schedctl, and returns its own endpoint to PM - as the endpoint of the effective scheduler. When a process terminates, a SCHEDULING_STOP message is sent to the scheduler. The reason for this effective endpoint is for future compatibility. Some day, we may have a scheduler that, instead of scheduling the process itself, forwards the SCHEDULING_START message on to another scheduler. PM has information on who schedules whom. As such, scheduling messages from user-land are sent through PM. An example is when processes change their priority, using nice(). In that case, a getsetpriority message is sent to PM, which then sends a SCHEDULING_SET_NICE to the process's effective scheduler. When a process is forked through PM, it inherits its parent's scheduler, but is spawned with an empty quantum. As before, a request to fork a process flows through VM before returning to PM, which then wakes up the child process. This flow has been modified slightly so that PM notifies the scheduler of the new process, before waking up the child process. If the scheduler fails to take over scheduling, the child process is torn down and the fork fails with an erroneous value. Process priority is entirely decided upon using nice levels. PM stores a copy of each process's nice level and when a child is forked, its parent's nice level is sent in the SCHEDULING_START message. How this level is mapped to a priority queue is up to the scheduler. It should be noted that the nice level is used to determine the max_priority and the parent could have been in a lower priority when it was spawned. To prevent a CPU intensive process from hawking the CPU by continuously forking children that get scheduled in the max_priority, the scheduler should determine in which queue the parent is currently scheduled, and schedule the child in that same queue. Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as NR_SCHED_QUEUES/2. That results in a "off by one" error when converting priority->nice->priority for nice=0. This also had the side effect that if someone were to set the MAX_USER_Q to something else than 0, then USER_Q would be off.	2010-05-18 13:39:04 +00:00
Erik van der Kouwe	93f3bf5bda	Fix wrong word	2010-04-28 20:37:08 +00:00
Tomas Hruby	86378ff645	PM remembers what it should schedule - while PM implements fork also for RS it needs to remember what to schedule and what not. PM_SCHEDULED flag serves this purpose. - PM only schedules processes that are descendaints of init, i.e. normal user processes - after a process is forked PM schedules for the first time only processes that have PM_SCHEDULED set. The others are handled iether by kernel or some other scheduler	2010-04-13 10:45:08 +00:00
Tomas Hruby	9b599bac1d	Quantum in fork - This patch removes the time slice split between parent and child in fork. - The time slice of the parent remains unchanged and the child does not have any. - If the process has a scheduler, the scheduler must assign the quantum and priority of the new process and let it run. - If the child does not inherit a scheduler, it is scheduled by the dummy default kernel policy. (servers, drivers, etc.) - In theory, the scheduler can change the quantum even of the parent process and implement any policy for splitting the quantum as neither the parent nor the child are runnable. Sending the out-of_quantum message on behalf of the processes may look like the right solution, however, the scheduler would probably handle the message before the whole fork protocol is finished. This way the scheduler has absolute control when the process should become runnable.	2010-04-10 15:27:38 +00:00
Cristiano Giuffrida	48c6bb79f4	Driver refactory for live update and crash recovery. SYSLIB CHANGES: - DS calls to publish / retrieve labels consider endpoints instead of u32_t. VFS CHANGES: - mapdriver() only adds an entry in the dmap table in VFS. - dev_up() is only executed upon reception of a driver up event. INET CHANGES: - INET no longer searches for existing drivers instances at startup. - A newtwork driver is (re)initialized upon reception of a driver up event. - Networking startup is now race-free by design. No need to waste 5 seconds at startup any more. DRIVER CHANGES: - Every driver publishes driver up events when starting for the first time or in case of restart when recovery actions must be taken in the upper layers. - Driver up events are published by drivers through DS. - For regular drivers, VFS is normally the only subscriber, but not necessarily. For instance, when the filter driver is in use, it must subscribe to driver up events to initiate recovery. - For network drivers, inet is the only subscriber for now. - Every VFS driver is statically linked with libdriver, every network driver is statically linked with libnetdriver. DRIVER LIBRARIES CHANGES: - Libdriver is extended to provide generic receive() and ds_publish() interfaces for VFS drivers. - driver_receive() is a wrapper for sef_receive() also used in driver_task() to discard spurious messages that were meant to be delivered to a previous version of the driver. - driver_receive_mq() is the same as driver_receive() but integrates support for queued messages. - driver_announce() publishes a driver up event for VFS drivers and marks the driver as initialized and expecting a DEV_OPEN message. - Libnetdriver is introduced to provide similar receive() and ds_publish() interfaces for network drivers (netdriver_announce() and netdriver_receive()). - Network drivers all support live update with no state transfer now. KERNEL CHANGES: - Added kernel call statectl for state management. Used by driver_announce() to unblock eventual callers sendrecing to the driver.	2010-04-08 13:41:35 +00:00
Arun Thomas	4ed3a0cf3a	Convert kernel over to bsdmake	2010-04-01 22:22:33 +00:00
Tomas Hruby	5b52c5aa02	A reliable way for userspace to check if a msg is from kernel - IPC_FLG_MSG_FROM_KERNEL status flag is returned to userspace if the receive was satisfied by s message which was sent by the kernel on behalf of a process. This perfectly reliale information. - MF_SENDING_FROM_KERNEL flag added to processes to be able to set IPC_FLG_MSG_FROM_KERNEL when finishing receive if the receiver wasn't ready to receive immediately. - PM is changed to use this information to confirm that the scheduling messages are indeed from the kernel and not faked by a process. PM uses sef_receive_status() - get_work() is removed from PM to make the changes simpler	2010-03-29 11:25:01 +00:00
Tomas Hruby	b4cf88a04f	Userspace scheduling - cotributed by Bjorn Swift - In this first phase, scheduling is moved from the kernel to the PM server. The next steps are to a) moving scheduling to its own server and b) include useful information in the "out of quantum" message, so that the scheduler can make use of this information. - The kernel process table now keeps record of who is responsible for scheduling each process (p_scheduler). When this pointer is NULL, the process will be scheduled by the kernel. If such a process runs out of quantum, the kernel will simply renew its quantum an requeue it. - When PM loads, it will take over scheduling of all running processes, except system processes, using sys_schedctl(). Essentially, this only results in taking over init. As children inherit a scheduler from their parent, user space programs forked by init will inherit PM (for now) as their scheduler. - Once a process has been assigned a scheduler, and runs out of quantum, its RTS_NO_QUANTUM flag will be set and the process dequeued. The kernel will send a message to the scheduler, on the process' behalf, informing the scheduler that it has run out of quantum. The scheduler can take what ever action it pleases, based on its policy, and then reschedule the process using the sys_schedule() system call. - Balance queues does not work as before. While the old in-kernel function used to renew the quantum of processes in the highest priority run queue, the user-space implementation only acts on processes that have been bumped down to a lower priority queue. This approach reacts slower to changes than the old one, but saves us sending a sys_schedule message for each process every time we balance the queues. Currently, when processes are moved up a priority queue, their quantum is also renewed, but this can be fiddled with. - do_nice has been removed from kernel. PM answers to get- and setpriority calls, updates it's own nice variable as well as the max_run_queue. This will be refactored once scheduling is moved to a separate server. We will probably have PM update it's local nice value and then send a message to whoever is scheduling the process. - changes to fix an issue in do_fork() where processes could run out of quantum but bypassing the code path that handles it correctly. The future plan is to remove the policy from do_fork() and implement it in userspace too.	2010-03-29 11:07:20 +00:00
Cristiano Giuffrida	cb176df60f	New RS and new signal handling for system processes. UPDATING INFO: 20100317: /usr/src/etc/system.conf updated to ignore default kernel calls: copy it (or merge it) to /etc/system.conf. The hello driver (/dev/hello) added to the distribution: # cd /usr/src/commands/scripts && make clean install # cd /dev && MAKEDEV hello KERNEL CHANGES: - Generic signal handling support. The kernel no longer assumes PM as a signal manager for every process. The signal manager of a given process can now be specified in its privilege slot. When a signal has to be delivered, the kernel performs the lookup and forwards the signal to the appropriate signal manager. PM is the default signal manager for user processes, RS is the default signal manager for system processes. To enable ptrace()ing for system processes, it is sufficient to change the default signal manager to PM. This will temporarily disable crash recovery, though. - sys_exit() is now split into sys_exit() (i.e. exit() for system processes, which generates a self-termination signal), and sys_clear() (i.e. used by PM to ask the kernel to clear a process slot when a process exits). - Added a new kernel call (i.e. sys_update()) to swap two process slots and implement live update. PM CHANGES: - Posix signal handling is no longer allowed for system processes. System signals are split into two fixed categories: termination and non-termination signals. When a non-termination signaled is processed, PM transforms the signal into an IPC message and delivers the message to the system process. When a termination signal is processed, PM terminates the process. - PM no longer assumes itself as the signal manager for system processes. It now makes sure that every system signal goes through the kernel before being actually processes. The kernel will then dispatch the signal to the appropriate signal manager which may or may not be PM. SYSLIB CHANGES: - Simplified SEF init and LU callbacks. - Added additional predefined SEF callbacks to debug crash recovery and live update. - Fixed a temporary ack in the SEF init protocol. SEF init reply is now completely synchronous. - Added SEF signal event type to provide a uniform interface for system processes to deal with signals. A sef_cb_signal_handler() callback is available for system processes to handle every received signal. A sef_cb_signal_manager() callback is used by signal managers to process system signals on behalf of the kernel. - Fixed a few bugs with memory mapping and DS. VM CHANGES: - Page faults and memory requests coming from the kernel are now implemented using signals. - Added a new VM call to swap two process slots and implement live update. - The call is used by RS at update time and in turn invokes the kernel call sys_update(). RS CHANGES: - RS has been reworked with a better functional decomposition. - Better kernel call masks. com.h now defines the set of very basic kernel calls every system service is allowed to use. This makes system.conf simpler and easier to maintain. In addition, this guarantees a higher level of isolation for system libraries that use one or more kernel calls internally (e.g. printf). - RS is the default signal manager for system processes. By default, RS intercepts every signal delivered to every system process. This makes crash recovery possible before bringing PM and friends in the loop. - RS now supports fast rollback when something goes wrong while initializing the new version during a live update. - Live update is now implemented by keeping the two versions side-by-side and swapping the process slots when the old version is ready to update. - Crash recovery is now implemented by keeping the two versions side-by-side and cleaning up the old version only when the recovery process is complete. DS CHANGES: - Fixed a bug when the process doing ds_publish() or ds_delete() is not known by DS. - Fixed the completely broken support for strings. String publishing is now implemented in the system library and simply wraps publishing of memory ranges. Ideally, we should adopt a similar approach for other data types as well. - Test suite fixed. DRIVER CHANGES: - The hello driver has been added to the Minix distribution to demonstrate basic live update and crash recovery functionalities. - Other drivers have been adapted to conform the new SEF interface.	2010-03-17 01:15:29 +00:00
Arun Thomas	1f9ce647cf	Move archtypes.h, fpu.h, and stackframe.h Move archtypes.h to include/ dir, since several servers require it. Move fpu.h and stackframe.h to arch-specific header directory. Make source files and makefiles aware of the new header locations.	2010-03-09 09:41:14 +00:00
Ben Gras	35a108b911	panic() cleanup. this change - makes panic() variadic, doing full printf() formatting - no more NO_NUM, and no more separate printf() statements needed to print extra info (or something in hex) before panicing - unifies panic() - same panic() name and usage for everyone - vm, kernel and rest have different names/syntax currently in order to implement their own luxuries, but no longer - throws out the 1st argument, to make source less noisy. the panic() in syslib retrieves the server name from the kernel so it should be clear enough who is panicing; e.g. panic("sigaction failed: %d", errno); looks like: at_wini(73130): panic: sigaction failed: 0 syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a - throws out report() - printf() is more convenient and powerful - harmonizes/fixes the use of panic() - there were a few places that used printf-style formatting (didn't work) and newlines (messes up the formatting) in panic() - throws out a few per-server panic() functions - cleans up a tie-in of tty with panic() merging printf() and panic() statements to be done incrementally.	2010-03-05 15:05:11 +00:00
Erik van der Kouwe	310876dcec	Kill processes which ignore signals thatshould not be ignored	2010-01-31 19:13:20 +00:00
Cristiano Giuffrida	d1fd04e72a	Initialization protocol for system services. SYSLIB CHANGES: - SEF framework now supports a new SEF Init request type from RS. 3 different callbacks are available (init_fresh, init_lu, init_restart) to specify initialization code when a service starts fresh, starts after a live update, or restarts. SYSTEM SERVICE CHANGES: - Initialization code for system services is now enclosed in a callback SEF will automatically call at init time. The return code of the callback will tell RS whether the initialization completed successfully. - Each init callback can access information passed by RS to initialize. As of now, each system service has access to the public entries of RS's system process table to gather all the information required to initialize. This design eliminates many existing or potential races at boot time and provides a uniform initialization interface to system services. The same interface will be reused for the upcoming publish/subscribe model to handle dynamic registration / deregistration of system services. VM CHANGES: - Uniform privilege management for all system services. Every service uses the same call mask format. For boot services, VM copies the call mask from init data. For dynamic services, VM still receives the call mask via rs_set_priv call that will be soon replaced by the upcoming publish/subscribe model. RS CHANGES: - The system process table has been reorganized and split into private entries and public entries. Only the latter ones are exposed to system services. - VM call masks are now entirely configured in rs/table.c - RS has now its own slot in the system process table. Only kernel tasks and user processes not included in the boot image are now left out from the system process table. - RS implements the initialization protocol for system services. - For services in the boot image, RS blocks till initialization is complete and panics when failure is reported back. Services are initialized in their order of appearance in the boot image priv table and RS blocks to implements synchronous initialization for every system service having the flag SF_SYNCH_BOOT set. - For services started dynamically, the initialization protocol is implemented as though it were the first ping for the service. In this case, if the system service fails to report back (or reports failure), RS brings the service down rather than trying to restart it.	2010-01-08 01:20:42 +00:00
David van Moolenbroek	ac9ab099c8	General cleanup: - clean up kernel section of minix/com.h somewhat - remove ALLOCMEM and VM_ALLOCMEM calls - remove non-safecopy and minix-vmd support from Inet - remove SYS_VIRVCOPY and SYS_PHYSVCOPY calls - remove obsolete segment encoding in SYS_SAFECOPY* - remove DEVCTL call, svrctl(FSDEVUNMAP), map_driverX - remove declarations of unimplemented svrctl requests - remove everything related to swapping to disk - remove floppysetup.sh - remove traces of rescue device - update DESCRIBE.sh with new devices - some other small changes	2010-01-05 19:39:27 +00:00
Cristiano Giuffrida	1f5841c8ed	Basic System Event Framework (SEF) with ping and live update. SYSLIB CHANGES: - SEF must be used by every system process and is thereby part of the system library. - The framework provides a receive() interface (sef_receive) for system processes to automatically catch known system even messages and process them. - SEF provides a default behavior for each type of system event, but allows system processes to register callbacks to override the default behavior. - Custom (local to the process) or predefined (provided by SEF) callback implementations can be registered to SEF. - SEF currently includes support for 2 types of system events: 1. SEF Ping. The event occurs every time RS sends a ping to figure out whether a system process is still alive. The default callback implementation provided by SEF is to notify RS back to let it know the process is alive and kicking. 2. SEF Live update. The event occurs every time RS sends a prepare to update message to let a system process know an update is available and to prepare for it. The live update support is very basic for now. SEF only deals with verifying if the prepare state can be supported by the process, dumping the state for debugging purposes, and providing an event-driven programming model to the process to react to state changes check-in when ready to update. - SEF should be extended in the future to integrate support for more types of system events. Ideally, all the cross-cutting concerns should be integrated into SEF to avoid duplicating code and ease extensibility. Examples include: * PM notify messages primarily used at shutdown. * SYSTEM notify messages primarily used for signals. * CLOCK notify messages used for system alarms. * Debug messages. IS could still be in charge of fkey handling but would forward the debug message to the target process (e.g. PM, if the user requested debug information about PM). SEF would then catch the message and do nothing unless the process has registered an appropriate callback to deal with the event. This simplifies the programming model to print debug information, avoids duplicating code, and reduces the effort to print debug information. SYSTEM PROCESSES CHANGES: - Every system process registers SEF callbacks it needs to override the default system behavior and calls sef_startup() right after being started. - sef_startup() does almost nothing now, but will be extended in the future to support callbacks of its own to let RS control and synchronize with every system process at initialization time. - Every system process calls sef_receive() now rather than receive() directly, to let SEF handle predefined system events. RS CHANGES: - RS supports a basic single-component live update protocol now, as follows: * When an update command is issued (via "service update "), RS notifies the target system process to prepare for a specific update state. If the process doesn't respond back in time, the update is aborted. * When the process responds back, RS kills it and marks it for refreshing. * The process is then automatically restarted as for a buggy process and can start running again. * Live update is currently prototyped as a controlled failure.	2009-12-21 14:12:21 +00:00
Thomas Veerman	958b25be50	- Introduce support for sticky bit. - Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly. - Clean up MFS by removing old, dead code (backwards compatibility is broken by the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all functions have proper banners and prototypes. - VFS should always provide a (syntactically) valid path to the FS; no need for the FS to do sanity checks when leaving/entering mount points. - Fix several bugs in MFS: - Several path lookup bugs in MFS. - A link can be too big for the path buffer. - A mountpoint can become inaccessible when the creation of a new inode fails, because the inode already exists and is a mountpoint. - Introduce support for supplemental groups. - Add test 46 to test supplemental group functionality (and removed obsolete suppl. tests from test 2). - Clean up VFS (not everything is done yet). - ISOFS now opens device read-only. This makes the -r flag in the mount command unnecessary (but will still report to be mounted read-write). - Introduce PipeFS. PipeFS is a new FS that handles all anonymous and named pipes. However, named pipes still reside on the (M)FS, as they are part of the file system on disk. To make this work VFS now has a concept of 'mapped' inodes, which causes read, write, truncate and stat requests to be redirected to the mapped FS, and all other requests to the original FS.	2009-12-20 20:27:14 +00:00
Cristiano Giuffrida	f4574783dc	Rewrite of boot process KERNEL CHANGES: - The kernel only knows about privileges of kernel tasks and the root system process (now RS). - Kernel tasks and the root system process are the only processes that are made schedulable by the kernel at startup. All the other processes in the boot image don't get their privileges set at startup and are inhibited from running by the RTS_NO_PRIV flag. - Removed the assumption on the ordering of processes in the boot image table. System processes can now appear in any order in the boot image table. - Privilege ids can now be assigned both statically or dynamically. The kernel assigns static privilege ids to kernel tasks and the root system process. Each id is directly derived from the process number. - User processes now all share the static privilege id of the root user process (now INIT). - sys_privctl split: we have more calls now to let RS set privileges for system processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS / SYS_PRIV_SET_USER are used to set privileges for a system / user process. - boot image table flags split: PROC_FULLVM is the only flag that has been moved out of the privilege flags and is still maintained in the boot image table. All the other privilege flags are out of the kernel now. RS CHANGES: - RS is the only user-space process who gets to run right after in-kernel startup. - RS uses the boot image table from the kernel and three additional boot image info table (priv table, sys table, dev table) to complete the initialization of the system. - RS checks that the entries in the priv table match the entries in the boot image table to make sure that every process in the boot image gets schedulable. - RS only uses static privilege ids to set privileges for system services in the boot image. - RS includes basic memory management support to allocate the boot image buffer dynamically during initialization. The buffer shall contain the executable image of all the system services we would like to restart after a crash. - First step towards decoupling between resource provisioning and resource requirements in RS: RS must know what resources it needs to restart a process and what resources it has currently available. This is useful to tradeoff reliability and resource consumption. When required resources are missing, the process cannot be restarted. In that case, in the future, a system flag will tell RS what to do. For example, if CORE_PROC is set, RS should trigger a system-wide panic because the system can no longer function correctly without a core system process. PM CHANGES: - The process tree built at initialization time is changed to have INIT as root with pid 0, RS child of INIT and all the system services children of RS. This is required to make RS in control of all the system services. - PM no longer registers labels for system services in the boot image. This is now part of RS's initialization process.	2009-12-11 00:08:19 +00:00
David van Moolenbroek	4c263d6002	PM: clean up endpoint info API/ABI	2009-10-31 14:09:28 +00:00
David van Moolenbroek	b423d7b477	Merge of David's ptrace branch. Summary: o Support for ptrace T_ATTACH/T_DETACH and T_SYSCALL o PM signal handling logic should now work properly, even with debuggers being present o Asynchronous PM/VFS protocol, full IPC support for senda(), and AMF_NOREPLY senda() flag DETAILS Process stop and delay call handling of PM: o Added sys_runctl() kernel call with sys_stop() and sys_resume() aliases, for PM to stop and resume a process o Added exception for sending/syscall-traced processes to sys_runctl(), and matching SIGKREADY pseudo-signal to PM o Fixed PM signal logic to deal with requests from a process after stopping it (so-called "delay calls"), using the SIGKREADY facility o Fixed various PM panics due to race conditions with delay calls versus VFS calls o Removed special PRIO_STOP priority value o Added SYS_LOCK RTS kernel flag, to stop an individual process from running while modifying its process structure Signal and debugger handling in PM: o Fixed debugger signals being dropped if a second signal arrives when the debugger has not retrieved the first one o Fixed debugger signals being sent to the debugger more than once o Fixed debugger signals unpausing process in VFS; removed PM_UNPAUSE_TR protocol message o Detached debugger signals from general signal logic and from being blocked on VFS calls, meaning that even VFS can now be traced o Fixed debugger being unable to receive more than one pending signal in one process stop o Fixed signal delivery being delayed needlessly when multiple signals are pending o Fixed wait test for tracer, which was returning for children that were not waited for o Removed second parallel pending call from PM to VFS for any process o Fixed process becoming runnable between exec() and debugger trap o Added support for notifying the debugger before the parent when a debugged child exits o Fixed debugger death causing child to remain stopped forever o Fixed consistently incorrect use of _NSIG Extensions to ptrace(): o Added T_ATTACH and T_DETACH ptrace request, to attach and detach a debugger to and from a process o Added T_SYSCALL ptrace request, to trace system calls o Added T_SETOPT ptrace request, to set trace options o Added TO_TRACEFORK trace option, to attach automatically to children of a traced process o Added TO_ALTEXEC trace option, to send SIGSTOP instead of SIGTRAP upon a successful exec() of the tracee o Extended T_GETUSER ptrace support to allow retrieving a process's priv structure o Removed T_STOP ptrace request again, as it does not help implementing debuggers properly o Added MINIX3-specific ptrace test (test42) o Added proper manual page for ptrace(2) Asynchronous PM/VFS interface: o Fixed asynchronous messages not being checked when receive() is called with an endpoint other than ANY o Added AMF_NOREPLY senda() flag, preventing such messages from satisfying the receive part of a sendrec() o Added asynsend3() that takes optional flags; asynsend() is now a #define passing in 0 as third parameter o Made PM/VFS protocol asynchronous; reintroduced tell_fs() o Made PM_BASE request/reply number range unique o Hacked in a horrible temporary workaround into RS to deal with newly revealed RS-PM-VFS race condition triangle until VFS is asynchronous System signal handling: o Fixed shutdown logic of device drivers; removed old SIGKSTOP signal o Removed is-superuser check from PM's do_procstat() (aka getsigset()) o Added sigset macros to allow system processes to deal with the full signal set, rather than just the POSIX subset Miscellaneous PM fixes: o Split do_getset into do_get and do_set, merging common code and making structure clearer o Fixed setpriority() being able to put to sleep processes using an invalid parameter, or revive zombie processes o Made find_proc() global; removed obsolete proc_from_pid() o Cleanup here and there Also included: o Fixed false-positive boot order kernel warning o Removed last traces of old NOTIFY_FROM code THINGS OF POSSIBLE INTEREST o It should now be possible to run PM at any priority, even lower than user processes o No assumptions are made about communication speed between PM and VFS, although communication must be FIFO o A debugger will now receive incoming debuggee signals at kill time only; the process may not yet be fully stopped o A first step has been made towards making the SYSTEM task preemptible	2009-09-30 09:57:22 +00:00

1 2 3

119 commits