minix/minix/kernel/priv.h

107 lines
4.2 KiB
C
Raw Normal View History

2005-07-14 17:26:26 +02:00
#ifndef PRIV_H
#define PRIV_H
/* Declaration of the system privileges structure. It defines flags, system
* call masks, an synchronous alarm timer, I/O privileges, pending hardware
* interrupts and notifications, and so on.
* System processes each get their own structure with properties, whereas all
* user processes share one structure. This setup provides a clear separation
* between common and privileged process fields and is very space efficient.
*
* Changes:
Rewrite of boot process KERNEL CHANGES: - The kernel only knows about privileges of kernel tasks and the root system process (now RS). - Kernel tasks and the root system process are the only processes that are made schedulable by the kernel at startup. All the other processes in the boot image don't get their privileges set at startup and are inhibited from running by the RTS_NO_PRIV flag. - Removed the assumption on the ordering of processes in the boot image table. System processes can now appear in any order in the boot image table. - Privilege ids can now be assigned both statically or dynamically. The kernel assigns static privilege ids to kernel tasks and the root system process. Each id is directly derived from the process number. - User processes now all share the static privilege id of the root user process (now INIT). - sys_privctl split: we have more calls now to let RS set privileges for system processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS / SYS_PRIV_SET_USER are used to set privileges for a system / user process. - boot image table flags split: PROC_FULLVM is the only flag that has been moved out of the privilege flags and is still maintained in the boot image table. All the other privilege flags are out of the kernel now. RS CHANGES: - RS is the only user-space process who gets to run right after in-kernel startup. - RS uses the boot image table from the kernel and three additional boot image info table (priv table, sys table, dev table) to complete the initialization of the system. - RS checks that the entries in the priv table match the entries in the boot image table to make sure that every process in the boot image gets schedulable. - RS only uses static privilege ids to set privileges for system services in the boot image. - RS includes basic memory management support to allocate the boot image buffer dynamically during initialization. The buffer shall contain the executable image of all the system services we would like to restart after a crash. - First step towards decoupling between resource provisioning and resource requirements in RS: RS must know what resources it needs to restart a process and what resources it has currently available. This is useful to tradeoff reliability and resource consumption. When required resources are missing, the process cannot be restarted. In that case, in the future, a system flag will tell RS what to do. For example, if CORE_PROC is set, RS should trigger a system-wide panic because the system can no longer function correctly without a core system process. PM CHANGES: - The process tree built at initialization time is changed to have INIT as root with pid 0, RS child of INIT and all the system services children of RS. This is required to make RS in control of all the system services. - PM no longer registers labels for system services in the boot image. This is now part of RS's initialization process.
2009-12-11 01:08:19 +01:00
* Nov 22, 2009 rewrite of privilege management (Cristiano Giuffrida)
* Jul 01, 2005 Created. (Jorrit N. Herder)
2005-07-14 17:26:26 +02:00
*/
#include <minix/com.h>
#include <minix/const.h>
#include <minix/priv.h>
#include "kernel/const.h"
#include "kernel/type.h"
#include "kernel/ipc_filter.h"
2005-07-14 17:26:26 +02:00
struct priv {
proc_nr_t s_proc_nr; /* number of associated process */
sys_id_t s_id; /* index of this system structure */
short s_flags; /* PREEMTIBLE, BILLABLE, etc. */
int s_init_flags; /* initialization flags given to the process. */
2005-07-14 17:26:26 +02:00
/* Asynchronous sends */
vir_bytes s_asyntab; /* addr. of table in process' address space */
size_t s_asynsize; /* number of elements in table. 0 when not in
* use
*/
endpoint_t s_asynendpoint; /* the endpoint the asyn table belongs to. */
short s_trap_mask; /* allowed system call traps */
sys_map_t s_ipc_to; /* allowed destination processes */
/* allowed kernel calls */
New RS and new signal handling for system processes. UPDATING INFO: 20100317: /usr/src/etc/system.conf updated to ignore default kernel calls: copy it (or merge it) to /etc/system.conf. The hello driver (/dev/hello) added to the distribution: # cd /usr/src/commands/scripts && make clean install # cd /dev && MAKEDEV hello KERNEL CHANGES: - Generic signal handling support. The kernel no longer assumes PM as a signal manager for every process. The signal manager of a given process can now be specified in its privilege slot. When a signal has to be delivered, the kernel performs the lookup and forwards the signal to the appropriate signal manager. PM is the default signal manager for user processes, RS is the default signal manager for system processes. To enable ptrace()ing for system processes, it is sufficient to change the default signal manager to PM. This will temporarily disable crash recovery, though. - sys_exit() is now split into sys_exit() (i.e. exit() for system processes, which generates a self-termination signal), and sys_clear() (i.e. used by PM to ask the kernel to clear a process slot when a process exits). - Added a new kernel call (i.e. sys_update()) to swap two process slots and implement live update. PM CHANGES: - Posix signal handling is no longer allowed for system processes. System signals are split into two fixed categories: termination and non-termination signals. When a non-termination signaled is processed, PM transforms the signal into an IPC message and delivers the message to the system process. When a termination signal is processed, PM terminates the process. - PM no longer assumes itself as the signal manager for system processes. It now makes sure that every system signal goes through the kernel before being actually processes. The kernel will then dispatch the signal to the appropriate signal manager which may or may not be PM. SYSLIB CHANGES: - Simplified SEF init and LU callbacks. - Added additional predefined SEF callbacks to debug crash recovery and live update. - Fixed a temporary ack in the SEF init protocol. SEF init reply is now completely synchronous. - Added SEF signal event type to provide a uniform interface for system processes to deal with signals. A sef_cb_signal_handler() callback is available for system processes to handle every received signal. A sef_cb_signal_manager() callback is used by signal managers to process system signals on behalf of the kernel. - Fixed a few bugs with memory mapping and DS. VM CHANGES: - Page faults and memory requests coming from the kernel are now implemented using signals. - Added a new VM call to swap two process slots and implement live update. - The call is used by RS at update time and in turn invokes the kernel call sys_update(). RS CHANGES: - RS has been reworked with a better functional decomposition. - Better kernel call masks. com.h now defines the set of very basic kernel calls every system service is allowed to use. This makes system.conf simpler and easier to maintain. In addition, this guarantees a higher level of isolation for system libraries that use one or more kernel calls internally (e.g. printf). - RS is the default signal manager for system processes. By default, RS intercepts every signal delivered to every system process. This makes crash recovery possible before bringing PM and friends in the loop. - RS now supports fast rollback when something goes wrong while initializing the new version during a live update. - Live update is now implemented by keeping the two versions side-by-side and swapping the process slots when the old version is ready to update. - Crash recovery is now implemented by keeping the two versions side-by-side and cleaning up the old version only when the recovery process is complete. DS CHANGES: - Fixed a bug when the process doing ds_publish() or ds_delete() is not known by DS. - Fixed the completely broken support for strings. String publishing is now implemented in the system library and simply wraps publishing of memory ranges. Ideally, we should adopt a similar approach for other data types as well. - Test suite fixed. DRIVER CHANGES: - The hello driver has been added to the Minix distribution to demonstrate basic live update and crash recovery functionalities. - Other drivers have been adapted to conform the new SEF interface.
2010-03-17 02:15:29 +01:00
bitchunk_t s_k_call_mask[SYS_CALL_MASK_SIZE];
2005-07-14 17:26:26 +02:00
New RS and new signal handling for system processes. UPDATING INFO: 20100317: /usr/src/etc/system.conf updated to ignore default kernel calls: copy it (or merge it) to /etc/system.conf. The hello driver (/dev/hello) added to the distribution: # cd /usr/src/commands/scripts && make clean install # cd /dev && MAKEDEV hello KERNEL CHANGES: - Generic signal handling support. The kernel no longer assumes PM as a signal manager for every process. The signal manager of a given process can now be specified in its privilege slot. When a signal has to be delivered, the kernel performs the lookup and forwards the signal to the appropriate signal manager. PM is the default signal manager for user processes, RS is the default signal manager for system processes. To enable ptrace()ing for system processes, it is sufficient to change the default signal manager to PM. This will temporarily disable crash recovery, though. - sys_exit() is now split into sys_exit() (i.e. exit() for system processes, which generates a self-termination signal), and sys_clear() (i.e. used by PM to ask the kernel to clear a process slot when a process exits). - Added a new kernel call (i.e. sys_update()) to swap two process slots and implement live update. PM CHANGES: - Posix signal handling is no longer allowed for system processes. System signals are split into two fixed categories: termination and non-termination signals. When a non-termination signaled is processed, PM transforms the signal into an IPC message and delivers the message to the system process. When a termination signal is processed, PM terminates the process. - PM no longer assumes itself as the signal manager for system processes. It now makes sure that every system signal goes through the kernel before being actually processes. The kernel will then dispatch the signal to the appropriate signal manager which may or may not be PM. SYSLIB CHANGES: - Simplified SEF init and LU callbacks. - Added additional predefined SEF callbacks to debug crash recovery and live update. - Fixed a temporary ack in the SEF init protocol. SEF init reply is now completely synchronous. - Added SEF signal event type to provide a uniform interface for system processes to deal with signals. A sef_cb_signal_handler() callback is available for system processes to handle every received signal. A sef_cb_signal_manager() callback is used by signal managers to process system signals on behalf of the kernel. - Fixed a few bugs with memory mapping and DS. VM CHANGES: - Page faults and memory requests coming from the kernel are now implemented using signals. - Added a new VM call to swap two process slots and implement live update. - The call is used by RS at update time and in turn invokes the kernel call sys_update(). RS CHANGES: - RS has been reworked with a better functional decomposition. - Better kernel call masks. com.h now defines the set of very basic kernel calls every system service is allowed to use. This makes system.conf simpler and easier to maintain. In addition, this guarantees a higher level of isolation for system libraries that use one or more kernel calls internally (e.g. printf). - RS is the default signal manager for system processes. By default, RS intercepts every signal delivered to every system process. This makes crash recovery possible before bringing PM and friends in the loop. - RS now supports fast rollback when something goes wrong while initializing the new version during a live update. - Live update is now implemented by keeping the two versions side-by-side and swapping the process slots when the old version is ready to update. - Crash recovery is now implemented by keeping the two versions side-by-side and cleaning up the old version only when the recovery process is complete. DS CHANGES: - Fixed a bug when the process doing ds_publish() or ds_delete() is not known by DS. - Fixed the completely broken support for strings. String publishing is now implemented in the system library and simply wraps publishing of memory ranges. Ideally, we should adopt a similar approach for other data types as well. - Test suite fixed. DRIVER CHANGES: - The hello driver has been added to the Minix distribution to demonstrate basic live update and crash recovery functionalities. - Other drivers have been adapted to conform the new SEF interface.
2010-03-17 02:15:29 +01:00
endpoint_t s_sig_mgr; /* signal manager for system signals */
2010-07-07 00:05:21 +02:00
endpoint_t s_bak_sig_mgr; /* backup signal manager for system signals */
2005-07-14 17:26:26 +02:00
sys_map_t s_notify_pending; /* bit map with pending notifications */
sys_map_t s_asyn_pending; /* bit map with pending asyn messages */
irq_id_t s_int_pending; /* pending hardware interrupts */
sigset_t s_sig_pending; /* pending signals */
ipc_filter_t *s_ipcf; /* ipc filter (NULL when no filter is set) */
2005-07-14 17:26:26 +02:00
minix_timer_t s_alarm_timer; /* synchronous alarm timer */
2005-07-14 17:26:26 +02:00
reg_t *s_stack_guard; /* stack guard word for kernel tasks */
char s_diag_sig; /* send a SIGKMESS when diagnostics arrive? */
2006-03-10 17:10:05 +01:00
int s_nr_io_range; /* allowed I/O ports */
struct io_range s_io_tab[NR_IO_RANGE];
2006-03-10 17:10:05 +01:00
int s_nr_mem_range; /* allowed memory ranges */
struct minix_mem_range s_mem_tab[NR_MEM_RANGE];
2006-03-10 17:10:05 +01:00
int s_nr_irq; /* allowed IRQ lines */
int s_irq_tab[NR_IRQ];
vir_bytes s_grant_table; /* grant table address of process, or 0 */
int s_grant_entries; /* no. of entries, or 0 */
endpoint_t s_grant_endpoint; /* the endpoint the grant table belongs to */
vir_bytes s_state_table; /* state table address of process, or 0 */
int s_state_entries; /* no. of entries, or 0 */
2005-07-14 17:26:26 +02:00
};
/* Guard word for task stacks. */
#define STACK_GUARD ((reg_t) (sizeof(reg_t) == 2 ? 0xBEEF : 0xDEADBEEF))
/* Magic system structure table addresses. */
Rewrite of boot process KERNEL CHANGES: - The kernel only knows about privileges of kernel tasks and the root system process (now RS). - Kernel tasks and the root system process are the only processes that are made schedulable by the kernel at startup. All the other processes in the boot image don't get their privileges set at startup and are inhibited from running by the RTS_NO_PRIV flag. - Removed the assumption on the ordering of processes in the boot image table. System processes can now appear in any order in the boot image table. - Privilege ids can now be assigned both statically or dynamically. The kernel assigns static privilege ids to kernel tasks and the root system process. Each id is directly derived from the process number. - User processes now all share the static privilege id of the root user process (now INIT). - sys_privctl split: we have more calls now to let RS set privileges for system processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS / SYS_PRIV_SET_USER are used to set privileges for a system / user process. - boot image table flags split: PROC_FULLVM is the only flag that has been moved out of the privilege flags and is still maintained in the boot image table. All the other privilege flags are out of the kernel now. RS CHANGES: - RS is the only user-space process who gets to run right after in-kernel startup. - RS uses the boot image table from the kernel and three additional boot image info table (priv table, sys table, dev table) to complete the initialization of the system. - RS checks that the entries in the priv table match the entries in the boot image table to make sure that every process in the boot image gets schedulable. - RS only uses static privilege ids to set privileges for system services in the boot image. - RS includes basic memory management support to allocate the boot image buffer dynamically during initialization. The buffer shall contain the executable image of all the system services we would like to restart after a crash. - First step towards decoupling between resource provisioning and resource requirements in RS: RS must know what resources it needs to restart a process and what resources it has currently available. This is useful to tradeoff reliability and resource consumption. When required resources are missing, the process cannot be restarted. In that case, in the future, a system flag will tell RS what to do. For example, if CORE_PROC is set, RS should trigger a system-wide panic because the system can no longer function correctly without a core system process. PM CHANGES: - The process tree built at initialization time is changed to have INIT as root with pid 0, RS child of INIT and all the system services children of RS. This is required to make RS in control of all the system services. - PM no longer registers labels for system services in the boot image. This is now part of RS's initialization process.
2009-12-11 01:08:19 +01:00
#define BEG_PRIV_ADDR (&priv[0])
#define END_PRIV_ADDR (&priv[NR_SYS_PROCS])
#define BEG_STATIC_PRIV_ADDR BEG_PRIV_ADDR
#define END_STATIC_PRIV_ADDR (BEG_STATIC_PRIV_ADDR + NR_STATIC_PRIV_IDS)
#define BEG_DYN_PRIV_ADDR END_STATIC_PRIV_ADDR
#define END_DYN_PRIV_ADDR END_PRIV_ADDR
2005-07-14 17:26:26 +02:00
#define priv_addr(i) (ppriv_addr)[(i)]
#define priv_id(rp) ((rp)->p_priv->s_id)
#define priv(rp) ((rp)->p_priv)
#define id_to_nr(id) priv_addr(id)->s_proc_nr
#define nr_to_id(nr) priv(proc_addr(nr))->s_id
2005-07-14 17:26:26 +02:00
IPC privileges fixes Kernel: o Remove s_ipc_sendrec, instead using s_ipc_to for all send primitives o Centralize s_ipc_to bit manipulation, - disallowing assignment of bits pointing to unused priv structs; - preventing send-to-self by not setting bit for own priv struct; - preserving send mask matrix symmetry in all cases o Add IPC send mask checks to SENDA, which were missing entirely somehow o Slightly improve IPC stats accounting for SENDA o Remove SYSTEM from user processes' send mask o Half-fix the dependency between boot image order and process numbers, - correcting the table order of the boot processes; - documenting the order requirement needed for proper send masks; - warning at boot time if the order is violated RS: o Add support in /etc/drivers.conf for servers that talk to user processes, - disallowing IPC to user processes if no "ipc" field is present - adding a special "USER" label to explicitly allow IPC to user processes o Always apply IPC masks when specified; remove -i flag from service(8) o Use kernel send mask symmetry to delay adding IPC permissions for labels that do not exist yet, adding them to that label's process upon creation o Add VM to ipc permissions list for rtl8139 and fxp in drivers.conf Left to future fixes: o Removal of the table order vs process numbers dependency altogether, possibly using per-process send list structures as used for SYSTEM calls o Proper assignment of send masks to boot processes; some of the assigned (~0) masks are much wider than necessary o Proper assignment of IPC send masks for many more servers in drivers.conf o Removal of the debugging warning about the now legitimate case where RS's add_forward_ipc cannot find the IPC destination's label yet
2009-07-02 18:25:31 +02:00
#define may_send_to(rp, nr) (get_sys_bit(priv(rp)->s_ipc_to, nr_to_id(nr)))
#define may_asynsend_to(rp, nr) (may_send_to(rp, nr) || (rp)->p_nr == nr)
IPC privileges fixes Kernel: o Remove s_ipc_sendrec, instead using s_ipc_to for all send primitives o Centralize s_ipc_to bit manipulation, - disallowing assignment of bits pointing to unused priv structs; - preventing send-to-self by not setting bit for own priv struct; - preserving send mask matrix symmetry in all cases o Add IPC send mask checks to SENDA, which were missing entirely somehow o Slightly improve IPC stats accounting for SENDA o Remove SYSTEM from user processes' send mask o Half-fix the dependency between boot image order and process numbers, - correcting the table order of the boot processes; - documenting the order requirement needed for proper send masks; - warning at boot time if the order is violated RS: o Add support in /etc/drivers.conf for servers that talk to user processes, - disallowing IPC to user processes if no "ipc" field is present - adding a special "USER" label to explicitly allow IPC to user processes o Always apply IPC masks when specified; remove -i flag from service(8) o Use kernel send mask symmetry to delay adding IPC permissions for labels that do not exist yet, adding them to that label's process upon creation o Add VM to ipc permissions list for rtl8139 and fxp in drivers.conf Left to future fixes: o Removal of the table order vs process numbers dependency altogether, possibly using per-process send list structures as used for SYSTEM calls o Proper assignment of send masks to boot processes; some of the assigned (~0) masks are much wider than necessary o Proper assignment of IPC send masks for many more servers in drivers.conf o Removal of the debugging warning about the now legitimate case where RS's add_forward_ipc cannot find the IPC destination's label yet
2009-07-02 18:25:31 +02:00
2005-07-14 17:26:26 +02:00
/* The system structures table and pointers to individual table slots. The
* pointers allow faster access because now a process entry can be found by
* indexing the psys_addr array, while accessing an element i requires a
* multiplication with sizeof(struct sys) to determine the address.
*/
EXTERN struct priv priv[NR_SYS_PROCS]; /* system properties table */
EXTERN struct priv *ppriv_addr[NR_SYS_PROCS]; /* direct slot pointers */
/* Make sure the system can boot. The following sanity check verifies that
* the system privileges table is large enough for the number of processes
* in the boot image.
*/
#if (NR_BOOT_PROCS > NR_SYS_PROCS)
#error NR_SYS_PROCS must be larger than NR_BOOT_PROCS
#endif
#endif /* PRIV_H */