Commit graph

40 commits

Author SHA1 Message Date
Cristiano Giuffrida f4574783dc Rewrite of boot process
KERNEL CHANGES:
- The kernel only knows about privileges of kernel tasks and the root system
process (now RS).
- Kernel tasks and the root system process are the only processes that are made
schedulable by the kernel at startup. All the other processes in the boot image
don't get their privileges set at startup and are inhibited from running by the
RTS_NO_PRIV flag.
- Removed the assumption on the ordering of processes in the boot image table.
System processes can now appear in any order in the boot image table.
- Privilege ids can now be assigned both statically or dynamically. The kernel
assigns static privilege ids to kernel tasks and the root system process. Each
id is directly derived from the process number.
- User processes now all share the static privilege id of the root user
process (now INIT).
- sys_privctl split: we have more calls now to let RS set privileges for system
processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the
RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS /
SYS_PRIV_SET_USER are used to set privileges for a system / user process.
- boot image table flags split: PROC_FULLVM is the only flag that has been
moved out of the privilege flags and is still maintained in the boot image
table. All the other privilege flags are out of the kernel now.

RS CHANGES:
- RS is the only user-space process who gets to run right after in-kernel
startup.
- RS uses the boot image table from the kernel and three additional boot image
info table (priv table, sys table, dev table) to complete the initialization
of the system.
- RS checks that the entries in the priv table match the entries in the boot
image table to make sure that every process in the boot image gets schedulable.
- RS only uses static privilege ids to set privileges for system services in
the boot image.
- RS includes basic memory management support to allocate the boot image buffer
dynamically during initialization. The buffer shall contain the executable
image of all the system services we would like to restart after a crash.
- First step towards decoupling between resource provisioning and resource
requirements in RS: RS must know what resources it needs to restart a process
and what resources it has currently available. This is useful to tradeoff
reliability and resource consumption. When required resources are missing, the
process cannot be restarted. In that case, in the future, a system flag will
tell RS what to do. For example, if CORE_PROC is set, RS should trigger a
system-wide panic because the system can no longer function correctly without
a core system process.

PM CHANGES:
- The process tree built at initialization time is changed to have INIT as root
with pid 0, RS child of INIT and all the system services children of RS. This
is required to make RS in control of all the system services.
- PM no longer registers labels for system services in the boot image. This is
now part of RS's initialization process.
2009-12-11 00:08:19 +00:00
David van Moolenbroek 4924d1a9b5 RS changes:
- add new "control" config directive, to let drivers restart drivers
  (by Jorrit Herder)
- fix bug causing system processes to be started twice sometimes
2009-12-02 09:54:50 +00:00
David van Moolenbroek fe7b2f1652 RS fixes:
- fix resource leak (PCI ACLs) when child fails right after exec
- fix resource leak (memory) when child exec fails at all
- fix race condition setting VM call privileges for new child
- make dev_execve() return a proper result, and check this result
- remove RS_EXECFAILED, as it should behave exactly like RS_EXITING
- add more clarifying comments about starting servers
2009-11-28 13:23:45 +00:00
David van Moolenbroek 8f9a90192f RS: disable harmless warning 2009-10-01 19:21:57 +00:00
David van Moolenbroek b423d7b477 Merge of David's ptrace branch. Summary:
o Support for ptrace T_ATTACH/T_DETACH and T_SYSCALL
o PM signal handling logic should now work properly, even with debuggers
  being present
o Asynchronous PM/VFS protocol, full IPC support for senda(), and
  AMF_NOREPLY senda() flag

DETAILS

Process stop and delay call handling of PM:
o Added sys_runctl() kernel call with sys_stop() and sys_resume()
  aliases, for PM to stop and resume a process
o Added exception for sending/syscall-traced processes to sys_runctl(),
  and matching SIGKREADY pseudo-signal to PM
o Fixed PM signal logic to deal with requests from a process after
  stopping it (so-called "delay calls"), using the SIGKREADY facility
o Fixed various PM panics due to race conditions with delay calls versus
  VFS calls
o Removed special PRIO_STOP priority value
o Added SYS_LOCK RTS kernel flag, to stop an individual process from
  running while modifying its process structure

Signal and debugger handling in PM:
o Fixed debugger signals being dropped if a second signal arrives when
  the debugger has not retrieved the first one
o Fixed debugger signals being sent to the debugger more than once
o Fixed debugger signals unpausing process in VFS; removed PM_UNPAUSE_TR
  protocol message
o Detached debugger signals from general signal logic and from being
  blocked on VFS calls, meaning that even VFS can now be traced
o Fixed debugger being unable to receive more than one pending signal in
  one process stop
o Fixed signal delivery being delayed needlessly when multiple signals
  are pending
o Fixed wait test for tracer, which was returning for children that were
  not waited for
o Removed second parallel pending call from PM to VFS for any process
o Fixed process becoming runnable between exec() and debugger trap
o Added support for notifying the debugger before the parent when a
  debugged child exits
o Fixed debugger death causing child to remain stopped forever
o Fixed consistently incorrect use of _NSIG

Extensions to ptrace():
o Added T_ATTACH and T_DETACH ptrace request, to attach and detach a
  debugger to and from a process
o Added T_SYSCALL ptrace request, to trace system calls
o Added T_SETOPT ptrace request, to set trace options
o Added TO_TRACEFORK trace option, to attach automatically to children
  of a traced process
o Added TO_ALTEXEC trace option, to send SIGSTOP instead of SIGTRAP upon
  a successful exec() of the tracee
o Extended T_GETUSER ptrace support to allow retrieving a process's priv
  structure
o Removed T_STOP ptrace request again, as it does not help implementing
  debuggers properly
o Added MINIX3-specific ptrace test (test42)
o Added proper manual page for ptrace(2)

Asynchronous PM/VFS interface:
o Fixed asynchronous messages not being checked when receive() is called
  with an endpoint other than ANY
o Added AMF_NOREPLY senda() flag, preventing such messages from
  satisfying the receive part of a sendrec()
o Added asynsend3() that takes optional flags; asynsend() is now a
  #define passing in 0 as third parameter
o Made PM/VFS protocol asynchronous; reintroduced tell_fs()
o Made PM_BASE request/reply number range unique
o Hacked in a horrible temporary workaround into RS to deal with newly
  revealed RS-PM-VFS race condition triangle until VFS is asynchronous

System signal handling:
o Fixed shutdown logic of device drivers; removed old SIGKSTOP signal
o Removed is-superuser check from PM's do_procstat() (aka getsigset())
o Added sigset macros to allow system processes to deal with the full
  signal set, rather than just the POSIX subset

Miscellaneous PM fixes:
o Split do_getset into do_get and do_set, merging common code and making
  structure clearer
o Fixed setpriority() being able to put to sleep processes using an
  invalid parameter, or revive zombie processes
o Made find_proc() global; removed obsolete proc_from_pid()
o Cleanup here and there

Also included:
o Fixed false-positive boot order kernel warning
o Removed last traces of old NOTIFY_FROM code

THINGS OF POSSIBLE INTEREST

o It should now be possible to run PM at any priority, even lower than
  user processes
o No assumptions are made about communication speed between PM and VFS,
  although communication must be FIFO
o A debugger will now receive incoming debuggee signals at kill time
  only; the process may not yet be fully stopped
o A first step has been made towards making the SYSTEM task preemptible
2009-09-30 09:57:22 +00:00
Ben Gras 32fa22fc2d RS_LOOKUP feature for libc functions that want to access servers.
let ipc talk to all USER processes and vice versa.

pm sig wrapper notify has to be called from two files.

actually install include files.
2009-09-21 15:25:15 +00:00
Ben Gras e402927576 support for vm priv system.
fix NULL envp ptr.
2009-09-21 14:49:04 +00:00
David van Moolenbroek 979bcfc195 - sys_privctl: don't mix message types
- sys_privctl: remove CTL_MM_PRIV (third parameter)
- remove obsolete sys_svrctl.c library file
2009-09-06 12:37:13 +00:00
Thomas Veerman b47483433c Added a hack to start binaries from the boot image only. In particular, setting
bin_img=1 in the boot monitor will make sure that during the boot procedure the
mfs binary that is part of the boot image is the only binary that is used to
mount partitions. This is useful when for some reason the mfs binary on disk 
malfunctions, rendering Minix unable to boot. By setting bin_img=1, the binary
on disk is ignored and the binary in the boot image is used instead.

- 'service' now accepts an additional flag -r. -r implies -c. -r instructs RS
  to first look in memory if the binary has already been copied to memory and
  execute that version, instead of loading the binary from disk. For example,
  the first time a MFS is being started it is copied (-c) to memory and
  executed from there. The second time MFS is being started this way, RS will
  look in memory for a previously copied MFS binary and reuse it if it exists.
- The mount and newroot commands now accept an additional flag -i, which
  instructs them to set the MS_REUSE flag in the mount flags.
- The mount system call now supports the MS_REUSE flag and invokes 'service'
  with the -r flag when MS_REUSE is set.
- /etc/rc and the rc script that's included in the boot image check for the
  existence of the bin_img flag in the boot monitor, and invoke mount and 
  newroot with the -i flag accordingly.
2009-08-18 11:36:01 +00:00
David van Moolenbroek 013241a006 RS: the plural of 'child' is 'children' 2009-07-11 17:59:05 +00:00
David van Moolenbroek b8b8f537bd IPC privileges fixes
Kernel:
o Remove s_ipc_sendrec, instead using s_ipc_to for all send primitives
o Centralize s_ipc_to bit manipulation,
  - disallowing assignment of bits pointing to unused priv structs;
  - preventing send-to-self by not setting bit for own priv struct;
  - preserving send mask matrix symmetry in all cases
o Add IPC send mask checks to SENDA, which were missing entirely somehow
o Slightly improve IPC stats accounting for SENDA
o Remove SYSTEM from user processes' send mask
o Half-fix the dependency between boot image order and process numbers,
  - correcting the table order of the boot processes;
  - documenting the order requirement needed for proper send masks;
  - warning at boot time if the order is violated

RS:
o Add support in /etc/drivers.conf for servers that talk to user processes,
  - disallowing IPC to user processes if no "ipc" field is present
  - adding a special "USER" label to explicitly allow IPC to user processes
o Always apply IPC masks when specified; remove -i flag from service(8)
o Use kernel send mask symmetry to delay adding IPC permissions for labels
  that do not exist yet, adding them to that label's process upon creation
o Add VM to ipc permissions list for rtl8139 and fxp in drivers.conf

Left to future fixes:
o Removal of the table order vs process numbers dependency altogether,
  possibly using per-process send list structures as used for SYSTEM calls
o Proper assignment of send masks to boot processes;
  some of the assigned (~0) masks are much wider than necessary
o Proper assignment of IPC send masks for many more servers in drivers.conf
o Removal of the debugging warning about the now legitimate case where RS's
  add_forward_ipc cannot find the IPC destination's label yet
2009-07-02 16:25:31 +00:00
Ben Gras 8b72765e39 ignore errors of pipe read (can happen with shutdown now,
now that all fd's are closed neatly in vfs), change messaging
in unexpected restarts
2009-05-06 15:38:32 +00:00
Ben Gras a12113e476 process restarts are pretty rare/serious. 2009-04-27 14:07:47 +00:00
Ben Gras 3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg ca8291c815 Support for restricting limiting IPC to a set of endpoints. Not enabled by
default, pass -i to service. Do not reply to bogus request types. Reply using
sendnb.
2008-02-21 16:20:22 +00:00
Philip Homburg 1f04287b3f Removed dmap table. Publish endpoint in DS before calling mapdriver5. 2007-08-07 12:24:06 +00:00
Philip Homburg 56a68dc32b Hack in service to use RS_START instead of RS_UP. RS reports the use of RS_UP. 2007-05-02 15:20:28 +00:00
Philip Homburg 02a229f14d Publish endpoints in ds. 2007-04-27 13:03:33 +00:00
Philip Homburg b613f5cb4b Report and detect exec failures using a pipe.
XXX Hardcoded values for s_ipc_to and s_ipc_sendrec.
2007-04-23 14:47:04 +00:00
Ben Gras b267d42531 removed or optionalized verbose/debugging messages 2007-02-16 15:50:30 +00:00
Ben Gras 73e4e31376 Don't reply to the caller on RS_DOWN until process is actually dead -
otherwise (e.g.) mounts right after an unmount of the same device don't
work (duplicate label).
2007-01-22 16:44:03 +00:00
Ben Gras 2194bc0310 vfs/mount/rs/service changes:
. changed umount() and mount() to call 'service', so that it can include
   a custom label, so that umount() works again (RS slot gets freed now).
   merged umount() and mount() into one file to encode keep this label
   knowledge in one file.
 . removed obsolete RS_PID field and RS_RESCUE rescue command
 . added label to RS_START struct
 . vfs no longer does kill of fs process on unmount (which was failing
   due to RS_PID request not working)
 . don't assume that if error wasn't one of three errors, that no error
   occured in vfs/request.c
mfs changes:
 . added checks to copy statements to truncate copies at buffer sizes
   (left in debug code for now)
 . added checks for null-terminatedness, if less than NAME_MAX was copied
 . added checks for copy function success
is changes: 
 . dump rs label
drivers.conf changes:
 . added acl for mfs so that mfs can be started with 'service start',
   so that a custom label can be provided
2007-01-22 15:25:41 +00:00
Philip Homburg 0c1d433f60 rs changes (also use driver configurations in the image ramdisk) 2006-10-31 13:35:04 +00:00
Ben Gras fa0ba56bc9 Merge of VFS by Balasz Gerofi with Minix trunk. 2006-10-25 13:40:36 +00:00
Philip Homburg f9ccfca2a1 (Incomplete) support for access control in PCI (pci_set_acl).
-script argument to service for crash recovery scripts
-config argument to service for driver resource configuration
restart command in service to restart a driver after a crash (for use in
crash recovery scripts).
down and refresh now take labels instead of pids.
verious changes in rs to make this work.
2006-10-20 15:01:32 +00:00
Ben Gras b888922d62 Added 'service run' to run a service without restart. 2006-08-15 15:54:51 +00:00
Philip Homburg c3cf4ef460 Fixed off by one error in backoff code. Limit backoff to 1 second for
disk drivers.
2006-05-15 12:08:43 +00:00
Philip Homburg e4967b06bb Special code for restarting disk drivers (-c flag in service). 2006-05-11 14:58:33 +00:00
Ben Gras 6c2a1bac7b endpoint fixes for RS 2006-03-08 14:38:35 +00:00
Ben Gras 7967177710 endpoint-aware conversion of servers.
'who', indicating caller number in pm and fs and some other servers, has
been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.).

In both PM and FS, isokendpt() convert endpoints to process slot
numbers, returning OK if it was a valid and consistent endpoint number.
okendpt() does the same but panic()s if it doesn't succeed. (In PM,
this is pm_isok..)

pm and fs keep their own records of process endpoints in their proc tables,
which are needed to make kernel calls about those processes.

message field names have changed.

fs drivers are endpoints.

fs now doesn't try to get out of driver deadlock, as the protocol isn't
supposed to let that happen any more. (A warning is printed if ELOCKED
is detected though.)

fproc[].fp_task (indicating which driver the process is suspended on)
became an int.

PM and FS now get endpoint numbers of initial boot processes from the
kernel. These happen to be the same as the old proc numbers, to let
user processes reach them with the old numbers, but FS and PM don't know
that. All new processes after INIT, even after the generation number
wraps around, get endpoint numbers with generation 1 and higher, so
the first instances of the boot processes are the only processes ever
to have endpoint numbers in the old proc number range.

More return code checks of sys_* functions have been added.

IS has become endpoint-aware. Ditched the 'text' and 'data' fields
in the kernel dump (which show locations, not sizes, so aren't terribly
useful) in favour of the endpoint number. Proc number is still visible.

Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got
the formatting changed.

PM reading segments using rw_seg() has changed - it uses other fields
in the message now instead of encoding the segment and process number and
fd in the fd field. For that it uses _read_pm() and _write_pm() which to
_taskcall()s directly in pm/misc.c.

PM now sys_exit()s itself on panic(), instead of sys_abort().

RS also talks in endpoints instead of process numbers.
2006-03-03 10:20:58 +00:00
Philip Homburg ee2253ec52 Use the sys_privctl library function. 2006-01-27 13:20:06 +00:00
Jorrit Herder 78f20c3959 Rest ... 2005-10-21 13:46:47 +00:00
Jorrit Herder 9333141704 New rescue functionality. 2005-10-21 13:28:26 +00:00
Ben Gras 1a37474437 . minor formatting fixes (spaces, newlines) of messages
. check pids for being > 0 before kill()ing them (0 and negative
  numbers have special meanings that shouldn't be used)
2005-10-21 11:13:17 +00:00
Jorrit Herder 2a98fed515 New Reincarnation Server functionality.
- service refresh: to cleanly stop and restart a server or driver
- binary exponential backoff: don't restart in a loop
2005-10-20 20:31:18 +00:00
Ben Gras 11146aba3d Newline after startup msg 2005-10-20 18:54:53 +00:00
Jorrit Herder 5a9dec8bd2 New signal handling behaviour at PM (services can be killed).
New Shift-F6 dump for RS server at IS.
New getnpid, getnproc, getpproc library calls at PM.
New reincarnation server (basic functionality is there now).
2005-10-12 15:07:38 +00:00
Ben Gras 42fbd9aced Andy's formatting changes. 2005-09-11 16:45:46 +00:00
Jorrit Herder 7bf400a709 *** empty log message *** 2005-08-23 11:31:32 +00:00
Renamed from servers/sm/manager.c (Browse further)