Commit graph

35 commits

Author SHA1 Message Date
Cristiano Giuffrida
cb176df60f New RS and new signal handling for system processes.
UPDATING INFO:
20100317:
        /usr/src/etc/system.conf updated to ignore default kernel calls: copy
        it (or merge it) to /etc/system.conf.
        The hello driver (/dev/hello) added to the distribution:
        # cd /usr/src/commands/scripts && make clean install
        # cd /dev && MAKEDEV hello

KERNEL CHANGES:
- Generic signal handling support. The kernel no longer assumes PM as a signal
manager for every process. The signal manager of a given process can now be
specified in its privilege slot. When a signal has to be delivered, the kernel
performs the lookup and forwards the signal to the appropriate signal manager.
PM is the default signal manager for user processes, RS is the default signal
manager for system processes. To enable ptrace()ing for system processes, it
is sufficient to change the default signal manager to PM. This will temporarily
disable crash recovery, though.
- sys_exit() is now split into sys_exit() (i.e. exit() for system processes,
which generates a self-termination signal), and sys_clear() (i.e. used by PM
to ask the kernel to clear a process slot when a process exits).
- Added a new kernel call (i.e. sys_update()) to swap two process slots and
implement live update.

PM CHANGES:
- Posix signal handling is no longer allowed for system processes. System
signals are split into two fixed categories: termination and non-termination
signals. When a non-termination signaled is processed, PM transforms the signal
into an IPC message and delivers the message to the system process. When a
termination signal is processed, PM terminates the process.
- PM no longer assumes itself as the signal manager for system processes. It now
makes sure that every system signal goes through the kernel before being
actually processes. The kernel will then dispatch the signal to the appropriate
signal manager which may or may not be PM.

SYSLIB CHANGES:
- Simplified SEF init and LU callbacks.
- Added additional predefined SEF callbacks to debug crash recovery and
live update.
- Fixed a temporary ack in the SEF init protocol. SEF init reply is now
completely synchronous.
- Added SEF signal event type to provide a uniform interface for system
processes to deal with signals. A sef_cb_signal_handler() callback is
available for system processes to handle every received signal. A
sef_cb_signal_manager() callback is used by signal managers to process
system signals on behalf of the kernel.
- Fixed a few bugs with memory mapping and DS.

VM CHANGES:
- Page faults and memory requests coming from the kernel are now implemented
using signals.
- Added a new VM call to swap two process slots and implement live update.
- The call is used by RS at update time and in turn invokes the kernel call
sys_update().

RS CHANGES:
- RS has been reworked with a better functional decomposition.
- Better kernel call masks. com.h now defines the set of very basic kernel calls
every system service is allowed to use. This makes system.conf simpler and
easier to maintain. In addition, this guarantees a higher level of isolation
for system libraries that use one or more kernel calls internally (e.g. printf).
- RS is the default signal manager for system processes. By default, RS
intercepts every signal delivered to every system process. This makes crash
recovery possible before bringing PM and friends in the loop.
- RS now supports fast rollback when something goes wrong while initializing
the new version during a live update.
- Live update is now implemented by keeping the two versions side-by-side and
swapping the process slots when the old version is ready to update.
- Crash recovery is now implemented by keeping the two versions side-by-side
and cleaning up the old version only when the recovery process is complete.

DS CHANGES:
- Fixed a bug when the process doing ds_publish() or ds_delete() is not known
by DS.
- Fixed the completely broken support for strings. String publishing is now
implemented in the system library and simply wraps publishing of memory ranges.
Ideally, we should adopt a similar approach for other data types as well.
- Test suite fixed.

DRIVER CHANGES:
- The hello driver has been added to the Minix distribution to demonstrate basic
live update and crash recovery functionalities.
- Other drivers have been adapted to conform the new SEF interface.
2010-03-17 01:15:29 +00:00
Ben Gras
e6cb76a2e2 no more kprintf - kernel uses libsys printf now, only kputc is special
to the kernel.
2010-03-03 15:45:01 +00:00
Tomas Hruby
c6fec6866f No locking in kernel code
- No locking in RTS_(UN)SET macros

- No lock_notify()

- Removed unused lock_send()

- No lock/unlock macros anymore
2010-02-09 15:26:58 +00:00
Tomas Hruby
728f0f0c49 Removal of the system task
* Userspace change to use the new kernel calls

	- _taskcall(SYSTASK...) changed to _kernel_call(...)

	- int 32 reused for the kernel calls

	- _do_kernel_call() to make the trap to kernel

	- kernel_call() to make the actuall kernel call from C using
	  _do_kernel_call()

	- unlike ipc call the kernel call always succeeds as kernel is
	  always available, however, kernel may return an error

* Kernel side implementation of kernel calls

	- the SYSTEm task does not run, only the proc table entry is
	  preserved

	- every data_copy(SYSTEM is no data_copy(KERNEL

	- "locking" is an empty operation now as everything runs in
	  kernel

	- sys_task() is replaced by kernel_call() which copies the
	  message into kernel, dispatches the call to its handler and
	  finishes by either copying the results back to userspace (if
	  need be) or by suspending the process because of VM

	- suspended processes are later made runnable once the memory
	  issue is resolved, picked up by the scheduler and only at
	  this time the call is resumed (in fact restarted) which does
	  not need to copy the message from userspace as the message
	  is already saved in the process structure.

	- no ned for the vmrestart queue, the scheduler will restart
	  the system calls

	- no special case in do_vmctl(), all requests remove the
	  RTS_VMREQUEST flag
2010-02-09 15:20:09 +00:00
Tomas Hruby
cca24d06d8 This patch removes the global variables who_p and who_e from the
kernel (sys task).  The main reason is that these would have to become
cpu local variables on SMP.  Once the system task is not a task but a
genuine part of the kernel there is even less reason to have these
extra variables as proc_ptr will already contain all neccessary
information. In addition converting who_e to the process pointer and
back again all the time will be avoided.

Although proc_ptr will contain all important information, accessing it
as a cpu local variable will be fairly expensive, hence the value
would be assigned to some on stack local variable. Therefore it is
better to add the 'caller' argument to the syscall handlers to pass
the value on stack anyway. It also clearly denotes on who's behalf is
the syscall being executed.

This patch also ANSIfies the syscall function headers.

Last but not least, it also fixes a potential bug in virtual_copy_f()
in case the check is disabled. So far the function in case of a
failure could possible reuse an old who_p in case this function had
not been called from the system task.

virtual_copy_f() takes the caller as a parameter too. In case the
checking is disabled, the caller must be NULL and non NULL if it is
enabled as we must be able to suspend the caller.
2010-02-03 09:04:48 +00:00
Cristiano Giuffrida
c5b309ff07 Merge of Wu's GSOC 09 branch (src.20090525.r4372.wu)
Main changes:
- COW optimization for safecopy.
- safemap, a grant-based interface for sharing memory regions between processes.
- Integration with safemap and complete rework of DS, supporting new data types
  natively (labels, memory ranges, memory mapped ranges).
- For further information:
  http://wiki.minix3.org/en/SummerOfCode2009/MemoryGrants

Additional changes not included in the original Wu's branch:
- Fixed unhandled case in VM when using COW optimization for safecopy in case
  of a block that has already been shared as SMAP.
- Better interface and naming scheme for sys_saferevmap and ds_retrieve_map
  calls.
- Better input checking in syslib: check for page alignment when creating
  memory mapping grants.
- DS notifies subscribers when an entry is deleted.
- Documented the behavior of indirect grants in case of memory mapping.
- Test suite in /usr/src/test/safeperf|safecopy|safemap|ds/* reworked
  and extended.
- Minor fixes and general cleanup.
- TO-DO: Grant ids should be generated and managed the way endpoints are to make
sure grant slots are never misreused.
2010-01-14 15:24:16 +00:00
Cristiano Giuffrida
d1fd04e72a Initialization protocol for system services.
SYSLIB CHANGES:
- SEF framework now supports a new SEF Init request type from RS. 3 different
callbacks are available (init_fresh, init_lu, init_restart) to specify
initialization code when a service starts fresh, starts after a live update,
or restarts.

SYSTEM SERVICE CHANGES:
- Initialization code for system services is now enclosed in a callback SEF will
automatically call at init time. The return code of the callback will
tell RS whether the initialization completed successfully.
- Each init callback can access information passed by RS to initialize. As of
now, each system service has access to the public entries of RS's system process
table to gather all the information required to initialize. This design
eliminates many existing or potential races at boot time and provides a uniform
initialization interface to system services. The same interface will be reused
for the upcoming publish/subscribe model to handle dynamic 
registration / deregistration of system services.

VM CHANGES:
- Uniform privilege management for all system services. Every service uses the
same call mask format. For boot services, VM copies the call mask from init
data. For dynamic services, VM still receives the call mask via rs_set_priv
call that will be soon replaced by the upcoming publish/subscribe model.

RS CHANGES:
- The system process table has been reorganized and split into private entries
and public entries. Only the latter ones are exposed to system services.
- VM call masks are now entirely configured in rs/table.c
- RS has now its own slot in the system process table. Only kernel tasks and
user processes not included in the boot image are now left out from the system
process table.
- RS implements the initialization protocol for system services.
- For services in the boot image, RS blocks till initialization is complete and
panics when failure is reported back. Services are initialized in their order of
appearance in the boot image priv table and RS blocks to implements synchronous
initialization for every system service having the flag SF_SYNCH_BOOT set.
- For services started dynamically, the initialization protocol is implemented
as though it were the first ping for the service. In this case, if the
system service fails to report back (or reports failure), RS brings the service
down rather than trying to restart it.
2010-01-08 01:20:42 +00:00
Cristiano Giuffrida
f4574783dc Rewrite of boot process
KERNEL CHANGES:
- The kernel only knows about privileges of kernel tasks and the root system
process (now RS).
- Kernel tasks and the root system process are the only processes that are made
schedulable by the kernel at startup. All the other processes in the boot image
don't get their privileges set at startup and are inhibited from running by the
RTS_NO_PRIV flag.
- Removed the assumption on the ordering of processes in the boot image table.
System processes can now appear in any order in the boot image table.
- Privilege ids can now be assigned both statically or dynamically. The kernel
assigns static privilege ids to kernel tasks and the root system process. Each
id is directly derived from the process number.
- User processes now all share the static privilege id of the root user
process (now INIT).
- sys_privctl split: we have more calls now to let RS set privileges for system
processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the
RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS /
SYS_PRIV_SET_USER are used to set privileges for a system / user process.
- boot image table flags split: PROC_FULLVM is the only flag that has been
moved out of the privilege flags and is still maintained in the boot image
table. All the other privilege flags are out of the kernel now.

RS CHANGES:
- RS is the only user-space process who gets to run right after in-kernel
startup.
- RS uses the boot image table from the kernel and three additional boot image
info table (priv table, sys table, dev table) to complete the initialization
of the system.
- RS checks that the entries in the priv table match the entries in the boot
image table to make sure that every process in the boot image gets schedulable.
- RS only uses static privilege ids to set privileges for system services in
the boot image.
- RS includes basic memory management support to allocate the boot image buffer
dynamically during initialization. The buffer shall contain the executable
image of all the system services we would like to restart after a crash.
- First step towards decoupling between resource provisioning and resource
requirements in RS: RS must know what resources it needs to restart a process
and what resources it has currently available. This is useful to tradeoff
reliability and resource consumption. When required resources are missing, the
process cannot be restarted. In that case, in the future, a system flag will
tell RS what to do. For example, if CORE_PROC is set, RS should trigger a
system-wide panic because the system can no longer function correctly without
a core system process.

PM CHANGES:
- The process tree built at initialization time is changed to have INIT as root
with pid 0, RS child of INIT and all the system services children of RS. This
is required to make RS in control of all the system services.
- PM no longer registers labels for system services in the boot image. This is
now part of RS's initialization process.
2009-12-11 00:08:19 +00:00
Tomas Hruby
a972f4bacc All macros defining rts flags are prefixed with RTS_
- macros used with RTS_SET group of macros to define struct proc p_rts_flags are
  now prefixed with RTS_ to make things clear
2009-11-10 09:11:13 +00:00
Ben Gras
7e73260cf5 - enable remembering of device memory ranges set by PCI and
told to kernel
  - makes VM ask the kernel if a certain process is allowed
    to map in a range of physical memory (VM rounds it to page
    boundaries afterwards - but it's impossible to map anything
    smaller otherwise so I assume this is safe, i.e. there won't
    be anything else in that page; certainly no regular memory)
  - VM permission check cleanup (no more hardcoded calls, less
    hardcoded logic, more readable main loop), a loose end left
    by GQ
  - remove do_copy warning, as the ipc server triggers this but
    it's no more harmful than the special cases already excluded
    explicitly (VFS, PM, etc).
2009-11-03 11:12:23 +00:00
David van Moolenbroek
979bcfc195 - sys_privctl: don't mix message types
- sys_privctl: remove CTL_MM_PRIV (third parameter)
- remove obsolete sys_svrctl.c library file
2009-09-06 12:37:13 +00:00
David van Moolenbroek
b8b8f537bd IPC privileges fixes
Kernel:
o Remove s_ipc_sendrec, instead using s_ipc_to for all send primitives
o Centralize s_ipc_to bit manipulation,
  - disallowing assignment of bits pointing to unused priv structs;
  - preventing send-to-self by not setting bit for own priv struct;
  - preserving send mask matrix symmetry in all cases
o Add IPC send mask checks to SENDA, which were missing entirely somehow
o Slightly improve IPC stats accounting for SENDA
o Remove SYSTEM from user processes' send mask
o Half-fix the dependency between boot image order and process numbers,
  - correcting the table order of the boot processes;
  - documenting the order requirement needed for proper send masks;
  - warning at boot time if the order is violated

RS:
o Add support in /etc/drivers.conf for servers that talk to user processes,
  - disallowing IPC to user processes if no "ipc" field is present
  - adding a special "USER" label to explicitly allow IPC to user processes
o Always apply IPC masks when specified; remove -i flag from service(8)
o Use kernel send mask symmetry to delay adding IPC permissions for labels
  that do not exist yet, adding them to that label's process upon creation
o Add VM to ipc permissions list for rtl8139 and fxp in drivers.conf

Left to future fixes:
o Removal of the table order vs process numbers dependency altogether,
  possibly using per-process send list structures as used for SYSTEM calls
o Proper assignment of send masks to boot processes;
  some of the assigned (~0) masks are much wider than necessary
o Proper assignment of IPC send masks for many more servers in drivers.conf
o Removal of the debugging warning about the now legitimate case where RS's
  add_forward_ipc cannot find the IPC destination's label yet
2009-07-02 16:25:31 +00:00
Ben Gras
c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg
3c2e122d6d Disabled code to set ipc_stats_target. 2008-02-22 10:58:27 +00:00
Philip Homburg
7541e0753b Separate permissions for sendrec. Actually initialize send/sendrec permissions
for data supplied by rs.
2007-04-23 13:30:04 +00:00
Ben Gras
7507ebfeca remove debug message 2007-03-30 15:17:03 +00:00
Ben Gras
3bb73b431b add/re-enable at_wini debug output 2007-02-21 17:49:35 +00:00
Ben Gras
a47531cc97 removed some verbose messages 2007-02-16 15:53:10 +00:00
Ben Gras
41e9fedf87 Mostly bugfixes of bugs triggered by the test set.
bugfixes:
 SYSTEM:
 . removed
        rc->p_priv->s_flags = 0;
   for the priv struct shared by all user processes in get_priv(). this
   should only be done once. doing a SYS_PRIV_USER in sys_privctl()
   caused the flags of all user processes to be reset, so they were no
   longer PREEMPTIBLE. this happened when RS executed a policy script.
   (this broke test1 in the test set)

 VFS/MFS:
 . chown can change the mode of a file, and chmod arguments are only
   part of the full file mode so the full filemode is slightly magic.
   changed these calls so that the final modes are returned to VFS, so
   that the vnode can be kept up-to-date.
   (this broke test11 in the test set)

 MFS:
 . lookup() checked for sizeof(string) instead of sizeof(user_path),
   truncating long path names
   (caught by test 23)
 . truncate functions neglected to update ctime
   (this broke test16)

 VFS:
 . corner case of an empty filename lookup caused fields of a request
   not to be filled in in the lookup functions, not making it clear
   that the lookup had failed, causing messages to garbage processes,
   causing strange failures.
   (caught by test 30)
 . trust v_size in vnode when doing reads or writes on non-special
   files, truncating i/o where necessary; this is necessary for pipes,
   as MFS can't tell when a pipe has been truncated without it being
   told explicitly each time.
   when the last reader/writer on a pipe closes, tell FS about
   the new size using truncate_vn().
   (this broke test 25, among others)
 . permission check for chdir() had disappeared; added a
   forbidden() call
   (caught by test 23)

new code, shouldn't change anything:
 . introduced RTS_SET, RTS_UNSET, and RTS_ISSET macro's, and their
   LOCK variants. These macros set and clear the p_rts_flags field,
   causing a lot of duplicated logic like

       old_flags = rp->p_rts_flags;            /* save value of the flags */
       rp->p_rts_flags &= ~NO_PRIV;
       if (old_flags != 0 && rp->p_rts_flags == 0) lock_enqueue(rp);

   to change into the simpler

       RTS_LOCK_UNSET(rp, NO_PRIV);

   so the macros take care of calling dequeue() and enqueue() (or lock_*()),
   as the case may be). This makes the code a bit more readable and a
   bit less fragile.
 . removed return code from do_clocktick in CLOCK as it currently
   never replies
 . removed some debug code from VFS
 . fixed grant debug message in device.c
 
preemptive checks, tests, changes:
 . added return code checks of receive() to SYSTEM and CLOCK
 . O_TRUNC should never arrive at MFS (added sanity check and removed
   O_TRUNC code)
 . user_path declared with PATH_MAX+1 to let it be null-terminated
 . checks in MFS to see if strings passed by VFS are null-terminated
 
 IS:
 . static irq name table thrown out
2007-02-01 17:50:02 +00:00
Ben Gras
7195fe3325 System statistical and call profiling
support by Rogier Meurs <rogier@meurs.org>.
2006-10-30 15:53:38 +00:00
Philip Homburg
dd3ee082b2 Initialize priv from user supplied priv structure in SYS_PRIV_INIT.
Added SYS_PRIV_USER call to downgrade a privileged process to a user process.
2006-10-20 14:42:48 +00:00
Ben Gras
002922fa4c New kernel call, SYS_PARAMCTL, that sets parameters of the caller
and is therefore unprivileged. Used to set grant tables.
2006-06-23 15:07:41 +00:00
Ben Gras
3061d7b17a Changed do_devio not to require DIO_TYPE, but to extract type
from DIO_REQUEST. Also do_vdevio. Also do_sdevio, but this
function also supports grant id's and offsets.

do_segctl: rename protected to prot.

do_umap: support for GRANT_SEG umap.

do_privctl: support SYS_PRIV_SET_GRANTS, which sets location and size
of in-own-address-space grant table.

do_safecopy: functions to verify and perform 'safe' (grant-based) copies.
2006-06-20 10:03:10 +00:00
Ben Gras
1335d5d700 'proc number' is process slot, 'endpoint' are generation-aware process
instance numbers, encoded and decoded using macros in <minix/endpoint.h>.

proc number -> endpoint migration
  . proc_nr in the interrupt hook is now an endpoint, proc_nr_e.
  . m_source for messages and notifies is now an endpoint, instead of
    proc number.
  . isokendpt() converts an endpoint to a process number, returns
    success (but fails if the process number is out of range, the
    process slot is not a living process, or the given endpoint
    number does not match the endpoint number in the process slot,
    indicating an old process).
  . okendpt() is the same as isokendpt(), but panic()s if the conversion
    fails. This is mainly used for decoding message.m_source endpoints,
    and other endpoint numbers in kernel data structures, which should
    always be correct.
  . if DEBUG_ENABLE_IPC_WARNINGS is enabled, isokendpt() and okendpt()
    get passed the __FILE__ and __LINE__ of the calling lines, and
    print messages about what is wrong with the endpoint number
    (out of range proc, empty proc, or inconsistent endpoint number),
    with the caller, making finding where the conversion failed easy
    without having to include code for every call to print where things
    went wrong. Sometimes this is harmless (wrong arg to a kernel call),
    sometimes it's a fatal internal inconsistency (bogus m_source).
  . some process table fields have been appended an _e to indicate it's
    become and endpoint.
  . process endpoint is stored in p_endpoint, without generation number.
    it turns out the kernel never needs the generation number, except
    when fork()ing, so it's decoded then.
  . kernel calls all take endpoints as arguments, not proc numbers.
    the one exception is sys_fork(), which needs to know in which slot
    to put the child.
2006-03-03 10:00:02 +00:00
Philip Homburg
38a16399f8 Store resource lists for drivers. Limited checks to enforce those lists. 2006-01-27 13:21:12 +00:00
Ben Gras
32514fb5f9 Al's system call -> kernel call renaming 2005-10-14 08:58:59 +00:00
Jorrit Herder
a01645b788 New scheduling code in kernel. Work in progress.
Round-robin within one priority queue works fine.
Ageing algorithm to be done.
2005-08-19 16:43:28 +00:00
Jorrit Herder
2a165d972e Moved "Changes" comments from system/do_....c to system.h.
All changes are now in a single header file.
2005-08-10 10:23:55 +00:00
Jorrit Herder
b96c389e78 Various small cleanups and comments added. 2005-08-05 09:41:15 +00:00
Jorrit Herder
74711a3b14 Check if kernel calls is allowed (from process' call mask) added. Not yet
enforced. If a call is denied, this will be kprinted. Please report any such
errors, so that I can adjust the mask before returning errors instead of
warnings.

Wrote CMOS driver. All CMOS code from FS has been removed. Currently the
driver only supports get time calls. Set time is left out as an exercise
for the book readers ... startup scripts were updated because the CMOS driver
is needed early on. (IS got same treatment.) Don't forget to run MAKEDEV cmos
in /dev/, otherwise the driver cannot be loaded.
2005-08-04 19:23:03 +00:00
Jorrit Herder
e561081545 Miscellaneous clean ups and fixes to the kernel.
Support for FLOPPY in boot image. (Set controller=fd at boot monitor.)
Moved major device numbers to <minix/dmap.h> (maybe rename to dev.h?)
2005-08-04 09:26:36 +00:00
Jorrit Herder
ab7c0a9926 Cleaned up table. Moved policies to table.
Small fixes to do_copy, do_privctl and do_fork.
2005-08-02 15:28:09 +00:00
Jorrit Herder
0946d128cd - Kernel call handlers cleaned up. More strict checking of input parameters.
- Moved generic_handler() from system.c to system/do_irqctl.c.
- Set privileges of system processes somewhat stricter.
2005-07-29 15:26:23 +00:00
Jorrit Herder
8866b4d0ef Kernel changes:
- reinstalled priority changing, now in sched() and unready()
- reinstalled check on message buffer in sys_call()
- reinstalled check in send masks in sys_call()
- changed do_fork() to get new privilege structure for SYS_PROCs
- removed some processes from boot image---will be dynamically started later
2005-07-26 12:48:34 +00:00
Jorrit Herder
80816ab001 *** empty log message *** 2005-07-22 09:20:43 +00:00