Commit graph

41 commits

Author SHA1 Message Date
Lionel Sambuc 433d6423c3 New sources layout
Change-Id: Ic716f336b7071063997cf5b4dae6d50e0b4631e9
2014-07-31 16:00:30 +02:00
Ben Gras 50e2064049 No more intel/minix segments.
This commit removes all traces of Minix segments (the text/data/stack
memory map abstraction in the kernel) and significance of Intel segments
(hardware segments like CS, DS that add offsets to all addressing before
page table translation). This ultimately simplifies the memory layout
and addressing and makes the same layout possible on non-Intel
architectures.

There are only two types of addresses in the world now: virtual
and physical; even the kernel and processes have the same virtual
address space. Kernel and user processes can be distinguished at a
glance as processes won't use 0xF0000000 and above.

No static pre-allocated memory sizes exist any more.

Changes to booting:
        . The pre_init.c leaves the kernel and modules exactly as
          they were left by the bootloader in physical memory
        . The kernel starts running using physical addressing,
          loaded at a fixed location given in its linker script by the
          bootloader.  All code and data in this phase are linked to
          this fixed low location.
        . It makes a bootstrap pagetable to map itself to a
          fixed high location (also in linker script) and jumps to
          the high address. All code and data then use this high addressing.
        . All code/data symbols linked at the low addresses is prefixed by
          an objcopy step with __k_unpaged_*, so that that code cannot
          reference highly-linked symbols (which aren't valid yet) or vice
          versa (symbols that aren't valid any more).
        . The two addressing modes are separated in the linker script by
          collecting the unpaged_*.o objects and linking them with low
          addresses, and linking the rest high. Some objects are linked
          twice, once low and once high.
        . The bootstrap phase passes a lot of information (e.g. free memory
          list, physical location of the modules, etc.) using the kinfo
          struct.
        . After this bootstrap the low-linked part is freed.
        . The kernel maps in VM into the bootstrap page table so that VM can
          begin executing. Its first job is to make page tables for all other
          boot processes. So VM runs before RS, and RS gets a fully dynamic,
          VM-managed address space. VM gets its privilege info from RS as usual
          but that happens after RS starts running.
        . Both the kernel loading VM and VM organizing boot processes happen
	  using the libexec logic. This removes the last reason for VM to
	  still know much about exec() and vm/exec.c is gone.

Further Implementation:
        . All segments are based at 0 and have a 4 GB limit.
        . The kernel is mapped in at the top of the virtual address
          space so as not to constrain the user processes.
        . Processes do not use segments from the LDT at all; there are
          no segments in the LDT any more, so no LLDT is needed.
        . The Minix segments T/D/S are gone and so none of the
          user-space or in-kernel copy functions use them. The copy
          functions use a process endpoint of NONE to realize it's
          a physical address, virtual otherwise.
        . The umap call only makes sense to translate a virtual address
          to a physical address now.
        . Segments-related calls like newmap and alloc_segments are gone.
        . All segments-related translation in VM is gone (vir2map etc).
        . Initialization in VM is simpler as no moving around is necessary.
        . VM and all other boot processes can be linked wherever they wish
          and will be mapped in at the right location by the kernel and VM
          respectively.

Other changes:
        . The multiboot code is less special: it does not use mb_print
          for its diagnostics any more but uses printf() as normal, saving
          the output into the diagnostics buffer, only printing to the
          screen using the direct print functions if a panic() occurs.
        . The multiboot code uses the flexible 'free memory map list'
          style to receive the list of free memory if available.
        . The kernel determines the memory layout of the processes to
          a degree: it tells VM where the kernel starts and ends and
          where the kernel wants the top of the process to be. VM then
          uses this entire range, i.e. the stack is right at the top,
          and mmap()ped bits of memory are placed below that downwards,
          and the break grows upwards.

Other Consequences:
        . Every process gets its own page table as address spaces
          can't be separated any more by segments.
        . As all segments are 0-based, there is no distinction between
          virtual and linear addresses, nor between userspace and
          kernel addresses.
        . Less work is done when context switching, leading to a net
          performance increase. (8% faster on my machine for 'make servers'.)
	. The layout and configuration of the GDT makes sysenter and syscall
	  possible.
2012-07-15 22:30:15 +02:00
Ben Gras 230b7775fe changes for detecting and building for clang/binutils elf
and minor fixes:
 . add ack/clean target to lib, 'unify' clean target
 . add includes as library dependency
 . mk: exclude warning options clang doesn't have in non-gcc
 . set -e in lib/*.sh build files
 . clang compile error circumvention (disable NOASSERTS for release builds)
2011-06-07 16:49:52 +02:00
David van Moolenbroek a7285dfabc Kernel/RS: fix permission computation with 32+ system processes 2010-12-07 10:32:42 +00:00
Tomas Hruby 63e2d73d1b Fixed brackets in bitmap macros 2010-03-30 08:34:33 +00:00
Tomas Hruby 1b56fdb33c Time accounting based on TSC
- as thre are still KERNEL and IDLE entries, time accounting for
  kernel and idle time works the same as for any other process

- everytime we stop accounting for the currently running process,
  kernel or idle, we read the TSC counter and increment the p_cycles
  entry.

- the process cycles inherently include some of the kernel cycles as
  we can stop accounting for the process only after we save its
  context and we start accounting just before we restore its context

- this assumes that the system does not scale the CPU frequency which
  will be true for ... long time ;-)
2010-02-10 15:36:54 +00:00
Tomas Hruby c6fec6866f No locking in kernel code
- No locking in RTS_(UN)SET macros

- No lock_notify()

- Removed unused lock_send()

- No lock/unlock macros anymore
2010-02-09 15:26:58 +00:00
Tomas Hruby 728f0f0c49 Removal of the system task
* Userspace change to use the new kernel calls

	- _taskcall(SYSTASK...) changed to _kernel_call(...)

	- int 32 reused for the kernel calls

	- _do_kernel_call() to make the trap to kernel

	- kernel_call() to make the actuall kernel call from C using
	  _do_kernel_call()

	- unlike ipc call the kernel call always succeeds as kernel is
	  always available, however, kernel may return an error

* Kernel side implementation of kernel calls

	- the SYSTEm task does not run, only the proc table entry is
	  preserved

	- every data_copy(SYSTEM is no data_copy(KERNEL

	- "locking" is an empty operation now as everything runs in
	  kernel

	- sys_task() is replaced by kernel_call() which copies the
	  message into kernel, dispatches the call to its handler and
	  finishes by either copying the results back to userspace (if
	  need be) or by suspending the process because of VM

	- suspended processes are later made runnable once the memory
	  issue is resolved, picked up by the scheduler and only at
	  this time the call is resumed (in fact restarted) which does
	  not need to copy the message from userspace as the message
	  is already saved in the process structure.

	- no ned for the vmrestart queue, the scheduler will restart
	  the system calls

	- no special case in do_vmctl(), all requests remove the
	  RTS_VMREQUEST flag
2010-02-09 15:20:09 +00:00
Kees van Reeuwijk a701e290f7 Removed unused symbols.
Made some functions PRIVATE, including ones that aren't used anywhere.
2010-01-25 18:13:48 +00:00
Cristiano Giuffrida d1fd04e72a Initialization protocol for system services.
SYSLIB CHANGES:
- SEF framework now supports a new SEF Init request type from RS. 3 different
callbacks are available (init_fresh, init_lu, init_restart) to specify
initialization code when a service starts fresh, starts after a live update,
or restarts.

SYSTEM SERVICE CHANGES:
- Initialization code for system services is now enclosed in a callback SEF will
automatically call at init time. The return code of the callback will
tell RS whether the initialization completed successfully.
- Each init callback can access information passed by RS to initialize. As of
now, each system service has access to the public entries of RS's system process
table to gather all the information required to initialize. This design
eliminates many existing or potential races at boot time and provides a uniform
initialization interface to system services. The same interface will be reused
for the upcoming publish/subscribe model to handle dynamic 
registration / deregistration of system services.

VM CHANGES:
- Uniform privilege management for all system services. Every service uses the
same call mask format. For boot services, VM copies the call mask from init
data. For dynamic services, VM still receives the call mask via rs_set_priv
call that will be soon replaced by the upcoming publish/subscribe model.

RS CHANGES:
- The system process table has been reorganized and split into private entries
and public entries. Only the latter ones are exposed to system services.
- VM call masks are now entirely configured in rs/table.c
- RS has now its own slot in the system process table. Only kernel tasks and
user processes not included in the boot image are now left out from the system
process table.
- RS implements the initialization protocol for system services.
- For services in the boot image, RS blocks till initialization is complete and
panics when failure is reported back. Services are initialized in their order of
appearance in the boot image priv table and RS blocks to implements synchronous
initialization for every system service having the flag SF_SYNCH_BOOT set.
- For services started dynamically, the initialization protocol is implemented
as though it were the first ping for the service. In this case, if the
system service fails to report back (or reports failure), RS brings the service
down rather than trying to restart it.
2010-01-08 01:20:42 +00:00
Cristiano Giuffrida f4574783dc Rewrite of boot process
KERNEL CHANGES:
- The kernel only knows about privileges of kernel tasks and the root system
process (now RS).
- Kernel tasks and the root system process are the only processes that are made
schedulable by the kernel at startup. All the other processes in the boot image
don't get their privileges set at startup and are inhibited from running by the
RTS_NO_PRIV flag.
- Removed the assumption on the ordering of processes in the boot image table.
System processes can now appear in any order in the boot image table.
- Privilege ids can now be assigned both statically or dynamically. The kernel
assigns static privilege ids to kernel tasks and the root system process. Each
id is directly derived from the process number.
- User processes now all share the static privilege id of the root user
process (now INIT).
- sys_privctl split: we have more calls now to let RS set privileges for system
processes. SYS_PRIV_ALLOW / SYS_PRIV_DISALLOW are only used to flip the
RTS_NO_PRIV flag and allow / disallow a process from running. SYS_PRIV_SET_SYS /
SYS_PRIV_SET_USER are used to set privileges for a system / user process.
- boot image table flags split: PROC_FULLVM is the only flag that has been
moved out of the privilege flags and is still maintained in the boot image
table. All the other privilege flags are out of the kernel now.

RS CHANGES:
- RS is the only user-space process who gets to run right after in-kernel
startup.
- RS uses the boot image table from the kernel and three additional boot image
info table (priv table, sys table, dev table) to complete the initialization
of the system.
- RS checks that the entries in the priv table match the entries in the boot
image table to make sure that every process in the boot image gets schedulable.
- RS only uses static privilege ids to set privileges for system services in
the boot image.
- RS includes basic memory management support to allocate the boot image buffer
dynamically during initialization. The buffer shall contain the executable
image of all the system services we would like to restart after a crash.
- First step towards decoupling between resource provisioning and resource
requirements in RS: RS must know what resources it needs to restart a process
and what resources it has currently available. This is useful to tradeoff
reliability and resource consumption. When required resources are missing, the
process cannot be restarted. In that case, in the future, a system flag will
tell RS what to do. For example, if CORE_PROC is set, RS should trigger a
system-wide panic because the system can no longer function correctly without
a core system process.

PM CHANGES:
- The process tree built at initialization time is changed to have INIT as root
with pid 0, RS child of INIT and all the system services children of RS. This
is required to make RS in control of all the system services.
- PM no longer registers labels for system services in the boot image. This is
now part of RS's initialization process.
2009-12-11 00:08:19 +00:00
David van Moolenbroek fce9fd4b4e Add 'getidle' CPU utilization measurement infrastructure 2009-12-02 11:52:26 +00:00
Tomas Hruby ae75f9d4e5 Removal of the executable flag from files that cannot be executed
- 755 -> 644
2009-11-09 10:26:00 +00:00
Ben Gras 9647fbc94e moved type and constants for random data to include file;
added consistency check in random; added source of randomness
internal to random using timing; only retrieve random bins that are full.
2009-04-02 15:24:44 +00:00
Ben Gras 3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Ben Gras 6f77685609 Split of architecture-dependent and -independent functions for i386,
mainly in the kernel and headers. This split based on work by
Ingmar Alting <iaalting@cs.vu.nl> done for his Minix PowerPC architecture
port.

 . kernel does not program the interrupt controller directly, do any
   other architecture-dependent operations, or contain assembly any more,
   but uses architecture-dependent functions in arch/$(ARCH)/.
 . architecture-dependent constants and types defined in arch/$(ARCH)/include.
 . <ibm/portio.h> moved to <minix/portio.h>, as they have become, for now,
   architecture-independent functions.
 . int86, sdevio, readbios, and iopenable are now i386-specific kernel calls
   and live in arch/i386/do_* now.
 . i386 arch now supports even less 86 code; e.g. mpx86.s and klib86.s have
   gone, and 'machine.protected' is gone (and always taken to be 1 in i386).
   If 86 support is to return, it should be a new architecture.
 . prototypes for the architecture-dependent functions defined in
   kernel/arch/$(ARCH)/*.c but used in kernel/ are in kernel/proto.h
 . /etc/make.conf included in makefiles and shell scripts that need to
   know the building architecture; it defines ARCH=<arch>, currently only
   i386.
 . some basic per-architecture build support outside of the kernel (lib)
 . in clock.c, only dequeue a process if it was ready
 . fixes for new include files

files deleted:
 . mpx/klib.s - only for choosing between mpx/klib86 and -386
 . klib86.s - only for 86

i386-specific files files moved (or arch-dependent stuff moved) to arch/i386/:
 . mpx386.s (entry point)
 . klib386.s
 . sconst.h
 . exception.c
 . protect.c
 . protect.h
 . i8269.c
2006-12-22 15:22:27 +00:00
Philip Homburg add4be444f get_sys_bits 2006-06-23 15:32:24 +00:00
Ben Gras 831bc7ecd1 Move bitmap manipulation macros to <minix/bitmap.h> 2006-06-20 09:50:26 +00:00
Ben Gras 1335d5d700 'proc number' is process slot, 'endpoint' are generation-aware process
instance numbers, encoded and decoded using macros in <minix/endpoint.h>.

proc number -> endpoint migration
  . proc_nr in the interrupt hook is now an endpoint, proc_nr_e.
  . m_source for messages and notifies is now an endpoint, instead of
    proc number.
  . isokendpt() converts an endpoint to a process number, returns
    success (but fails if the process number is out of range, the
    process slot is not a living process, or the given endpoint
    number does not match the endpoint number in the process slot,
    indicating an old process).
  . okendpt() is the same as isokendpt(), but panic()s if the conversion
    fails. This is mainly used for decoding message.m_source endpoints,
    and other endpoint numbers in kernel data structures, which should
    always be correct.
  . if DEBUG_ENABLE_IPC_WARNINGS is enabled, isokendpt() and okendpt()
    get passed the __FILE__ and __LINE__ of the calling lines, and
    print messages about what is wrong with the endpoint number
    (out of range proc, empty proc, or inconsistent endpoint number),
    with the caller, making finding where the conversion failed easy
    without having to include code for every call to print where things
    went wrong. Sometimes this is harmless (wrong arg to a kernel call),
    sometimes it's a fatal internal inconsistency (bogus m_source).
  . some process table fields have been appended an _e to indicate it's
    become and endpoint.
  . process endpoint is stored in p_endpoint, without generation number.
    it turns out the kernel never needs the generation number, except
    when fork()ing, so it's decoded then.
  . kernel calls all take endpoints as arguments, not proc numbers.
    the one exception is sys_fork(), which needs to know in which slot
    to put the child.
2006-03-03 10:00:02 +00:00
Ben Gras 88ba4b5268 added reenter check to lock_dequeue() to avoid unlocking of interrupts
via cause_sig() during an exception.

moved lock check configuration to <minix/sys_config.h> instead of
kernel/config.h, because the 'relocking' field in kinfo depends on it.

other prettification: common locking macro, whether lock timing is on or
not.
2006-02-10 16:53:51 +00:00
Ben Gras d11b2e4b8c Al's double-blank-line removal request 2005-08-22 15:23:47 +00:00
Jorrit Herder 74711a3b14 Check if kernel calls is allowed (from process' call mask) added. Not yet
enforced. If a call is denied, this will be kprinted. Please report any such
errors, so that I can adjust the mask before returning errors instead of
warnings.

Wrote CMOS driver. All CMOS code from FS has been removed. Currently the
driver only supports get time calls. Set time is left out as an exercise
for the book readers ... startup scripts were updated because the CMOS driver
is needed early on. (IS got same treatment.) Don't forget to run MAKEDEV cmos
in /dev/, otherwise the driver cannot be loaded.
2005-08-04 19:23:03 +00:00
Jorrit Herder 89cf745fe2 Single boot driver loaded, while multiple can be included in the boot image.
The user needs to set label=... to choose the driver of his or her choice.
This driver will be mapped onto the controller that is set in controller=...

Minor cleanup of kernel source code (boot image table now is static).
2005-08-03 16:06:35 +00:00
Jorrit Herder 8866b4d0ef Kernel changes:
- reinstalled priority changing, now in sched() and unready()
- reinstalled check on message buffer in sys_call()
- reinstalled check in send masks in sys_call()
- changed do_fork() to get new privilege structure for SYS_PROCs
- removed some processes from boot image---will be dynamically started later
2005-07-26 12:48:34 +00:00
Jorrit Herder 198c976f7e System processes can be signaled; signals are transformed in SYS_EVENT message
that passes signal map along. This mechanisms is also used for nonuser signals
like SIGKMESS, SIGKSTOP, SIGKSIG.

Revised comments of many system call handlers. Renamed setpriority to nice.
2005-07-19 12:21:36 +00:00
Philip Homburg 7d4e914618 Random number generator 2005-07-18 15:40:24 +00:00
Jorrit Herder 42ab148155 Reorganized system call library; uses separate file per call now.
New configuration header file to include/ exclude functionality.
Extracted privileged features from struct proc and create new struct priv.
Renamed various system calls for readability.
2005-07-14 15:12:12 +00:00
Jorrit Herder bac6068857 Rewrite of process scheduling:
- current and maximum priority per process;
- quantum size and current ticks left per process;
- max number of full quantums in a row allow
  (otherwise current priority is decremented)
2005-06-30 15:55:19 +00:00
Ben Gras 3eeff022fb Added function read_cpu_flags() that returns current cpu flags as a
long.  This is used to check for interrupts being disabled at the time
of a lock() call, if enabled in config.h. The number of times this
happens is then counted in the kinfo structure. These events (recursive
lockings) lead to nasty race conditions.
2005-06-20 14:53:13 +00:00
Ben Gras 3c7120d830 Changed arguments of timer library functions. 2005-06-17 13:36:01 +00:00
Jorrit Herder f2a85e58d9 Various updates.
* Removed some variants of the SYS_GETINFO calls from the kernel;
  replaced them with new PM and utils libary functionality. Fixed
  bugs in utils library that used old get_kenv() variant.
* Implemented a buffer in the kernel to gather random data.
  Memory driver periodically checks this for /dev/random.
  A better random algorithm can now be implemented in the driver.
  Removed SYS_RANDOM; the SYS_GETINFO call is used instead.
* Remove SYS_KMALLOC from the kernel. Memory allocation can now
  be done at the process manager with new 'other' library functions.
2005-06-03 13:55:06 +00:00
Ben Gras c977bd8709 Added args to lock() and unlock() to tell them apart, for use
when lock timing is enabled in minix/config.h.

Added phys_zero() routine to klib386.s that zeroes a range of memory, and
added corresponding system call.
2005-06-01 09:37:52 +00:00
Jorrit Herder 0165662cd9 Replaced flagalrm() timers with another technique to check for timeouts.
This allowed removing the p_flagarlm timer from the kernel's process table.
Furthermore, I merged p_syncalrm and p_signalrm into p_alarm_timer to save
even more space. Note that processes can no longer have both a signal and
synchronous alarm timer outstanding as of now.
2005-05-31 14:43:04 +00:00
Jorrit Herder 322ec9ef8b Moved stime, time, times POSIX calls from FS to PM. Removed child time
accounting from kernel (now in PM).  Large amount of files in this commit
is due to system time problems during development.
2005-05-31 09:50:51 +00:00
Jorrit Herder c2be104821 Improved NOTIFY system: fixed a minor error, ignore pending notifications
on SENDREC system calls. To be done: resource (buffer pool) management;
make it structurally impossible to run out of buffers.
2005-05-27 12:44:14 +00:00
Jorrit Herder ccd17ecfed New NOTIFY system call! Queued at kernel. Duplicate messages (with same source
and type) are overwritten with newer flags/ arguments. The interface from
within the kernel is lock_notify(). User processes can make a system call with
notify(). NOTIFY fully replaces the old notification mechanism.
2005-05-24 10:06:17 +00:00
Jorrit Herder 1cb880b158 Intermediate update---please await next commit. 2005-05-19 09:36:44 +00:00
Jorrit Herder ac0995259d *** empty log message *** 2005-05-02 14:30:04 +00:00
Jorrit Herder 89ac678b9b *** empty log message *** 2005-04-29 15:36:43 +00:00
Ben Gras 9865aeaa79 Initial revision 2005-04-21 14:53:53 +00:00