Commit graph

502 commits

Author SHA1 Message Date
Tomas Hruby
a8111c5027 Various small scheduling related fixes 2010-05-26 07:16:39 +00:00
Tomas Hruby
451a6890d6 scheduling - time quantum in miliseconds
- Currently the cpu time quantum is timer-ticks based. Thus the
  remaining quantum is decreased only if the processes is interrupted
  by a timer tick. As processes block a lot this typically does not
  happen for normal user processes. Also the quantum depends on the
  frequency of the timer.

- This change makes the quantum miliseconds based. Internally the
  miliseconds are translated into cpu cycles. Everytime userspace
  execution is interrupted by kernel the cycles just consumed by the
  current process are deducted from the remaining quantum.

- It makes the quantum system timer frequency independent.

- The boot processes quantum is loosely derived from the tick-based
  quantas and 60Hz timer and subject to future change

- the 64bit arithmetics is a little ugly, will be changes once we have
  compiler support for 64bit integers (soon)
2010-05-25 08:06:14 +00:00
Erik van der Kouwe
1f11a57141 Oops, last commit included more than was intended 2010-05-20 08:07:47 +00:00
Erik van der Kouwe
5f15ec05b2 More system processes, this was not enough for the release script to run on some configurations 2010-05-20 08:05:07 +00:00
Tomas Hruby
b09bcf6779 Scheduling server (by Bjorn Swift)
In this second phase, scheduling is moved from PM to its own
scheduler (see r6557 for phase one). In the next phase we hope to a)
include useful information in the "out of quantum" message and b)
create some simple scheduling policy that makes use of that
information.

When the system starts up, PM will iterate over its process table and
ask SCHED to take over scheduling unprivileged processes. This is
done by sending a SCHEDULING_START message to SCHED. This message
includes the processes endpoint, the parent's endpoint and its nice
level. The scheduler adds this process to its schedproc table, issues
a schedctl, and returns its own endpoint to PM - as the endpoint of
the effective scheduler. When a process terminates, a SCHEDULING_STOP
message is sent to the scheduler.

The reason for this effective endpoint is for future compatibility.
Some day, we may have a scheduler that, instead of scheduling the
process itself, forwards the SCHEDULING_START message on to another
scheduler.

PM has information on who schedules whom. As such, scheduling
messages from user-land are sent through PM. An example is when
processes change their priority, using nice(). In that case, a
getsetpriority message is sent to PM, which then sends a
SCHEDULING_SET_NICE to the process's effective scheduler.

When a process is forked through PM, it inherits its parent's
scheduler, but is spawned with an empty quantum. As before, a request
to fork a process flows through VM before returning to PM, which then
wakes up the child process. This flow has been modified slightly so
that PM notifies the scheduler of the new process, before waking up
the child process. If the scheduler fails to take over scheduling,
the child process is torn down and the fork fails with an erroneous
value.

Process priority is entirely decided upon using nice levels. PM
stores a copy of each process's nice level and when a child is
forked, its parent's nice level is sent in the SCHEDULING_START
message. How this level is mapped to a priority queue is up to the
scheduler. It should be noted that the nice level is used to
determine the max_priority and the parent could have been in a lower
priority when it was spawned. To prevent a CPU intensive process from
hawking the CPU by continuously forking children that get scheduled
in the max_priority, the scheduler should determine in which queue
the parent is currently scheduled, and schedule the child in that
same queue.

Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as
NR_SCHED_QUEUES/2. That results in a "off by one" error when
converting priority->nice->priority for nice=0. This also had the
side effect that if someone were to set the MAX_USER_Q to something
else than 0, then USER_Q would be off.
2010-05-18 13:39:04 +00:00
David van Moolenbroek
9ba65d2ea8 This patch switches the MINIX3 ethernet driver stack from a port-based
model to an instance-based model. Each ethernet driver instance is now
responsible for exactly one network interface card. The port field in
/etc/inet.conf now acts as an instance field instead.

This patch also updates the data link protocol. This update:
- eliminates the concept of ports entirely;
- eliminates DL_GETNAME entirely;
- standardizes on using m_source for IPC and DL_ENDPT for safecopies;
- removes error codes from TASK/STAT replies, as they were unused;
- removes a number of other old or unused fields;
- names and renames a few other fields.

All ethernet drivers have been changed to:
- conform to the new protocol, and exactly that;
- take on an instance number based on a given "instance" argument;
- skip that number of PCI devices in probe iterations;
- use config tables and environment variables based on that number;
- no longer be limited to a predefined maximum of cards in any way;
- get rid of any leftover non-safecopy support and other ancient junk;
- have a correct banner protocol figure, or none at all.

Other changes:
* Inet.conf is now taken to be line-based, and supports #-comments.
  No existing installations are expected to be affected by this.
* A new, select-based asynchio library replaces the old one.
  Kindly contributed by Kees J. Bot.
* Inet now supports use of select() on IP devices.
  Combined, the last two changes together speed up dhcpd
  considerably in the presence of multiple interfaces.
* A small bug has been fixed in nonamed.
2010-05-17 22:22:53 +00:00
Erik van der Kouwe
7570df267f Full 64-bit multitplication and division added to u64 library 2010-05-17 16:44:26 +00:00
Arun Thomas
5706670029 Convert boot/ and commands/ over to bsdmake 2010-05-12 16:28:54 +00:00
Ben Gras
c5c25e7abc kernel/vm: change pde table info from single buffer to explicit per-process.
makes code in kernel more readable, and allows better sanity checking on
using the pde info.
2010-05-12 08:31:05 +00:00
Erik van der Kouwe
b7bf2733d6 Intermediate boot verbosity level EXTRA (2), MAX moved to 3 2010-05-10 18:07:59 +00:00
Tomas Hruby
5c63cac05a Removed defines not used since r6844. 2010-05-10 13:29:04 +00:00
Tomas Hruby
6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Ben Gras
f78d8e74fd secondary cache feature in vm.
A new call to vm lets processes yield a part of their memory to vm,
together with an id, getting newly allocated memory in return. vm is
allowed to forget about it if it runs out of memory. processes can ask
for it back using the same id. (These two operations are normally
combined in a single call.)

It can be used as a as-big-as-memory-will-allow block cache for
filesystems, which is how mfs now uses it.
2010-05-05 11:35:04 +00:00
Ben Gras
029e809780 driver.h: increase max no. of open minors. 2010-05-03 19:43:54 +00:00
Erik van der Kouwe
4b34ff6903 Add syslib function to obtain CPU frequency 2010-05-03 19:41:04 +00:00
Ben Gras
99a13341bd cpufeature() - rename _SSEx and correct logic in cpufeature() in lib 2010-04-29 19:08:49 +00:00
Cristiano Giuffrida
0164957abb Unified crash recovery and live update.
RS CHANGES:
- Crash recovery is now implemented like live update. Two instances are kept
side by side and the dead version is live updated into the new one. The endpoint
doesn't change and the failure is not exposed (by default) to other system
services.
- The new instance can be created reactively (when a crash is detected) or
proactively. In the latter case, RS can be instructed to keep a replica of
the system service to perform a hot swap when the service fails. The flag
SF_USE_REPL is set in that case.
- The new flag SF_USE_REPL is supported for services in the boot image and
dynamically started services through the RS interface (i.e. -p option in the
service utility).
- Fixed a free unallocated memory bug for core system services.
2010-04-27 11:17:30 +00:00
Tomas Hruby
30798fc3e1 Removed unused prototype 2010-04-26 23:39:05 +00:00
Tomas Hruby
f51eea4b32 Changed pagefault delivery to VM
this patch changes the way pagefaults are delivered to VM. It adopts
the same model as the out-of-quantum messages sent by kernel to a
scheduler.

- everytime a userspace pagefault occurs, kernel creates a message
  which is sent to VM on behalf of the faulting process

- the process is blocked on delivery to VM in the standard IPC code
  instead of waiting in a spacial in-kernel queue (stack) and is not
  runnable until VM tell kernel that the pagefault is resolved and is
  free to clear the RTS_PAGEFAULT flag.

- VM does not need call kernel and poll the pagefault information
  which saves many (1/2?) calls and kernel calls that return "no more
  data"

- VM notification by kernel does not need to use signals

- each entry in proc table is by 12 bytes smaller (~3k save)
2010-04-26 23:21:26 +00:00
Kees van Reeuwijk
86a23c1fbd Remove U16_t and most other similar types. Rewrite functions to ansi-style
declaration if necessary.
2010-04-21 11:05:22 +00:00
Kees van Reeuwijk
8a304627a3 Forgot to add two new files to SVN. 2010-04-20 07:17:03 +00:00
Kees van Reeuwijk
e85f78a20b Add some support for wchar_t. 2010-04-19 15:20:24 +00:00
Erik van der Kouwe
7de730afe4 Add scancode reading capability to TTY 2010-04-15 06:55:42 +00:00
Kees van Reeuwijk
8005ac2c64 Add timerisclear() macro. 2010-04-14 17:51:39 +00:00
Kees van Reeuwijk
fa3adedf63 Remove some duplicate declarations in headers.
Explicitly declare some functions as returning void.
2010-04-13 15:22:38 +00:00
Kees van Reeuwijk
bc314bda91 Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar.
Use ANSI-style function declarations where necessary.
2010-04-13 10:58:41 +00:00
Cristiano Giuffrida
65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Ben Gras
1c8c8aa4d8 isblank() implementation. 2010-04-08 15:00:25 +00:00
Cristiano Giuffrida
48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk
c114df82ec Rename all uses of U8_t to u8_t and remove U8_t, remove unused I8_t,
Remove all uses of U16_t and U32_t in pci-related code.
If necessary to avoid problems, change functions to ansi-style declaration.
2010-04-07 13:35:56 +00:00
Cristiano Giuffrida
d8b42a755d Move kernel signal SIGKNDELAY to system signal SIGSNDELAY and fix broken ptrace. 2010-03-31 08:55:12 +00:00
Lorenzo Cavallaro
8dfc7699a6 cdecl calling convention requires to push arguments on the stack in a
reverse order to easily support variadic arguments. Thus, instead of
using the proper stdarg.h macros (that nowadays are
compiler-dependent), it may be tempting to directly take the address of
the last argument and considering it as the start of an array. This is
a shortcut that avoid looping to get all the arguments as the CPU
already pushed them on the stack before the call to the function.

Unfortunately, such an assumption is strictly compiler-dependent and
compilers are free to move the last argument on the stack, as a local
variable, and return the address of the location where the argument was
stored, if asked for. This will break things as the rest of the array's
argument are stored elsewhere (typically, a couple of words above the
location where the argument was stored).

This patch fixes the issue by allowing ACK to take the shortcut and
enabling gcc/llvm-gcc to follow the right way.
2010-03-30 09:36:46 +00:00
Tomas Hruby
63e2d73d1b Fixed brackets in bitmap macros 2010-03-30 08:34:33 +00:00
Tomas Hruby
5b52c5aa02 A reliable way for userspace to check if a msg is from kernel
- IPC_FLG_MSG_FROM_KERNEL status flag is returned to userspace if the
  receive was satisfied by s message which was sent by the kernel on
  behalf of a process. This perfectly reliale information.

- MF_SENDING_FROM_KERNEL flag added to processes to be able to set
  IPC_FLG_MSG_FROM_KERNEL when finishing receive if the receiver
  wasn't ready to receive immediately.

- PM is changed to use this information to confirm that the scheduling
  messages are indeed from the kernel and not faked by a process.

  PM uses sef_receive_status()

- get_work() is removed from PM to make the changes simpler
2010-03-29 11:25:01 +00:00
Tomas Hruby
b4cf88a04f Userspace scheduling
- cotributed by Bjorn Swift

- In this first phase, scheduling is moved from the kernel to the PM
  server. The next steps are to a) moving scheduling to its own server
  and b) include useful information in the "out of quantum" message,
  so that the scheduler can make use of this information.

- The kernel process table now keeps record of who is responsible for
  scheduling each process (p_scheduler). When this pointer is NULL,
  the process will be scheduled by the kernel. If such a process runs
  out of quantum, the kernel will simply renew its quantum an requeue
  it.

- When PM loads, it will take over scheduling of all running
  processes, except system processes, using sys_schedctl().
  Essentially, this only results in taking over init. As children
  inherit a scheduler from their parent, user space programs forked by
  init will inherit PM (for now) as their scheduler.

 - Once a process has been assigned a scheduler, and runs out of
   quantum, its RTS_NO_QUANTUM flag will be set and the process
   dequeued. The kernel will send a message to the scheduler, on the
   process' behalf, informing the scheduler that it has run out of
   quantum. The scheduler can take what ever action it pleases, based
   on its policy, and then reschedule the process using the
   sys_schedule() system call.

- Balance queues does not work as before. While the old in-kernel
  function used to renew the quantum of processes in the highest
  priority run queue, the user-space implementation only acts on
  processes that have been bumped down to a lower priority queue.
  This approach reacts slower to changes than the old one, but saves
  us sending a sys_schedule message for each process every time we
  balance the queues. Currently, when processes are moved up a
  priority queue, their quantum is also renewed, but this can be
  fiddled with.

- do_nice has been removed from kernel. PM answers to get- and
  setpriority calls, updates it's own nice variable as well as the
  max_run_queue. This will be refactored once scheduling is moved to a
  separate server. We will probably have PM update it's local nice
  value and then send a message to whoever is scheduling the process.

- changes to fix an issue in do_fork() where processes could run out
  of quantum but bypassing the code path that handles it correctly.
  The future plan is to remove the policy from do_fork() and implement
  it in userspace too.
2010-03-29 11:07:20 +00:00
Lorenzo Cavallaro
a16308efdb cdecl calling convention expects the callee to pop the hidden pointer on
struct return. For example, GCC and LLVM comply with this (tested on IA32).

ACK doesn't seem to follow this convention and expects the caller to clean up
the stack. Compiling hand-written ACK-compliant assembly code (returning a 
struct) with GCC or LLVM used to break things (4-bytes misaligned stack).

The patch fixes this problem.
2010-03-24 17:25:17 +00:00
Kees van Reeuwijk
407316e451 More const correctness.
Removed prototype for unimplemented getpgid() function.
Removed a value return from a void function.
2010-03-23 14:25:09 +00:00
Cristiano Giuffrida
bde2109b7c IPC status code for receive().
IPC changes:
- receive() is changed to take an additional parameter, which is a pointer to
a status code.
- The status code is filled in by the kernel to provide additional information
to the caller. For now, the kernel only fills in the IPC call used by the
sender.

Syslib changes:
- sef_receive() has been split into sef_receive() (with the original semantics)
and sef_receive_status() which exposes the status code to userland.
- Ideally, every sys process should gradually switch to sef_receive_status()
and use is_ipc_notify() as a dependable way to check for notify.
- SEF has been modified to use is_ipc_notify() and demonstrate how to use the
new status code.
2010-03-23 00:09:11 +00:00
Cristiano Giuffrida
ef95bf1bb9 Print stacktrace when a system service fails or when a core dump has to be generated for a user process. 2010-03-22 22:46:29 +00:00
Arun Thomas
436d6012a3 Convert drivers/ and servers/ over to bsdmake
-Move libdriver to lib/
-Install all boot image services on filesystem to aid restartability
2010-03-22 21:25:22 +00:00
Tomas Hruby
12ef495cac atomicity fix when enabling paging
- before enabling paging VM asks kernel to resize its segments. This
  may cause kernel to segfault if APIC is used and an interrupt
  happens between this and paging enabled. As these are 2 separate
  vmctl calls it is not atomic. This patch fixes this problem. VM does
  not ask kernel to resize the segments in a separate call anymore.
  The new segments limit is part of the "enable paging" call. It
  generalizes this call in such a way that more information can be
  passed as need be or the information may be completely different if
  another architecture requires this.
2010-03-22 07:42:52 +00:00
Kees van Reeuwijk
4432f197c1 Add a define for NSIG. 2010-03-17 13:43:34 +00:00
Cristiano Giuffrida
cb176df60f New RS and new signal handling for system processes.
UPDATING INFO:
20100317:
        /usr/src/etc/system.conf updated to ignore default kernel calls: copy
        it (or merge it) to /etc/system.conf.
        The hello driver (/dev/hello) added to the distribution:
        # cd /usr/src/commands/scripts && make clean install
        # cd /dev && MAKEDEV hello

KERNEL CHANGES:
- Generic signal handling support. The kernel no longer assumes PM as a signal
manager for every process. The signal manager of a given process can now be
specified in its privilege slot. When a signal has to be delivered, the kernel
performs the lookup and forwards the signal to the appropriate signal manager.
PM is the default signal manager for user processes, RS is the default signal
manager for system processes. To enable ptrace()ing for system processes, it
is sufficient to change the default signal manager to PM. This will temporarily
disable crash recovery, though.
- sys_exit() is now split into sys_exit() (i.e. exit() for system processes,
which generates a self-termination signal), and sys_clear() (i.e. used by PM
to ask the kernel to clear a process slot when a process exits).
- Added a new kernel call (i.e. sys_update()) to swap two process slots and
implement live update.

PM CHANGES:
- Posix signal handling is no longer allowed for system processes. System
signals are split into two fixed categories: termination and non-termination
signals. When a non-termination signaled is processed, PM transforms the signal
into an IPC message and delivers the message to the system process. When a
termination signal is processed, PM terminates the process.
- PM no longer assumes itself as the signal manager for system processes. It now
makes sure that every system signal goes through the kernel before being
actually processes. The kernel will then dispatch the signal to the appropriate
signal manager which may or may not be PM.

SYSLIB CHANGES:
- Simplified SEF init and LU callbacks.
- Added additional predefined SEF callbacks to debug crash recovery and
live update.
- Fixed a temporary ack in the SEF init protocol. SEF init reply is now
completely synchronous.
- Added SEF signal event type to provide a uniform interface for system
processes to deal with signals. A sef_cb_signal_handler() callback is
available for system processes to handle every received signal. A
sef_cb_signal_manager() callback is used by signal managers to process
system signals on behalf of the kernel.
- Fixed a few bugs with memory mapping and DS.

VM CHANGES:
- Page faults and memory requests coming from the kernel are now implemented
using signals.
- Added a new VM call to swap two process slots and implement live update.
- The call is used by RS at update time and in turn invokes the kernel call
sys_update().

RS CHANGES:
- RS has been reworked with a better functional decomposition.
- Better kernel call masks. com.h now defines the set of very basic kernel calls
every system service is allowed to use. This makes system.conf simpler and
easier to maintain. In addition, this guarantees a higher level of isolation
for system libraries that use one or more kernel calls internally (e.g. printf).
- RS is the default signal manager for system processes. By default, RS
intercepts every signal delivered to every system process. This makes crash
recovery possible before bringing PM and friends in the loop.
- RS now supports fast rollback when something goes wrong while initializing
the new version during a live update.
- Live update is now implemented by keeping the two versions side-by-side and
swapping the process slots when the old version is ready to update.
- Crash recovery is now implemented by keeping the two versions side-by-side
and cleaning up the old version only when the recovery process is complete.

DS CHANGES:
- Fixed a bug when the process doing ds_publish() or ds_delete() is not known
by DS.
- Fixed the completely broken support for strings. String publishing is now
implemented in the system library and simply wraps publishing of memory ranges.
Ideally, we should adopt a similar approach for other data types as well.
- Test suite fixed.

DRIVER CHANGES:
- The hello driver has been added to the Minix distribution to demonstrate basic
live update and crash recovery functionalities.
- Other drivers have been adapted to conform the new SEF interface.
2010-03-17 01:15:29 +00:00
Kees van Reeuwijk
d89e33fc92 Suppressed some warnings in the WIFSIGNALED macro. 2010-03-15 18:33:29 +00:00
Thomas Veerman
bef0e3eb63 - Add support for the ucontext system calls (getcontext, setcontext,
swapcontext, and makecontext).
- Fix VM to not erroneously think the stack segment and data segment have
  collided when a user-space thread invokes brk().
- Add test51 to test ucontext functionality.
- Add man pages for ucontext system calls.
2010-03-12 15:58:41 +00:00
Kees van Reeuwijk
23e97af1b4 Add an UNUSED annotation, and use it in libsys. 2010-03-11 14:23:33 +00:00
Kees van Reeuwijk
88ac328e6b Add prototypes for a bunch of time-related functions. Surprisingly,
they were in the implementation, but not in the header files.
2010-03-09 22:10:58 +00:00
Kees van Reeuwijk
a34d34bc1f Add a set of declarations to math.h. Since we don't actually have
implementations for these functions, we lean on GNU builtin functions
for using them, so these declarations are also conditional on using
a GNU compiler.
2010-03-09 22:05:20 +00:00
Arun Thomas
1f9ce647cf Move archtypes.h, fpu.h, and stackframe.h
Move archtypes.h to include/ dir, since several servers require it. Move
fpu.h and stackframe.h to arch-specific header directory. Make source
files and makefiles aware of the new header locations.
2010-03-09 09:41:14 +00:00
Arun Thomas
2a8fabf4ad Include directory reorg and makefile updates.
-Convert the include directory over to using bsdmake
 syntax
-Update/add mkfiles
-Modify install(1) so that it can create symlinks
-Update makefiles to use new install(1) options
-Rename /usr/include/ibm to /usr/include/i386
-Create /usr/include/machine symlink to arch header files
-Move vm_i386.h to its new home in the /usr/include/i386
-Update source files to #include the header files at their
 new homes.
-Add new gnu-includes target for building GCC headers
2010-03-08 11:04:59 +00:00