Commit graph

3011 commits

Author SHA1 Message Date
Tomas Hruby
b4cf88a04f Userspace scheduling
- cotributed by Bjorn Swift

- In this first phase, scheduling is moved from the kernel to the PM
  server. The next steps are to a) moving scheduling to its own server
  and b) include useful information in the "out of quantum" message,
  so that the scheduler can make use of this information.

- The kernel process table now keeps record of who is responsible for
  scheduling each process (p_scheduler). When this pointer is NULL,
  the process will be scheduled by the kernel. If such a process runs
  out of quantum, the kernel will simply renew its quantum an requeue
  it.

- When PM loads, it will take over scheduling of all running
  processes, except system processes, using sys_schedctl().
  Essentially, this only results in taking over init. As children
  inherit a scheduler from their parent, user space programs forked by
  init will inherit PM (for now) as their scheduler.

 - Once a process has been assigned a scheduler, and runs out of
   quantum, its RTS_NO_QUANTUM flag will be set and the process
   dequeued. The kernel will send a message to the scheduler, on the
   process' behalf, informing the scheduler that it has run out of
   quantum. The scheduler can take what ever action it pleases, based
   on its policy, and then reschedule the process using the
   sys_schedule() system call.

- Balance queues does not work as before. While the old in-kernel
  function used to renew the quantum of processes in the highest
  priority run queue, the user-space implementation only acts on
  processes that have been bumped down to a lower priority queue.
  This approach reacts slower to changes than the old one, but saves
  us sending a sys_schedule message for each process every time we
  balance the queues. Currently, when processes are moved up a
  priority queue, their quantum is also renewed, but this can be
  fiddled with.

- do_nice has been removed from kernel. PM answers to get- and
  setpriority calls, updates it's own nice variable as well as the
  max_run_queue. This will be refactored once scheduling is moved to a
  separate server. We will probably have PM update it's local nice
  value and then send a message to whoever is scheduling the process.

- changes to fix an issue in do_fork() where processes could run out
  of quantum but bypassing the code path that handles it correctly.
  The future plan is to remove the policy from do_fork() and implement
  it in userspace too.
2010-03-29 11:07:20 +00:00
Tomas Hruby
a3ffc0f7ad Removed NIL_SYS_PROC and NIL_PROC
- NIL_PROC replaced by simple NULLs
2010-03-28 09:54:32 +00:00
Kees van Reeuwijk
98493805fd Lots of const correctness. 2010-03-27 14:31:00 +00:00
Cristiano Giuffrida
9192dbecc9 Preserve the order of IPC messages between two parties.
Currently a sequence of messages between a sender A and a receiver B of the
form: A.asynsend(M1, B); A.send(M2, B) may result in the receiver receiving
M1 first and then M2 or viceversa. This patch makes sure that the original
order M1, M2 is always preserved.

Note that the order of a hypotetical sequence A.asynsend(M1, B);
A.asynsend(M2, B) is already guaranteed by the implementation of
asynsend by design. Other senda-based wrappers can define their own
semantics.
2010-03-27 00:09:22 +00:00
Tomas Hruby
8e5a82fd49 Comment in proc.h
- This comment is not correct as the pproc_addr array does not exist.
2010-03-26 13:19:04 +00:00
Tomas Hruby
1dd6f5573a Direction flag
- ack assumes that the direction flag in eflags is clear when
  assigning two structures. It is implemented by a call to a built-in
  function which is like memcpy but needs the flag to be clear
  otherwise rubish is copied. This patch fixes the kernel entries.
2010-03-26 12:29:52 +00:00
Lorenzo Cavallaro
a16308efdb cdecl calling convention expects the callee to pop the hidden pointer on
struct return. For example, GCC and LLVM comply with this (tested on IA32).

ACK doesn't seem to follow this convention and expects the caller to clean up
the stack. Compiling hand-written ACK-compliant assembly code (returning a 
struct) with GCC or LLVM used to break things (4-bytes misaligned stack).

The patch fixes this problem.
2010-03-24 17:25:17 +00:00
Arun Thomas
5fd3f34273 Minor docs/UPDATING fix 2010-03-24 13:41:38 +00:00
Arun Thomas
14ca3ea017 Sort docs/UPDATING entries reverse chronologically 2010-03-24 10:13:19 +00:00
Arun Thomas
0ea82663e0 Fix crtso building with GCC 2010-03-24 10:11:17 +00:00
Kees van Reeuwijk
407316e451 More const correctness.
Removed prototype for unimplemented getpgid() function.
Removed a value return from a void function.
2010-03-23 14:25:09 +00:00
Tomas Hruby
a6957b7847 Fixed prototype in cat 2010-03-23 13:36:16 +00:00
Tomas Hruby
8451a86f0a Interrupts hadling while idle
- When the cpu halts, the interrupts are enable so the cpu may be
  woken up. When the interrupt handler returns but another interrupt
  is available it is also serviced immediately. This is not a problem
  per-se. It only slightly breaks time accounting as idle accounted is
  for the kernel time in the interrupt handler.
  
  
-  As the big kernel lock is lock/unlocked in the smp branch in the
   time acounting functions as they are called exactly at the places
   we need to take the lock) this leads to a deadlock.

- we make sure that once the interrupt handler returns from the nested
  trap, the interrupts are disabled. This means that only one
  interrupt is serviced after idle is interrupted.

- this requires the loop in apic timer calibration to keep reenabling
  the interrupts. I admit it is a little bit hackish (one line),
  however, this code is a stupid corner case at the boot time.
  Hopefully it does not matter too much.
2010-03-23 13:35:01 +00:00
Cristiano Giuffrida
bde2109b7c IPC status code for receive().
IPC changes:
- receive() is changed to take an additional parameter, which is a pointer to
a status code.
- The status code is filled in by the kernel to provide additional information
to the caller. For now, the kernel only fills in the IPC call used by the
sender.

Syslib changes:
- sef_receive() has been split into sef_receive() (with the original semantics)
and sef_receive_status() which exposes the status code to userland.
- Ideally, every sys process should gradually switch to sef_receive_status()
and use is_ipc_notify() as a dependable way to check for notify.
- SEF has been modified to use is_ipc_notify() and demonstrate how to use the
new status code.
2010-03-23 00:09:11 +00:00
Cristiano Giuffrida
45db6482e8 Prioritized NOTIFY messages for reliable asynchonrous delivery of system events. 2010-03-22 23:44:55 +00:00
Cristiano Giuffrida
ef95bf1bb9 Print stacktrace when a system service fails or when a core dump has to be generated for a user process. 2010-03-22 22:46:29 +00:00
Arun Thomas
436d6012a3 Convert drivers/ and servers/ over to bsdmake
-Move libdriver to lib/
-Install all boot image services on filesystem to aid restartability
2010-03-22 21:25:22 +00:00
Kees van Reeuwijk
c33102ea6b Miscellaneous code cleanup. 2010-03-22 20:43:06 +00:00
Ben Gras
4b2310a7ee only print 1 every 1000 spurious interrupts (per interrupt). 2010-03-22 13:55:51 +00:00
Tomas Hruby
12ef495cac atomicity fix when enabling paging
- before enabling paging VM asks kernel to resize its segments. This
  may cause kernel to segfault if APIC is used and an interrupt
  happens between this and paging enabled. As these are 2 separate
  vmctl calls it is not atomic. This patch fixes this problem. VM does
  not ask kernel to resize the segments in a separate call anymore.
  The new segments limit is part of the "enable paging" call. It
  generalizes this call in such a way that more information can be
  passed as need be or the information may be completely different if
  another architecture requires this.
2010-03-22 07:42:52 +00:00
Tomas Hruby
a5094f7d7f Kernel dumps its registers when exception
- if an exception occurs in kernel and this exception is not handled
  in an sane way and the kernel crashes, it also dumps what was loaded
  in the general purpose registers exactly at the time of the
  exception to help to debug the problem
2010-03-20 14:59:18 +00:00
Erik van der Kouwe
b42c66ed10 this patch adds access to the debug breakpoints to
the kernel. They are not used atm, but having them in trunk allows them
to be easily used when needed. To set a breakpoint that triggers when
the variable foo is written to (the most common use case), one calls:

breakpoint_set(vir2phys((vir_bytes) &foo), 0,
  BREAKPOINT_FLAG_MODE_GLOBAL |
  BREAKPOINT_FLAG_RW_WRITE |
  BREAKPOINT_FLAG_LEN_4);

It can later be disabled using:

breakpoint_set(vir2phys((vir_bytes) &foo), 0,
  BREAKPOINT_FLAG_MODE_OFF);

There are some limitations:

- There are at most four breakpoints (hardware limit); the index of the
  breakpoint (0-3) is specified as the second parameter of
  breakpoint_set.

- The breakpoint exception in the kernel is not handled and causes a
  panic; it would be reasonably easy to change this by inspecing DR6,
  printing a message, disabling the breakpoint and continuing. However,
  in my experience even just a panic can be very useful.

- Breakpoints can be set only in the part of the address space that is
  in every page table. It is useful for the kernel, but to use this for
  user processes would require saving and restoring the debug registers
  as part of the context switch. Although the CPU provides support for
  local breakpoints (I implemened this as BREAKPOINT_FLAG_LOCAL) they
  only work if task switching is used.
2010-03-19 19:15:20 +00:00
Erik van der Kouwe
19ff96081c Specify missing return type 2010-03-19 19:07:00 +00:00
Ben Gras
ec30f25d0c VM: fix kernel mappings for children of non-paged parents. 2010-03-18 17:17:31 +00:00
Tomas Hruby
61348227a7 docs/UPDATING correction 2010-03-18 17:00:32 +00:00
Tomas Hruby
a0602c06a3 Fixed kernel stack comment 2010-03-18 16:18:22 +00:00
Ben Gras
e595328986 remove 3 awk files - don't have generated files in svn. 2010-03-18 14:15:48 +00:00
Ben Gras
f250bfaa13 change messy CREATEPDE macro to clean little function.
forget about the dirtypde bitmap and WIPEPDE/DONEPDE macros too.

check if mapping happens to already be in place, and if so, don't
reload cr3 (on the account of that mapping, that is).

don't reload cr3 unconditionally.
2010-03-18 13:35:41 +00:00
Erik van der Kouwe
c3e73f0793 Provide a warning is a kernel call has been denied, to ease system.conf debugging 2010-03-17 18:23:51 +00:00
Kees van Reeuwijk
4683d6b713 Let awk.old install its awk as /usr/bin/awk.old.
Add one-true-awk as the new awk that installs to /usr/bin/awk.
2010-03-17 16:16:52 +00:00
Kees van Reeuwijk
365495b530 Moved current awk sources to awk.old. 2010-03-17 16:11:48 +00:00
Kees van Reeuwijk
4432f197c1 Add a define for NSIG. 2010-03-17 13:43:34 +00:00
Cristiano Giuffrida
cb176df60f New RS and new signal handling for system processes.
UPDATING INFO:
20100317:
        /usr/src/etc/system.conf updated to ignore default kernel calls: copy
        it (or merge it) to /etc/system.conf.
        The hello driver (/dev/hello) added to the distribution:
        # cd /usr/src/commands/scripts && make clean install
        # cd /dev && MAKEDEV hello

KERNEL CHANGES:
- Generic signal handling support. The kernel no longer assumes PM as a signal
manager for every process. The signal manager of a given process can now be
specified in its privilege slot. When a signal has to be delivered, the kernel
performs the lookup and forwards the signal to the appropriate signal manager.
PM is the default signal manager for user processes, RS is the default signal
manager for system processes. To enable ptrace()ing for system processes, it
is sufficient to change the default signal manager to PM. This will temporarily
disable crash recovery, though.
- sys_exit() is now split into sys_exit() (i.e. exit() for system processes,
which generates a self-termination signal), and sys_clear() (i.e. used by PM
to ask the kernel to clear a process slot when a process exits).
- Added a new kernel call (i.e. sys_update()) to swap two process slots and
implement live update.

PM CHANGES:
- Posix signal handling is no longer allowed for system processes. System
signals are split into two fixed categories: termination and non-termination
signals. When a non-termination signaled is processed, PM transforms the signal
into an IPC message and delivers the message to the system process. When a
termination signal is processed, PM terminates the process.
- PM no longer assumes itself as the signal manager for system processes. It now
makes sure that every system signal goes through the kernel before being
actually processes. The kernel will then dispatch the signal to the appropriate
signal manager which may or may not be PM.

SYSLIB CHANGES:
- Simplified SEF init and LU callbacks.
- Added additional predefined SEF callbacks to debug crash recovery and
live update.
- Fixed a temporary ack in the SEF init protocol. SEF init reply is now
completely synchronous.
- Added SEF signal event type to provide a uniform interface for system
processes to deal with signals. A sef_cb_signal_handler() callback is
available for system processes to handle every received signal. A
sef_cb_signal_manager() callback is used by signal managers to process
system signals on behalf of the kernel.
- Fixed a few bugs with memory mapping and DS.

VM CHANGES:
- Page faults and memory requests coming from the kernel are now implemented
using signals.
- Added a new VM call to swap two process slots and implement live update.
- The call is used by RS at update time and in turn invokes the kernel call
sys_update().

RS CHANGES:
- RS has been reworked with a better functional decomposition.
- Better kernel call masks. com.h now defines the set of very basic kernel calls
every system service is allowed to use. This makes system.conf simpler and
easier to maintain. In addition, this guarantees a higher level of isolation
for system libraries that use one or more kernel calls internally (e.g. printf).
- RS is the default signal manager for system processes. By default, RS
intercepts every signal delivered to every system process. This makes crash
recovery possible before bringing PM and friends in the loop.
- RS now supports fast rollback when something goes wrong while initializing
the new version during a live update.
- Live update is now implemented by keeping the two versions side-by-side and
swapping the process slots when the old version is ready to update.
- Crash recovery is now implemented by keeping the two versions side-by-side
and cleaning up the old version only when the recovery process is complete.

DS CHANGES:
- Fixed a bug when the process doing ds_publish() or ds_delete() is not known
by DS.
- Fixed the completely broken support for strings. String publishing is now
implemented in the system library and simply wraps publishing of memory ranges.
Ideally, we should adopt a similar approach for other data types as well.
- Test suite fixed.

DRIVER CHANGES:
- The hello driver has been added to the Minix distribution to demonstrate basic
live update and crash recovery functionalities.
- Other drivers have been adapted to conform the new SEF interface.
2010-03-17 01:15:29 +00:00
David van Moolenbroek
7685e98304 typo 2010-03-16 16:21:28 +00:00
Cristiano Giuffrida
83d1f45578 Fixed a bug in interrupt handling code when removing a handler in case of
a shared IRQ.
2010-03-16 10:20:36 +00:00
Erik van der Kouwe
2ba76cfec6 Minor correction in UPDATING 2010-03-16 08:16:13 +00:00
Arun Thomas
9944688d2b Convert man/ over to new make 2010-03-16 00:15:43 +00:00
Kees van Reeuwijk
d89e33fc92 Suppressed some warnings in the WIFSIGNALED macro. 2010-03-15 18:33:29 +00:00
Thomas Veerman
1749530ec1 Fix bug item #405: missing # in front of comment 2010-03-15 10:42:51 +00:00
Thomas Veerman
bef0e3eb63 - Add support for the ucontext system calls (getcontext, setcontext,
swapcontext, and makecontext).
- Fix VM to not erroneously think the stack segment and data segment have
  collided when a user-space thread invokes brk().
- Add test51 to test ucontext functionality.
- Add man pages for ucontext system calls.
2010-03-12 15:58:41 +00:00
Kees van Reeuwijk
3efbb8f133 Let the commands/simple/tr.c understand about '\t', '\r', and '\n'. 2010-03-12 09:58:44 +00:00
Kees van Reeuwijk
23e97af1b4 Add an UNUSED annotation, and use it in libsys. 2010-03-11 14:23:33 +00:00
Erik van der Kouwe
da25ecf758 Work around KVM unreal mode bug by avoiding unreal mode 2010-03-10 15:32:31 +00:00
Kees van Reeuwijk
5df6b80093 Clean up code in preparation for using gcc warnings. 2010-03-10 13:19:27 +00:00
Ben Gras
0937d6c367 re-establish kernel assert()s.
use the regular <assert.h> assert() instead of vmassert() in
kernel. throw out some #if 0 code. fix a few assert() conditions.
enable by default.
2010-03-10 13:00:05 +00:00
Kees van Reeuwijk
88ac328e6b Add prototypes for a bunch of time-related functions. Surprisingly,
they were in the implementation, but not in the header files.
2010-03-09 22:10:58 +00:00
Kees van Reeuwijk
a34d34bc1f Add a set of declarations to math.h. Since we don't actually have
implementations for these functions, we lean on GNU builtin functions
for using them, so these declarations are also conditional on using
a GNU compiler.
2010-03-09 22:05:20 +00:00
Dirk Vogt
44231405cf fix newsigset/oldsigset references 2010-03-09 20:46:26 +00:00
Kees van Reeuwijk
cf95efbad1 Prevent the use of an unitialized variable for block size in CRC calculation. 2010-03-09 16:21:41 +00:00
Arun Thomas
4206784d82 Flex: Fix install(1) invocation in build 2010-03-09 09:43:53 +00:00