Commit graph

690 commits

Author SHA1 Message Date
Tomas Hruby b09bcf6779 Scheduling server (by Bjorn Swift)
In this second phase, scheduling is moved from PM to its own
scheduler (see r6557 for phase one). In the next phase we hope to a)
include useful information in the "out of quantum" message and b)
create some simple scheduling policy that makes use of that
information.

When the system starts up, PM will iterate over its process table and
ask SCHED to take over scheduling unprivileged processes. This is
done by sending a SCHEDULING_START message to SCHED. This message
includes the processes endpoint, the parent's endpoint and its nice
level. The scheduler adds this process to its schedproc table, issues
a schedctl, and returns its own endpoint to PM - as the endpoint of
the effective scheduler. When a process terminates, a SCHEDULING_STOP
message is sent to the scheduler.

The reason for this effective endpoint is for future compatibility.
Some day, we may have a scheduler that, instead of scheduling the
process itself, forwards the SCHEDULING_START message on to another
scheduler.

PM has information on who schedules whom. As such, scheduling
messages from user-land are sent through PM. An example is when
processes change their priority, using nice(). In that case, a
getsetpriority message is sent to PM, which then sends a
SCHEDULING_SET_NICE to the process's effective scheduler.

When a process is forked through PM, it inherits its parent's
scheduler, but is spawned with an empty quantum. As before, a request
to fork a process flows through VM before returning to PM, which then
wakes up the child process. This flow has been modified slightly so
that PM notifies the scheduler of the new process, before waking up
the child process. If the scheduler fails to take over scheduling,
the child process is torn down and the fork fails with an erroneous
value.

Process priority is entirely decided upon using nice levels. PM
stores a copy of each process's nice level and when a child is
forked, its parent's nice level is sent in the SCHEDULING_START
message. How this level is mapped to a priority queue is up to the
scheduler. It should be noted that the nice level is used to
determine the max_priority and the parent could have been in a lower
priority when it was spawned. To prevent a CPU intensive process from
hawking the CPU by continuously forking children that get scheduled
in the max_priority, the scheduler should determine in which queue
the parent is currently scheduled, and schedule the child in that
same queue.

Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as
NR_SCHED_QUEUES/2. That results in a "off by one" error when
converting priority->nice->priority for nice=0. This also had the
side effect that if someone were to set the MAX_USER_Q to something
else than 0, then USER_Q would be off.
2010-05-18 13:39:04 +00:00
David van Moolenbroek 9ba65d2ea8 This patch switches the MINIX3 ethernet driver stack from a port-based
model to an instance-based model. Each ethernet driver instance is now
responsible for exactly one network interface card. The port field in
/etc/inet.conf now acts as an instance field instead.

This patch also updates the data link protocol. This update:
- eliminates the concept of ports entirely;
- eliminates DL_GETNAME entirely;
- standardizes on using m_source for IPC and DL_ENDPT for safecopies;
- removes error codes from TASK/STAT replies, as they were unused;
- removes a number of other old or unused fields;
- names and renames a few other fields.

All ethernet drivers have been changed to:
- conform to the new protocol, and exactly that;
- take on an instance number based on a given "instance" argument;
- skip that number of PCI devices in probe iterations;
- use config tables and environment variables based on that number;
- no longer be limited to a predefined maximum of cards in any way;
- get rid of any leftover non-safecopy support and other ancient junk;
- have a correct banner protocol figure, or none at all.

Other changes:
* Inet.conf is now taken to be line-based, and supports #-comments.
  No existing installations are expected to be affected by this.
* A new, select-based asynchio library replaces the old one.
  Kindly contributed by Kees J. Bot.
* Inet now supports use of select() on IP devices.
  Combined, the last two changes together speed up dhcpd
  considerably in the presence of multiple interfaces.
* A small bug has been fixed in nonamed.
2010-05-17 22:22:53 +00:00
David van Moolenbroek ce386974bc DS: base number of data entries on NR_SYS_PROCS 2010-05-12 13:21:15 +00:00
Ben Gras c5c25e7abc kernel/vm: change pde table info from single buffer to explicit per-process.
makes code in kernel more readable, and allows better sanity checking on
using the pde info.
2010-05-12 08:31:05 +00:00
Cristiano Giuffrida 23204787d5 - Fixed a bug when running out of priv structures.
- Tell VM about VM calls for every new service instance.
2010-05-11 20:49:42 +00:00
Tomas Hruby d3e991a7b6 PM signal handling check too strict
- this panic may be unnecessarily triggered if PM gets the delayed
  stop signal from kernel before it gets reply from VFS to the UNPAUSE
  call.

- after this change PM does not proceed to delivering the signal until
  the reply from VFS is received. Perhaps PM could deliver the signal
  straight away as it knows that the process does not run. Possibly
 i dangerous.

- the signal is deliverd immediately after the UNPAUSE reply as the
  pending signals are always checked at the moment.
2010-05-10 14:27:22 +00:00
Tomas Hruby 6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Ben Gras d5a0af826a vm: use arch_map2str to print pagefault info, to properly display code addrs 2010-05-08 17:25:54 +00:00
Tomas Hruby 7c334e2670 RS - fixed timeouts
- rs does not assume hz==60

- rs adjusts its timeout ticks by the system clock frequency

- drivers have time to reply if hz is set too high (e.g. 1000+) for
  instance when debugging
2010-05-07 18:12:16 +00:00
Thomas Veerman 0aceb25535 Small cleanup of dead and/or redundant code. 2010-05-06 09:32:40 +00:00
Ben Gras b6bb75963b vm: remove leftover diag print 2010-05-05 15:26:48 +00:00
Ben Gras 86e1b9d770 fsctl.h doesn't exist. 2010-05-05 11:49:41 +00:00
Ben Gras f78d8e74fd secondary cache feature in vm.
A new call to vm lets processes yield a part of their memory to vm,
together with an id, getting newly allocated memory in return. vm is
allowed to forget about it if it runs out of memory. processes can ask
for it back using the same id. (These two operations are normally
combined in a single call.)

It can be used as a as-big-as-memory-will-allow block cache for
filesystems, which is how mfs now uses it.
2010-05-05 11:35:04 +00:00
Ben Gras 4ac5eb7832 rs: stacktrace if system process exits early. 2010-04-29 08:50:17 +00:00
Cristiano Giuffrida 83ef7119f6 Don't panic when out of priv structures. 2010-04-28 20:41:23 +00:00
Erik van der Kouwe 93f3bf5bda Fix wrong word 2010-04-28 20:37:08 +00:00
Thomas Veerman f9317dc039 Scan all processes for that might be blocked on a lock 2010-04-28 11:54:22 +00:00
Erik van der Kouwe d17590fcf4 Avoid sbrk (in favour of malloc) in RS where possible 2010-04-28 08:35:54 +00:00
Cristiano Giuffrida 0164957abb Unified crash recovery and live update.
RS CHANGES:
- Crash recovery is now implemented like live update. Two instances are kept
side by side and the dead version is live updated into the new one. The endpoint
doesn't change and the failure is not exposed (by default) to other system
services.
- The new instance can be created reactively (when a crash is detected) or
proactively. In the latter case, RS can be instructed to keep a replica of
the system service to perform a hot swap when the service fails. The flag
SF_USE_REPL is set in that case.
- The new flag SF_USE_REPL is supported for services in the boot image and
dynamically started services through the RS interface (i.e. -p option in the
service utility).
- Fixed a free unallocated memory bug for core system services.
2010-04-27 11:17:30 +00:00
Tomas Hruby f51eea4b32 Changed pagefault delivery to VM
this patch changes the way pagefaults are delivered to VM. It adopts
the same model as the out-of-quantum messages sent by kernel to a
scheduler.

- everytime a userspace pagefault occurs, kernel creates a message
  which is sent to VM on behalf of the faulting process

- the process is blocked on delivery to VM in the standard IPC code
  instead of waiting in a spacial in-kernel queue (stack) and is not
  runnable until VM tell kernel that the pagefault is resolved and is
  free to clear the RTS_PAGEFAULT flag.

- VM does not need call kernel and poll the pagefault information
  which saves many (1/2?) calls and kernel calls that return "no more
  data"

- VM notification by kernel does not need to use signals

- each entry in proc table is by 12 bytes smaller (~3k save)
2010-04-26 23:21:26 +00:00
Ben Gras 94edf4fa12 vfs: start at vmnt[0] to sync mounted filesystems, not vmnt[1]. 2010-04-26 17:12:34 +00:00
Kees van Reeuwijk e24ed988d6 Fix some compilation errors with the gcc compiler, fix some recent warnings. 2010-04-22 13:59:34 +00:00
Kees van Reeuwijk 86a23c1fbd Remove U16_t and most other similar types. Rewrite functions to ansi-style
declaration if necessary.
2010-04-21 11:05:22 +00:00
Kees van Reeuwijk e85f78a20b Add some support for wchar_t. 2010-04-19 15:20:24 +00:00
David van Moolenbroek cfb108afc7 fix mfs/isofs signal handling 2010-04-15 16:10:28 +00:00
Ben Gras e0792d72d7 vm: util.S not used currently; leave it out. 2010-04-13 15:02:32 +00:00
Ben Gras 5c17d5e02f vm: include no-caching bits in PTF_ALLFLAGS for flags sanity check. 2010-04-13 11:08:08 +00:00
Ben Gras 1f1f8d2207 vm: don't force physical addresses to be nonzero. 2010-04-13 11:01:40 +00:00
Kees van Reeuwijk bc314bda91 Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar.
Use ANSI-style function declarations where necessary.
2010-04-13 10:58:41 +00:00
Tomas Hruby 86378ff645 PM remembers what it should schedule
- while PM implements fork also for RS it needs to remember what to
  schedule and what not. PM_SCHEDULED flag serves this purpose.

- PM only schedules processes that are descendaints of init, i.e. normal
  user processes

- after a process is forked PM schedules for the first time only
  processes that have PM_SCHEDULED set. The others are handled iether
  by kernel or some other scheduler
2010-04-13 10:45:08 +00:00
Ben Gras 5f7c37bb84 vm: remove assert, map in of phys addr 0 is legit sometimes. 2010-04-13 10:39:46 +00:00
Ben Gras 27fc7ab1f3 vm: use assert() instead of vm_assert(); remove vm_assert(). 2010-04-12 12:37:28 +00:00
Ben Gras c78250332d let vm use physically fragmented memory for allocations.
map_copy_ph_block is replaced by map_clone_ph_block, which can
replace a single physical block by multiple physical blocks.

also,
 . merge map_mem.c with region.c, as they manipulate the same
   data structures
 . NOTRUNNABLE removed as sanity check
 . use direct functions for ALLOC_MEM and FREE_MEM again
 . add some checks to shared memory mapping code
 . fix for data structure integrity when using shared memory
 . fix sanity checks
2010-04-12 11:25:24 +00:00
Ben Gras 76fbf21026 ipc server: don't print as many errors, to make ipc test less noisy. 2010-04-12 11:06:15 +00:00
Cristiano Giuffrida 66a8efba53 Fixed escape warning. 2010-04-12 08:39:59 +00:00
Tomas Hruby 9b599bac1d Quantum in fork
- This patch removes the time slice split between parent and child in
  fork.

- The time slice of the parent remains unchanged and the child does
  not have any.

- If the process has a scheduler, the scheduler must assign the
  quantum and priority of the new process and let it run.

- If the child does not inherit a scheduler, it is scheduled by the
  dummy default kernel policy. (servers, drivers, etc.)

- In theory, the scheduler can change the quantum even of the parent
  process and implement any policy for splitting the quantum as
  neither the parent nor the child are runnable.  Sending the
  out-of_quantum message on behalf of the processes may look like the
  right solution, however, the scheduler would probably handle the
  message before the whole fork protocol is finished. This way the
  scheduler has absolute control when the process should become
  runnable.
2010-04-10 15:27:38 +00:00
Tomas Hruby 1a31d158ad Restructure and simplyfycation of the scheduling code in PM a little bit.
- It introduces schedule_process() which makes a kernel call to set
  the scheduling parameters of a process. It is used in the next patch.
2010-04-10 15:24:49 +00:00
Cristiano Giuffrida 65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Cristiano Giuffrida 98d1cf7064 Fixed gcc -Wall warnings. 2010-04-08 15:02:32 +00:00
Cristiano Giuffrida 48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk 94a81c840a Removed unused variables, added const where possible. 2010-04-07 11:25:51 +00:00
Arun Thomas 4ed3a0cf3a Convert kernel over to bsdmake 2010-04-01 22:22:33 +00:00
Kees van Reeuwijk fc7dced1fa Fix printfs with too few or too many parms, remove unused vars, fix incorrect flag tests, other code cleanup. 2010-04-01 13:25:05 +00:00
Kees van Reeuwijk c3f649557e Lots of const correctness, other cleanup. 2010-04-01 12:51:31 +00:00
Cristiano Giuffrida d8b42a755d Move kernel signal SIGKNDELAY to system signal SIGSNDELAY and fix broken ptrace. 2010-03-31 08:55:12 +00:00
Thomas Veerman 4d686f1616 Move allocation of temporary inodes for cloned character special devices from
MFS to PFS.
2010-03-30 15:00:09 +00:00
Kees van Reeuwijk 4865e3f4f9 More use of endpoint_t. Other code cleanup. 2010-03-30 14:07:15 +00:00
Ben Gras bc0e36f402 fix null deref; vmnt->mounted_on is NULL legitimately for root.
changed check+panic to assert().

added assert().
2010-03-29 11:39:54 +00:00
Tomas Hruby 5b52c5aa02 A reliable way for userspace to check if a msg is from kernel
- IPC_FLG_MSG_FROM_KERNEL status flag is returned to userspace if the
  receive was satisfied by s message which was sent by the kernel on
  behalf of a process. This perfectly reliale information.

- MF_SENDING_FROM_KERNEL flag added to processes to be able to set
  IPC_FLG_MSG_FROM_KERNEL when finishing receive if the receiver
  wasn't ready to receive immediately.

- PM is changed to use this information to confirm that the scheduling
  messages are indeed from the kernel and not faked by a process.

  PM uses sef_receive_status()

- get_work() is removed from PM to make the changes simpler
2010-03-29 11:25:01 +00:00
Tomas Hruby b4cf88a04f Userspace scheduling
- cotributed by Bjorn Swift

- In this first phase, scheduling is moved from the kernel to the PM
  server. The next steps are to a) moving scheduling to its own server
  and b) include useful information in the "out of quantum" message,
  so that the scheduler can make use of this information.

- The kernel process table now keeps record of who is responsible for
  scheduling each process (p_scheduler). When this pointer is NULL,
  the process will be scheduled by the kernel. If such a process runs
  out of quantum, the kernel will simply renew its quantum an requeue
  it.

- When PM loads, it will take over scheduling of all running
  processes, except system processes, using sys_schedctl().
  Essentially, this only results in taking over init. As children
  inherit a scheduler from their parent, user space programs forked by
  init will inherit PM (for now) as their scheduler.

 - Once a process has been assigned a scheduler, and runs out of
   quantum, its RTS_NO_QUANTUM flag will be set and the process
   dequeued. The kernel will send a message to the scheduler, on the
   process' behalf, informing the scheduler that it has run out of
   quantum. The scheduler can take what ever action it pleases, based
   on its policy, and then reschedule the process using the
   sys_schedule() system call.

- Balance queues does not work as before. While the old in-kernel
  function used to renew the quantum of processes in the highest
  priority run queue, the user-space implementation only acts on
  processes that have been bumped down to a lower priority queue.
  This approach reacts slower to changes than the old one, but saves
  us sending a sys_schedule message for each process every time we
  balance the queues. Currently, when processes are moved up a
  priority queue, their quantum is also renewed, but this can be
  fiddled with.

- do_nice has been removed from kernel. PM answers to get- and
  setpriority calls, updates it's own nice variable as well as the
  max_run_queue. This will be refactored once scheduling is moved to a
  separate server. We will probably have PM update it's local nice
  value and then send a message to whoever is scheduling the process.

- changes to fix an issue in do_fork() where processes could run out
  of quantum but bypassing the code path that handles it correctly.
  The future plan is to remove the policy from do_fork() and implement
  it in userspace too.
2010-03-29 11:07:20 +00:00