Commit graph

160 commits

Author SHA1 Message Date
Ben Gras 196021cd82 drop safemap code 2012-10-30 13:55:42 +01:00
Arne Welzel bf33a1c097 libsys: add sys_safememset() 2012-09-26 02:18:00 +02:00
Ben Gras 2d72cbec41 SYSENTER/SYSCALL support
. add cpufeature detection of both
	. use it for both ipc and kernelcall traps, using a register
	  for call number
	. SYSENTER/SYSCALL does not save any context, therefore userland
	  has to save it
	. to accomodate multiple kernel entry/exit types, the entry
	  type is recorded in the process struct. hitherto all types
	  were interrupt (soft int, exception, hard int); now SYSENTER/SYSCALL
	  is new, with the difference that context is not fully restored
	  from proc struct when running the process again. this can't be
	  done as some information is missing.
	. complication: cases in which the kernel has to fully change
	  process context (i.e. sigreturn). in that case the exit type
	  is changed from SYSENTER/SYSEXIT to soft-int (i.e. iret) and
	  context is fully restored from the proc struct. this does mean
	  the PC and SP must change, as the sysenter/sysexit userland code
	  will otherwise try to restore its own context. this is true in the
	  sigreturn case.
	. override all usage by setting libc_ipc=1
2012-09-24 15:53:43 +02:00
Ben Gras 2cb560297c VM: remove unused dma memory support functions from vm
. unused calls / data structures
2012-09-18 13:17:47 +02:00
David van Moolenbroek 41df1b59f1 libsys: let optset parse largeish positive values
Note that strtoul() also parses negative numbers correctly.
2012-09-03 12:20:17 +00:00
Ben Gras 053fa581b5 vm: remove stack handling for signals
. moved to the kernel as the handling was only
	  reading it; the kernel may as well write it too
2012-08-29 17:31:38 +02:00
Arun Thomas fd43d93ce5 ARM support for system libraries 2012-08-28 13:49:27 -04:00
Arun Thomas a57a591f25 Reorganize arch consts and types
-DEFAULT_HZ const moved to archconst.h
-cpu_info struct moved to archtypes.h
2012-08-16 17:07:43 +02:00
Arun Thomas 697f0d097f Rename sys_vmctl_get_cr3_i386 2012-08-12 23:30:54 +02:00
David van Moolenbroek ebc85e54c4 libsys: resolve Coverity warnings 2012-08-09 00:16:36 +02:00
David van Moolenbroek 49aed1ad97 libsys: remove unused stacktrace variant 2012-08-09 00:16:35 +02:00
David van Moolenbroek 8caec1b57b libsys: 64-bit numbers support for printf()
Change some drivers accordingly.
2012-07-26 09:45:05 +00:00
Ben Gras cbcdb838f1 various coverity-inspired fixes
. some strncpy/strcpy to strlcpy conversions
	. new <minix/param.h> to avoid including other minix headers
	  that have colliding definitions with library and commands code,
	  causing parse warnings
	. removed some dead code / assignments
2012-07-16 14:00:56 +02:00
Ben Gras 50e2064049 No more intel/minix segments.
This commit removes all traces of Minix segments (the text/data/stack
memory map abstraction in the kernel) and significance of Intel segments
(hardware segments like CS, DS that add offsets to all addressing before
page table translation). This ultimately simplifies the memory layout
and addressing and makes the same layout possible on non-Intel
architectures.

There are only two types of addresses in the world now: virtual
and physical; even the kernel and processes have the same virtual
address space. Kernel and user processes can be distinguished at a
glance as processes won't use 0xF0000000 and above.

No static pre-allocated memory sizes exist any more.

Changes to booting:
        . The pre_init.c leaves the kernel and modules exactly as
          they were left by the bootloader in physical memory
        . The kernel starts running using physical addressing,
          loaded at a fixed location given in its linker script by the
          bootloader.  All code and data in this phase are linked to
          this fixed low location.
        . It makes a bootstrap pagetable to map itself to a
          fixed high location (also in linker script) and jumps to
          the high address. All code and data then use this high addressing.
        . All code/data symbols linked at the low addresses is prefixed by
          an objcopy step with __k_unpaged_*, so that that code cannot
          reference highly-linked symbols (which aren't valid yet) or vice
          versa (symbols that aren't valid any more).
        . The two addressing modes are separated in the linker script by
          collecting the unpaged_*.o objects and linking them with low
          addresses, and linking the rest high. Some objects are linked
          twice, once low and once high.
        . The bootstrap phase passes a lot of information (e.g. free memory
          list, physical location of the modules, etc.) using the kinfo
          struct.
        . After this bootstrap the low-linked part is freed.
        . The kernel maps in VM into the bootstrap page table so that VM can
          begin executing. Its first job is to make page tables for all other
          boot processes. So VM runs before RS, and RS gets a fully dynamic,
          VM-managed address space. VM gets its privilege info from RS as usual
          but that happens after RS starts running.
        . Both the kernel loading VM and VM organizing boot processes happen
	  using the libexec logic. This removes the last reason for VM to
	  still know much about exec() and vm/exec.c is gone.

Further Implementation:
        . All segments are based at 0 and have a 4 GB limit.
        . The kernel is mapped in at the top of the virtual address
          space so as not to constrain the user processes.
        . Processes do not use segments from the LDT at all; there are
          no segments in the LDT any more, so no LLDT is needed.
        . The Minix segments T/D/S are gone and so none of the
          user-space or in-kernel copy functions use them. The copy
          functions use a process endpoint of NONE to realize it's
          a physical address, virtual otherwise.
        . The umap call only makes sense to translate a virtual address
          to a physical address now.
        . Segments-related calls like newmap and alloc_segments are gone.
        . All segments-related translation in VM is gone (vir2map etc).
        . Initialization in VM is simpler as no moving around is necessary.
        . VM and all other boot processes can be linked wherever they wish
          and will be mapped in at the right location by the kernel and VM
          respectively.

Other changes:
        . The multiboot code is less special: it does not use mb_print
          for its diagnostics any more but uses printf() as normal, saving
          the output into the diagnostics buffer, only printing to the
          screen using the direct print functions if a panic() occurs.
        . The multiboot code uses the flexible 'free memory map list'
          style to receive the list of free memory if available.
        . The kernel determines the memory layout of the processes to
          a degree: it tells VM where the kernel starts and ends and
          where the kernel wants the top of the process to be. VM then
          uses this entire range, i.e. the stack is right at the top,
          and mmap()ped bits of memory are placed below that downwards,
          and the break grows upwards.

Other Consequences:
        . Every process gets its own page table as address spaces
          can't be separated any more by segments.
        . As all segments are 0-based, there is no distinction between
          virtual and linear addresses, nor between userspace and
          kernel addresses.
        . Less work is done when context switching, leading to a net
          performance increase. (8% faster on my machine for 'make servers'.)
	. The layout and configuration of the GDT makes sysenter and syscall
	  possible.
2012-07-15 22:30:15 +02:00
Thomas Veerman f93afa00e9 Remove MINIXSRCDIR and use NETBSDSRCDIR
NETBSDSRCDIR is used all over the place anyway, and this reduces
our diff with NetBSD a little.
2012-06-18 10:53:35 +00:00
Ben Gras 0fb2f83da9 drop from segments physcopy/vircopy invocations
. sys_vircopy always uses D for both src and dst
	. sys_physcopy uses PHYS_SEG if and only if corresponding
	  endpoint is NONE, so we can derive the mode (PHYS_SEG or D)
	  from the endpoint arg in the kernel, dropping the seg args
	. fields in msg still filled in for backwards compatability,
	  using same NONE-logic in the library
2012-06-18 12:28:40 +00:00
Ben Gras 0e35eb0c6b drop segments from safemap/safeunmap invocations 2012-06-18 12:28:40 +00:00
Ben Gras 2bfeeed885 drop segment from safecopy invocations
. all invocations were S or D, so can safely be dropped
	  to prepare for the segmentless world
	. still assign D to the SCP_SEG field in the message
	  to make previous kernels usable
2012-06-16 16:22:51 +00:00
Ben Gras 769af57274 further libexec generalization
. new mode for sys_memset: include process so memset can be
	  done in physical or virtual address space.
	. add a mode to mmap() that lets a process allocate uninitialized
	  memory.
	. this allows an exec()er (RS, VFS, etc.) to request uninitialized
	  memory from VM and selectively clear the ranges that don't come
	  from a file, leaving no uninitialized memory left for the process
	  to see.
	. use callbacks for clearing the process, clearing memory in the
	  process, and copying into the process; so that the libexec code
	  can be used from rs, vfs, and in the future, kernel (to load vm)
	  and vm (to load boot-time processes)
2012-06-07 15:15:02 +02:00
Ben Gras 040362e379 exec() cleanup, generalization, improvement
. make exec() callers (i.e. vfs and rs) determine the
	  memory layout by explicitly reserving regions using
	  mmap() calls on behalf of the exec()ing process,
	  i.e. handling all of the exec logic, thereby eliminating
	  all special exec() knowledge from VM.
	. the new procedure is: clear the exec()ing process
	  first, then call third-party mmap()s to reserve memory, then
	  copy the executable file section contents in, all using callbacks
	  tailored to the caller's way of starting an executable
	. i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM
	  as with rigid 2-section arguments
	. this naturally allows generalizing exec() by simply loading
	  all ELF sections
	. drop/merge of lots of duplicate exec() code into libexec
	. not copying the code sections to vfs and into the executable
	  again is a measurable performance improvement (about 3.3% faster
	  for 'make' in src/servers/)
2012-06-07 15:15:01 +02:00
Ben Gras ee4016155e vm: add third-party mmap() mode and PROCCTL
these two functions will be used to support all exec() functionality
going into a single library shared by RS and VFS and exec() knowledge
leaving VM.

	. third-party mmap: allow certain processes (VFS, RS) to
	  do mmap() on behalf of another process
	. PROCCTL: used to free and clear a process' address space
2012-06-07 12:43:16 +02:00
David van Moolenbroek 060399d9dd SEF: add sef_cancel()
This function allows the caller to cancel receiving a message from a
SEF callback. The receive function will then return EINTR.
2012-04-09 16:35:57 +02:00
David van Moolenbroek 6aa61efd09 VBOX: add host/guest communication interface
This interface can be used by other system processes by means of the
newly provided vbox API in libsys.
2012-04-09 15:56:20 +02:00
Ben Gras 204ae72525 retire _ANSI and <minix/ansi.h> 2012-03-25 21:58:27 +02:00
Ben Gras 7336a67dfe retire PUBLIC, PRIVATE and FORWARD 2012-03-25 21:58:14 +02:00
Ben Gras 6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
David van Moolenbroek 70abb127cc Add sys_vumap() kernel call
This new call is a vectored version of sys_umap(). It supports batch
lookups, non-contiguous memory, faulting in memory, and basic access
checks.
2012-03-24 19:51:13 +01:00
David van Moolenbroek 21ed531c8f pci: remove pci_init1 API call 2012-03-07 23:56:08 +01:00
David van Moolenbroek ca95f69f25 drivers: resolve compiler warnings 2012-03-05 22:32:55 +01:00
Ben Gras c543dcf205 fix for -lsys assert(): call panic()
. call panic() instead of abort() so that stacktraces are printed
	. also call printf(..) instead of fprintf(stderr, ..)
2012-02-24 13:09:39 +01:00
David van Moolenbroek 2c685f34e0 Cut PM out of the adddma/deldma/getdma call path 2012-01-14 00:27:06 +01:00
David van Moolenbroek 84662ec4b3 libsys: unbreak getidle() 2011-12-16 16:06:09 +00:00
David van Moolenbroek 6f374faca5 Add "expected size" parameter to getsysinfo()
This patch provides basic protection against damage resulting from
differently compiled servers blindly copying tables to one another.
In every getsysinfo() call, the caller is provided with the expected
size of the requested data structure. The callee fails the call if
the expected size does not match the data structure's actual size.
2011-12-11 22:34:14 +01:00
David van Moolenbroek 2f622b3a51 SEF: default to endpoint-changing restart 2011-12-05 16:28:07 +01:00
Thomas Veerman a209c3ae12 Fix a ton of compiler warnings
This patch fixes most of current reasons to generate compiler warnings.
The changes consist of:
 - adding missing casts
 - hiding or unhiding function declarations
 - including headers where missing
 - add __UNCONST when assigning a const char * to a char *
 - adding missing return statements
 - changing some types from unsigned to signed, as the code seems to want
   signed ints
 - converting old-style function definitions to current style (i.e.,
   void func(param1, param2) short param1, param2; {...} to
   void func (short param1, short param2) {...})
 - making the compiler silent about signed vs unsigned comparisons. We
   have too many of those in the new libc to fix.

A number of bugs in the test set were fixed. These bugs were never
triggered with our old libc. Consequently, these tests are now forced to
link with the new libc or they will generate errors (in particular tests 43
and 55).

Most changes in NetBSD libc are limited to moving aroudn "#ifndef __minix"
or stuff related to Minix-specific things (code in sys-minix or gen/minix).
2011-11-14 10:07:49 +00:00
David van Moolenbroek 2602861f23 Move optset.c into libsys; remove redundant copies 2011-11-07 16:16:08 +01:00
Arun Thomas 05341dfeb0 sef: build sef_debug_header only when needed 2011-09-19 18:17:45 +02:00
Arun Thomas 92fa3189ab MKSYSDEBUG: conditionally compile more debug code 2011-09-16 15:25:26 +02:00
Arun Thomas 4ca68d42a0 Add MKLIVEUPDATE and MKSTATECTL 2011-09-02 16:57:22 +02:00
Arun Thomas 5cddf6dce8 libsys: don't pull in strerror 2011-08-30 21:10:34 +02:00
Arun Thomas 86b061078b Build gcov code only if MKCOVERAGE is yes 2011-08-09 10:39:33 +02:00
Ben Gras 02081e4b62 rename mmap() and munmap()
. it's a good extra interface to have but doesn't
	  meet standardised functionality
	. applications (in pkgsrc) find it and expect
	  full functionality the minix mmap doesn't offter
	. on the whole probably better to hide these functions
	  (mmap and friends) until they are grown up; the base system
	  can use the new minix_* names
2011-07-16 13:01:19 +02:00
Evgeniy Ivanov 5da4a0bd56 Move minimal libc from libsys into separate lib.
Now users can choose between libsys, libsys + libminc and
libsys + libc. E.g. PUFFS/FUSE servers need libsys + libc while
old servers can use libsys + libminc.
2011-07-09 22:32:38 +02:00
Ben Gras 9c01ceb576 introduce sqrt_approx() in -lsys
. use this to avoid -lm dependency in mfs
2011-07-04 02:51:12 +02:00
Gianluca Guida cc17b27a2b Build NetBSD libc library in world in ELF mode.
3 sets of libraries are built now:
  . ack: all libraries that ack can compile (/usr/lib/i386/)
  . clang+elf: all libraries with minix headers (/usr/lib/)
  . clang+elf: all libraries with netbsd headers (/usr/netbsd/)

Once everything can be compiled with netbsd libraries and headers, the
/usr/netbsd hierarchy will be obsolete and its libraries compiled with
netbsd headers will be installed in /usr/lib, and its headers
in /usr/include. (i.e. minix libc and current minix headers set
will be gone.)

To use the NetBSD libc system (libraries + headers) before
it is the default libc, see:
   http://wiki.minix3.org/en/DevelopersGuide/UsingNetBSDCode
This wiki page also documents the maintenance of the patch
files of minix-specific changes to imported NetBSD code.

Changes in this commit:
  . libsys: Add NBSD compilation and create a safe NBSD-based libc.
  . Port rest of libraries (except libddekit) to new header system.
  . Enable compilation of libddekit with new headers.
  . Enable kernel compilation with new headers.
  . Enable drivers compilation with new headers.
  . Port legacy commands to new headers and libc.
  . Port servers to new headers.
  . Add <sys/sigcontext.h> in compat library.
  . Remove dependency file in tree.
  . Enable compilation of common/lib/libc/atomic in libsys
  . Do not generate RCSID strings in libc.
  . Temporarily disable zoneinfo as they are incompatible with NetBSD format
  . obj-nbsd for .gitignore
  . Procfs: use only integer arithmetic. (Antoine Leca)
  . Increase ramdisk size to create NBSD-based images.
  . Remove INCSYMLINKS handling hack.
  . Add nbsd_include/sys/exec_elf.h
  . Enable ELF compilation with NBSD libc.
  . Add 'make nbsdsrc' in tools to download reference NetBSD sources.
  . Automate minix-port.patch creation.
  . Avoid using fstavfs() as it is *extremely* slow and unneeded.
  . Set err() as PRIVATE to avoid name clash with libc.
  . [NBSD] servers/vm: remove compilation warnings.
  . u32 is not a long in NBSD headers.
  . UPDATING info on netbsd hierarchy
  . commands fixes for netbsd libc
2011-06-24 11:46:30 +02:00
Ben Gras e3f68488ee fix many clang warnings in lib/ 2011-06-23 19:25:36 +02:00
Erik van der Kouwe 6e0f3b3bda Split off sys_umap_remote from sys_umap
sys_umap now supports only:
- looking up the physical address of a virtual address in the address space
  of the caller;
- looking up the physical address of a grant for which the caller is the
  grantee.

This is enough for nearly all umap users. The new sys_umap_remote supports
lookups in arbitrary address spaces and grants for arbitrary grantees.
2011-06-10 14:28:20 +00:00
Erik van der Kouwe e969b5e11b Remote unused segctl kernel call 2011-04-26 23:28:23 +02:00
David van Moolenbroek c51cd5fe91 Server/driver protocols: no longer allow third-party copies.
Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed
to know which actual process to copy data from/to, as that process may
not always be the caller. Now that we have full safecopy support, these
fields have become useless for that purpose: the owner of the grant is
*always* the caller. Allowing the caller to supply another endpoint is
in fact dangerous, because the callee may then end up using a grant
from a third party. One could call this a variant of the confused
deputy problem.

From now on, safecopy calls should always use the caller's endpoint as
grant owner. This fully obsoletes the DL_ENDPT field in the
inet/ethernet protocol. IO_ENDPT has other uses besides identifying the
grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only
because that is a more fitting name (it should never be used for I/O
after all), but also in order to intentionally break any old system
source code outside the base system. If this patch breaks your code,
fixing it is fairly simple:

- DL_ENDPT should be replaced with m_source;
- IO_ENDPT should be replaced with m_source when used for safecopies;
- IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g.
  when setting REP_ENDPT, matching requests in CANCEL calls, getting
  DEV_SELECT flags, and retrieving of the real user process's endpoint
  in DEV_OPEN.

The changes in this patch are binary backward compatible.
2011-04-11 17:35:05 +00:00
David van Moolenbroek c4928b2df9 libsys: fix micro_delay() 2011-04-08 16:57:44 +00:00
Thomas Veerman 2cde22ee10 Enable a process to find out what the error code was when delivery of an
asynchronous message resulted in an error.

The model here is that:
 - Iff a sender wishes to be notified, the sender MUST check for errors
   BEFORE sending another asynchronous message.

The reason is that in order to remember the error code, we can't clean up
the message table and hence we risk running out of table space. This is
less of a problem when the sender enables notifications only for errors.
2011-04-08 15:23:12 +00:00
Arun Thomas 25a790a631 VM and kernel support for ELF 2011-02-26 23:00:55 +00:00
Ben Gras dc1cc91df1 <ansi.h> -> <minix/ansi.h> 2011-01-28 11:35:02 +00:00
Erik van der Kouwe 04229f0581 Servers request TSC freq from kernel rather than each one measuring it individually 2011-01-11 11:03:37 +00:00
Dirk Vogt c22564335f Added possibility to inject input events to tty
M    include/Makefile
A    include/minix/input.h
M    include/minix/com.h
M    drivers/tty/keyboard.c
M    drivers/tty/tty.c
M    drivers/tty/tty.h
M    include/minix/syslib.h
M    lib/libsys/Makefile
A    lib/libsys/input.c
2010-11-17 14:53:07 +00:00
Arun Thomas aaaad89244 Use int64 functions consistently
Instead of manipulating the u64_t type directly, use the
ex64hi()/ex64lo()/make64() functions.
2010-11-07 23:35:29 +00:00
Erik van der Kouwe b0eaf0bc27 make system server vprintf check for NULL 2010-10-04 17:53:18 +00:00
Ben Gras 68de328ac1 make the asynsend table size NPROCS-dependent.
this is a fix for e.g. the situation where lots of processes die
instantly, and PM has to send an asyn msg for each one to VFS, and
panics if there are too many. there are likely more situations in
which this table should be dependent on the no. of processes.

reported by pikpik on #minix3.
2010-10-01 14:39:04 +00:00
Tomas Hruby 74c5cd7668 The profile utility can set the sprofiling mode
- profile --nmi | --rtc sets the profiling mode

- --rtc is default, uses BIOS RTC, cannot profile kernel the presetted
  frequency values apply

- --nmi is only available in APIC mode as it uses the NMI watchdog, -f
  allows any frequency in Hz

- both modes use compatible data structures
2010-09-23 10:49:42 +00:00
David van Moolenbroek adbc4e4ea7 libsys: tsc_to_micros support for large TSC delta values 2010-09-23 09:26:42 +00:00
Ben Gras 250fb23dc0 lib/libsys/gcov.c - fix gcc warning 2010-09-20 11:36:41 +00:00
Tomas Hruby 06b6e5624a SMP - Changed prototype of sys_schedule()
- sys_schedule can change only selected values, -1 means that the
  current value should be kept unchanged. For instance we mostly want
  to change the scheduling quantum and priority but we want to keep
  the process at the current cpu

- RS can hand off its processes to scheduler

- service can read the destination cpu from system.conf

- RS can pass the information farther
2010-09-15 14:10:42 +00:00
David van Moolenbroek 354da24f5b make getsysinfo() a system-land call 2010-09-14 21:50:05 +00:00
Ben Gras 23311d9819 lib: fixes to make clang not error 2010-09-13 15:50:54 +00:00
Ben Gras c81f201c8c added missing sef_gcov.c 2010-08-25 13:23:32 +00:00
Ben Gras 5d6c2aae0a gcov support, based on work contributed by Anton Kuijsten. 2010-08-25 13:06:43 +00:00
Arun Thomas de231a713e Move MIN() and MAX() macros to sys/params.h 2010-08-21 13:10:41 +00:00
Erik van der Kouwe b337d3f8e5 move rrrrrrread_tsc from libsys to libc so anyone can use it 2010-08-20 18:43:56 +00:00
Arun Thomas 9a21d1a2fd Macros for symbols used in both ASM and C
-The macros take care of prepending the leading underscore when
 necessary.
2010-08-17 16:44:07 +00:00
Cristiano Giuffrida 8cedace2f5 Scheduling parameters out of the kernel. 2010-07-13 15:30:17 +00:00
David van Moolenbroek 1ecdac623a libsys: add standard condition spinning primitives 2010-07-12 23:14:40 +00:00
Cristiano Giuffrida 8427d774b6 RS live update support. 2010-07-09 18:29:04 +00:00
David van Moolenbroek 895850b8cf move timers code to libsys 2010-07-09 12:58:18 +00:00
Cristiano Giuffrida 1f8dbed029 RS crash recovery support. 2010-07-06 22:05:21 +00:00
Ben Gras 68db8ed0b9 lib: fixes for warnings that clang has for libraries. 2010-07-06 12:08:22 +00:00
David van Moolenbroek 2488cc6442 PCI: expose BAR sizes 2010-07-01 09:10:16 +00:00
Erik van der Kouwe 4690e8b015 Opps, forgot to svn add these files 2010-07-01 08:38:15 +00:00
Erik van der Kouwe 23284ee7bd User-space scheduling for system processes 2010-07-01 08:32:33 +00:00
Cristiano Giuffrida 377f4e7e31 Fix and comment a race in SEF Init 2010-06-27 09:01:15 +00:00
Arun Thomas c0c8d25799 Rename mkfiles from minix.*.mk to bsd.*.mk
Makes things easier for pkgsrc
2010-06-25 18:29:09 +00:00
David van Moolenbroek eeab8e0680 libdriver: make partition code use a contiguous buffer 2010-06-13 10:40:22 +00:00
Arun Thomas 1b2c01db1b Makefile updates:
Turn on optimization
Remove some redundancy in FLAGS
2010-06-11 16:05:36 +00:00
Arun Thomas f0a158d8c1 More cleanup to remove MM and FS references 2010-06-10 14:04:46 +00:00
Thomas Veerman 6bbcab3ec4 Clean up MFS a bit:
- Remove unused includes.
 - Add include guards to headers.
 - Use unsigned variables in case they're never going to hold a negative
   value. This causes GCC's complaints to disappear and should make flexelint
   a lot happier, too.
 - Make functions private when they're used only within a module.
 - Remove unused variables.
 - Add casts where appropriate.
2010-06-01 12:35:33 +00:00
Tomas Hruby a8111c5027 Various small scheduling related fixes 2010-05-26 07:16:39 +00:00
Erik van der Kouwe 1f11a57141 Oops, last commit included more than was intended 2010-05-20 08:07:47 +00:00
Erik van der Kouwe 5f15ec05b2 More system processes, this was not enough for the release script to run on some configurations 2010-05-20 08:05:07 +00:00
Tomas Hruby b09bcf6779 Scheduling server (by Bjorn Swift)
In this second phase, scheduling is moved from PM to its own
scheduler (see r6557 for phase one). In the next phase we hope to a)
include useful information in the "out of quantum" message and b)
create some simple scheduling policy that makes use of that
information.

When the system starts up, PM will iterate over its process table and
ask SCHED to take over scheduling unprivileged processes. This is
done by sending a SCHEDULING_START message to SCHED. This message
includes the processes endpoint, the parent's endpoint and its nice
level. The scheduler adds this process to its schedproc table, issues
a schedctl, and returns its own endpoint to PM - as the endpoint of
the effective scheduler. When a process terminates, a SCHEDULING_STOP
message is sent to the scheduler.

The reason for this effective endpoint is for future compatibility.
Some day, we may have a scheduler that, instead of scheduling the
process itself, forwards the SCHEDULING_START message on to another
scheduler.

PM has information on who schedules whom. As such, scheduling
messages from user-land are sent through PM. An example is when
processes change their priority, using nice(). In that case, a
getsetpriority message is sent to PM, which then sends a
SCHEDULING_SET_NICE to the process's effective scheduler.

When a process is forked through PM, it inherits its parent's
scheduler, but is spawned with an empty quantum. As before, a request
to fork a process flows through VM before returning to PM, which then
wakes up the child process. This flow has been modified slightly so
that PM notifies the scheduler of the new process, before waking up
the child process. If the scheduler fails to take over scheduling,
the child process is torn down and the fork fails with an erroneous
value.

Process priority is entirely decided upon using nice levels. PM
stores a copy of each process's nice level and when a child is
forked, its parent's nice level is sent in the SCHEDULING_START
message. How this level is mapped to a priority queue is up to the
scheduler. It should be noted that the nice level is used to
determine the max_priority and the parent could have been in a lower
priority when it was spawned. To prevent a CPU intensive process from
hawking the CPU by continuously forking children that get scheduled
in the max_priority, the scheduler should determine in which queue
the parent is currently scheduled, and schedule the child in that
same queue.

Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as
NR_SCHED_QUEUES/2. That results in a "off by one" error when
converting priority->nice->priority for nice=0. This also had the
side effect that if someone were to set the MAX_USER_Q to something
else than 0, then USER_Q would be off.
2010-05-18 13:39:04 +00:00
Ben Gras c5c25e7abc kernel/vm: change pde table info from single buffer to explicit per-process.
makes code in kernel more readable, and allows better sanity checking on
using the pde info.
2010-05-12 08:31:05 +00:00
Ben Gras f78d8e74fd secondary cache feature in vm.
A new call to vm lets processes yield a part of their memory to vm,
together with an id, getting newly allocated memory in return. vm is
allowed to forget about it if it runs out of memory. processes can ask
for it back using the same id. (These two operations are normally
combined in a single call.)

It can be used as a as-big-as-memory-will-allow block cache for
filesystems, which is how mfs now uses it.
2010-05-05 11:35:04 +00:00
Erik van der Kouwe 4b34ff6903 Add syslib function to obtain CPU frequency 2010-05-03 19:41:04 +00:00
Erik van der Kouwe a033e6fcae Add missing newline at end of file 2010-04-27 13:30:46 +00:00
Cristiano Giuffrida 0164957abb Unified crash recovery and live update.
RS CHANGES:
- Crash recovery is now implemented like live update. Two instances are kept
side by side and the dead version is live updated into the new one. The endpoint
doesn't change and the failure is not exposed (by default) to other system
services.
- The new instance can be created reactively (when a crash is detected) or
proactively. In the latter case, RS can be instructed to keep a replica of
the system service to perform a hot swap when the service fails. The flag
SF_USE_REPL is set in that case.
- The new flag SF_USE_REPL is supported for services in the boot image and
dynamically started services through the RS interface (i.e. -p option in the
service utility).
- Fixed a free unallocated memory bug for core system services.
2010-04-27 11:17:30 +00:00
Tomas Hruby f51eea4b32 Changed pagefault delivery to VM
this patch changes the way pagefaults are delivered to VM. It adopts
the same model as the out-of-quantum messages sent by kernel to a
scheduler.

- everytime a userspace pagefault occurs, kernel creates a message
  which is sent to VM on behalf of the faulting process

- the process is blocked on delivery to VM in the standard IPC code
  instead of waiting in a spacial in-kernel queue (stack) and is not
  runnable until VM tell kernel that the pagefault is resolved and is
  free to clear the RTS_PAGEFAULT flag.

- VM does not need call kernel and poll the pagefault information
  which saves many (1/2?) calls and kernel calls that return "no more
  data"

- VM notification by kernel does not need to use signals

- each entry in proc table is by 12 bytes smaller (~3k save)
2010-04-26 23:21:26 +00:00
David van Moolenbroek aacbfc41cc intercept puts() in libsys, for gcc 2010-04-23 20:23:33 +00:00
Cristiano Giuffrida 48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk c114df82ec Rename all uses of U8_t to u8_t and remove U8_t, remove unused I8_t,
Remove all uses of U16_t and U32_t in pci-related code.
If necessary to avoid problems, change functions to ansi-style declaration.
2010-04-07 13:35:56 +00:00
Kees van Reeuwijk 0a04f49d2b Fixed some incorrect uses of printf-like functions. 2010-04-01 14:30:36 +00:00
Cristiano Giuffrida d8b42a755d Move kernel signal SIGKNDELAY to system signal SIGSNDELAY and fix broken ptrace. 2010-03-31 08:55:12 +00:00
Tomas Hruby b4cf88a04f Userspace scheduling
- cotributed by Bjorn Swift

- In this first phase, scheduling is moved from the kernel to the PM
  server. The next steps are to a) moving scheduling to its own server
  and b) include useful information in the "out of quantum" message,
  so that the scheduler can make use of this information.

- The kernel process table now keeps record of who is responsible for
  scheduling each process (p_scheduler). When this pointer is NULL,
  the process will be scheduled by the kernel. If such a process runs
  out of quantum, the kernel will simply renew its quantum an requeue
  it.

- When PM loads, it will take over scheduling of all running
  processes, except system processes, using sys_schedctl().
  Essentially, this only results in taking over init. As children
  inherit a scheduler from their parent, user space programs forked by
  init will inherit PM (for now) as their scheduler.

 - Once a process has been assigned a scheduler, and runs out of
   quantum, its RTS_NO_QUANTUM flag will be set and the process
   dequeued. The kernel will send a message to the scheduler, on the
   process' behalf, informing the scheduler that it has run out of
   quantum. The scheduler can take what ever action it pleases, based
   on its policy, and then reschedule the process using the
   sys_schedule() system call.

- Balance queues does not work as before. While the old in-kernel
  function used to renew the quantum of processes in the highest
  priority run queue, the user-space implementation only acts on
  processes that have been bumped down to a lower priority queue.
  This approach reacts slower to changes than the old one, but saves
  us sending a sys_schedule message for each process every time we
  balance the queues. Currently, when processes are moved up a
  priority queue, their quantum is also renewed, but this can be
  fiddled with.

- do_nice has been removed from kernel. PM answers to get- and
  setpriority calls, updates it's own nice variable as well as the
  max_run_queue. This will be refactored once scheduling is moved to a
  separate server. We will probably have PM update it's local nice
  value and then send a message to whoever is scheduling the process.

- changes to fix an issue in do_fork() where processes could run out
  of quantum but bypassing the code path that handles it correctly.
  The future plan is to remove the policy from do_fork() and implement
  it in userspace too.
2010-03-29 11:07:20 +00:00