Commit graph

276 commits

Author SHA1 Message Date
Ben Gras
0b79eac642 mmap: accept non-PROT_WRITE MAP_SHARED mappings
Currently we don't accept writable file mmap()s, as there is no
system in place to guarantee dirty buffers would make it back to
disk. But we can actually accept MAP_SHARED for PROT_READ mappings,
meaning the ranges aren't writable at all (and no private copy is
made as with MAP_PRIVATE), as it turns out a fairly large class of
usage.

	. fail writable MAP_SHARED mappings at runtime
	. reduces some minix-specific patches
	. lets binutils gold build on minix without further patching

Change-Id: If2896c0a555328ac5b324afa706063fc6d86519e
2014-07-28 17:05:20 +02:00
Ben Gras
565f13088f make vfs & filesystems use failable copying
Change the kernel to add features to vircopy and safecopies so that
transparent copy fixing won't happen to avoid deadlocks, and such copies
fail with EFAULT.

Transparently making copying work from filesystems (as normally done by
the kernel & VM when copying fails because of missing/readonly memory)
is problematic as it can happen that, for file-mapped ranges, that that
same filesystem that is blocked on the copy request is needed to satisfy
the memory range, leading to deadlock. Dito for VFS itself, if done with
a blocking call.

This change makes the copying done from a filesystem fail in such cases
with EFAULT by VFS adding the CPF_TRY flag to the grants. If a FS call
fails with EFAULT, VFS will then request the range to be made available
to VM after the FS is unblocked, allowing it to be used to satisfy the
range if need be in another VFS thread.

Similarly, for datacopies that VFS itself does, it uses the failable
vircopy variant and callers use a wrapper that talk to VM if necessary
to get the copy to work.

	. kernel: add CPF_TRY flag to safecopies
	. kernel: only request writable ranges to VM for the
	  target buffer when copying fails
	. do copying in VFS TRY-first
	. some fixes in VM to build SANITYCHECK mode
	. add regression test for the cases where
	  - a FS system call needs memory mapped in a process that the
	    FS itself must map.
	  - such a range covers more than one file-mapped region.
	. add 'try' mode to vircopy, physcopy
	. add flags field to copy kernel call messages
	. if CP_FLAG_TRY is set, do not transparently try
	  to fix memory ranges
	. for use by VFS when accessing user buffers to avoid
	  deadlock
	. remove some obsolete backwards compatability assignments
        . VFS: let thread scheduling work for VM requests too
          Allows VFS to make calls to VM while suspending and resuming
          the currently running thread. Does currently not work for the
          main thread.
        . VM: add fix memory range call for use by VFS

Change-Id: I295794269cea51a3163519a9cfe5901301d90b32
2014-07-28 17:05:14 +02:00
David van Moolenbroek
f199fc0bfe VM: fix corruption from recursive PDE allocation
Change-Id: I6176b849fefca4bed3e92648b0d72ff47658915c
2014-07-28 17:05:13 +02:00
Ben Gras
ed9076ccb4 64-bit VFS_VMCALL_OFFSET
Change-Id: I29725365a199f850420cd0e4e3902cf70dffe9ad
2014-07-28 17:05:10 +02:00
Lionel Sambuc
84d9c625bf Synchronize on NetBSD-CVS (2013/12/1 12:00:00 UTC)
- Fix for possible unset uid/gid in toproto
 - Fix for default mtree style
 - Update libelf
 - Importing libexecinfo
 - Resynchronize GCC, mpc, gmp, mpfr
 - build.sh: Replace params with show-params.
     This has been done as the make target has been renamed in the same
     way, while a new target named params has been added. This new
     target generates a file containing all the parameters, instead of
     printing it on the console.
 - Update test48 with new etc/services (Fix by Ben Gras <ben@minix3.org)
     get getservbyport() out of the inner loop

Change-Id: Ie6ad5226fa2621ff9f0dee8782ea48f9443d2091
2014-07-28 17:05:06 +02:00
Ben Gras
be9fe09e97 x86 multiboot.h
Change-Id: I245564a98fb9e2572b88f8feb7411ad6800a543c
2014-03-03 20:47:05 +01:00
Ben Gras
dda632a24f drop the minix_ prefixes for mmap and munmap
also cleanup of various minix-specific changes, cleanup of
mmap-related testing.

Change-Id: I289a4fc50cf8a13df4a6082038d860853a4bd024
2014-03-03 20:47:03 +01:00
Lionel Sambuc
301f5f87f0 Renamed m_vm_vfs to m_vm_vfs_mmap.
Stay coherent with the naming scheme of the messages.

Change-Id: Icc0e13a88ec29263502166c0e6eec81cdb974663
2014-03-03 20:47:00 +01:00
Lionel Sambuc
175d3e7eae Changing the message union to anonymous.
This allows us to write things like this:
  message m;
  m.m_notify.interrupts = new_value;

or
  message *mp;
  mp->m_notify.interrupts = new_value;

The shorthands macro have been adapted for the new scheme, and will be
kept as long as we have generic messages being used.

Change-Id: Icfd02b5f126892b1d5d2cebe8c8fb02b180000f7
2014-03-03 20:46:47 +01:00
Ben Gras
88be7bd333 Use netbsd <sys/mman.h>
Change-Id: I80e9cffc80140383a6faf692248573c64d282b4a
2014-03-03 20:37:27 +01:00
Ben Gras
25719b5d92 bigger message
Change-Id: Ie770140c55799bdc3bb8f0ad6994d59938155a1a
2014-03-02 12:28:31 +01:00
Lionel Sambuc
c3fc9df84a Adding ipc_ prefix to ipc primitives
* Also change _orig to _intr for clarity
 * Cleaned up {IPC,KER}VEC
 * Renamed _minix_kernel_info_struct to get_minix_kerninfo
 * Merged _senda.S into _ipc.S
 * Moved into separate files get_minix_kerninfo and _do_kernel_call
 * Adapted do_kernel_call to follow same _ convention as ipc functions
 * Drop patches in libc/net/send.c and libc/include/namespace.h

Change-Id: If4ea21ecb65435170d7d87de6c826328e84c18d0
2014-03-01 09:05:01 +01:00
David van Moolenbroek
24ec0d73b5 Clean up interface to PM and VFS
- introduce new call numbers, names, and field aliases;
- initialize request messages to zero for all ABI calls;
- format callnr.h in the same way as com.h;
- redo call tables in both servers;
- remove param.h namespace pollution in the servers;
- make brk(2) go to VM directly, rather than through PM;
- remove obsolete BRK, UTIME, and WAIT calls;
- clean up path copying routine in VFS;
- move remaining system calls from libminlib to libc;
- correct some errno-related mistakes in libc routines.

Change-Id: I2d8ec5d061cd7e0b30c51ffd77aa72ebf84e2565
2014-03-01 09:05:01 +01:00
David van Moolenbroek
6b3f4dc157 Input infrastructure, INPUT server, PCKBD driver
This commit separates the low-level keyboard driver from TTY, putting
it in a separate driver (PCKBD). The commit also separates management
of raw input devices from TTY, and puts it in a separate server
(INPUT). All keyboard and mouse input from hardware is sent by drivers
to the INPUT server, which either sends it to a process that has
opened a raw input device, or otherwise forwards it to TTY for
standard processing.

Design by Dirk Vogt. Prototype by Uli Kastlunger.

Additional changes made to the prototype:

- the event communication is now based on USB HID codes; all input
  drivers have to use USB codes to describe events;
- all TTY keymaps have been converted to USB format, with the effect
  that a single keymap covers all keys; there is no (static) escaped
  keymap anymore;
- further keymap tweaks now allow remapping of literally all keys;
- input device renumbering and protocol rewrite;
- INPUT server rewrite, with added support for cancel and select;
- PCKBD reimplementation, including PC/AT-to-USB translation;
- support for manipulating keyboard LEDs has been added;
- keyboard and mouse multiplexer devices have been added to INPUT,
  primarily so that an X server need only open two devices;
- a new "libinputdriver" library abstracts away protocol details from
  input drivers, and should be used by all future input drivers;
- both INPUT and PCKBD can be restarted;
- TTY is now scheduled by KERNEL, so that it won't be punished for
  running a lot; without this, simply running "yes" on the console
  kills the system;
- the KIOCBELL IOCTL has been moved to /dev/console;
- support for the SCANCODES termios setting has been removed;
- obsolete keymap compression has been removed;
- the obsolete Olivetti M24 keymap has been removed.

Change-Id: I3a672fb8c4fd566734e4b46d3994b4b7fc96d578
2014-03-01 09:04:55 +01:00
David van Moolenbroek
de975579a4 Rename SYSCTL kernel call to DIAGCTL
Change-Id: I1b17373f01808d887dcbeab493838946fbef4ef6
2014-03-01 09:04:54 +01:00
Lionel Sambuc
9fab85c2de Replacing timer_t by netbsd's timer_t
* Renamed struct timer to struct minix_timer
 * Renamed timer_t to minix_timer_t
 * Ensured all the code uses the minix_timer_t typedef
 * Removed ifdef around _BSD_TIMER_T
 * Removed include/timers.h and merged it into include/minix/timers.h
 * Resolved prototype conflict by renaming kernel's (re)set_timer
   to (re)set_kernel_timer.

Change-Id: I56f0f30dfed96e1a0575d92492294cf9a06468a5
2014-03-01 09:04:54 +01:00
David van Moolenbroek
b48542d914 VM: readd support for forgetting cached FS blocks
Not all services involved in block I/O go through VM to access the
blocks they need.  As a result, the blocks in VM may become stale,
possibly causing corruption when the stale copy is restored by a
service that does go through VM later on.  This patch restores support
for forgetting cached blocks that belong to a particular device, and
makes the relevant file systems use this functionality 1) when
requested by VFS through REQ_FLUSH, and 2) upon unmount.

Change-Id: I0758c5ed8fe4b5ba81d432595d2113175776aff8
2014-03-01 09:04:53 +01:00
Lionel Sambuc
cfd3379bb1 Removing CSU patches
* Removed startup code patches in lib/csu regarding kernel to userland
   ABI.

 * Aligned stack layout on NetBSD stack layout.

 * Generate valid stack pointers instead of offsets by taking into account
   _minix_kerninfo->kinfo->user_sp.

 * Refactored stack generation, by moving part of execve in two
   functions {minix_stack_params(), minix_stack_fill()} and using them
   in execve(), rs and vm.

 * Changed load offset of rtld (ld.so) to:
      execi.args.stack_high - execi.args.stack_size - 0xa00000
   which is 10MB below the main executable stack.

Change-Id: I839daf3de43321cded44105634102d419cb36cec
2014-02-18 11:25:02 +01:00
Thomas Veerman
c1a31d53d9 stat.h: remove some big_ types
Change-Id: I84017db3d54edfb823cc52e02d0b07fccb003988
2014-02-18 11:25:01 +01:00
Ben Gras
274fdff60c VM bugfix & regression test
The bug in the offset correction code for the 'shrink region from
below' case can easily case an assert(foundregion->offset == offset)
to trigger (if the blocks are touched afterwards, e.g. on fork())
as the offsets become wrong. This commit is a fix & regression test.

Change-Id: I28ed403e3891362a2dea674a49e786d3450d2983
2014-01-09 18:28:11 +01:00
Kees Jongenburger
84c0f81c09 arm:refactor move pagetable to the common directory.
Change-Id: Ic3178241f86156f729f016017f61017653ddbd23
2013-12-17 14:41:35 +01:00
Ben Gras
740c1a7425 libminixfs: allow non-pagesize-multiple FSes
The memory-mapped files implementation (mmap() etc.) is implemented with
the help of the filesystems using the in-VM FS cache. Filesystems tell it
about all cached blocks and their metadata. Metadata is: device offset and,
if any (and known), inode number and in-inode offset. VM can then map in
requested memory-mapped file blocks, and request them if necessary.

A limitation of this system is that filesystem block sizes that are not
a multiple of the VM system (and VM hardware) page size are not possible;
we can't map blocks in partially. (We can copy, but then the benefits of
mapping and sharing the physical pages is gone.) So until before this
commit various pieces of caching code assumed page size multiple
blocksizes. This isn't strictly necessary as long as mmap() needn't be
supported on that FS.

This change allows the in-FS cache code (libminixfs) to allocate any-sized
blocks, and will not interact with the VM cache for non-pagesize-multiple
blocks. In that case it will also signal requestors, by failing 'peek'
requests, that mmap() should not be supported on this FS. VM and VFS
will then gracefully fail all file-mapping mmap() calls, and exec() will
fall back to copying executable blocks instead of mmap()ping executables.

As a result, 3 diagnostics that signal file-mapped mmap()s failing
(hitherto an unusual occurence) are disabled, as ld.so does file-mapped
mmap()s to map in objects it needs. On FSes not supporting it this situation
is legitimate and shouldn't cause so much noise. ld.so will revert to its own
minix-specific allocate+copy style of starting executables if mmap()s fail.

Change-Id: Iecb1c8090f5e0be28da8f5181bb35084eb18f67b
2013-11-21 10:03:06 +00:00
Kees Jongenburger
73ed75a454 arm:vm allow per memory type flags.
Change-Id: Id5a5bd479bdfbbc3fb52a85c29e1d7712a1171a7
2013-09-26 12:11:28 +02:00
Kees Jongenburger
d77debb5b7 arm:caching access the l1 pages over cacheable memory.
When we start using a new pagetable (for a new process)
the last part is to ensure the pagetable itself can be
accessed by VM. This is done in pt_bind by updating
the "pagetable of pagetables" and we want this mapping
to match other mappings to the l1 pagetable.

Change-Id: I7b506fd75553917fdc1abd25b55e4b2f25ccbf8d
2013-09-26 11:57:44 +02:00
Kees Jongenburger
91004287be arm:caching mark memory as cacheable.
kernel mappings that are not marked as  VMMF_UNCACHED are now mapped
as cacheable.
2013-09-26 11:54:36 +02:00
Kees Jongenburger
0f23130180 arm:caching introduce _CACHED defines
Introduce ARM_VM_SECTION_CACHED and ARM_VM_PTE_CACHED to ensure we
are using the correct caching flags everywhere.
2013-09-26 11:54:36 +02:00
Kees Jongenburger
b82f01ea69 arm:clarify pagetable code.
Make it clear that non RW mapped memory is mapped RO.
2013-09-26 11:45:44 +02:00
Kees Jongenburger
184fe46a79 arm:use correct address mask for sections. 2013-09-26 09:06:05 +02:00
Ben Gras
b5951f9663 vm: enable filemap=1 by default
. turns on mmap() functionality for files by default
	. also causes exec() to use it to map in executables
	  without copying and with sharing those pages with the
	  disk cache and other instances of the executable

Change-Id: Idb94dfe110eed916cf83b12c45e1a77241a2cee5
2013-09-13 12:56:41 +00:00
Ben Gras
b538531449 vm: make WARNS=5 proof
Change-Id: I737ded223daf04f1c0c85a2e8e6b36c8fdcd07db
2013-09-06 11:51:20 +02:00
David van Moolenbroek
8e87bd84b4 Remove VM_VFS_REPLY from VM_BASIC_CALLS
Change-Id: I0a03f1c95fd7ef87cecb01a028f59696a8447738
2013-08-08 23:23:13 +02:00
David van Moolenbroek
78d707cd26 VM: support for shared call mask ACLs
The VM server now manages its call masks such that all user processes
share the same call mask. As a result, an update for the call mask of
any user process will apply to all user processes. This is similar to
the privilege infrastructure employed by the kernel, and may serve as
a template for similar fine-grained restrictions in other servers.

Concretely, this patch fixes the problem of "service edit init" not
applying the given VM call mask to user processes started from RC
scripts during system startup.

In addition, this patch makes RS set a proper VM call mask for each
recovery script it spawns.

Change-Id: I520a30d85a0d3f3502d2b158293a2258825358cf
2013-08-08 23:22:58 +02:00
Antoine Leca
521de2a716 Env_memory_parse: move into VM
Only possible user of that function.
While here, fix bug about NR_MEMS vs MAXMEMMAP
NR_MEMS=16 is currently less than MAXMEMMAP=40; avoid overriding
2013-08-07 16:30:27 +00:00
Antoine Leca
cfc36e5fd3 Drop obsolete <minix/compiler.h> and <minix/crtso.h>
Change-Id: I05da32bd2bdf014b6fd5c39d6e808d3c73812dc0
2013-08-07 16:28:39 +00:00
Antoine Leca
ed53e5e5a4 VM: added missing dependency 2013-08-06 11:46:33 +02:00
Antoine Leca
da82f9b2e8 <a.out.h>, MINIX style: remove as obsolete
Change-Id: Icc8b7210d60a93ac9cc4610d676dcba270756410
2013-08-06 11:43:35 +02:00
Xiaoguang Sun
64f10ee644 Implement getrusage
Implement getrusage.
These fields of struct rusage are not supported and always set to zero at this time
long ru_nswap;           /* swaps */
long ru_inblock;         /* block input operations */
long ru_oublock;         /* block output operations */
long ru_msgsnd;          /* messages sent */
long ru_msgrcv;          /* messages received */
long ru_nvcsw;           /* voluntary context switches */
long ru_nivcsw;          /* involuntary context switches */

test75.c is the unit test for this new function

Change-Id: I3f1eb69de1fce90d087d76773b09021fc6106539
2013-07-01 23:00:47 +02:00
Kees Jongenburger
a2fcba659c arm:remove pre 1:1 mapping workarounds.
Change-Id: I5a690cf5a561cdca9b9c1f031402f80fd203c92d
2013-06-12 16:42:20 +02:00
Ben Gras
d579bb21f4 vm: bugfix: cover vfs_request-in-callback case
Change-Id: I16fccd9938fe8edab83c6c2e327d27d65ff20224
2013-06-05 15:02:56 +00:00
Ben Gras
1cc6f4295d vm: handle disappearing process case
Change-Id: Id96759883e4cdb175c79dcef7ef5ff254612101f
2013-05-31 17:31:55 +00:00
Ben Gras
49b9165251 vm: mmap support
. test74 for mmap functionality
	. vm: add a mem_file memory type that specifies an mmap()ped
	  memory range, backed by a file
	. add fdref, an object that keeps track of FD references within
	  VM per process and so knows how to de-duplicate the use of FD's
	  by various mmap()ped ranges; there can be many more than there can
	  be FD's
	. turned off for now, enable with 'filemap=1' as boot option

Change-Id: I640b1126cdaa522a0560301cf6732b7661555672
2013-05-31 15:42:01 +00:00
Ben Gras
2d2a1a077d panic: declare as printf-style-checked
. and related fixes

Change-Id: I5131ac57dc53d8aec8d421a34c5ceea383404d7a
2013-05-31 13:35:25 +00:00
Ben Gras
4fb3945025 vm: a bit more informative about failed pagefaults
Change-Id: I2b72dfb9291670cb837dfdb279f519892575d4a6
2013-05-29 15:25:45 +00:00
Kees Jongenburger
ffea6706f0 arm:also use 1MB sections for mapping AM335X device memory.
Change-Id: Idc0b285fcbabe8ec4c0be9a600b6a720c0bd3ffc
2013-05-24 15:47:04 +02:00
Kees Jongenburger
1e1ff96aea arm:vm caching fix.
Improve reliability by using write trough cache.
2013-05-16 20:39:20 +02:00
Kees Jongenburger
a3f6529ee2 arm:vm header cleanup. 2013-05-16 20:39:19 +02:00
Kees Jongenburger
e6bac75a8b ARM:Rename ARM_BIG_PAGE to ARM_SECTION.
The natural term to use when talking about MINIX big pages on ARM
is SECTION. A section is a level 1 page table entry pointing to
a 1MB area.

Change-Id: I9bd27ca99bc772126c31c27a537b1415db20c4a6
2013-04-29 11:42:26 +02:00
Ben Gras
0cfff08e56 libexec: mmap support, prealloc variants
In libexec, split the memory allocation method into cleared and
non-cleared. Cleared gives zeroed memory, non-cleared gives 'junk'
memory (that will be overwritten anyway, and so needn't be cleared)
that is faster to get.

Also introduce the 'memmap' method that can be used, if available,
to map code and data from executables into a process using the
third-party mmap() mode.

Change-Id: I26694fd3c21deb8b97e01ed675dfc14719b0672b
2013-04-24 10:18:16 +00:00
Ben Gras
49eb1f4806 vm: new secondary cache code
Primary purpose of change: to support the mmap implementation, VM must
know both (a) about some block metadata for FS cache blocks, i.e.
inode numbers and inode offsets where applicable; and (b) know about
*all* cache blocks, i.e.  also of the FS primary caches and not just
the blocks that spill into the secondary one. This changes the
interface and VM data structures.

This change is only for the interface (libminixfs) and VM data
structures; the filesystem code is unmodified, so although the
secondary cache will be used as normal, blocks will not be annotated
with inode information until the FS is modified to provide this
information. Until it is modified, mmap of files will fail gracefully
on such filesystems.

This is indicated to VFS/VM by returning ENOSYS for REQ_PEEK.

Change-Id: I1d2df6c485e6c5e89eb28d9055076cc02629594e
2013-04-24 10:18:16 +00:00
Ben Gras
7421728360 VM: memtype fix
Memory types in VM are described by methods. Each mapped region has
a type, and all pages instantiated get that type on creation.
Individual page types has to be able to change though. This commit
changes the code to use the memory types of the individual pages,
where appropriate, instead of just the higher-level region, in case
it has changed. This is needed to e.g. support future copy-on-write
MAP_PRIVATE mmap modes.

Change-Id: I5523db14ac036ec774a54392fb67f9acb8725731
2013-04-24 10:18:15 +00:00