Commit graph

57 commits

Author SHA1 Message Date
Jean-Baptiste Boric b1d068470b isofs: reworked for better performance
isofs now uses an in-memory directory listing built on-the-fly instead
of parsing the ISO 9660 data structures over and over for almost every
request. This yields huge performance improvements.

The directory listing is allocated dynamically, but Minix servers aren't
normally supposed to do that because critical servers would crash if the
system runs out of memory. isofs is quite frugal, won't allocate memory
after having the whole directory tree cached and is not that critical
(its most important job is to serve as a root file system during
installation).

The benefits and elegance of this scheme far outweights this small
problem in practice.

Change-Id: I13d070388c07d274cbee0645cbc50295c447c5b6
2015-10-07 12:40:24 +02:00
David van Moolenbroek b80da2a01d commands: move manpages into command directories
Change-Id: Icf8a2d26629a1822725022c9ee21c587d3c4c3b4
2015-09-28 14:06:06 +00:00
David van Moolenbroek d91f738bd8 Kernel: export clock information on kernel page
Please note that this information is for use by system services only!
The clock facility is not ready to be used directly by userland, and
thus, this kernel page extension is NOT part of the userland ABI.

For service programmers' convenience, change the prototype of the
getticks(3) to return the uptime clock value directly, since the call
can no longer fail.

Correct the sys_times(2) reply message to use the right field type
for the boot time.

Restructure the kernel internals a bit so as to have all the clock
stuff closer together.

Change-Id: Ifc050b7bd253aecbe46e3bd7d7cc75bd86e45555
2015-09-23 12:00:46 +00:00
David van Moolenbroek 594df55e53 Abstract away minix_kerninfo access
Instead of importing an external _minix_kerninfo variable, any code
using the shared kernel page should now call get_minix_kerninfo(3).
Since this is the only logical name for such a function, rename the
previous get_minix_kerninfo call to ipc_minix_kerninfo.

Change-Id: I2e424b6fb55aa55d3da850187f1f7a0b7cbbf910
2015-09-21 15:09:04 +00:00
David van Moolenbroek e4d99eb9b0 Basic live rerandomization infrastructure
This commits adds a basic infrastructure to support Address Space
Randomization (ASR).  In a nutshell, using the already imported ASR
LLVM pass, multiple versions can be generated for the same system
service, each with a randomized, different address space layout.
Combined with the magic instrumentation for state transfer, a system
service can be live updated into another ASR-randomized version at
runtime, thus providing live rerandomization.

Since MINIX3 is not yet capable of running LLVM linker passes, the
ASR-randomized service binaries have to be pregenerated during
crosscompilation.  These pregenerated binaries can then be cycled
through at runtime.  This patch provides the basic proof-of-concept
infrastructure for both these parts.

In order to support pregeneration, the clientctl host script has
been extended with a "buildasr" command.  It is to be used after
building the entire system with bitcode and magic support, and will
produce a given number of ASR-randomized versions of all system
services.  These services are placed in /usr/service/asr in the
image that is generated as final step by the "buildasr" command.

In order to support runtime updating, a new update_asr(8) command
has been added to MINIX3.  This command attempts to live-update the
running system services into their next ASR-randomized versions.
For now, this command is not run automatically, and thus must be
invoked manually.

Technical notes:

- For various reasons, magic instrumentation is x86-only for now,
  and ASR functionality is therefore to be used on x86 only as well.
- The ASR-randomized binaries are placed in numbered subdirectories
  so as not to have to change their actual program names, which are
  assumed to be static in various places (system.conf, procfs).
- The root partition is typically too small to contain all the
  produced binaries, which is why we introduce /usr/service.  There
  is a symlink from /service/asr to /usr/service/asr for no other
  reason than to let userland continue to assume that all services
  are reachable through /service.
- The ASR count field (r_asr_count/ASRcount) maintained by RS is not
  used within RS in any way; it is only passed through procfs to
  userland in order to allow update_asr(8) to keep track of which
  version is currently loaded without having to maintain own state.
- Ideally, pre-instrumentation linking of a service would remove all
  its randomized versions.  Currently, the user is assumed not to
  perform ASR instrumentation and then recompile system services
  without performing ASR instrumentation again, as the randomized
  binaries included in the image would then be stale.  This aspect
  has to be improved later.
- Various other issues are flagged in the comments of the various
  parts of this patch.

Change-Id: I093ad57f31c18305591f64b2d491272288aa0937
2015-09-17 17:15:03 +00:00
David van Moolenbroek 129adfeb53 Annotations and tweaks for live update
This change is necessary for instrumentation-aided state transfer.

Change-Id: I24be938009f02e302a15083f9a7a11824975e42b
2015-09-17 17:13:38 +00:00
Lionel Sambuc 8b0f8559ee VM: set recovery policy to restart
- Update proc to select restart policy for VM
 - Update testrelpol to test the supported modes of recovery for VM
 - Small code cleanups in testrelpol as well.

Change-Id: I6958e100865c2429b9435f3f7cc7d018046378c3
2015-09-17 13:45:43 +00:00
Cristiano Giuffrida 0c474453d1 tests: Expand the reliability test suite.
Change-Id: Ic7f90f2d4edae1f72f98b34bda70891330c27941
2015-09-17 13:37:40 +00:00
Cristiano Giuffrida 3f82ac6a4e services: Selectively enable stateful restart.
Change-Id: Ibf6afa3041013ca714e28b673abb1329cd72d2d5
2015-09-17 13:36:01 +00:00
Cristiano Giuffrida 0e78c0166c Switch to stateful restart.
The following services have been updated to support stateful restarts:
 - Drivers: tty
 - Filesystems: isofs, mfs, pfs, libvtreefs-based file servers
 - System servers: tty, ds, pm, vfs, vm

Change-Id: Ie84baa3ba1774047b3ae519808fe4116928edabb
2015-09-17 13:26:22 +00:00
Cristiano Giuffrida 50b7f13f9f Add live update-friendly annotations.
Change-Id: I7d7d79893836a20799ca548a350f3288e92581f0
2015-09-17 13:25:38 +00:00
David van Moolenbroek fefec20e6b procfs: do not list init in /proc/services
It is not a system service.

Change-Id: Ibfbf08aa52095826c19172e517bcbd292e7944a0
2015-09-07 22:56:19 +00:00
David van Moolenbroek 4472b590c7 libminixfs: rework prefetch API
This patch changes the prefetch API so that file systems must now
provide a set of block numbers, rather than a set of buffers.  The
result is a leaner and more well-defined API; linear computation of
the range of blocks to prefetch; duplicates no longer interfering
with the prefetch process; guaranteed inclusion of the block needed
next into the prefetch range; and, limits and policy decisions better
established by libminixfs now actually being moved into libminixfs.

Change-Id: I7e44daf2d2d164bc5e2f1473ad717f3ff0f0a77f
2015-08-14 18:39:30 +00:00
David van Moolenbroek 6c46a77d95 libminixfs: better support for read errors and EOF
- The lmfs_get_block*(3) API calls may now return an error.  The idea
  is to encourage a next generation of file system services to do a
  better job at dealing with block read errors than the MFS-derived
  implementations do.  These existing file systems have been changed
  to panic immediately upon getting a block read error, in order to
  let unchecked errors cause corruption.  Note that libbdev already
  retries failing I/O operations a few times first.

- The libminixfs block device I/O module (bio.c) now deals properly
  with end-of-file conditions on block devices.  Since a device or
  partition size may not be a multiple of the root file system's block
  size, support for partial block retrival has been added, with a new
  internal lmfs_get_partial_block(3) call.  A new test program,
  test85, tests the new handling of EOF conditions when reading,
  writing, and memory-mapping a block device.

Change-Id: I05e35b6b8851488328a2679da635ebba0c6d08ce
2015-08-14 18:39:26 +00:00
David van Moolenbroek 1311233cfb libminixfs: keep track of block usage
This patch changes the libminixfs API and implementation such that the
library is at all times aware of how many total and used blocks there
are in the file system.  This removes the last upcall of libminixfs
into file systems (fs_blockstats).  In the process, make this part of
the libminixfs API a little prettier and more robust.  Change file
systems accordingly.  Since this change only adds to MFS being unable
to deal with zones and blocks having different sizes, fail to mount
such file systems immediately rather than triggering an assert later.

Change-Id: I078e589c7e1be1fa691cf391bf5dfddd1baf2c86
2015-08-14 18:39:21 +00:00
David van Moolenbroek 0314acfb2d libminixfs: miscellaneous API cleanup
Mostly removal of unused parameters from calls.

Change-Id: I0eb7b568265d1669492d958e78b9e69d7cf6fc05
2015-08-14 18:39:00 +00:00
David van Moolenbroek cb9453ca63 libminixfs: add support for peeking blocks
With this change, the lmfs_get_block*(3) functions allow the caller to
specify that it only wants the block if it is in the cache or the
secondary VM cache.  If the block is not found there, the functions
return NULL.  Previously, the PREFETCH method would be used to this
end instead, which was both abuse in name and less efficient.

Change-Id: Ieb5a15b67fa25d2008a8eeef9d126ac908fc2395
2015-08-13 13:46:50 +00:00
David van Moolenbroek d75faf18d9 libminixfs: add support for memory-mapped holes
When VM asks a file system to provide a block to satisfy a page fault
on a file memory mapping, the file system previously had no way to
inform VM that the block is a hole, since there is no corresponding
block on the underlying device.  To work around this, MFS and ext2
would actually allocate a block for the hole when asked by VM, which
not only defeats the point of holes in the first place, but also does
not work on read-only file systems.  With this patch, a new libminixfs
call allows the file system to inform VM about holes.  This issue does
raise the question as to whether the VM cache is using the right data
structures, since there are now two places where we have to fake a
device offset.  This will have to be revisited in the future.

The patch changes file systems accordingly, and adds a test to test74.

Change-Id: Ib537d56b3f30a8eb05bc1f63c92b5c7428d18f4c
2015-08-13 13:46:48 +00:00
David van Moolenbroek e94f856b38 libminixfs/VM: fix memory-mapped file corruption
This patch employs one solution to resolve two independent but related
issues.  Both issues are the result of one fundamental aspect of the
way VM's memory mapping works: VM uses its cache to map in blocks for
memory-mapped file regions, and for blocks already in the VM cache, VM
does not go to the file system before mapping them in.  To preserve
consistency between the FS and VM caches, VM relies on being informed
about all updates to file contents through the block cache.  The two
issues are both the result of VM not being properly informed about
such updates:

 1. Once a file system provides libminixfs with an inode association
    (inode number + inode offset) for a disk block, this association
    is not broken until a new inode association is provided for it.
    If a block is freed and reallocated as a metadata (non-inode)
    block, its old association is maintained, and may be supplied to
    VM's secondary cache.  Due to reuse of inodes, it is possible
    that the same inode association becomes valid for an actual file
    block again.  In that case, when that new file is memory-mapped,
    under certain circumstances, VM may end up using the metadata
    block to satisfy a page fault on the file, due to the stale inode
    association.  The result is a corrupted memory mapping, with the
    application seeing data other than the current file contents
    mapped in at the file block.

 2. When a hole is created in a file, the underlying block is freed
    from the device, but VM is not informed of this update, and thus,
    if VM's cache contains the block with its previous inode
    association, this block will remain there.  As a result, if an
    application subsequently memory-maps the file, VM will map in the
    old block at the position of the hole, rather than an all-zeroes
    block.  Thus, again, the result is a corrupted memory mapping.

This patch resolves both issues by making the file system inform the
minixfs library about blocks being freed, so that libminixfs can
break the inode association for that block, both in its own cache and
in the VM cache.  Since libminixfs does not know whether VM has the
block in its cache or not, it makes a call to VM for each block being
freed.  Thus, this change introduces more calls to VM, but it solves
the correctness issues at hand; optimizations may be introduced
later.  On the upside, all freed blocks are now marked as clean,
which should result in fewer blocks being written back to the device,
and the blocks are removed from the caches entirely, which should
result in slightly better cache usage.

This patch is necessary but not sufficient to resolve the situation
with respect to memory mapping of file holes in general.  Therefore,
this patch extends test 74 with a (rather particular but effective)
test for the first issue, but not yet with a test for the second one.

This fixes #90.

Change-Id: Iad8b134d2f88a884f15d3fc303e463280749c467
2015-08-13 13:46:46 +00:00
David van Moolenbroek da32b6c32e orinoco: retire
This code is MPL-licensed and thus does not belong in the MINIX3
source tree.

Change-Id: I10388b05e90e83b95414cf9b469e50f49bc1db31
2015-07-20 16:55:15 +00:00
Lionel Sambuc 67b4718325 log: announce presence during startup
Set its restart policy to "reset".

Change-Id: I54f350d9d0d9bc571abd9630f27f4c961c7c0778
2015-06-29 10:57:38 +00:00
Cristiano Giuffrida a8f606defa procfs: add service pid information
Change-Id: I163ca4c6c6db45cca41515644ac6c2acd0807ee8
2015-06-29 10:56:53 +00:00
David van Moolenbroek f5321d8d55 procfs: do not list inactive services
Each /proc/service entry must have a unique label.  With cloning,
multiple RS services may have the same label.  Since we are not
actually interested in inactive services (for now), eliminate those
entries, leaving only the active service which will then indeed have
a unique label in the list.  This resolves a procfs crash.

Change-Id: I0de7ef8fd186ab13f3e22e46416504fd981c09aa
2015-06-29 10:56:43 +00:00
David van Moolenbroek 0eabb93c0c procfs: retrieve both RS tables from RS at once
Previously, procfs would retrieve the rproc and rprocpub tables from
RS in two separate calls.  This allowed for a race condition where the
tables could change in between the calls, resulting in a panic in
procfs under certain circumstances.  RS now implements a new method
for getsysinfo that allows the retrieval of both tables at once.

Change-Id: I5ec22d25898361270c90e805a43fc6d76ad9e29d
2015-06-29 10:56:30 +00:00
David van Moolenbroek da21d85025 Add PTYFS, Unix98 pseudo terminal support
This patch adds support for Unix98 pseudo terminals, that is,
posix_openpt(3), grantpt(3), unlockpt(3), /dev/ptmx, and /dev/pts/.
The latter is implemented with a new pseudo file system, PTYFS.

In effect, this patch adds secure support for unprivileged pseudo
terminal allocation, allowing programs such as tmux(1) to be used by
non-root users as well.  Test77 has been extended with new tests, and
no longer needs to run as root.

The new functionality is optional.  To revert to the old behavior,
remove the "ptyfs" entry from /etc/fstab.

Technical nodes:

o The reason for not implementing the NetBSD /dev/ptm approach is that
  implementing the corresponding ioctl (TIOCPTMGET) would require
  adding a number of extremely hairy exceptions to VFS, including the
  PTY driver having to create new file descriptors for its own device
  nodes.

o PTYFS is required for Unix98 PTYs in order to avoid that the PTY
  driver has to be aware of old-style PTY naming schemes and even has
  to call chmod(2) on a disk-backed file system.  PTY cannot be its
  own PTYFS since a character driver may currently not also be a file
  system.  However, PTYFS may be subsumed into a DEVFS in the future.

o The Unix98 PTY behavior differs somewhat from NetBSD's, in that
  slave nodes are created on ptyfs only upon the first call to
  grantpt(3).  This approach obviates the need to revoke access as
  part of the grantpt(3) call.

o Shutting down PTY may leave slave nodes on PTYFS, but once PTY is
  restarted, these leftover slave nodes will be removed before they
  create a security risk.  Unmounting PTYFS will make existing PTY
  slaves permanently unavailable, and absence of PTYFS will block
  allocation of new Unix98 PTYs until PTYFS is (re)mounted.

Change-Id: I822b43ba32707c8815fd0f7d5bb7a438f51421c1
2015-06-23 17:43:46 +00:00
David van Moolenbroek 22840dea11 libfsdriver: preinitialize stat.st_ino
The stat.st_ino field must always be filled with the inode number
given as part of the fdr_stat request anyway, so libfsdriver can
simply fill in the number and allow the file system not to bother.

Change-Id: Ia7a849d0b23dfc83010df0d48fa26e4225427694
2015-06-23 14:38:04 +00:00
David van Moolenbroek af4345b097 isofs: do not link against libc
This change requires a small patch to libc, in order to avoid that
libminc has to pull in a large chunk of libc just for mktime(3).

Change-Id: I48e598b3716eff626cac461f78a41e32334e6b28
2015-06-07 17:01:45 +00:00
David van Moolenbroek dfc3261535 PFS, inet: use static UID to drop privileges
Previously, services would obtain the user ID of "service" through
getpwnam(3).  While this approach is conceptually better, it also
imposes linking against libc which in turn causes problems with
printf(3), which already led to PFS no longer dropping privileges at
all.  For now, we hardcode SERVICE_UID and use that instead.

In the future, two changes should allow removal of SERVICE_UID again:
- "service edit" should cause RS to request that a service (such as
  PFS) drop privileges through SEF, using the user ID resolved by
  service(8), or something similar;
- a future devfs should make it possible for inet to start without
  root privileges altogether.

Change-Id: Ie02a1e888cde325806fc0ae76909943ac42c9b96
2015-06-06 21:42:48 +00:00
David van Moolenbroek 75e18fe498 Add 3c90x: 3Com 3C90xB/C network driver
Change-Id: Iba0bbcb3b1b69a7c204abdc81cf3afe59b6bfaae
2015-02-10 13:47:28 +00:00
Lionel Sambuc 41ba8c04cc Restart policies: Add testing and ProcFS DB
- Expose in procfs the service status and supported recovery policies.
 - This adds a test (testrelpol.sh) to exercise the restart policies of
   the system services and drivers.

NOTE:
  The policy support information is temporarily hardcoded in ProcFS, but
  this has to be replaced by properly retrieving this information from
  RS, which should in turn be setup on a per service basis, at
  initialization time.

Change-Id: I0cb1516a450355b38d0c46b1a8b3d9e841a2c029
2014-12-10 23:11:25 +01:00
David van Moolenbroek 31b6611abf procfs: add /proc/service directory
This directory is filled dynamically with regular files, one for each
service that RS knows about, named after its label.  Its contents are
still subject to (heavy) change, but currently expose the service's
endpoint and number of restarts so far.

Change-Id: Ie58c824bcb6382c8da7a714e59fee87329970b4b
2014-11-12 12:13:53 +00:00
David van Moolenbroek f1abbce725 procfs: convert to KNF
Change-Id: Ib4252f199af0f9597745dcd2c11a7f761738671f
2014-11-12 12:13:47 +00:00
David van Moolenbroek 52be5c0afb libvtreefs: API changes/extensions, part 2
- rename start_vtreefs to run_vtreefs, since the function returns upon
  termination these days;
- add get_inode_slots function to retrieve the number of indexed slots;
- add support for extra per-inode data for arbitrary storage.

Change-Id: If2d365d7b478a1cecc9e20fb2b3e70c1a1cf7243
2014-11-12 12:13:43 +00:00
David van Moolenbroek 5eefd0fec2 libvtreefs: API changes/extensions, part 1
- move primary I/O buffer into vtreefs; change read hook API;
- add hooks for write, truncate, symlink, mknod, unlink, chmod/chown;
- modernize message_hook;
- change procfs, devman, gpio accordingly;

Change-Id: I9f0669e41195efa3253032e95d93f0a78e9d68d6
2014-11-12 12:13:38 +00:00
Emmanuel Blot f92baba71c Fix bad cast from u16_t to ssize_t 2014-11-12 12:13:28 +00:00
David van Moolenbroek 7ee000e54a procfs: compile in x86 support only for x86 target
Issue reported by Emmanuel Blot.

Change-Id: I7f5b1b65273e6ac841d5451e0be7b0e1c92d537c
2014-11-12 12:13:23 +00:00
David van Moolenbroek 92601f58cb ext2: perform super I/O with contiguous memory
Issue reported by Antoine Leca.

Change-Id: Ie6f3ab6c1943b0b7ea9d5a68d4c24b92bab17233
2014-11-11 21:43:55 +00:00
Ben Gras f53651de01 VM,MFS: better handling of some exceptional cases
Fix for problems reported by Alejandro Hernández:
	. VM unmap: handle case where there is no nextvr

Fixes for problems found by running Melkor ELF fuzzing tool:
	. VM: better handle case where region prealloc fails by
	  freeing memory that was allocated so far
	. MFS fs_readwrite: EOF check should happen for read and
	  peek requests, not just read

This fixes #4.

Change-Id: I2adf4eebdfb4c48a297beff0478eed5c917a53a4
2014-11-10 17:51:57 +01:00
Lionel Sambuc 9e77ef5013 Enhancing /proc/pci
- Adding missing fields for PCI device lookup
 - Adding the domain (for now set to zero) as part of the slot name

Change-Id: Iebaf3b21f6ab5024738cbc1dea66d5ad3ada175d
2014-11-10 14:43:27 +01:00
David van Moolenbroek c2f99d7c3a isofs: rename source directory to "isofs"
Change-Id: Ibe630f720b4399e7ebbbd850650036fbaa9cec7b
2014-09-18 13:00:57 +00:00
David van Moolenbroek edfcb02885 isofs: basic improvements
- fix for "out of extents" panic;
- return ENOENT when a file name does not exist;
- inode count sanity check upon unmount.

Change-Id: Icb97dbaf7c8aec463438f06b341defca357094b2
2014-09-18 13:00:52 +00:00
David van Moolenbroek e2dc2c8954 isofs: use libdriver
Change-Id: I5ced800eec92f651f31d9c77c3129fe837ca4614
2014-09-18 13:00:47 +00:00
Jean-Baptiste Boric 3e08d38e8e iso9660fs: rewrite ISO 9660 file system server
iso9660fs has been cleaned up and debugged. It now supports:
 * ISO 9660 Level 3,
 * System Use Sharing Protocol (SUSP),
 * Rock Ridge Interchange Protocol (RRIP).

The following Rock Ridge features are supported:
 * POSIX file attributes (PX),
 * POSIX device number (PN),
 * Symbolic links (SL),
 * Alternate file name (NM),
 * Timestamps in 7-byte format (TF).

Change-Id: Ib227411bdda5bc10a957b27ad05fafdc95eca35f
2014-09-18 13:00:42 +00:00
David van Moolenbroek 1858c65d72 Revert "Temporarily disable the is9600 FS server"
This reverts commit ab5c98ee5a.
2014-09-18 12:59:18 +00:00
David van Moolenbroek 30d9b70391 PFS: rewrite, restyle
- remove the buffer pool, inode bitmap, and inode hash table, and
  simplify the code accordingly;
- use theoretically slightly more optimal buffer management;
- put the entire source in one file, instead of having many files
  with one or two functions each;
- convert the code to KNF style.

Change-Id: Ib8f6f0bd99fbc6eb9098fba718e71b8e560783d9
2014-09-18 12:46:28 +00:00
David van Moolenbroek f859061eaf PFS: use libfsdriver
In order to avoid creating libfsdriver exceptions, two changes to VFS
are necessary:

- the returned position field for reads/writes is no longer abused to
  return the new pipe size; VFS is perfectly capable of updating the
  size itself;
- during system startup, PFS is now sent a mount request, just like all
  other file systems.

In proper "two steps forward, one step back" fashion, the latter point
has the consequence that PFS can no longer drop its privileges at
startup.  This is probably best resolved with a more general solution
for all boot image system services.  The upside is that PFS no longer
needs to be linked with libc.

Change-Id: I92e2410cdb0d93d0e6107bae10bc08efc2dbb8b3
2014-09-18 12:46:28 +00:00
David van Moolenbroek 970d95ecd5 ext2: use libfsdriver
- fix panic on truncating files with holes;
- remove block-based readahead, to match MFS.

Change-Id: I385552f8019e9c013a6cb937bcc8e4e7181a4a50
2014-09-18 12:46:27 +00:00
David van Moolenbroek ccaeedb267 MFS: use libfsdriver
Change-Id: Ib658c7dea47b81a417755b0554a75288117b431a
2014-09-18 12:46:27 +00:00
David van Moolenbroek ad80a203db Move clock_time into libsys
Change-Id: Ibc5034617e6f6581de7c4a166ca075b3c357fa82
2014-09-18 12:46:26 +00:00
David van Moolenbroek 0dc5c83ec2 libvtreefs: use libfsdriver
Change-Id: I0e6446bd0ccc3b89edc237be441ebfd92585f352
2014-09-18 12:46:26 +00:00