This patch adds support for the wait4 system call, and with that the
wait3 call as well. The implementation is absolutely minimal: only
user and system times of the exited child are returned (with all other
rusage fields left zero), and there is no support for tracers. Still,
this should cover the main use cases of wait4.
Change-Id: I7a04589a8423a23990ab39aa38e85d535556743a
- the userland call is now made to PM only, and PM relays the call to
other servers as appropriate; this is an ABI change that will
ultimately allow us to add proper support for wait3() and the like;
for the moment there is backward compatibility;
- the getrusage-specific kernel subcall has been removed, as it
provided only redundant functionality, and did not provide the means
to be extended correctly in the future - namely, allowing the kernel
to return different values depending on whether resource usage of
the caller (self) or its children was requested;
- VM is now told whether resource usage of the caller (self) or its
children is requested, and it refrains from filling in wrong values
for information it does not have;
- VM now uses the correct unit for the ru_maxrss values;
- VFS is cut out of the loop entirely, since it does not provide any
values at the moment; a comment explains how it should be readded.
Change-Id: I27b0f488437dec3d8e784721c67b03f2f853120f
The current values were both inaccurate (especially for dynamically
linked executables) and using the wrong unit (bytes, instead of
kilobytes times ticks-of-execution). For now we are better off not
populating these fields at all.
Change-Id: I195a8fa8db909e64a833eec25f59c9ee0b89bdc5
Please note that this information is for use by system services only!
The clock facility is not ready to be used directly by userland, and
thus, this kernel page extension is NOT part of the userland ABI.
For service programmers' convenience, change the prototype of the
getticks(3) to return the uptime clock value directly, since the call
can no longer fail.
Correct the sys_times(2) reply message to use the right field type
for the boot time.
Restructure the kernel internals a bit so as to have all the clock
stuff closer together.
Change-Id: Ifc050b7bd253aecbe46e3bd7d7cc75bd86e45555
- test multicomponent live update with and without rs and/or vm;
- retry the update a few times if the failure code suggests it might
be a transient failure.
Change-Id: I5fce256bb418be257353ed21428f672d851d974d
- Update proc to select restart policy for VM
- Update testrelpol to test the supported modes of recovery for VM
- Small code cleanups in testrelpol as well.
Change-Id: I6958e100865c2429b9435f3f7cc7d018046378c3
If arguments are provided, the services list to test is set from those,
instead of initializing it with every currently running service.
If such arguments are present, also skip LiveUpdate tests.
Change-Id: I14f874666a610072a5ff4a60516e59cf04dc9e31
For dynamically linked executables, the interpreter is passed a
file descriptor of the binary being executed. To this end, VFS
opens the target executable, but opening the file fails if it is
not readable, even when it is executable. With this patch, when
opening the executable, it verifies the X bit rather than the R
bit on the file, thus allowing the execution of dynamically
linked binaries that are executable but not readable.
Add test86 to verify correctness.
Change-Id: If3514add6a33b33d52c05a0a627d757bff118d77
- The lmfs_get_block*(3) API calls may now return an error. The idea
is to encourage a next generation of file system services to do a
better job at dealing with block read errors than the MFS-derived
implementations do. These existing file systems have been changed
to panic immediately upon getting a block read error, in order to
let unchecked errors cause corruption. Note that libbdev already
retries failing I/O operations a few times first.
- The libminixfs block device I/O module (bio.c) now deals properly
with end-of-file conditions on block devices. Since a device or
partition size may not be a multiple of the root file system's block
size, support for partial block retrival has been added, with a new
internal lmfs_get_partial_block(3) call. A new test program,
test85, tests the new handling of EOF conditions when reading,
writing, and memory-mapping a block device.
Change-Id: I05e35b6b8851488328a2679da635ebba0c6d08ce
This patch changes the libminixfs API and implementation such that the
library is at all times aware of how many total and used blocks there
are in the file system. This removes the last upcall of libminixfs
into file systems (fs_blockstats). In the process, make this part of
the libminixfs API a little prettier and more robust. Change file
systems accordingly. Since this change only adds to MFS being unable
to deal with zones and blocks having different sizes, fail to mount
such file systems immediately rather than triggering an assert later.
Change-Id: I078e589c7e1be1fa691cf391bf5dfddd1baf2c86
When VM asks a file system to provide a block to satisfy a page fault
on a file memory mapping, the file system previously had no way to
inform VM that the block is a hole, since there is no corresponding
block on the underlying device. To work around this, MFS and ext2
would actually allocate a block for the hole when asked by VM, which
not only defeats the point of holes in the first place, but also does
not work on read-only file systems. With this patch, a new libminixfs
call allows the file system to inform VM about holes. This issue does
raise the question as to whether the VM cache is using the right data
structures, since there are now two places where we have to fake a
device offset. This will have to be revisited in the future.
The patch changes file systems accordingly, and adds a test to test74.
Change-Id: Ib537d56b3f30a8eb05bc1f63c92b5c7428d18f4c
This patch employs one solution to resolve two independent but related
issues. Both issues are the result of one fundamental aspect of the
way VM's memory mapping works: VM uses its cache to map in blocks for
memory-mapped file regions, and for blocks already in the VM cache, VM
does not go to the file system before mapping them in. To preserve
consistency between the FS and VM caches, VM relies on being informed
about all updates to file contents through the block cache. The two
issues are both the result of VM not being properly informed about
such updates:
1. Once a file system provides libminixfs with an inode association
(inode number + inode offset) for a disk block, this association
is not broken until a new inode association is provided for it.
If a block is freed and reallocated as a metadata (non-inode)
block, its old association is maintained, and may be supplied to
VM's secondary cache. Due to reuse of inodes, it is possible
that the same inode association becomes valid for an actual file
block again. In that case, when that new file is memory-mapped,
under certain circumstances, VM may end up using the metadata
block to satisfy a page fault on the file, due to the stale inode
association. The result is a corrupted memory mapping, with the
application seeing data other than the current file contents
mapped in at the file block.
2. When a hole is created in a file, the underlying block is freed
from the device, but VM is not informed of this update, and thus,
if VM's cache contains the block with its previous inode
association, this block will remain there. As a result, if an
application subsequently memory-maps the file, VM will map in the
old block at the position of the hole, rather than an all-zeroes
block. Thus, again, the result is a corrupted memory mapping.
This patch resolves both issues by making the file system inform the
minixfs library about blocks being freed, so that libminixfs can
break the inode association for that block, both in its own cache and
in the VM cache. Since libminixfs does not know whether VM has the
block in its cache or not, it makes a call to VM for each block being
freed. Thus, this change introduces more calls to VM, but it solves
the correctness issues at hand; optimizations may be introduced
later. On the upside, all freed blocks are now marked as clean,
which should result in fewer blocks being written back to the device,
and the blocks are removed from the caches entirely, which should
result in slightly better cache usage.
This patch is necessary but not sufficient to resolve the situation
with respect to memory mapping of file holes in general. Therefore,
this patch extends test 74 with a (rather particular but effective)
test for the first issue, but not yet with a test for the second one.
This fixes#90.
Change-Id: Iad8b134d2f88a884f15d3fc303e463280749c467
This test connects to a remote HTTP server to retrieve files, using various
chunk sizes and concurrency settings to exercise the network stack. The test
is only performed is USENETWORK=yes. This test requires the following URLs to
remain available: http://test82.minix3.org/test1.txt and
http://test82.minix3.org/test2.bin. The former contains a 'Hello world'
message followed by a newline, the latter all 16-bit values in increasing
order, using big-endian notation.
Change-Id: I696106482fb1658f9657be2b6845a1b37a3d6172
These new tests are largely based on the code from test 56 (UDS). Common code
is moved into a separate file common-socket.c. In some instances the tests
are too strict for TCP/UDP sockets, which may not always react instantly to
whatever happens on the other side (even locally). For these cases, the
ignore_* fields in struct socket_test_info indicate that there needs to be
an exception. There are also tests where it seems the functionality of inet
is either incorrect or incomplete with regard to the POSIX standard. In these
cases, the bug_* fields are used to document the issues while avoiding
failure of the test.
Change-Id: Ia860deb4559d42608790451936b1aade866faebc
This patch introduces USENETWORK environment variable to determine whether to
use the network or not, instead of the unreliable ping test; set to 'yes' to
enable network usage.
Change-Id: I9e26fa95b5b990fd94f5978db8de0dd73496d314
This patch adds support for Unix98 pseudo terminals, that is,
posix_openpt(3), grantpt(3), unlockpt(3), /dev/ptmx, and /dev/pts/.
The latter is implemented with a new pseudo file system, PTYFS.
In effect, this patch adds secure support for unprivileged pseudo
terminal allocation, allowing programs such as tmux(1) to be used by
non-root users as well. Test77 has been extended with new tests, and
no longer needs to run as root.
The new functionality is optional. To revert to the old behavior,
remove the "ptyfs" entry from /etc/fstab.
Technical nodes:
o The reason for not implementing the NetBSD /dev/ptm approach is that
implementing the corresponding ioctl (TIOCPTMGET) would require
adding a number of extremely hairy exceptions to VFS, including the
PTY driver having to create new file descriptors for its own device
nodes.
o PTYFS is required for Unix98 PTYs in order to avoid that the PTY
driver has to be aware of old-style PTY naming schemes and even has
to call chmod(2) on a disk-backed file system. PTY cannot be its
own PTYFS since a character driver may currently not also be a file
system. However, PTYFS may be subsumed into a DEVFS in the future.
o The Unix98 PTY behavior differs somewhat from NetBSD's, in that
slave nodes are created on ptyfs only upon the first call to
grantpt(3). This approach obviates the need to revoke access as
part of the grantpt(3) call.
o Shutting down PTY may leave slave nodes on PTYFS, but once PTY is
restarted, these leftover slave nodes will be removed before they
create a security risk. Unmounting PTYFS will make existing PTY
slaves permanently unavailable, and absence of PTYFS will block
allocation of new Unix98 PTYs until PTYFS is (re)mounted.
Change-Id: I822b43ba32707c8815fd0f7d5bb7a438f51421c1
This patch fixes two related issues:
- If a large (>PIPE_BUF) pipe write is processed partially, only to be
followed by a write error condition, then the process is left in an
incorrect state, possibly causing VFS to crash on a subsequent call.
- If such a partially processed large pipe write ends up resulting in
an EPIPE error, no corresponding SIGPIPE signal is generated.
The corrected behavior is tested in test68.
Change-Id: I5540e61ab6bcc60a31201485eda04bc49ece2ca8
The original one-shot page patch (git-e321f65) did not account for the
possibility of pagefaults happening while copying memory in the
kernel. This allowed a simple cp(1) from vbfs to hang the system,
since VM was repeatedly requesting the same page from the file system.
With this fix, VM no longer tries to fetch the same memory-mapped page
from VFS more than once per memory handling request from the kernel.
In addition to fixing the original issue, this change should make
handling memory somewhat more robust and ever-so-slightly faster.
Test74 has been extended with a simple test for this case.
Change-Id: I6e565f3750141e51b52ec98c938f8e1aa40070d0
- Expose in procfs the service status and supported recovery policies.
- This adds a test (testrelpol.sh) to exercise the restart policies of
the system services and drivers.
NOTE:
The policy support information is temporarily hardcoded in ProcFS, but
this has to be replaced by properly retrieving this information from
RS, which should in turn be setup on a per service basis, at
initialization time.
Change-Id: I0cb1516a450355b38d0c46b1a8b3d9e841a2c029
This patch adds (very limited) support for memory-mapping pages on
file systems that are mounted on the special "none" device and that
do not implement PEEK support by themselves. This includes hgfs,
vbfs, and procfs.
The solution is implemented in libvtreefs, and consists of allocating
pages, filling them with content by calling the file system's READ
functionality, passing the pages to VM, and freeing them again. A new
VM flag is used to indicate that these pages should be mapped in only
once, and thus not cached beyond their single use. This prevents
stale data from getting mapped in without the involvement of the file
system, which would be problematic on file systems where file contents
may become outdated at any time. No VM caching means no sharing and
poor performance, but mmap no longer fails on these file systems.
Compared to a libc-based approach, this patch retains the on-demand
nature of mmap. Especially tail(1) is known to map in a large file
area only to use a small portion of it.
All file systems now need to be given permission for the SETCACHEPAGE
and CLEARCACHE calls to VM.
A very basic regression test is added to test74.
Change-Id: I17afc4cb97315b515cad1542521b98f293b6b559
The test would sometimes fail because an alarm triggered before the
system call to be interrupted by the alarm could be started.
Change-Id: Ia507720a1f2d259afde1f97b7edd03f22cbd4810
The new functionality aims to save each file system server from having
to implement its own block I/O routines just so that it can serve as a
root file system. The new source file (bio.c) lists the requirements
that file system servers have to fulfill in order to use the routines.
Change-Id: Ia0190fd5c30e8c2097ed8f4b0e3ccde1827e0b92
The file system may not be expecting these upcalls at arbitrary
moments, while they serve only as a performance optimization anyway.
Change-Id: I0748fd1f6c2645ddbb64466093ee36025aac45e0
The minixfs library only ever submits vector elements (and reads) of
the system page size. The test implementation was expecting vector
elements (and reads) of the file system block size. The resulting
mismatch caused I/O to fail in various ways, even though this did not
have an effect on the actual test.
Change-Id: I02f4a3efcd4a32916435d82c7d5798e6b78f0a27
Updating the current block size before flushing the cache, which still
contained blocks with the old block size, resulted in triggering an
assert on position alignment.
Change-Id: I7a83f3d3bc57bafc08aa6c8df64fbf978273bbfd
Known limitations:
- comment for now testisofs, as iso9660fs is known to be broken.
Benefits:
- near 3x speed improvement on C++ code compilation, bringing down
make build to from 44min down to 21min.
- Allows for X applications to work properly, which should be available
in near-term future through pkgsrc for 3.3.0.
Change-Id: I8f4179a7ea925ed381642add32cfd8c5822217e4