Not all services involved in block I/O go through VM to access the
blocks they need. As a result, the blocks in VM may become stale,
possibly causing corruption when the stale copy is restored by a
service that does go through VM later on. This patch restores support
for forgetting cached blocks that belong to a particular device, and
makes the relevant file systems use this functionality 1) when
requested by VFS through REQ_FLUSH, and 2) upon unmount.
Change-Id: I0758c5ed8fe4b5ba81d432595d2113175776aff8
The memory-mapped files implementation (mmap() etc.) is implemented with
the help of the filesystems using the in-VM FS cache. Filesystems tell it
about all cached blocks and their metadata. Metadata is: device offset and,
if any (and known), inode number and in-inode offset. VM can then map in
requested memory-mapped file blocks, and request them if necessary.
A limitation of this system is that filesystem block sizes that are not
a multiple of the VM system (and VM hardware) page size are not possible;
we can't map blocks in partially. (We can copy, but then the benefits of
mapping and sharing the physical pages is gone.) So until before this
commit various pieces of caching code assumed page size multiple
blocksizes. This isn't strictly necessary as long as mmap() needn't be
supported on that FS.
This change allows the in-FS cache code (libminixfs) to allocate any-sized
blocks, and will not interact with the VM cache for non-pagesize-multiple
blocks. In that case it will also signal requestors, by failing 'peek'
requests, that mmap() should not be supported on this FS. VM and VFS
will then gracefully fail all file-mapping mmap() calls, and exec() will
fall back to copying executable blocks instead of mmap()ping executables.
As a result, 3 diagnostics that signal file-mapped mmap()s failing
(hitherto an unusual occurence) are disabled, as ld.so does file-mapped
mmap()s to map in objects it needs. On FSes not supporting it this situation
is legitimate and shouldn't cause so much noise. ld.so will revert to its own
minix-specific allocate+copy style of starting executables if mmap()s fail.
Change-Id: Iecb1c8090f5e0be28da8f5181bb35084eb18f67b
. initial workaround for assert() firing on iovec
size on ARM. likely due to alloc_contig() allocating
unusually mapped memory in STATICINIT.
. for the same reason use the regular cache i/o functions
to read the superblock in mfs - avoid the alloc_contig()
that STATICINIT does.
Change-Id: I3d8dc635b1cf2666e55b0393feae74cc25b8fed4
. this is OK (although it wastes some memory) as
long as the VM interface isn't used, which has its
own checks in libsys
Change-Id: I28decd367b2cd5c01482bdc71615c65ab61c9a71
libminixfs may now be informed of changes to the block usage on the
filesystem. if the net change becomes big enough, libminixfs may
resize the cache based on the new usage.
. update the 2 FSes to provide this information to libminixfs
Change-Id: I158815a11da801fd5572a8de89c9e6c039b82650
This commit introduces a new request type called REQ_BPEEK. It
requests minor device blocks from the FS. Analogously to REQ_PEEK,
it requests the filesystem to get the requested blocks into its
cache, without actually copying the result anywhere.
Change-Id: If1d06645b0e17553a64b3167091e9d12efeb3d6f
Primary purpose of change: to support the mmap implementation, VM must
know both (a) about some block metadata for FS cache blocks, i.e.
inode numbers and inode offsets where applicable; and (b) know about
*all* cache blocks, i.e. also of the FS primary caches and not just
the blocks that spill into the secondary one. This changes the
interface and VM data structures.
This change is only for the interface (libminixfs) and VM data
structures; the filesystem code is unmodified, so although the
secondary cache will be used as normal, blocks will not be annotated
with inode information until the FS is modified to provide this
information. Until it is modified, mmap of files will fail gracefully
on such filesystems.
This is indicated to VFS/VM by returning ENOSYS for REQ_PEEK.
Change-Id: I1d2df6c485e6c5e89eb28d9055076cc02629594e
This commit removes the secondary cache code implementation from
VM and its usage from libminixfs. It is to be replaced by a new
implementation.
Change-Id: I8fa3af06330e7604c7e0dd4cbe39d3ce353a05b1
. test70: regression test for m_out vfs race condition
The following tests use testcache.c to generate test i/o
patterns, generate random write data and verify the reads.
. test71: blackbox full-stack test of FS operation, testing
using the regular VFS interface crazy i/o patterns
with various working set sizes, triggering only
primary cache, also secondary cache, and finally
disk i/o and verifying contents all the time
. test72: unit test of libminixfs, implementing
functions it needs from -lsys and -lblockdriver
and the client in order to simulate a working
cache client and backend environment.
. test73: blackbox test of secondary vm cache in isolation
Change-Id: I1287e9753182b8719e634917ad158e3c1e079ceb
Remove old versions of system calls and system calls that don't have
a libc api interface anymore (dup, dup2, creat).
VFS still contains support for old system call numbers for the new stat
system calls (i.e., 65, 66, 67) to keep supporting old binaries built for
MINIX 3.2.1 (prior to the release).
Change-Id: I721779b58a50c7eeae20669de24658d55d69b25b
. 'anonymous' cache blocks (retrieved with NO_DEV as dev
parameter) were used to implement read()s from holes in
inodes that should return zeroes
. this is an awkward special case in the cache code though
and there's a more direct way to implement the same functionality:
instead of copying from a new, anonymous, zero block, to
the user target buffer, simply sys_safememset the user target
buffer directly. as this was the only use of this feature,
this is all that's needed to simplify the cache code a little.
Add primary cache management feature to libminixfs as mfs and ext2
currently do separately, remove cache code from mfs and ext2, and make
them use the libminixfs interface. This makes all fields of the buf
struct private to libminixfs and FS clients aren't supposed to access
them at all. Only the opaque 'void *data' field (the FS block contents,
used to be called bp) is to be accessed by the FS client.
The main purpose is to implement the interface to the 2ndary vm cache
just once, get rid of some code duplication, and add a little
abstraction to reduce the code inertia of the whole caching business.
Some minor sanity checking and prohibition done by mfs in this code
as removed from the generic primary cache code as a result:
- checking all inodes are not in use when allocating/resizing
the cache
- checking readonly filesystems aren't written to
- checking the superblock isn't written to on mounted filesystems
The minixfslib code relies on fs_blockstats() in the client filesystem to
return some FS usage information.
. all invocations were S or D, so can safely be dropped
to prepare for the segmentless world
. still assign D to the SCP_SEG field in the message
to make previous kernels usable
This patch fixes most of current reasons to generate compiler warnings.
The changes consist of:
- adding missing casts
- hiding or unhiding function declarations
- including headers where missing
- add __UNCONST when assigning a const char * to a char *
- adding missing return statements
- changing some types from unsigned to signed, as the code seems to want
signed ints
- converting old-style function definitions to current style (i.e.,
void func(param1, param2) short param1, param2; {...} to
void func (short param1, short param2) {...})
- making the compiler silent about signed vs unsigned comparisons. We
have too many of those in the new libc to fix.
A number of bugs in the test set were fixed. These bugs were never
triggered with our old libc. Consequently, these tests are now forced to
link with the new libc or they will generate errors (in particular tests 43
and 55).
Most changes in NetBSD libc are limited to moving aroudn "#ifndef __minix"
or stuff related to Minix-specific things (code in sys-minix or gen/minix).
. move cache size heuristic from mfs there
so mfs and ext2 can share it
. add vfs credentials retrieving function, with
backwards compatability from previous struct
format, to be used by both ext2 and mfs
. fix for ext2 - STATICINIT was fed no.
of bytes instead of no. of elements, overallocating
memory by a megabyte or two for the superblock