Commit graph

382 commits

Author SHA1 Message Date
Thomas Veerman
bdfef53dbf VFS: initialize variables 2013-01-11 12:46:44 +00:00
Thomas Veerman
aa521228a5 VFS: Coverity appeasements 2013-01-11 09:42:01 +00:00
Thomas Veerman
ea8ff9284a Add stack trace dumps for VFS over serial 2013-01-11 09:18:36 +00:00
Thomas Veerman
625f4ae4a3 VFS: add documentation about internal working 2013-01-11 09:18:36 +00:00
Thomas Veerman
23c5f56e32 VFS: change locking to ease concurrent FSes
This patch uses stricter locking for REQ_LINK, REQ_MKDIR, REQ_MKNOD,
REQ_RENAME, REQ_RMDIR, REQ_SLINK and REQ_UNLINK. For all requests, VFS
locks the directory in which we add or remove an inode with VNODE_WRITE.
I.e., the operations have exclusive access to that directory.

Furthermore, REQ_CHOWN, REQ_CHMOD, and REQ_FTRUNC now lock the vmnt
VMNT_READ; VMNT_WRITE was unnecessary.
2013-01-11 09:18:35 +00:00
Thomas Veerman
3de8d1cf6e VFS/PFS: remove notion of position in pipes
Because pipes have no file position. VFS maintained (file) offsets into a
buffer internal to PFS and stored them in vnodes for simplicity, mixing
the responsibilities of filp and vnode objects.

With this patch PFS ignores the position field in REQ_READ and REQ_WRITE
requests making VFS' job a lot simpler.
2013-01-11 09:18:35 +00:00
Thomas Veerman
7c8b3ddfed VFS: fix locking bugs
.sync and fsync used unnecessarily restrictive locking type
.fsync violated locking order by obtaining a vmnt lock after a filp lock
.fsync contained a TOCTOU bug
.new_node violated locking rules (didn't upgrade lock upon file creation)
.do_pipe used unnecessarily restrictive locking type
.always lock pipes exclusively; even a read operation might require to do
 a write on a vnode object (update pipe size)
.when opening a file with O_TRUNC, upgrade vnode lock when truncating
.utime used unnecessarily restrictive locking type
.path parsing:
  .always acquire VMNT_WRITE or VMNT_EXCL on vmnt and downgrade to
   VMNT_READ if that was what was actually requested. This prevents the
   following deadlock scenario:
   thread A:
     lock_vmnt(vmp, TLL_READSER);
     lock_vnode(vp, TLL_READSER);
     upgrade_vmnt_lock(vmp, TLL_WRITE);

   thread B:
     lock_vmnt(vmp, TLL_READ);
     lock_vnode(vp, TLL_READSER);

   thread A will be stuck in upgrade_vmnt_lock and thread B is stuck in
   lock_vnode. This happens when, for example, thread A tries create a
   new node (open.c:new_node) and thread B tries to do eat_path to
   change dir (stadir.c:do_chdir). When the path is being resolved, a
   vnode is always locked with VNODE_OPCL (TLL_READSER) and then
   downgraded to VNODE_READ if read-only is actually requested. Thread
   A locks the vmnt with VMNT_WRITE (TLL_READSER) which still allows
   VMNT_READ locks. Thread B can't acquire a lock on the vnode because
   thread A has it; Thread A can't upgrade its vmnt lock to VMNT_WRITE
   (TLL_WRITE) because thread B has a VMNT_READ lock on it.

   By serializing vmnt locks during path parsing, thread B can only
   acquire a lock on vmp when thread A has completely finished its
   operation.
2013-01-11 09:18:35 +00:00
Kees Jongenburger
c0c581a635 vfs:fix for variable 'rfp' set but not used.
mount.c: In function 'mount_pfs':
mount.c:395:17: error: variable 'rfp' set but not used [-Werror=unused-but-set-variable]

Change-Id: I2f22590ab4e3a4a1678e9096626ebca53d2660e6
2013-01-07 09:12:27 +01:00
Ben Gras
8aeac26999 vfs: fix clobbering fd_nr
dumpcore: fd_nr can be in use as blocking fd but will then be clobbered
by common_open, causing disaster for exiting unpause().
2012-12-11 12:00:57 +01:00
David van Moolenbroek
766047123a VFS: fix off-by-one in get_name() 2012-11-30 12:24:47 +00:00
Thomas Veerman
179261a9b6 mtab: support moving mount points
Also fix canonical_path function; it fails to parse some paths
2012-11-29 10:50:51 +00:00
Thomas Veerman
d9f4f71916 Implement dynamic mtab support
With this patch /etc/mtab becomes obsolete.
2012-11-26 15:20:18 +00:00
Thomas Veerman
de83b2a9d9 VFS: change 'last_dir' to match locking assumption
new_node makes the assumption that when it does last_dir on a path, a
successive advance would not yield a lock on a vmnt, because last_dir
already locked the vmnt. This is true except when last_dir resolves
to a directory on the parent vmnt of the file that was the result of
advance. For example,
 # cd /
 # echo foo > home
where home is on a different (sub) partition than / is (default
install). last_dir would resolve to / and advance would resolve to
/home.

With this change, last_dir resolves to the root node on the /home
partition, making the assumption valid again.
2012-11-26 15:20:18 +00:00
David van Moolenbroek
7dd286e6b8 VFS: do not save device node for new regular files
The VFS/FS protocol does not require the file server to supply a
special device node number in response to a REQ_CREATE request, as
this call creates only regular files. Therefore, VFS should not
erroneously save this piece of information from the REQ_CREATE reply
either.
2012-11-15 14:29:59 +00:00
Thomas Veerman
14e470be81 VFS: fix TOCTOU bug in sync 2012-11-14 13:24:53 +00:00
Thomas Veerman
ed23a7a7d2 VFS: fix reboot panic with mounted FUSE FS
Upon reboot VFS semi-exits all processes and unmounts the file system.
However, upon unmount, exiting FUSE file systems might need service from
the file system (due to libc). As the FUSE process is halfway the exit
procedure, it doesn't have a valid root directory and working directory.
Trying to do system calls then triggers a sanity check in VFS.

This fix first exits normal processes which should then allow for
unmounting FUSE file systems. Then VFS exits all processes including
File Servers and unmounts the rest of the file system.
2012-11-14 13:18:16 +00:00
Thomas Veerman
badec36b33 VFS: fix deadlock when out of worker threads
There is a deadlock vulnerability when there are no worker threads
available and all of them blocked on a worker thread that's waiting for a
reply from a driver or a reply from an FS that needs to make a back call. In
these cases the deadlock resolver thread should kick in, but didn't in all
cases. Moreover, POSIX calls from File Servers weren't handled properly
anymore, which also could lead to deadlocks.
2012-11-14 13:12:37 +00:00
Arne Welzel
e35c4f78d2 VFS: fix check_bsf() locking
The check_bsf() macro uses assert(mutex_trylock(&bsf_lock)) and
assumes bsf_lock is locked afterwards. This breaks when compiling
with NOASSERTS="yes". Also: macro to function transition.
2012-09-28 14:57:34 +02:00
Arne Welzel
7e1074732b VFS: resolve unused parameter if NOASSERTS="yes"
If VFS is compiled with NOASSERTS="yes", ctty_opcl() does not
use the op parameter. Change to "non-assert()" sanity check.
2012-09-28 14:57:32 +02:00
Ben Gras
60014efb3e vfs: pm_dumpcore: always clean up process
. whenever this function is called, pm will expect
	  the process to be cleaned up
	. so don't abort the process entirely on error
	. fixes a later 'forking on top of in-use child' vfs panic
2012-09-19 17:13:17 +02:00
Thomas Veerman
c087a60ed2 VFS: fix GCC compilation error 2012-09-17 15:29:38 +00:00
Thomas Veerman
3881e732a9 VFS: panic when unmount_all fails 2012-09-17 11:01:46 +00:00
Thomas Veerman
992799b91f VFS: make all IPC asynchronous
By decoupling synchronous drivers from VFS, we are a big step closer to
supporting driver crashes under all circumstances. That is, VFS can't
become stuck on IPC with a synchronous driver (e.g., INET) and can
recover from crashing block drivers during open/close/ioctl or during
communication with an FS.

In order to maintain serialized communication with a synchronous driver,
the communication is wrapped by a mutex on a per driver basis (not major
numbers as there can be multiple majors with identical endpoints). Majors
that share a driver endpoint point to a single mutex object.

In order to support crashes from block drivers, the file reopen tactic
had to be changed; first reopen files associated with the crashed
driver, then send the new driver endpoint to FSes. This solves a
deadlock between the FS and the block driver;
  - VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it
    after retrying the current request to the newly started driver.
  - The block driver would refuse the retried request until all files
    had been reopened.
  - VFS would reopen files only after getting a reply from the initial
    REQ_NEW_DRIVER.

When a character special driver crashes, all associated files have to
be marked invalid and closed (or reopened if flagged as such). However,
they can only be closed if a thread holds exclusive access to it. To
obtain exclusive access, the worker thread (which handles the new driver
endpoint event from DS) schedules a new job to garbage collect invalid
files. This way, we can signal the worker thread that was talking to the
crashed driver and will release exclusive access to a file associated
with the crashed driver and prevent the garbage collecting worker thread
from dead locking on that file.

Also, when a character special driver crashes, RS will unmap the driver
and remap it upon restart. During unmapping, associated files are marked
invalid instead of waiting for an endpoint up event from DS, as that
event might come later than new read/write/select requests and thus
cause confusion in the freshly started driver.

When locking a filp, the usage counters are no longer checked. The usage
counter can legally go down to zero during filp invalidation while there
are locks pending.

DS events are handled by a separate worker thread instead of the main
thread as reopening files could lead to another crash and a stuck thread.
An additional worker thread is then necessary to unlock it.

Finally, with everything asynchronous a race condition in do_select
surfaced. A select entry was only marked in use after succesfully sending
initial select requests to drivers and having to wait. When multiple
select() calls were handled there was opportunity that these entries
were overwritten. This had as effect that some select results were
ignored (and select() remained blocking instead if returning) or do_select
tried to access filps that were not present (because thrown away by
secondary select()). This bug manifested itself with sendrecs, but was
very hard to reproduce. However, it became awfully easy to trigger with
asynsends only.
2012-09-17 11:01:45 +00:00
Ben Gras
e4ac80eb60 various warning/errorwarning fixes for gcc47
. warnings (sometimes promoted to errors) in servers/ and kernel/
 . -Os for ext2 boot module to make it small enough
2012-08-27 16:19:18 +02:00
Ben Gras
31d8526346 libexec: add load_offset feature, used for ld.so
. ld.so is linked at 0 but it can relocate itself; we
	  wish to load ld.so higher though to trap NULL dereferences.
	  if we know we have to execute ld.so, vfs tells libexec to put it
	  higher.
2012-08-12 23:22:54 +02:00
Thomas Veerman
66dbf73049 VFS: fix locking bug in clone_opcl
When VFS runs out of vnodes after closing a vnode in opcl, common_open
will try to unlock a vnode through unlock_filp that has already been
unlocked in clone_opcl. By first obtaining and locking a new vnode this
situation is prevented; if there are no free vnodes, common_open will
unlock a still locked vnode.
2012-07-30 10:01:16 +00:00
Thomas Veerman
f6b0d662b5 VFS: check path components for NAME_MAX length 2012-07-30 09:44:58 +00:00
David van Moolenbroek
0b4c154160 VFS: call req_inhibread again 2012-07-19 14:36:51 +00:00
David van Moolenbroek
e0742978f1 VFS: do not resolve symlinks in rename(2) 2012-07-18 14:59:45 +00:00
Thomas Veerman
0d3ccd8908 VFS: fix coverity defects 2012-07-17 10:29:22 +00:00
Thomas Veerman
fd60f03129 VFS: remove support for sync FS communication 2012-07-17 10:12:53 +00:00
Thomas Veerman
06f49fe167 VFS: prevent buffer overflow
If an FS returns faulty struct dirent data, VFS could overflow
a buffer that holds this data.
2012-07-17 08:49:41 +00:00
Ben Gras
cbcdb838f1 various coverity-inspired fixes
. some strncpy/strcpy to strlcpy conversions
	. new <minix/param.h> to avoid including other minix headers
	  that have colliding definitions with library and commands code,
	  causing parse warnings
	. removed some dead code / assignments
2012-07-16 14:00:56 +02:00
Thomas Veerman
77dbd766c1 VFS: Use safe string copy functions 2012-07-16 10:57:43 +00:00
Ben Gras
50e2064049 No more intel/minix segments.
This commit removes all traces of Minix segments (the text/data/stack
memory map abstraction in the kernel) and significance of Intel segments
(hardware segments like CS, DS that add offsets to all addressing before
page table translation). This ultimately simplifies the memory layout
and addressing and makes the same layout possible on non-Intel
architectures.

There are only two types of addresses in the world now: virtual
and physical; even the kernel and processes have the same virtual
address space. Kernel and user processes can be distinguished at a
glance as processes won't use 0xF0000000 and above.

No static pre-allocated memory sizes exist any more.

Changes to booting:
        . The pre_init.c leaves the kernel and modules exactly as
          they were left by the bootloader in physical memory
        . The kernel starts running using physical addressing,
          loaded at a fixed location given in its linker script by the
          bootloader.  All code and data in this phase are linked to
          this fixed low location.
        . It makes a bootstrap pagetable to map itself to a
          fixed high location (also in linker script) and jumps to
          the high address. All code and data then use this high addressing.
        . All code/data symbols linked at the low addresses is prefixed by
          an objcopy step with __k_unpaged_*, so that that code cannot
          reference highly-linked symbols (which aren't valid yet) or vice
          versa (symbols that aren't valid any more).
        . The two addressing modes are separated in the linker script by
          collecting the unpaged_*.o objects and linking them with low
          addresses, and linking the rest high. Some objects are linked
          twice, once low and once high.
        . The bootstrap phase passes a lot of information (e.g. free memory
          list, physical location of the modules, etc.) using the kinfo
          struct.
        . After this bootstrap the low-linked part is freed.
        . The kernel maps in VM into the bootstrap page table so that VM can
          begin executing. Its first job is to make page tables for all other
          boot processes. So VM runs before RS, and RS gets a fully dynamic,
          VM-managed address space. VM gets its privilege info from RS as usual
          but that happens after RS starts running.
        . Both the kernel loading VM and VM organizing boot processes happen
	  using the libexec logic. This removes the last reason for VM to
	  still know much about exec() and vm/exec.c is gone.

Further Implementation:
        . All segments are based at 0 and have a 4 GB limit.
        . The kernel is mapped in at the top of the virtual address
          space so as not to constrain the user processes.
        . Processes do not use segments from the LDT at all; there are
          no segments in the LDT any more, so no LLDT is needed.
        . The Minix segments T/D/S are gone and so none of the
          user-space or in-kernel copy functions use them. The copy
          functions use a process endpoint of NONE to realize it's
          a physical address, virtual otherwise.
        . The umap call only makes sense to translate a virtual address
          to a physical address now.
        . Segments-related calls like newmap and alloc_segments are gone.
        . All segments-related translation in VM is gone (vir2map etc).
        . Initialization in VM is simpler as no moving around is necessary.
        . VM and all other boot processes can be linked wherever they wish
          and will be mapped in at the right location by the kernel and VM
          respectively.

Other changes:
        . The multiboot code is less special: it does not use mb_print
          for its diagnostics any more but uses printf() as normal, saving
          the output into the diagnostics buffer, only printing to the
          screen using the direct print functions if a panic() occurs.
        . The multiboot code uses the flexible 'free memory map list'
          style to receive the list of free memory if available.
        . The kernel determines the memory layout of the processes to
          a degree: it tells VM where the kernel starts and ends and
          where the kernel wants the top of the process to be. VM then
          uses this entire range, i.e. the stack is right at the top,
          and mmap()ped bits of memory are placed below that downwards,
          and the break grows upwards.

Other Consequences:
        . Every process gets its own page table as address spaces
          can't be separated any more by segments.
        . As all segments are 0-based, there is no distinction between
          virtual and linear addresses, nor between userspace and
          kernel addresses.
        . Less work is done when context switching, leading to a net
          performance increase. (8% faster on my machine for 'make servers'.)
	. The layout and configuration of the GDT makes sysenter and syscall
	  possible.
2012-07-15 22:30:15 +02:00
Ben Gras
0fb2f83da9 drop from segments physcopy/vircopy invocations
. sys_vircopy always uses D for both src and dst
	. sys_physcopy uses PHYS_SEG if and only if corresponding
	  endpoint is NONE, so we can derive the mode (PHYS_SEG or D)
	  from the endpoint arg in the kernel, dropping the seg args
	. fields in msg still filled in for backwards compatability,
	  using same NONE-logic in the library
2012-06-18 12:28:40 +00:00
Ben Gras
2bfeeed885 drop segment from safecopy invocations
. all invocations were S or D, so can safely be dropped
	  to prepare for the segmentless world
	. still assign D to the SCP_SEG field in the message
	  to make previous kernels usable
2012-06-16 16:22:51 +00:00
Ben Gras
85ff5a947e dumpcore: use ptrace function to trigger a coredump
. dumpcore currently relies on minix segments
	. also ptrace dumpcore fix
2012-06-15 12:13:50 +02:00
Ben Gras
769af57274 further libexec generalization
. new mode for sys_memset: include process so memset can be
	  done in physical or virtual address space.
	. add a mode to mmap() that lets a process allocate uninitialized
	  memory.
	. this allows an exec()er (RS, VFS, etc.) to request uninitialized
	  memory from VM and selectively clear the ranges that don't come
	  from a file, leaving no uninitialized memory left for the process
	  to see.
	. use callbacks for clearing the process, clearing memory in the
	  process, and copying into the process; so that the libexec code
	  can be used from rs, vfs, and in the future, kernel (to load vm)
	  and vm (to load boot-time processes)
2012-06-07 15:15:02 +02:00
Ben Gras
040362e379 exec() cleanup, generalization, improvement
. make exec() callers (i.e. vfs and rs) determine the
	  memory layout by explicitly reserving regions using
	  mmap() calls on behalf of the exec()ing process,
	  i.e. handling all of the exec logic, thereby eliminating
	  all special exec() knowledge from VM.
	. the new procedure is: clear the exec()ing process
	  first, then call third-party mmap()s to reserve memory, then
	  copy the executable file section contents in, all using callbacks
	  tailored to the caller's way of starting an executable
	. i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM
	  as with rigid 2-section arguments
	. this naturally allows generalizing exec() by simply loading
	  all ELF sections
	. drop/merge of lots of duplicate exec() code into libexec
	. not copying the code sections to vfs and into the executable
	  again is a measurable performance improvement (about 3.3% faster
	  for 'make' in src/servers/)
2012-06-07 15:15:01 +02:00
Ben Gras
41b869d4d6 drop aout support
justification: soon we won't be able to execute sep I&D aouts at
all (because of the vanishing segments), which was the default mode
to generate them so most binaries will be sep I&D.

this makes the vfs/rs exec() unification work simpler.

after unification, common I&D aout could be added back quite simply.
2012-06-07 12:43:16 +02:00
David van Moolenbroek
1817f7fc07 VFS: fix "process already free" panic on reboot
Reported by Claudiu Dan Gheorghe, debugged by Thomas and myself
2012-05-02 17:42:50 +02:00
Thomas Veerman
068d443d12 VFS: unlock vmnt when out of vnodes 2012-04-27 08:51:13 +00:00
Thomas Veerman
b6ff38065f VFS: release what can be released
Only attempt to release blocked processes that are blocked. There is
no use in trying to find more blocked processes than we know that are
blocked (on a pipe).
2012-04-27 08:51:02 +00:00
Thomas Veerman
7b81254069 VFS: simplify stat for pipes
According to POSIX the st_size field of struct stat is undefined for
fifos and anonymous pipes. Thus we can do anything we want. We save a
copy by not being accurate on pipe sizes.
2012-04-27 08:50:49 +00:00
Thomas Veerman
db8198d99d VFS: use S_IS* macros 2012-04-27 08:49:38 +00:00
Thomas Veerman
96bbc5da3e VFS: I_PIPE is redundant
Also, use S_IS* macros instead of manual comparison.
2012-04-27 08:49:38 +00:00
Ben Gras
755102d67f AT_SUN_EXECNAME support
. vfs: pass execname in aux vectors
	. ld.elf_so: use this to expand $ORIGIN
	. this requires the executable to reserve more
	  space at exec() calling time
2012-04-26 13:32:39 +02:00
David van Moolenbroek
26f817243b VFS: reimplement truncate mtime/ctime fix
POSIX mandates that a file's modification and change time be left
untouched upon truncate/ftruncate iff the file size does not change.
However, an open(O_TRUNC) call must always update the modification and
change time of the file, even if it was already zero-sized. VFS uses
the file systems' truncate call to implement O_TRUNC. This patch
replaces git-255ae85, which did not take into account the open case.
The size check is now moved into VFS, so that individual file systems
need not check for this case anymore.
2012-04-20 11:35:59 +02:00
Ben Gras
3945cfbfd3 block ioctls: pass request number 2012-04-18 11:01:15 +02:00
Ben Gras
53002f6f6c recognize and execute dynamically linked executables
. generalize libexec slightly to get some more necessary information
	  from ELF files, e.g. the interpreter
	. execute dynamically linked executables when exec()ed by VFS
	. switch to netbsd variant of elf32.h exclusively, solves some
	  conflicting headers
2012-04-16 00:41:42 +00:00
Thomas Veerman
26ec619a30 VFS: fix filp reuse race
Pipes consist of two filps (read filp and write filp) and a shared
vnode. When the writer leaves the filp reference count drops to
zero and subsequent find_filp()s should not find the filp when a
reader looks for it and the reader gets EOF. However, the pipe()
system call tries to find two filps, marks them in use, and only
after a successful node creation on PFS, overwrites the shared
vnode with the new vnode. Consequently, this leaves a small window
where a just closed 'pipe write filp' gets reused and marked as
present, before becoming the actual new 'pipe write filp' for a new
pipe. A reader for the old pipe will think a writer is present and
wait for that writer to write something or to leave; both actions
should revive the suspended reader. This will never happen and the
reader will be stuck forever.
2012-04-13 13:22:57 +00:00
Thomas Veerman
e292ba487e VFS: more three-level-lock sanity checking 2012-04-13 13:22:42 +00:00
Thomas Veerman
933120b0b1 VFS: add getting active threads control msg 2012-04-13 13:21:01 +00:00
Thomas Veerman
e1a73469c8 VFS: remove debug print 2012-04-13 13:20:28 +00:00
Thomas Veerman
c2bb739760 VFS: let know when skipping reply 2012-04-13 13:19:45 +00:00
Thomas Veerman
91a38b6d4e VFS: fix dead lock
When running out of worker threads to handle device replies a dead
lock resolver thread is used. However, it was only used for FS
endpoints; it is now used for "system processes" (drivers and FS
endpoints). Also, drivers were marked as system process when they
were not "forced" to map (i.e., mapping was done before endpoint was
alive).
2012-04-13 13:19:10 +00:00
Thomas Veerman
b956493367 VFS: fix new signed/unsigned comparisons 2012-04-13 13:00:11 +00:00
Thomas Veerman
defe329519 VFS: warnings are errors 2012-04-13 12:59:32 +00:00
Thomas Veerman
0d63d9e125 VFS: enable sending control messages 2012-04-13 12:54:55 +00:00
Thomas Veerman
f571466c56 VFS: find job only if request is an transaction 2012-04-13 12:52:52 +00:00
Thomas Veerman
8f55767619 VFS: make m_in job local
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.

Furthermore, this patch cleans up some code and uses better types in a lot
of places.
2012-04-13 12:50:38 +00:00
Ben Gras
1e2b3f4326 vfs: more regions for coredumps 2012-04-12 14:29:59 +02:00
Ben Gras
204ae72525 retire _ANSI and <minix/ansi.h> 2012-03-25 21:58:27 +02:00
Ben Gras
7336a67dfe retire PUBLIC, PRIVATE and FORWARD 2012-03-25 21:58:14 +02:00
Ben Gras
6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
David van Moolenbroek
e8d2d2f6b6 libminc-related updates
- add files needed for acpi, ahci, fbd, vfs to libminc
- remove "-lc" from their respective makefiles
- remove setenv from libminc (requires initialization)
2012-03-12 23:16:45 +01:00
Tomas Hruby
72b7abd1a1 VFS - no CANCEL for async non-blocking operations
- if an operation (R, W, IOCTL) is non blocking, a flag is set
  and sent to the device.

- nothing changes for sync devices

- asyn devices should reply asap if an operation is non-blocking.
  We must trust the devices, but we had to trust them anyway to
  reply to CANCEL correctly

- we safe sending CANCEL commands to asyn devices. This greatly
  simplifies the protocol. Asynchronous devices can always reply
  when a reply is ready and do not need to deal with other
  situations

- currently, none of our drivers use the flags since they drive
  virtual devices which do not block
2012-03-02 15:44:48 +00:00
Tomas Hruby
f19d8df184 VFS : simplification of handling asyn selects
- select_request_async() returns no ops by default

- wantops in do_select() always set correctly, do_select() does
  not need a special case for SUSPEND (and ugly code)
2012-03-02 15:44:48 +00:00
Tomas Hruby
369a12704f VFS - dev_style_asyn()
- dev_style_asyn() tests whether a device is asynchronous

 - simplifies code and helps readability
2012-03-02 15:44:47 +00:00
Tomas Hruby
35eb88461d VFS - cancel_nblock()
- duplicate code in dev_io() which sends CANCEL in case of a
  non-blocking operation moved to cancel_nblock()
2012-03-02 15:44:47 +00:00
Thomas Veerman
1efb51b1de VFS: improve crashed FS resource cleanup
When VFS detects that an FS has crashed and tries to clean up
resources, it marks fairly late in the process that a vmnt is not
to be used again (to send requests to). This allows a thread to
become blocked on a vmnt after all blocked threads were stopped, but
before it finds out it shouldn't try to send to that vmnt.
2012-02-22 13:54:35 +00:00
Thomas Veerman
5ff845212e VFS: remove unused variables 2012-02-21 10:21:05 +00:00
Thomas Veerman
0c1cd8720a VFS: fix last_dir not returning last directory
If the provided path was only a single component (i.e., without
slashes), then last_dir would return early and skip the symlink
detection (i.e., check whether the path ends in a symlink and resolve
that first before returning). This bug triggered an assert in open
which expects that an advance after an last_dir (with VMNT_WRITE lock)
does not yield another vmnt lock.
2012-02-21 10:21:05 +00:00
Thomas Veerman
230ea1ce13 VFS: remove erroneous assert
The assert was meant as an additional check to the assert in link.c:198.
The reasoning behind the assert in link.c:198 is that once you've
obtained a write lock on a vmnt, you can't get an additional read lock
on the same vmnt. However, that does not always hold for the assert in
path.c:281 where the situation could be that you've obtained a read lock
and managed to get another read lock (this is possible). In other words,
the assert in path.c:281 is not the right place to check for that
situation.
2012-02-20 09:17:42 +00:00
Thomas Veerman
c540bcb001 VFS: various select fixes
- Fix locking bug when unable to send DEV_SELECT request. Upon failure
  VFS tried to cancel the select operation, but this failed due to trying
  to lock a filp that was already locked to send the request in the first
  place. Do_select_request now handles locking of filps itself instead of
  relying on the caller to do it.  This fixes a crash when killing INET.
- Fix failure to revive a process after a non-blocking select operation
  yielded no ready select operations when replying DEV_SEL_REPL1.
- Improve readability by using OK, SUSPEND, and standard error values as
  results instead of having separate macros in select.
- Don't print not having a driver for a major device; after killing a driver
  select will trigger this printf.
2012-02-17 21:09:07 +00:00
Arun Thomas
ff56906879 Remove obsolete INSTALLFLAGS from makefiles 2012-02-16 23:26:38 +01:00
Ben Gras
2fe8fb192f Full switch to clang/ELF. Drop ack. Simplify.
There is important information about booting non-ack images in
docs/UPDATING. ack/aout-format images can't be built any more, and
booting clang/ELF-format ones is a little different. Updating to the
new boot monitor is recommended.

Changes in this commit:

	. drop boot monitor -> allowing dropping ack support
	. facility to copy ELF boot files to /boot so that old boot monitor
	  can still boot fairly easily, see UPDATING
	. no more ack-format libraries -> single-case libraries
	. some cleanup of OBJECT_FMT, COMPILER_TYPE, etc cases
	. drop several ack toolchain commands, but not all support
	  commands (e.g. aal is gone but acksize is not yet).
	. a few libc files moved to netbsd libc dir
	. new /bin/date as minix date used code in libc/
	. test compile fix
	. harmonize includes
	. /usr/lib is no longer special: without ack, /usr/lib plays no
	  kind of special bootstrapping role any more and bootstrapping
	  is done exclusively through packages, so releases depend even
	  less on the state of the machine making them now.
	. rename nbsd_lib* to lib*
	. reduce mtree
2012-02-14 14:52:02 +01:00
Thomas Veerman
80c4685324 VFS: replace VFS with AVFS 2012-02-13 16:53:21 +00:00
Thomas Veerman
4498750810 libchardriver: fix open reply for async devices 2012-02-09 14:17:54 +00:00
Thomas Veerman
1fc399a5c1 Add permission test for bind and socket
Also, apply forbidden patch to VFS from AVFS (fixes hanging test56 if
it has the permission test).
2012-01-30 15:16:20 +00:00
Thomas Veerman
0bd011affd PM: extend srv_fork to set a specific UID
Currently, all servers and drivers run as root as they are forks of
RS. srv_fork now tells PM with which credentials to run the resulting
fork. Subsequently, PM lets VFS now as well.

This patch also fixes the following bugs:
 - RS doesn't initialize the setugid variable during exec, causing the
   servers and drivers to run setuid rendering the srv_fork extension
   useless.
 - PM erroneously tells VFS to run processes setuid. This doesn't
   actually lead to setuid processes as VFS sets {r,e}uid and {r,e}gid
   properly before checking PM's approval.
2012-01-30 15:16:19 +00:00
David van Moolenbroek
c89aaf7a87 vfs/avfs: renumber stat calls so as to be unique
The old stat call numbers are still supported for a while.
2012-01-14 00:27:07 +01:00
David van Moolenbroek
2c685f34e0 Cut PM out of the adddma/deldma/getdma call path 2012-01-14 00:27:06 +01:00
David van Moolenbroek
8cb7ba7951 Remove obsolete PROCSTAT/getsigset call. 2012-01-14 00:27:06 +01:00
Ben Gras
34a8901eb8 vfs,avfs: verify an interpreter was found on #! line
. if not, NULL *interp is dereferenced
2011-12-21 23:44:13 +01:00
David van Moolenbroek
6f374faca5 Add "expected size" parameter to getsysinfo()
This patch provides basic protection against damage resulting from
differently compiled servers blindly copying tables to one another.
In every getsysinfo() call, the caller is provided with the expected
size of the requested data structure. The callee fails the call if
the expected size does not match the data structure's actual size.
2011-12-11 22:34:14 +01:00
David van Moolenbroek
9701e9dfd2 Servers: cleanup of some gcc -W warnings 2011-12-11 22:33:37 +01:00
Thomas Veerman
0a61519eea Provide core dumping support for AVFS 2011-12-08 10:47:11 +00:00
David van Moolenbroek
9221586f37 vfs/avfs: req_newdriver should use fs_sendrec
Using sendrec directly only results in problems. While it is not
clear whether using fs_sendrec is the best option, it is at least
an improvement.

Also remove some legacy cruft.
2011-12-05 16:28:09 +01:00
David van Moolenbroek
db087efac4 VFS/FS: REQ_NEW_DRIVER now provides a label 2011-11-30 19:05:26 +01:00
Thomas Veerman
b4fb061802 Implement issetugid syscall
Implement issetugid syscall and provide a test. This gets rid of the
scary "Unsecure. Implement me" warning during compilation.
2011-11-28 10:03:43 +00:00
David van Moolenbroek
a9f89a7290 vfs/avfs: map O_ACCMODE to R_BIT|W_BIT on recovery 2011-11-24 13:57:36 +01:00
David van Moolenbroek
b4d909d415 Split block/character protocols and libdriver
This patch separates the character and block driver communication
protocols. The old character protocol remains the same, but a new
block protocol is introduced. The libdriver library is replaced by
two new libraries: libchardriver and libblockdriver. Their exposed
API, and drivers that use them, have been updated accordingly.
Together, libbdev and libblockdriver now completely abstract away
the message format used by the block protocol. As the memory driver
is both a character and a block device driver, it now implements its
own message loop.

The most important semantic change made to the block protocol is that
it is no longer possible to return both partial results and an error
for a single transfer. This simplifies the interaction between the
caller and the driver, as the I/O vector no longer needs to be copied
back. Also, drivers are now no longer supposed to decide based on the
layout of the I/O vector when a transfer should be cut short. Put
simply, transfers are now supposed to either succeed completely, or
result in an error.

After this patch, the state of the various pieces is as follows:
- block protocol: stable
- libbdev API: stable for synchronous communication
- libblockdriver API: needs slight revision (the drvlib/partition API
  in particular; the threading API will also change shortly)
- character protocol: needs cleanup
- libchardriver API: needs cleanup accordingly
- driver restarts: largely unsupported until endpoint changes are
  reintroduced

As a side effect, this patch eliminates several bugs, hacks, and gcc
-Wall and -W warnings all over the place. It probably introduces a
few new ones, too.

Update warning: this patch changes the protocol between MFS and disk
drivers, so in order to use old/new images, the MFS from the ramdisk
must be used to mount all file systems.
2011-11-23 14:06:37 +01:00
David van Moolenbroek
1e1db53986 Introduce sys_getregs call, and let vfs use it 2011-11-22 02:07:33 +01:00
Adriana Szekeres
c30f014a89 gcore command to coredump a process 2011-11-22 22:07:41 +01:00
Adriana Szekeres
eaa29370f4 ELF core files 2011-11-22 22:07:40 +01:00
David van Moolenbroek
0bb27bb0b1 Servers: remove ABI comment 2011-11-07 22:24:59 +01:00
David van Moolenbroek
b02c260ecb Miscellaneous legacy cleanup 2011-11-07 22:20:55 +01:00
Thomas Veerman
203937456e Fix off-by-one errors and increase PATH_MAX to 1024
In some places it was assumed that PATH_MAX does not include a
terminating null character.

Increases PATH_MAX to 1024 to get in sync with NetBSD. Required some
rewriting in AVFS to keep memory usage low (the stack in use by a thread
is very small).
2011-09-12 09:00:24 +00:00
Thomas Veerman
d4b72e81b2 Cleanup servers to make GCC/Clang a little happier 2011-09-08 13:57:03 +00:00
Thomas Veerman
8a266a478e Increase gid_t and uid_t to 32 bits
Increase gid_t and uid_t to 32 bits and provide backwards compatibility
where needed.
2011-09-05 13:56:14 +00:00
Arun Thomas
86b061078b Build gcov code only if MKCOVERAGE is yes 2011-08-09 10:39:33 +02:00
Ben Gras
c4ea2a195c getsid() implementation 2011-08-02 22:16:59 +02:00
Thomas Veerman
ece4c9d565 Add DEV_CLONE_A dev type 2011-07-27 12:23:03 +00:00
Arun Thomas
530bd5d486 vfs/rs: for ELF, sep_id should be 0 2011-07-26 15:21:07 +02:00
Thomas Veerman
902e0e27e0 Don't panic if owner has vanished before reply 2011-07-15 14:11:34 +00:00
Evgeniy Ivanov
ef0a265086 New stat structure.
* VFS and installed MFSes must be in sync before and after this change *

Use struct stat from NetBSD. It requires adding new STAT, FSTAT and LSTAT
syscalls. Libc modification is both backward and forward compatible.

Also new struct stat uses modern field sizes to avoid ABI
incompatibility, when we update uid_t, gid_t and company.
Exceptions are ino_t and off_t in old libc (though paddings added).
2011-07-12 16:39:55 +02:00
Ben Gras
a9d15dd3e4 pm, vfs: don't print something for bogus calls 2011-07-05 13:21:48 +02:00
Ben Gras
86a226680b vfs: don't SUSPEND for unknown calls
. returning ENOSYS helps for implementing
	  new calls with forwards compatability
2011-07-02 17:19:13 +02:00
Arun Thomas
93ae43f577 boot: Add multiboot support
Not yet fully spec-compliant; work in progress
2011-06-24 17:21:51 +02:00
Gianluca Guida
cc17b27a2b Build NetBSD libc library in world in ELF mode.
3 sets of libraries are built now:
  . ack: all libraries that ack can compile (/usr/lib/i386/)
  . clang+elf: all libraries with minix headers (/usr/lib/)
  . clang+elf: all libraries with netbsd headers (/usr/netbsd/)

Once everything can be compiled with netbsd libraries and headers, the
/usr/netbsd hierarchy will be obsolete and its libraries compiled with
netbsd headers will be installed in /usr/lib, and its headers
in /usr/include. (i.e. minix libc and current minix headers set
will be gone.)

To use the NetBSD libc system (libraries + headers) before
it is the default libc, see:
   http://wiki.minix3.org/en/DevelopersGuide/UsingNetBSDCode
This wiki page also documents the maintenance of the patch
files of minix-specific changes to imported NetBSD code.

Changes in this commit:
  . libsys: Add NBSD compilation and create a safe NBSD-based libc.
  . Port rest of libraries (except libddekit) to new header system.
  . Enable compilation of libddekit with new headers.
  . Enable kernel compilation with new headers.
  . Enable drivers compilation with new headers.
  . Port legacy commands to new headers and libc.
  . Port servers to new headers.
  . Add <sys/sigcontext.h> in compat library.
  . Remove dependency file in tree.
  . Enable compilation of common/lib/libc/atomic in libsys
  . Do not generate RCSID strings in libc.
  . Temporarily disable zoneinfo as they are incompatible with NetBSD format
  . obj-nbsd for .gitignore
  . Procfs: use only integer arithmetic. (Antoine Leca)
  . Increase ramdisk size to create NBSD-based images.
  . Remove INCSYMLINKS handling hack.
  . Add nbsd_include/sys/exec_elf.h
  . Enable ELF compilation with NBSD libc.
  . Add 'make nbsdsrc' in tools to download reference NetBSD sources.
  . Automate minix-port.patch creation.
  . Avoid using fstavfs() as it is *extremely* slow and unneeded.
  . Set err() as PRIVATE to avoid name clash with libc.
  . [NBSD] servers/vm: remove compilation warnings.
  . u32 is not a long in NBSD headers.
  . UPDATING info on netbsd hierarchy
  . commands fixes for netbsd libc
2011-06-24 11:46:30 +02:00
Ben Gras
a77c2973b3 fix clang warnings -R in kernel/ and servers/ 2011-06-09 16:09:13 +02:00
Ben Gras
674cd6fd48 larger i/o buffer for exec()
. makes exec() for large executables (e.g. clang, gcc)
    significantly faster

Thanks to Antoine Leca.
2011-05-12 19:12:28 +02:00
Thomas Veerman
aba392e630 Clean up and fix multiple bugs in select:
- Remove redundant code.
 - Always wait for the initial reply from an asynchronous select request,
   even if the select has been satisfied on another file descriptor or
   was canceled due to a serious error.
 - Restart asynchronous selects if upon reply from the driver turns out
   that there are deferred operations (and do not forget we're still
   interested in the results of the deferred operations).
 - Do not hang a non-blocking select when another blocking select on
   the same filp is still blocking.
 - Split blocking operations in read, write, and exceptions (i.e.,
   blocking on read does not imply the write will block as well).
 - Some loops would iterate over OPEN_MAX file descriptors instead of
   the "highest" file descriptor.
 - Use proper internal error return values.
 - A secondary reply from a synchronous driver is essentially the same
   as from an asynchronous driver (the only difference being how the 
   answer is received). Merge.
 - Return proper error code after a driver failure.
 - Auto-detect whether a driver is synchronous or asynchronous.
 - Remove some code duplication.
 - Clean up code (coding style, add missing comments, put all select
   related code together).
2011-04-13 13:25:34 +00:00
Thomas Veerman
f0740680cd Do not print an error message when a binary is corrupt 2011-04-12 13:09:19 +00:00
David van Moolenbroek
c51cd5fe91 Server/driver protocols: no longer allow third-party copies.
Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed
to know which actual process to copy data from/to, as that process may
not always be the caller. Now that we have full safecopy support, these
fields have become useless for that purpose: the owner of the grant is
*always* the caller. Allowing the caller to supply another endpoint is
in fact dangerous, because the callee may then end up using a grant
from a third party. One could call this a variant of the confused
deputy problem.

From now on, safecopy calls should always use the caller's endpoint as
grant owner. This fully obsoletes the DL_ENDPT field in the
inet/ethernet protocol. IO_ENDPT has other uses besides identifying the
grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only
because that is a more fitting name (it should never be used for I/O
after all), but also in order to intentionally break any old system
source code outside the base system. If this patch breaks your code,
fixing it is fairly simple:

- DL_ENDPT should be replaced with m_source;
- IO_ENDPT should be replaced with m_source when used for safecopies;
- IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g.
  when setting REP_ENDPT, matching requests in CANCEL calls, getting
  DEV_SELECT flags, and retrieving of the real user process's endpoint
  in DEV_OPEN.

The changes in this patch are binary backward compatible.
2011-04-11 17:35:05 +00:00
Arun Thomas
cd9b4b46f4 libexec: return physaddr info from ELF headers 2011-04-07 12:22:36 +00:00
David van Moolenbroek
28f2a169da VFS: bugfixes for handling block-special files:
- on driver restarts, reopen devices on a per-file basis, not per-mount
- do not assume that there is just one vnode per block-special device
- update block-special files in the uncommon mounting success paths, too
- upon mount, sync but also invalidate affected buffers on the root FS
- upon unmount, check whether a vnode is in use before updating it
2011-03-25 10:56:43 +00:00
Erik van der Kouwe
36f9c1155a Restart process after response from async driver on non-blocking select 2011-02-23 10:27:48 +00:00
Ben Gras
287fee89cb add NOASSERTS make flag that disables assert()s (NDEBUG=1).
. made some checks in vfs/vnode.c also respond to NDEBUG=1.
  . turned on in release builds
2011-02-16 18:58:30 +00:00
Ben Gras
dc1cc91df1 <ansi.h> -> <minix/ansi.h> 2011-01-28 11:35:02 +00:00
Ben Gras
f0f34dd8d9 vfs - use a static buffer instead of malloc()+free(), solving
recently appeared ENOMEM problems during exec().
2010-12-15 14:43:59 +00:00
Arun Thomas
372b873413 VFS/RS support for ELF 2010-12-10 09:27:56 +00:00
Arun Thomas
cc26fb5ec4 vfs: terminate string in rdlink_direct
Fixes test56 when compiled with GCC.
2010-12-01 16:24:50 +00:00
Dirk Vogt
5e1e763506 removed unneeded global var 2010-11-24 16:30:13 +00:00
Dirk Vogt
9ed280d1ec decouple file system server start/termination from mount/umount 2010-11-23 19:34:56 +00:00
Arun Thomas
f0ab18377d GCC/clang: int64 routines in C 2010-11-12 18:38:10 +00:00
Erik van der Kouwe
9235536f38 Fix select-related bugs: missing cancellations led to potentially forgetting notifies, especially in the case of async drivers 2010-10-08 12:50:52 +00:00
David van Moolenbroek
354da24f5b make getsysinfo() a system-land call 2010-09-14 21:50:05 +00:00
Thomas Veerman
13ef7f1f38 Prepare VFS to support back calls from PFS. For security reasons and to support
file descriptor passing, PFS does some back calls to VFS. For example, to
verify the validity of a path provided by a process and to tell VFS it must
copy file descriptors from one process to another.
2010-08-30 13:44:07 +00:00
Ben Gras
5d6c2aae0a gcov support, based on work contributed by Anton Kuijsten. 2010-08-25 13:06:43 +00:00
Thomas Veerman
c8cfcab5db - Make sure there's space left in the vmnt table for another mount point.
- Increase mount point limit.
2010-08-17 10:02:50 +00:00
Ben Gras
3badab8b70 vfs - split fp_fd field into fd + callnr fields 2010-07-22 14:55:28 +00:00
Erik van der Kouwe
739f2d7536 Fix comment 2010-07-15 14:47:08 +00:00
Thomas Veerman
5aff633a0b Make RS and VFS aware of new UDS major. Contributed by Thomas Cort 2010-07-15 13:51:38 +00:00
David van Moolenbroek
895850b8cf move timers code to libsys 2010-07-09 12:58:18 +00:00
Thomas Veerman
34a2864e27 Fix a few compile time warnings 2010-07-02 12:41:19 +00:00
Arun Thomas
c0c8d25799 Rename mkfiles from minix.*.mk to bsd.*.mk
Makes things easier for pkgsrc
2010-06-25 18:29:09 +00:00
Erik van der Kouwe
c0dfa2f3f1 Get rid of asynsend backup copy in VFS 2010-06-25 14:57:54 +00:00
Erik van der Kouwe
498d7d8a4c Don't use kernel responses in servers 2010-06-24 07:37:26 +00:00
Ben Gras
fc01683584 include, vfs: statvfs, fstatvfs calls, contributed by Buccapatnam Tirumala, Gautam. 2010-06-23 23:53:50 +00:00
Ben Gras
19b790eb53 vfs: don't use a mountpoint if it's in use for anything else.
(this avoids data structure confusion if a mountpoint is reused as
a mountpoint until that's properly fixed.)
2010-06-11 11:41:56 +00:00
Arun Thomas
1bf6d23f34 Make exec() use entry point in a.out header 2010-06-10 14:59:10 +00:00
Arun Thomas
f0a158d8c1 More cleanup to remove MM and FS references 2010-06-10 14:04:46 +00:00
Kees van Reeuwijk
826b9590f2 More endpoint_t correctness.
More const correctness.
Other code cleanup.
2010-06-08 14:09:18 +00:00
Arun Thomas
4c10a31440 Remove legacy MM, FS, and FS_PROC_NR macros 2010-06-08 13:58:01 +00:00
Thomas Veerman
6bbcab3ec4 Clean up MFS a bit:
- Remove unused includes.
 - Add include guards to headers.
 - Use unsigned variables in case they're never going to hold a negative
   value. This causes GCC's complaints to disappear and should make flexelint
   a lot happier, too.
 - Make functions private when they're used only within a module.
 - Remove unused variables.
 - Add casts where appropriate.
2010-06-01 12:35:33 +00:00
Tomas Hruby
6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Thomas Veerman
0aceb25535 Small cleanup of dead and/or redundant code. 2010-05-06 09:32:40 +00:00