Commit graph

399 commits

Author SHA1 Message Date
Thomas Veerman 77dbd766c1 VFS: Use safe string copy functions 2012-07-16 10:57:43 +00:00
Ben Gras 50e2064049 No more intel/minix segments.
This commit removes all traces of Minix segments (the text/data/stack
memory map abstraction in the kernel) and significance of Intel segments
(hardware segments like CS, DS that add offsets to all addressing before
page table translation). This ultimately simplifies the memory layout
and addressing and makes the same layout possible on non-Intel
architectures.

There are only two types of addresses in the world now: virtual
and physical; even the kernel and processes have the same virtual
address space. Kernel and user processes can be distinguished at a
glance as processes won't use 0xF0000000 and above.

No static pre-allocated memory sizes exist any more.

Changes to booting:
        . The pre_init.c leaves the kernel and modules exactly as
          they were left by the bootloader in physical memory
        . The kernel starts running using physical addressing,
          loaded at a fixed location given in its linker script by the
          bootloader.  All code and data in this phase are linked to
          this fixed low location.
        . It makes a bootstrap pagetable to map itself to a
          fixed high location (also in linker script) and jumps to
          the high address. All code and data then use this high addressing.
        . All code/data symbols linked at the low addresses is prefixed by
          an objcopy step with __k_unpaged_*, so that that code cannot
          reference highly-linked symbols (which aren't valid yet) or vice
          versa (symbols that aren't valid any more).
        . The two addressing modes are separated in the linker script by
          collecting the unpaged_*.o objects and linking them with low
          addresses, and linking the rest high. Some objects are linked
          twice, once low and once high.
        . The bootstrap phase passes a lot of information (e.g. free memory
          list, physical location of the modules, etc.) using the kinfo
          struct.
        . After this bootstrap the low-linked part is freed.
        . The kernel maps in VM into the bootstrap page table so that VM can
          begin executing. Its first job is to make page tables for all other
          boot processes. So VM runs before RS, and RS gets a fully dynamic,
          VM-managed address space. VM gets its privilege info from RS as usual
          but that happens after RS starts running.
        . Both the kernel loading VM and VM organizing boot processes happen
	  using the libexec logic. This removes the last reason for VM to
	  still know much about exec() and vm/exec.c is gone.

Further Implementation:
        . All segments are based at 0 and have a 4 GB limit.
        . The kernel is mapped in at the top of the virtual address
          space so as not to constrain the user processes.
        . Processes do not use segments from the LDT at all; there are
          no segments in the LDT any more, so no LLDT is needed.
        . The Minix segments T/D/S are gone and so none of the
          user-space or in-kernel copy functions use them. The copy
          functions use a process endpoint of NONE to realize it's
          a physical address, virtual otherwise.
        . The umap call only makes sense to translate a virtual address
          to a physical address now.
        . Segments-related calls like newmap and alloc_segments are gone.
        . All segments-related translation in VM is gone (vir2map etc).
        . Initialization in VM is simpler as no moving around is necessary.
        . VM and all other boot processes can be linked wherever they wish
          and will be mapped in at the right location by the kernel and VM
          respectively.

Other changes:
        . The multiboot code is less special: it does not use mb_print
          for its diagnostics any more but uses printf() as normal, saving
          the output into the diagnostics buffer, only printing to the
          screen using the direct print functions if a panic() occurs.
        . The multiboot code uses the flexible 'free memory map list'
          style to receive the list of free memory if available.
        . The kernel determines the memory layout of the processes to
          a degree: it tells VM where the kernel starts and ends and
          where the kernel wants the top of the process to be. VM then
          uses this entire range, i.e. the stack is right at the top,
          and mmap()ped bits of memory are placed below that downwards,
          and the break grows upwards.

Other Consequences:
        . Every process gets its own page table as address spaces
          can't be separated any more by segments.
        . As all segments are 0-based, there is no distinction between
          virtual and linear addresses, nor between userspace and
          kernel addresses.
        . Less work is done when context switching, leading to a net
          performance increase. (8% faster on my machine for 'make servers'.)
	. The layout and configuration of the GDT makes sysenter and syscall
	  possible.
2012-07-15 22:30:15 +02:00
Ben Gras 0fb2f83da9 drop from segments physcopy/vircopy invocations
. sys_vircopy always uses D for both src and dst
	. sys_physcopy uses PHYS_SEG if and only if corresponding
	  endpoint is NONE, so we can derive the mode (PHYS_SEG or D)
	  from the endpoint arg in the kernel, dropping the seg args
	. fields in msg still filled in for backwards compatability,
	  using same NONE-logic in the library
2012-06-18 12:28:40 +00:00
Ben Gras 2bfeeed885 drop segment from safecopy invocations
. all invocations were S or D, so can safely be dropped
	  to prepare for the segmentless world
	. still assign D to the SCP_SEG field in the message
	  to make previous kernels usable
2012-06-16 16:22:51 +00:00
Ben Gras 85ff5a947e dumpcore: use ptrace function to trigger a coredump
. dumpcore currently relies on minix segments
	. also ptrace dumpcore fix
2012-06-15 12:13:50 +02:00
Ben Gras 769af57274 further libexec generalization
. new mode for sys_memset: include process so memset can be
	  done in physical or virtual address space.
	. add a mode to mmap() that lets a process allocate uninitialized
	  memory.
	. this allows an exec()er (RS, VFS, etc.) to request uninitialized
	  memory from VM and selectively clear the ranges that don't come
	  from a file, leaving no uninitialized memory left for the process
	  to see.
	. use callbacks for clearing the process, clearing memory in the
	  process, and copying into the process; so that the libexec code
	  can be used from rs, vfs, and in the future, kernel (to load vm)
	  and vm (to load boot-time processes)
2012-06-07 15:15:02 +02:00
Ben Gras 040362e379 exec() cleanup, generalization, improvement
. make exec() callers (i.e. vfs and rs) determine the
	  memory layout by explicitly reserving regions using
	  mmap() calls on behalf of the exec()ing process,
	  i.e. handling all of the exec logic, thereby eliminating
	  all special exec() knowledge from VM.
	. the new procedure is: clear the exec()ing process
	  first, then call third-party mmap()s to reserve memory, then
	  copy the executable file section contents in, all using callbacks
	  tailored to the caller's way of starting an executable
	. i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM
	  as with rigid 2-section arguments
	. this naturally allows generalizing exec() by simply loading
	  all ELF sections
	. drop/merge of lots of duplicate exec() code into libexec
	. not copying the code sections to vfs and into the executable
	  again is a measurable performance improvement (about 3.3% faster
	  for 'make' in src/servers/)
2012-06-07 15:15:01 +02:00
Ben Gras 41b869d4d6 drop aout support
justification: soon we won't be able to execute sep I&D aouts at
all (because of the vanishing segments), which was the default mode
to generate them so most binaries will be sep I&D.

this makes the vfs/rs exec() unification work simpler.

after unification, common I&D aout could be added back quite simply.
2012-06-07 12:43:16 +02:00
David van Moolenbroek 1817f7fc07 VFS: fix "process already free" panic on reboot
Reported by Claudiu Dan Gheorghe, debugged by Thomas and myself
2012-05-02 17:42:50 +02:00
Thomas Veerman 068d443d12 VFS: unlock vmnt when out of vnodes 2012-04-27 08:51:13 +00:00
Thomas Veerman b6ff38065f VFS: release what can be released
Only attempt to release blocked processes that are blocked. There is
no use in trying to find more blocked processes than we know that are
blocked (on a pipe).
2012-04-27 08:51:02 +00:00
Thomas Veerman 7b81254069 VFS: simplify stat for pipes
According to POSIX the st_size field of struct stat is undefined for
fifos and anonymous pipes. Thus we can do anything we want. We save a
copy by not being accurate on pipe sizes.
2012-04-27 08:50:49 +00:00
Thomas Veerman db8198d99d VFS: use S_IS* macros 2012-04-27 08:49:38 +00:00
Thomas Veerman 96bbc5da3e VFS: I_PIPE is redundant
Also, use S_IS* macros instead of manual comparison.
2012-04-27 08:49:38 +00:00
Ben Gras 755102d67f AT_SUN_EXECNAME support
. vfs: pass execname in aux vectors
	. ld.elf_so: use this to expand $ORIGIN
	. this requires the executable to reserve more
	  space at exec() calling time
2012-04-26 13:32:39 +02:00
David van Moolenbroek 26f817243b VFS: reimplement truncate mtime/ctime fix
POSIX mandates that a file's modification and change time be left
untouched upon truncate/ftruncate iff the file size does not change.
However, an open(O_TRUNC) call must always update the modification and
change time of the file, even if it was already zero-sized. VFS uses
the file systems' truncate call to implement O_TRUNC. This patch
replaces git-255ae85, which did not take into account the open case.
The size check is now moved into VFS, so that individual file systems
need not check for this case anymore.
2012-04-20 11:35:59 +02:00
Ben Gras 3945cfbfd3 block ioctls: pass request number 2012-04-18 11:01:15 +02:00
Ben Gras 53002f6f6c recognize and execute dynamically linked executables
. generalize libexec slightly to get some more necessary information
	  from ELF files, e.g. the interpreter
	. execute dynamically linked executables when exec()ed by VFS
	. switch to netbsd variant of elf32.h exclusively, solves some
	  conflicting headers
2012-04-16 00:41:42 +00:00
Thomas Veerman 26ec619a30 VFS: fix filp reuse race
Pipes consist of two filps (read filp and write filp) and a shared
vnode. When the writer leaves the filp reference count drops to
zero and subsequent find_filp()s should not find the filp when a
reader looks for it and the reader gets EOF. However, the pipe()
system call tries to find two filps, marks them in use, and only
after a successful node creation on PFS, overwrites the shared
vnode with the new vnode. Consequently, this leaves a small window
where a just closed 'pipe write filp' gets reused and marked as
present, before becoming the actual new 'pipe write filp' for a new
pipe. A reader for the old pipe will think a writer is present and
wait for that writer to write something or to leave; both actions
should revive the suspended reader. This will never happen and the
reader will be stuck forever.
2012-04-13 13:22:57 +00:00
Thomas Veerman e292ba487e VFS: more three-level-lock sanity checking 2012-04-13 13:22:42 +00:00
Thomas Veerman 933120b0b1 VFS: add getting active threads control msg 2012-04-13 13:21:01 +00:00
Thomas Veerman e1a73469c8 VFS: remove debug print 2012-04-13 13:20:28 +00:00
Thomas Veerman c2bb739760 VFS: let know when skipping reply 2012-04-13 13:19:45 +00:00
Thomas Veerman 91a38b6d4e VFS: fix dead lock
When running out of worker threads to handle device replies a dead
lock resolver thread is used. However, it was only used for FS
endpoints; it is now used for "system processes" (drivers and FS
endpoints). Also, drivers were marked as system process when they
were not "forced" to map (i.e., mapping was done before endpoint was
alive).
2012-04-13 13:19:10 +00:00
Thomas Veerman b956493367 VFS: fix new signed/unsigned comparisons 2012-04-13 13:00:11 +00:00
Thomas Veerman defe329519 VFS: warnings are errors 2012-04-13 12:59:32 +00:00
Thomas Veerman 0d63d9e125 VFS: enable sending control messages 2012-04-13 12:54:55 +00:00
Thomas Veerman f571466c56 VFS: find job only if request is an transaction 2012-04-13 12:52:52 +00:00
Thomas Veerman 8f55767619 VFS: make m_in job local
By making m_in job local (i.e., each job has its own copy of m_in instead
of refering to the global m_in) we don't have to store and restore m_in
on every thread yield. This reduces overhead. Moreover, remove the
assumption that m_in is preserved. Do_XXX functions have to copy the
system call parameters as soon as possible and only pass those copies to
other functions.

Furthermore, this patch cleans up some code and uses better types in a lot
of places.
2012-04-13 12:50:38 +00:00
Ben Gras 1e2b3f4326 vfs: more regions for coredumps 2012-04-12 14:29:59 +02:00
Ben Gras 204ae72525 retire _ANSI and <minix/ansi.h> 2012-03-25 21:58:27 +02:00
Ben Gras 7336a67dfe retire PUBLIC, PRIVATE and FORWARD 2012-03-25 21:58:14 +02:00
Ben Gras 6a73e85ad1 retire _PROTOTYPE
. only good for obsolete K&R support
	. also remove a stray ansi.h and the proto cmd
2012-03-25 16:17:10 +02:00
David van Moolenbroek e8d2d2f6b6 libminc-related updates
- add files needed for acpi, ahci, fbd, vfs to libminc
- remove "-lc" from their respective makefiles
- remove setenv from libminc (requires initialization)
2012-03-12 23:16:45 +01:00
Tomas Hruby 72b7abd1a1 VFS - no CANCEL for async non-blocking operations
- if an operation (R, W, IOCTL) is non blocking, a flag is set
  and sent to the device.

- nothing changes for sync devices

- asyn devices should reply asap if an operation is non-blocking.
  We must trust the devices, but we had to trust them anyway to
  reply to CANCEL correctly

- we safe sending CANCEL commands to asyn devices. This greatly
  simplifies the protocol. Asynchronous devices can always reply
  when a reply is ready and do not need to deal with other
  situations

- currently, none of our drivers use the flags since they drive
  virtual devices which do not block
2012-03-02 15:44:48 +00:00
Tomas Hruby f19d8df184 VFS : simplification of handling asyn selects
- select_request_async() returns no ops by default

- wantops in do_select() always set correctly, do_select() does
  not need a special case for SUSPEND (and ugly code)
2012-03-02 15:44:48 +00:00
Tomas Hruby 369a12704f VFS - dev_style_asyn()
- dev_style_asyn() tests whether a device is asynchronous

 - simplifies code and helps readability
2012-03-02 15:44:47 +00:00
Tomas Hruby 35eb88461d VFS - cancel_nblock()
- duplicate code in dev_io() which sends CANCEL in case of a
  non-blocking operation moved to cancel_nblock()
2012-03-02 15:44:47 +00:00
Thomas Veerman 1efb51b1de VFS: improve crashed FS resource cleanup
When VFS detects that an FS has crashed and tries to clean up
resources, it marks fairly late in the process that a vmnt is not
to be used again (to send requests to). This allows a thread to
become blocked on a vmnt after all blocked threads were stopped, but
before it finds out it shouldn't try to send to that vmnt.
2012-02-22 13:54:35 +00:00
Thomas Veerman 5ff845212e VFS: remove unused variables 2012-02-21 10:21:05 +00:00
Thomas Veerman 0c1cd8720a VFS: fix last_dir not returning last directory
If the provided path was only a single component (i.e., without
slashes), then last_dir would return early and skip the symlink
detection (i.e., check whether the path ends in a symlink and resolve
that first before returning). This bug triggered an assert in open
which expects that an advance after an last_dir (with VMNT_WRITE lock)
does not yield another vmnt lock.
2012-02-21 10:21:05 +00:00
Thomas Veerman 230ea1ce13 VFS: remove erroneous assert
The assert was meant as an additional check to the assert in link.c:198.
The reasoning behind the assert in link.c:198 is that once you've
obtained a write lock on a vmnt, you can't get an additional read lock
on the same vmnt. However, that does not always hold for the assert in
path.c:281 where the situation could be that you've obtained a read lock
and managed to get another read lock (this is possible). In other words,
the assert in path.c:281 is not the right place to check for that
situation.
2012-02-20 09:17:42 +00:00
Thomas Veerman c540bcb001 VFS: various select fixes
- Fix locking bug when unable to send DEV_SELECT request. Upon failure
  VFS tried to cancel the select operation, but this failed due to trying
  to lock a filp that was already locked to send the request in the first
  place. Do_select_request now handles locking of filps itself instead of
  relying on the caller to do it.  This fixes a crash when killing INET.
- Fix failure to revive a process after a non-blocking select operation
  yielded no ready select operations when replying DEV_SEL_REPL1.
- Improve readability by using OK, SUSPEND, and standard error values as
  results instead of having separate macros in select.
- Don't print not having a driver for a major device; after killing a driver
  select will trigger this printf.
2012-02-17 21:09:07 +00:00
Arun Thomas ff56906879 Remove obsolete INSTALLFLAGS from makefiles 2012-02-16 23:26:38 +01:00
Ben Gras 2fe8fb192f Full switch to clang/ELF. Drop ack. Simplify.
There is important information about booting non-ack images in
docs/UPDATING. ack/aout-format images can't be built any more, and
booting clang/ELF-format ones is a little different. Updating to the
new boot monitor is recommended.

Changes in this commit:

	. drop boot monitor -> allowing dropping ack support
	. facility to copy ELF boot files to /boot so that old boot monitor
	  can still boot fairly easily, see UPDATING
	. no more ack-format libraries -> single-case libraries
	. some cleanup of OBJECT_FMT, COMPILER_TYPE, etc cases
	. drop several ack toolchain commands, but not all support
	  commands (e.g. aal is gone but acksize is not yet).
	. a few libc files moved to netbsd libc dir
	. new /bin/date as minix date used code in libc/
	. test compile fix
	. harmonize includes
	. /usr/lib is no longer special: without ack, /usr/lib plays no
	  kind of special bootstrapping role any more and bootstrapping
	  is done exclusively through packages, so releases depend even
	  less on the state of the machine making them now.
	. rename nbsd_lib* to lib*
	. reduce mtree
2012-02-14 14:52:02 +01:00
Thomas Veerman 80c4685324 VFS: replace VFS with AVFS 2012-02-13 16:53:21 +00:00
Thomas Veerman 4498750810 libchardriver: fix open reply for async devices 2012-02-09 14:17:54 +00:00
Thomas Veerman 1fc399a5c1 Add permission test for bind and socket
Also, apply forbidden patch to VFS from AVFS (fixes hanging test56 if
it has the permission test).
2012-01-30 15:16:20 +00:00
Thomas Veerman 0bd011affd PM: extend srv_fork to set a specific UID
Currently, all servers and drivers run as root as they are forks of
RS. srv_fork now tells PM with which credentials to run the resulting
fork. Subsequently, PM lets VFS now as well.

This patch also fixes the following bugs:
 - RS doesn't initialize the setugid variable during exec, causing the
   servers and drivers to run setuid rendering the srv_fork extension
   useless.
 - PM erroneously tells VFS to run processes setuid. This doesn't
   actually lead to setuid processes as VFS sets {r,e}uid and {r,e}gid
   properly before checking PM's approval.
2012-01-30 15:16:19 +00:00
David van Moolenbroek c89aaf7a87 vfs/avfs: renumber stat calls so as to be unique
The old stat call numbers are still supported for a while.
2012-01-14 00:27:07 +01:00
David van Moolenbroek 2c685f34e0 Cut PM out of the adddma/deldma/getdma call path 2012-01-14 00:27:06 +01:00
David van Moolenbroek 8cb7ba7951 Remove obsolete PROCSTAT/getsigset call. 2012-01-14 00:27:06 +01:00
Ben Gras 34a8901eb8 vfs,avfs: verify an interpreter was found on #! line
. if not, NULL *interp is dereferenced
2011-12-21 23:44:13 +01:00
David van Moolenbroek 6f374faca5 Add "expected size" parameter to getsysinfo()
This patch provides basic protection against damage resulting from
differently compiled servers blindly copying tables to one another.
In every getsysinfo() call, the caller is provided with the expected
size of the requested data structure. The callee fails the call if
the expected size does not match the data structure's actual size.
2011-12-11 22:34:14 +01:00
David van Moolenbroek 9701e9dfd2 Servers: cleanup of some gcc -W warnings 2011-12-11 22:33:37 +01:00
Thomas Veerman 0a61519eea Provide core dumping support for AVFS 2011-12-08 10:47:11 +00:00
David van Moolenbroek 9221586f37 vfs/avfs: req_newdriver should use fs_sendrec
Using sendrec directly only results in problems. While it is not
clear whether using fs_sendrec is the best option, it is at least
an improvement.

Also remove some legacy cruft.
2011-12-05 16:28:09 +01:00
David van Moolenbroek db087efac4 VFS/FS: REQ_NEW_DRIVER now provides a label 2011-11-30 19:05:26 +01:00
Thomas Veerman b4fb061802 Implement issetugid syscall
Implement issetugid syscall and provide a test. This gets rid of the
scary "Unsecure. Implement me" warning during compilation.
2011-11-28 10:03:43 +00:00
David van Moolenbroek a9f89a7290 vfs/avfs: map O_ACCMODE to R_BIT|W_BIT on recovery 2011-11-24 13:57:36 +01:00
David van Moolenbroek b4d909d415 Split block/character protocols and libdriver
This patch separates the character and block driver communication
protocols. The old character protocol remains the same, but a new
block protocol is introduced. The libdriver library is replaced by
two new libraries: libchardriver and libblockdriver. Their exposed
API, and drivers that use them, have been updated accordingly.
Together, libbdev and libblockdriver now completely abstract away
the message format used by the block protocol. As the memory driver
is both a character and a block device driver, it now implements its
own message loop.

The most important semantic change made to the block protocol is that
it is no longer possible to return both partial results and an error
for a single transfer. This simplifies the interaction between the
caller and the driver, as the I/O vector no longer needs to be copied
back. Also, drivers are now no longer supposed to decide based on the
layout of the I/O vector when a transfer should be cut short. Put
simply, transfers are now supposed to either succeed completely, or
result in an error.

After this patch, the state of the various pieces is as follows:
- block protocol: stable
- libbdev API: stable for synchronous communication
- libblockdriver API: needs slight revision (the drvlib/partition API
  in particular; the threading API will also change shortly)
- character protocol: needs cleanup
- libchardriver API: needs cleanup accordingly
- driver restarts: largely unsupported until endpoint changes are
  reintroduced

As a side effect, this patch eliminates several bugs, hacks, and gcc
-Wall and -W warnings all over the place. It probably introduces a
few new ones, too.

Update warning: this patch changes the protocol between MFS and disk
drivers, so in order to use old/new images, the MFS from the ramdisk
must be used to mount all file systems.
2011-11-23 14:06:37 +01:00
David van Moolenbroek 1e1db53986 Introduce sys_getregs call, and let vfs use it 2011-11-22 02:07:33 +01:00
Adriana Szekeres c30f014a89 gcore command to coredump a process 2011-11-22 22:07:41 +01:00
Adriana Szekeres eaa29370f4 ELF core files 2011-11-22 22:07:40 +01:00
David van Moolenbroek 0bb27bb0b1 Servers: remove ABI comment 2011-11-07 22:24:59 +01:00
David van Moolenbroek b02c260ecb Miscellaneous legacy cleanup 2011-11-07 22:20:55 +01:00
Thomas Veerman 203937456e Fix off-by-one errors and increase PATH_MAX to 1024
In some places it was assumed that PATH_MAX does not include a
terminating null character.

Increases PATH_MAX to 1024 to get in sync with NetBSD. Required some
rewriting in AVFS to keep memory usage low (the stack in use by a thread
is very small).
2011-09-12 09:00:24 +00:00
Thomas Veerman d4b72e81b2 Cleanup servers to make GCC/Clang a little happier 2011-09-08 13:57:03 +00:00
Thomas Veerman 8a266a478e Increase gid_t and uid_t to 32 bits
Increase gid_t and uid_t to 32 bits and provide backwards compatibility
where needed.
2011-09-05 13:56:14 +00:00
Arun Thomas 86b061078b Build gcov code only if MKCOVERAGE is yes 2011-08-09 10:39:33 +02:00
Ben Gras c4ea2a195c getsid() implementation 2011-08-02 22:16:59 +02:00
Thomas Veerman ece4c9d565 Add DEV_CLONE_A dev type 2011-07-27 12:23:03 +00:00
Arun Thomas 530bd5d486 vfs/rs: for ELF, sep_id should be 0 2011-07-26 15:21:07 +02:00
Thomas Veerman 902e0e27e0 Don't panic if owner has vanished before reply 2011-07-15 14:11:34 +00:00
Evgeniy Ivanov ef0a265086 New stat structure.
* VFS and installed MFSes must be in sync before and after this change *

Use struct stat from NetBSD. It requires adding new STAT, FSTAT and LSTAT
syscalls. Libc modification is both backward and forward compatible.

Also new struct stat uses modern field sizes to avoid ABI
incompatibility, when we update uid_t, gid_t and company.
Exceptions are ino_t and off_t in old libc (though paddings added).
2011-07-12 16:39:55 +02:00
Ben Gras a9d15dd3e4 pm, vfs: don't print something for bogus calls 2011-07-05 13:21:48 +02:00
Ben Gras 86a226680b vfs: don't SUSPEND for unknown calls
. returning ENOSYS helps for implementing
	  new calls with forwards compatability
2011-07-02 17:19:13 +02:00
Arun Thomas 93ae43f577 boot: Add multiboot support
Not yet fully spec-compliant; work in progress
2011-06-24 17:21:51 +02:00
Gianluca Guida cc17b27a2b Build NetBSD libc library in world in ELF mode.
3 sets of libraries are built now:
  . ack: all libraries that ack can compile (/usr/lib/i386/)
  . clang+elf: all libraries with minix headers (/usr/lib/)
  . clang+elf: all libraries with netbsd headers (/usr/netbsd/)

Once everything can be compiled with netbsd libraries and headers, the
/usr/netbsd hierarchy will be obsolete and its libraries compiled with
netbsd headers will be installed in /usr/lib, and its headers
in /usr/include. (i.e. minix libc and current minix headers set
will be gone.)

To use the NetBSD libc system (libraries + headers) before
it is the default libc, see:
   http://wiki.minix3.org/en/DevelopersGuide/UsingNetBSDCode
This wiki page also documents the maintenance of the patch
files of minix-specific changes to imported NetBSD code.

Changes in this commit:
  . libsys: Add NBSD compilation and create a safe NBSD-based libc.
  . Port rest of libraries (except libddekit) to new header system.
  . Enable compilation of libddekit with new headers.
  . Enable kernel compilation with new headers.
  . Enable drivers compilation with new headers.
  . Port legacy commands to new headers and libc.
  . Port servers to new headers.
  . Add <sys/sigcontext.h> in compat library.
  . Remove dependency file in tree.
  . Enable compilation of common/lib/libc/atomic in libsys
  . Do not generate RCSID strings in libc.
  . Temporarily disable zoneinfo as they are incompatible with NetBSD format
  . obj-nbsd for .gitignore
  . Procfs: use only integer arithmetic. (Antoine Leca)
  . Increase ramdisk size to create NBSD-based images.
  . Remove INCSYMLINKS handling hack.
  . Add nbsd_include/sys/exec_elf.h
  . Enable ELF compilation with NBSD libc.
  . Add 'make nbsdsrc' in tools to download reference NetBSD sources.
  . Automate minix-port.patch creation.
  . Avoid using fstavfs() as it is *extremely* slow and unneeded.
  . Set err() as PRIVATE to avoid name clash with libc.
  . [NBSD] servers/vm: remove compilation warnings.
  . u32 is not a long in NBSD headers.
  . UPDATING info on netbsd hierarchy
  . commands fixes for netbsd libc
2011-06-24 11:46:30 +02:00
Ben Gras a77c2973b3 fix clang warnings -R in kernel/ and servers/ 2011-06-09 16:09:13 +02:00
Ben Gras 674cd6fd48 larger i/o buffer for exec()
. makes exec() for large executables (e.g. clang, gcc)
    significantly faster

Thanks to Antoine Leca.
2011-05-12 19:12:28 +02:00
Thomas Veerman aba392e630 Clean up and fix multiple bugs in select:
- Remove redundant code.
 - Always wait for the initial reply from an asynchronous select request,
   even if the select has been satisfied on another file descriptor or
   was canceled due to a serious error.
 - Restart asynchronous selects if upon reply from the driver turns out
   that there are deferred operations (and do not forget we're still
   interested in the results of the deferred operations).
 - Do not hang a non-blocking select when another blocking select on
   the same filp is still blocking.
 - Split blocking operations in read, write, and exceptions (i.e.,
   blocking on read does not imply the write will block as well).
 - Some loops would iterate over OPEN_MAX file descriptors instead of
   the "highest" file descriptor.
 - Use proper internal error return values.
 - A secondary reply from a synchronous driver is essentially the same
   as from an asynchronous driver (the only difference being how the 
   answer is received). Merge.
 - Return proper error code after a driver failure.
 - Auto-detect whether a driver is synchronous or asynchronous.
 - Remove some code duplication.
 - Clean up code (coding style, add missing comments, put all select
   related code together).
2011-04-13 13:25:34 +00:00
Thomas Veerman f0740680cd Do not print an error message when a binary is corrupt 2011-04-12 13:09:19 +00:00
David van Moolenbroek c51cd5fe91 Server/driver protocols: no longer allow third-party copies.
Before safecopies, the IO_ENDPT and DL_ENDPT message fields were needed
to know which actual process to copy data from/to, as that process may
not always be the caller. Now that we have full safecopy support, these
fields have become useless for that purpose: the owner of the grant is
*always* the caller. Allowing the caller to supply another endpoint is
in fact dangerous, because the callee may then end up using a grant
from a third party. One could call this a variant of the confused
deputy problem.

From now on, safecopy calls should always use the caller's endpoint as
grant owner. This fully obsoletes the DL_ENDPT field in the
inet/ethernet protocol. IO_ENDPT has other uses besides identifying the
grant owner though. This patch renames IO_ENDPT to USER_ENDPT, not only
because that is a more fitting name (it should never be used for I/O
after all), but also in order to intentionally break any old system
source code outside the base system. If this patch breaks your code,
fixing it is fairly simple:

- DL_ENDPT should be replaced with m_source;
- IO_ENDPT should be replaced with m_source when used for safecopies;
- IO_ENDPT should be replaced with USER_ENDPT for any other use, e.g.
  when setting REP_ENDPT, matching requests in CANCEL calls, getting
  DEV_SELECT flags, and retrieving of the real user process's endpoint
  in DEV_OPEN.

The changes in this patch are binary backward compatible.
2011-04-11 17:35:05 +00:00
Arun Thomas cd9b4b46f4 libexec: return physaddr info from ELF headers 2011-04-07 12:22:36 +00:00
David van Moolenbroek 28f2a169da VFS: bugfixes for handling block-special files:
- on driver restarts, reopen devices on a per-file basis, not per-mount
- do not assume that there is just one vnode per block-special device
- update block-special files in the uncommon mounting success paths, too
- upon mount, sync but also invalidate affected buffers on the root FS
- upon unmount, check whether a vnode is in use before updating it
2011-03-25 10:56:43 +00:00
Erik van der Kouwe 36f9c1155a Restart process after response from async driver on non-blocking select 2011-02-23 10:27:48 +00:00
Ben Gras 287fee89cb add NOASSERTS make flag that disables assert()s (NDEBUG=1).
. made some checks in vfs/vnode.c also respond to NDEBUG=1.
  . turned on in release builds
2011-02-16 18:58:30 +00:00
Ben Gras dc1cc91df1 <ansi.h> -> <minix/ansi.h> 2011-01-28 11:35:02 +00:00
Ben Gras f0f34dd8d9 vfs - use a static buffer instead of malloc()+free(), solving
recently appeared ENOMEM problems during exec().
2010-12-15 14:43:59 +00:00
Arun Thomas 372b873413 VFS/RS support for ELF 2010-12-10 09:27:56 +00:00
Arun Thomas cc26fb5ec4 vfs: terminate string in rdlink_direct
Fixes test56 when compiled with GCC.
2010-12-01 16:24:50 +00:00
Dirk Vogt 5e1e763506 removed unneeded global var 2010-11-24 16:30:13 +00:00
Dirk Vogt 9ed280d1ec decouple file system server start/termination from mount/umount 2010-11-23 19:34:56 +00:00
Arun Thomas f0ab18377d GCC/clang: int64 routines in C 2010-11-12 18:38:10 +00:00
Erik van der Kouwe 9235536f38 Fix select-related bugs: missing cancellations led to potentially forgetting notifies, especially in the case of async drivers 2010-10-08 12:50:52 +00:00
David van Moolenbroek 354da24f5b make getsysinfo() a system-land call 2010-09-14 21:50:05 +00:00
Thomas Veerman 13ef7f1f38 Prepare VFS to support back calls from PFS. For security reasons and to support
file descriptor passing, PFS does some back calls to VFS. For example, to
verify the validity of a path provided by a process and to tell VFS it must
copy file descriptors from one process to another.
2010-08-30 13:44:07 +00:00
Ben Gras 5d6c2aae0a gcov support, based on work contributed by Anton Kuijsten. 2010-08-25 13:06:43 +00:00
Thomas Veerman c8cfcab5db - Make sure there's space left in the vmnt table for another mount point.
- Increase mount point limit.
2010-08-17 10:02:50 +00:00
Ben Gras 3badab8b70 vfs - split fp_fd field into fd + callnr fields 2010-07-22 14:55:28 +00:00
Erik van der Kouwe 739f2d7536 Fix comment 2010-07-15 14:47:08 +00:00
Thomas Veerman 5aff633a0b Make RS and VFS aware of new UDS major. Contributed by Thomas Cort 2010-07-15 13:51:38 +00:00
David van Moolenbroek 895850b8cf move timers code to libsys 2010-07-09 12:58:18 +00:00
Thomas Veerman 34a2864e27 Fix a few compile time warnings 2010-07-02 12:41:19 +00:00
Arun Thomas c0c8d25799 Rename mkfiles from minix.*.mk to bsd.*.mk
Makes things easier for pkgsrc
2010-06-25 18:29:09 +00:00
Erik van der Kouwe c0dfa2f3f1 Get rid of asynsend backup copy in VFS 2010-06-25 14:57:54 +00:00
Erik van der Kouwe 498d7d8a4c Don't use kernel responses in servers 2010-06-24 07:37:26 +00:00
Ben Gras fc01683584 include, vfs: statvfs, fstatvfs calls, contributed by Buccapatnam Tirumala, Gautam. 2010-06-23 23:53:50 +00:00
Ben Gras 19b790eb53 vfs: don't use a mountpoint if it's in use for anything else.
(this avoids data structure confusion if a mountpoint is reused as
a mountpoint until that's properly fixed.)
2010-06-11 11:41:56 +00:00
Arun Thomas 1bf6d23f34 Make exec() use entry point in a.out header 2010-06-10 14:59:10 +00:00
Arun Thomas f0a158d8c1 More cleanup to remove MM and FS references 2010-06-10 14:04:46 +00:00
Kees van Reeuwijk 826b9590f2 More endpoint_t correctness.
More const correctness.
Other code cleanup.
2010-06-08 14:09:18 +00:00
Arun Thomas 4c10a31440 Remove legacy MM, FS, and FS_PROC_NR macros 2010-06-08 13:58:01 +00:00
Thomas Veerman 6bbcab3ec4 Clean up MFS a bit:
- Remove unused includes.
 - Add include guards to headers.
 - Use unsigned variables in case they're never going to hold a negative
   value. This causes GCC's complaints to disappear and should make flexelint
   a lot happier, too.
 - Make functions private when they're used only within a module.
 - Remove unused variables.
 - Add casts where appropriate.
2010-06-01 12:35:33 +00:00
Tomas Hruby 6e25ad8b0a Use of all NIL_* defines converted to NULL 2010-05-10 13:26:00 +00:00
Thomas Veerman 0aceb25535 Small cleanup of dead and/or redundant code. 2010-05-06 09:32:40 +00:00
Thomas Veerman f9317dc039 Scan all processes for that might be blocked on a lock 2010-04-28 11:54:22 +00:00
Ben Gras 94edf4fa12 vfs: start at vmnt[0] to sync mounted filesystems, not vmnt[1]. 2010-04-26 17:12:34 +00:00
Kees van Reeuwijk 86a23c1fbd Remove U16_t and most other similar types. Rewrite functions to ansi-style
declaration if necessary.
2010-04-21 11:05:22 +00:00
Kees van Reeuwijk bc314bda91 Remove the types Dev_t, _mnx_Gui, _mnx_Uid, and similar.
Use ANSI-style function declarations where necessary.
2010-04-13 10:58:41 +00:00
Cristiano Giuffrida 66a8efba53 Fixed escape warning. 2010-04-12 08:39:59 +00:00
Cristiano Giuffrida 65ef539739 Driver mapping refactory.
VFS CHANGES:
- dmap table no longer statically initialized in VFS
- Dropped FSSIGNON svrctl call no longer used by INET

INET CHANGES:
- INET announces its presence to VFS just like any other driver

RS CHANGES:
- The boot image dev table contains all the data to initialize VFS' dmap table
- RS interface supports asynchronous up and update operations now
- RS interface extended to support driver style and flags
2010-04-09 21:56:44 +00:00
Cristiano Giuffrida 48c6bb79f4 Driver refactory for live update and crash recovery.
SYSLIB CHANGES:
- DS calls to publish / retrieve labels consider endpoints instead of u32_t.

VFS CHANGES:
- mapdriver() only adds an entry in the dmap table in VFS.
- dev_up() is only executed upon reception of a driver up event.

INET CHANGES:
- INET no longer searches for existing drivers instances at startup.
- A newtwork driver is (re)initialized upon reception of a driver up event.
- Networking startup is now race-free by design. No need to waste 5 seconds
at startup any more.

DRIVER CHANGES:
- Every driver publishes driver up events when starting for the first time or
in case of restart when recovery actions must be taken in the upper layers.
- Driver up events are published by drivers through DS. 
- For regular drivers, VFS is normally the only subscriber, but not necessarily.
For instance, when the filter driver is in use, it must subscribe to driver
up events to initiate recovery.
- For network drivers, inet is the only subscriber for now.
- Every VFS driver is statically linked with libdriver, every network driver
is statically linked with libnetdriver.

DRIVER LIBRARIES CHANGES:
- Libdriver is extended to provide generic receive() and ds_publish() interfaces
for VFS drivers.
- driver_receive() is a wrapper for sef_receive() also used in driver_task()
to discard spurious messages that were meant to be delivered to a previous
version of the driver.
- driver_receive_mq() is the same as driver_receive() but integrates support
for queued messages.
- driver_announce() publishes a driver up event for VFS drivers and marks
the driver as initialized and expecting a DEV_OPEN message.
- Libnetdriver is introduced to provide similar receive() and ds_publish()
interfaces for network drivers (netdriver_announce() and netdriver_receive()).
- Network drivers all support live update with no state transfer now.

KERNEL CHANGES:
- Added kernel call statectl for state management. Used by driver_announce() to
unblock eventual callers sendrecing to the driver.
2010-04-08 13:41:35 +00:00
Kees van Reeuwijk 94a81c840a Removed unused variables, added const where possible. 2010-04-07 11:25:51 +00:00
Kees van Reeuwijk fc7dced1fa Fix printfs with too few or too many parms, remove unused vars, fix incorrect flag tests, other code cleanup. 2010-04-01 13:25:05 +00:00
Thomas Veerman 4d686f1616 Move allocation of temporary inodes for cloned character special devices from
MFS to PFS.
2010-03-30 15:00:09 +00:00
Ben Gras bc0e36f402 fix null deref; vmnt->mounted_on is NULL legitimately for root.
changed check+panic to assert().

added assert().
2010-03-29 11:39:54 +00:00
Arun Thomas 436d6012a3 Convert drivers/ and servers/ over to bsdmake
-Move libdriver to lib/
-Install all boot image services on filesystem to aid restartability
2010-03-22 21:25:22 +00:00
Kees van Reeuwijk c33102ea6b Miscellaneous code cleanup. 2010-03-22 20:43:06 +00:00
Cristiano Giuffrida cb176df60f New RS and new signal handling for system processes.
UPDATING INFO:
20100317:
        /usr/src/etc/system.conf updated to ignore default kernel calls: copy
        it (or merge it) to /etc/system.conf.
        The hello driver (/dev/hello) added to the distribution:
        # cd /usr/src/commands/scripts && make clean install
        # cd /dev && MAKEDEV hello

KERNEL CHANGES:
- Generic signal handling support. The kernel no longer assumes PM as a signal
manager for every process. The signal manager of a given process can now be
specified in its privilege slot. When a signal has to be delivered, the kernel
performs the lookup and forwards the signal to the appropriate signal manager.
PM is the default signal manager for user processes, RS is the default signal
manager for system processes. To enable ptrace()ing for system processes, it
is sufficient to change the default signal manager to PM. This will temporarily
disable crash recovery, though.
- sys_exit() is now split into sys_exit() (i.e. exit() for system processes,
which generates a self-termination signal), and sys_clear() (i.e. used by PM
to ask the kernel to clear a process slot when a process exits).
- Added a new kernel call (i.e. sys_update()) to swap two process slots and
implement live update.

PM CHANGES:
- Posix signal handling is no longer allowed for system processes. System
signals are split into two fixed categories: termination and non-termination
signals. When a non-termination signaled is processed, PM transforms the signal
into an IPC message and delivers the message to the system process. When a
termination signal is processed, PM terminates the process.
- PM no longer assumes itself as the signal manager for system processes. It now
makes sure that every system signal goes through the kernel before being
actually processes. The kernel will then dispatch the signal to the appropriate
signal manager which may or may not be PM.

SYSLIB CHANGES:
- Simplified SEF init and LU callbacks.
- Added additional predefined SEF callbacks to debug crash recovery and
live update.
- Fixed a temporary ack in the SEF init protocol. SEF init reply is now
completely synchronous.
- Added SEF signal event type to provide a uniform interface for system
processes to deal with signals. A sef_cb_signal_handler() callback is
available for system processes to handle every received signal. A
sef_cb_signal_manager() callback is used by signal managers to process
system signals on behalf of the kernel.
- Fixed a few bugs with memory mapping and DS.

VM CHANGES:
- Page faults and memory requests coming from the kernel are now implemented
using signals.
- Added a new VM call to swap two process slots and implement live update.
- The call is used by RS at update time and in turn invokes the kernel call
sys_update().

RS CHANGES:
- RS has been reworked with a better functional decomposition.
- Better kernel call masks. com.h now defines the set of very basic kernel calls
every system service is allowed to use. This makes system.conf simpler and
easier to maintain. In addition, this guarantees a higher level of isolation
for system libraries that use one or more kernel calls internally (e.g. printf).
- RS is the default signal manager for system processes. By default, RS
intercepts every signal delivered to every system process. This makes crash
recovery possible before bringing PM and friends in the loop.
- RS now supports fast rollback when something goes wrong while initializing
the new version during a live update.
- Live update is now implemented by keeping the two versions side-by-side and
swapping the process slots when the old version is ready to update.
- Crash recovery is now implemented by keeping the two versions side-by-side
and cleaning up the old version only when the recovery process is complete.

DS CHANGES:
- Fixed a bug when the process doing ds_publish() or ds_delete() is not known
by DS.
- Fixed the completely broken support for strings. String publishing is now
implemented in the system library and simply wraps publishing of memory ranges.
Ideally, we should adopt a similar approach for other data types as well.
- Test suite fixed.

DRIVER CHANGES:
- The hello driver has been added to the Minix distribution to demonstrate basic
live update and crash recovery functionalities.
- Other drivers have been adapted to conform the new SEF interface.
2010-03-17 01:15:29 +00:00
David van Moolenbroek 27d53256e4 VFS fixes:
- do not use uninitialized req_breadwrite results upon failure
- improve ".." ELEAVEMOUNT correctness check
2010-03-08 22:05:27 +00:00
Ben Gras 35a108b911 panic() cleanup.
this change
   - makes panic() variadic, doing full printf() formatting -
     no more NO_NUM, and no more separate printf() statements
     needed to print extra info (or something in hex) before panicing
   - unifies panic() - same panic() name and usage for everyone -
     vm, kernel and rest have different names/syntax currently
     in order to implement their own luxuries, but no longer
   - throws out the 1st argument, to make source less noisy.
     the panic() in syslib retrieves the server name from the kernel
     so it should be clear enough who is panicing; e.g.
         panic("sigaction failed: %d", errno);
     looks like:
         at_wini(73130): panic: sigaction failed: 0
         syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
   - throws out report() - printf() is more convenient and powerful
   - harmonizes/fixes the use of panic() - there were a few places
     that used printf-style formatting (didn't work) and newlines
     (messes up the formatting) in panic()
   - throws out a few per-server panic() functions
   - cleans up a tie-in of tty with panic()

merging printf() and panic() statements to be done incrementally.
2010-03-05 15:05:11 +00:00
Ben Gras adf0b6fb26 No more E{SRC,DST}DIED errno's, replaced by EDEADSRCDST.
The callers don't care about the difference and had to check 3 error
codes instead of one.
2010-03-03 15:47:16 +00:00
Kees van Reeuwijk f3c98fdca2 Fixed a number of cases where a bits in an integer were tested
incorrectly, resulting in real (and nasty) bugs.
2010-03-02 12:55:39 +00:00
Kees van Reeuwijk 1597e701a0 Remove useless variables and the computations on them. 2010-02-19 10:00:32 +00:00
Arun Thomas b706112487 Incorporate bsdmake into buildsystem and reorganize libs 2010-02-16 14:41:33 +00:00
David van Moolenbroek bdd4f5857f Fixes for truncate system calls:
- VFS: check for negative sizes in all truncate calls
- VFS: update file size after truncating with fcntl(F_FREESP)
- VFS: move pos/len checks for F_FREESP with l_len!=0 from FS to VFS
- MFS: do not zero data block for small files when fully truncating
- MFS: do not write out freed indirect blocks after freeing space
- MFS: make truncate work correctly with differing zone/block sizes
- tests: add new test50 for truncate call family
2010-02-09 08:12:37 +00:00
Ben Gras 35b471ad94 removal of unused vm<->vfs code. 2010-02-03 13:35:17 +00:00
Kees van Reeuwijk 2ba237cd4e Fixed a number of uses of uninitialized variables by adding assertions
or other sanity checks, code reshuffling, or fixing broken behavior.
2010-01-27 10:23:58 +00:00
Thomas Veerman 9a7cd8e254 Pipe vnodes are always mapped. 2010-01-27 09:30:39 +00:00
Thomas Veerman ee2e57b4dc Add return statement after failed dev_open (fixes open count in at_wini) 2010-01-21 15:02:29 +00:00
Thomas Veerman ca9280e097 - Fix dangling symlink regression
- Make open(2) more POSIX compliant
- Add a test case for dangling symlinks and open() syscall with O_CREAT and
  O_EXCL on a symlink.
- Update open(2) man page to reflect change.
2010-01-21 09:32:15 +00:00
Cristiano Giuffrida c5b309ff07 Merge of Wu's GSOC 09 branch (src.20090525.r4372.wu)
Main changes:
- COW optimization for safecopy.
- safemap, a grant-based interface for sharing memory regions between processes.
- Integration with safemap and complete rework of DS, supporting new data types
  natively (labels, memory ranges, memory mapped ranges).
- For further information:
  http://wiki.minix3.org/en/SummerOfCode2009/MemoryGrants

Additional changes not included in the original Wu's branch:
- Fixed unhandled case in VM when using COW optimization for safecopy in case
  of a block that has already been shared as SMAP.
- Better interface and naming scheme for sys_saferevmap and ds_retrieve_map
  calls.
- Better input checking in syslib: check for page alignment when creating
  memory mapping grants.
- DS notifies subscribers when an entry is deleted.
- Documented the behavior of indirect grants in case of memory mapping.
- Test suite in /usr/src/test/safeperf|safecopy|safemap|ds/* reworked
  and extended.
- Minor fixes and general cleanup.
- TO-DO: Grant ids should be generated and managed the way endpoints are to make
sure grant slots are never misreused.
2010-01-14 15:24:16 +00:00
David van Moolenbroek b31119abf5 Mount updates:
- allow mounting with "none" block device
- allow unmounting by mountpoint
- make VFS aware of file system process labels
- allow m3_ca1 to use the full available message size
- use *printf in u/mount(1), as mount(2) uses it already
- fix reference leaks for some mount error cases in VFS
2010-01-12 23:08:50 +00:00
Cristiano Giuffrida d1fd04e72a Initialization protocol for system services.
SYSLIB CHANGES:
- SEF framework now supports a new SEF Init request type from RS. 3 different
callbacks are available (init_fresh, init_lu, init_restart) to specify
initialization code when a service starts fresh, starts after a live update,
or restarts.

SYSTEM SERVICE CHANGES:
- Initialization code for system services is now enclosed in a callback SEF will
automatically call at init time. The return code of the callback will
tell RS whether the initialization completed successfully.
- Each init callback can access information passed by RS to initialize. As of
now, each system service has access to the public entries of RS's system process
table to gather all the information required to initialize. This design
eliminates many existing or potential races at boot time and provides a uniform
initialization interface to system services. The same interface will be reused
for the upcoming publish/subscribe model to handle dynamic 
registration / deregistration of system services.

VM CHANGES:
- Uniform privilege management for all system services. Every service uses the
same call mask format. For boot services, VM copies the call mask from init
data. For dynamic services, VM still receives the call mask via rs_set_priv
call that will be soon replaced by the upcoming publish/subscribe model.

RS CHANGES:
- The system process table has been reorganized and split into private entries
and public entries. Only the latter ones are exposed to system services.
- VM call masks are now entirely configured in rs/table.c
- RS has now its own slot in the system process table. Only kernel tasks and
user processes not included in the boot image are now left out from the system
process table.
- RS implements the initialization protocol for system services.
- For services in the boot image, RS blocks till initialization is complete and
panics when failure is reported back. Services are initialized in their order of
appearance in the boot image priv table and RS blocks to implements synchronous
initialization for every system service having the flag SF_SYNCH_BOOT set.
- For services started dynamically, the initialization protocol is implemented
as though it were the first ping for the service. In this case, if the
system service fails to report back (or reports failure), RS brings the service
down rather than trying to restart it.
2010-01-08 01:20:42 +00:00
David van Moolenbroek ac9ab099c8 General cleanup:
- clean up kernel section of minix/com.h somewhat
- remove ALLOCMEM and VM_ALLOCMEM calls
- remove non-safecopy and minix-vmd support from Inet
- remove SYS_VIRVCOPY and SYS_PHYSVCOPY calls
- remove obsolete segment encoding in SYS_SAFECOPY*
- remove DEVCTL call, svrctl(FSDEVUNMAP), map_driverX
- remove declarations of unimplemented svrctl requests
- remove everything related to swapping to disk
- remove floppysetup.sh
- remove traces of rescue device
- update DESCRIBE.sh with new devices
- some other small changes
2010-01-05 19:39:27 +00:00
Cristiano Giuffrida 1f5841c8ed Basic System Event Framework (SEF) with ping and live update.
SYSLIB CHANGES:
- SEF must be used by every system process and is thereby part of the system
library.
- The framework provides a receive() interface (sef_receive) for system
processes to automatically catch known system even messages and process them.
- SEF provides a default behavior for each type of system event, but allows
system processes to register callbacks to override the default behavior.
- Custom (local to the process) or predefined (provided by SEF) callback
implementations can be registered to SEF.
- SEF currently includes support for 2 types of system events:
  1. SEF Ping. The event occurs every time RS sends a ping to figure out
  whether a system process is still alive. The default callback implementation
  provided by SEF is to notify RS back to let it know the process is alive
  and kicking.
  2. SEF Live update. The event occurs every time RS sends a prepare to update
  message to let a system process know an update is available and to prepare
  for it. The live update support is very basic for now. SEF only deals with
  verifying if the prepare state can be supported by the process, dumping the
  state for debugging purposes, and providing an event-driven programming
  model to the process to react to state changes check-in when ready to update.
- SEF should be extended in the future to integrate support for more types of
system events. Ideally, all the cross-cutting concerns should be integrated into
SEF to avoid duplicating code and ease extensibility. Examples include:
  * PM notify messages primarily used at shutdown.
  * SYSTEM notify messages primarily used for signals.
  * CLOCK notify messages used for system alarms.
  * Debug messages. IS could still be in charge of fkey handling but would
  forward the debug message to the target process (e.g. PM, if the user
  requested debug information about PM). SEF would then catch the message and
  do nothing unless the process has registered an appropriate callback to
  deal with the event. This simplifies the programming model to print debug
  information, avoids duplicating code, and reduces the effort to print
  debug information.

SYSTEM PROCESSES CHANGES:
- Every system process registers SEF callbacks it needs to override the default
system behavior and calls sef_startup() right after being started.
- sef_startup() does almost nothing now, but will be extended in the future to
support callbacks of its own to let RS control and synchronize with every
system process at initialization time.
- Every system process calls sef_receive() now rather than receive() directly,
to let SEF handle predefined system events.

RS CHANGES:
- RS supports a basic single-component live update protocol now, as follows:
  * When an update command is issued (via "service update *"), RS notifies the
  target system process to prepare for a specific update state.
  * If the process doesn't respond back in time, the update is aborted.
  * When the process responds back, RS kills it and marks it for refreshing.
  * The process is then automatically restarted as for a buggy process and can
  start running again.
  * Live update is currently prototyped as a controlled failure.
2009-12-21 14:12:21 +00:00
Thomas Veerman 6aa43dc9e4 Fix typo and a bug causing vnode references to become too low. 2009-12-21 09:36:34 +00:00
Thomas Veerman 958b25be50 - Introduce support for sticky bit.
- Revise VFS-FS protocol and update VFS/MFS/ISOFS accordingly.
- Clean up MFS by removing old, dead code (backwards compatibility is broken by
  the new VFS-FS protocol, anyway) and rewrite other parts. Also, make sure all
  functions have proper banners and prototypes.
- VFS should always provide a (syntactically) valid path to the FS; no need for
  the FS to do sanity checks when leaving/entering mount points.
- Fix several bugs in MFS:
  - Several path lookup bugs in MFS.
  - A link can be too big for the path buffer.
  - A mountpoint can become inaccessible when the creation of a new inode
    fails, because the inode already exists and is a mountpoint.
- Introduce support for supplemental groups.
- Add test 46 to test supplemental group functionality (and removed obsolete
  suppl. tests from test 2).
- Clean up VFS (not everything is done yet).
- ISOFS now opens device read-only. This makes the -r flag in the mount command
  unnecessary (but will still report to be mounted read-write).
- Introduce PipeFS. PipeFS is a new FS that handles all anonymous and
  named pipes. However, named pipes still reside on the (M)FS, as they are part
  of the file system on disk. To make this work VFS now has a concept of
  'mapped' inodes, which causes read, write, truncate and stat requests to be
  redirected to the mapped FS, and all other requests to the original FS.
2009-12-20 20:27:14 +00:00
David van Moolenbroek be2087ecf9 Filter driver by Wu Bingzheng et al 2009-12-02 10:08:58 +00:00
David van Moolenbroek f89388c241 Kernel, servers: remove unused proto.h definitions 2009-10-31 14:11:50 +00:00
David van Moolenbroek 6c4197f77e PM, VFS: remove unused param.h definitions 2009-10-29 13:29:04 +00:00
Ben Gras c373473f24 add prototype for wait_for() to fix compiler warning. 2009-10-05 16:43:02 +00:00
David van Moolenbroek b423d7b477 Merge of David's ptrace branch. Summary:
o Support for ptrace T_ATTACH/T_DETACH and T_SYSCALL
o PM signal handling logic should now work properly, even with debuggers
  being present
o Asynchronous PM/VFS protocol, full IPC support for senda(), and
  AMF_NOREPLY senda() flag

DETAILS

Process stop and delay call handling of PM:
o Added sys_runctl() kernel call with sys_stop() and sys_resume()
  aliases, for PM to stop and resume a process
o Added exception for sending/syscall-traced processes to sys_runctl(),
  and matching SIGKREADY pseudo-signal to PM
o Fixed PM signal logic to deal with requests from a process after
  stopping it (so-called "delay calls"), using the SIGKREADY facility
o Fixed various PM panics due to race conditions with delay calls versus
  VFS calls
o Removed special PRIO_STOP priority value
o Added SYS_LOCK RTS kernel flag, to stop an individual process from
  running while modifying its process structure

Signal and debugger handling in PM:
o Fixed debugger signals being dropped if a second signal arrives when
  the debugger has not retrieved the first one
o Fixed debugger signals being sent to the debugger more than once
o Fixed debugger signals unpausing process in VFS; removed PM_UNPAUSE_TR
  protocol message
o Detached debugger signals from general signal logic and from being
  blocked on VFS calls, meaning that even VFS can now be traced
o Fixed debugger being unable to receive more than one pending signal in
  one process stop
o Fixed signal delivery being delayed needlessly when multiple signals
  are pending
o Fixed wait test for tracer, which was returning for children that were
  not waited for
o Removed second parallel pending call from PM to VFS for any process
o Fixed process becoming runnable between exec() and debugger trap
o Added support for notifying the debugger before the parent when a
  debugged child exits
o Fixed debugger death causing child to remain stopped forever
o Fixed consistently incorrect use of _NSIG

Extensions to ptrace():
o Added T_ATTACH and T_DETACH ptrace request, to attach and detach a
  debugger to and from a process
o Added T_SYSCALL ptrace request, to trace system calls
o Added T_SETOPT ptrace request, to set trace options
o Added TO_TRACEFORK trace option, to attach automatically to children
  of a traced process
o Added TO_ALTEXEC trace option, to send SIGSTOP instead of SIGTRAP upon
  a successful exec() of the tracee
o Extended T_GETUSER ptrace support to allow retrieving a process's priv
  structure
o Removed T_STOP ptrace request again, as it does not help implementing
  debuggers properly
o Added MINIX3-specific ptrace test (test42)
o Added proper manual page for ptrace(2)

Asynchronous PM/VFS interface:
o Fixed asynchronous messages not being checked when receive() is called
  with an endpoint other than ANY
o Added AMF_NOREPLY senda() flag, preventing such messages from
  satisfying the receive part of a sendrec()
o Added asynsend3() that takes optional flags; asynsend() is now a
  #define passing in 0 as third parameter
o Made PM/VFS protocol asynchronous; reintroduced tell_fs()
o Made PM_BASE request/reply number range unique
o Hacked in a horrible temporary workaround into RS to deal with newly
  revealed RS-PM-VFS race condition triangle until VFS is asynchronous

System signal handling:
o Fixed shutdown logic of device drivers; removed old SIGKSTOP signal
o Removed is-superuser check from PM's do_procstat() (aka getsigset())
o Added sigset macros to allow system processes to deal with the full
  signal set, rather than just the POSIX subset

Miscellaneous PM fixes:
o Split do_getset into do_get and do_set, merging common code and making
  structure clearer
o Fixed setpriority() being able to put to sleep processes using an
  invalid parameter, or revive zombie processes
o Made find_proc() global; removed obsolete proc_from_pid()
o Cleanup here and there

Also included:
o Fixed false-positive boot order kernel warning
o Removed last traces of old NOTIFY_FROM code

THINGS OF POSSIBLE INTEREST

o It should now be possible to run PM at any priority, even lower than
  user processes
o No assumptions are made about communication speed between PM and VFS,
  although communication must be FIFO
o A debugger will now receive incoming debuggee signals at kill time
  only; the process may not yet be fully stopped
o A first step has been made towards making the SYSTEM task preemptible
2009-09-30 09:57:22 +00:00
Ben Gras 8d9aa1fe4f throw out exec debugging message. 2009-09-30 08:36:13 +00:00
Tomas Hruby 97fe6a4ba5 Broken pipes fix
- fix for the broken partial pipes r/w operations
2009-09-24 16:03:25 +00:00
Tomas Hruby f53377ed67 Removed the broken PROC_EVENT and SYN_ALARM from VFS 2009-09-22 22:11:20 +00:00
Tomas Hruby 8590ac260d Removed dependency of vfs on NR_TASKS macro
- all macros in consts.h that depend on NR_TASKS replaced by a FP_BLOCKED_ON_*

- fp_suspended removed and replaced by fp_blocked_on. Testing whether a process
  is supended is qeual to testing whether fp_blocked_on is FP_BLOCKED_ON_NONE or
  not

- fp_task is valid only if fp_blocked_on == FP_BLOCKED_ON_OTHER

- no need of special values that do not colide with valid and special endpoints
  since they are not used as endpoints anymore

- suspend only takes FP_BLOCKED_ON_* values not endpoints anymore

- suspend(task) replaced by wait_for(task) which sets fp_task so we remember who
  are we waiting for and suspend sets fp_blocked_on to FP_BLOCKED_ON_OTHER to
  signal that we are waiting for some other process

- some functions should take endpoint_t instead of int, fixed
2009-09-22 21:48:26 +00:00
Ben Gras f5459e38db - some exec debugging prints when errors happen
- lookup mounted_on check to avoid NULL dereference
 - some errors in exec() shouldn't be fatal
2009-09-21 14:49:26 +00:00
David van Moolenbroek 42f0bf7dda VFS: fetch_name() buffer underflow (reported by John Peace, bug #305) 2009-08-29 08:22:50 +00:00
Tomas Hruby f3e0c5c381 VFS quits gracefully if mount fails and mounted_on remains uninitialized 2009-08-18 13:30:05 +00:00
David van Moolenbroek d82e260a90 Support for setitimer(ITIMER_REAL). 2009-08-15 16:09:32 +00:00
Thomas Veerman ce916bcb91 Fixed a minor select bug:
- When one does a select on a file descriptor that is meaningless for that particular file type, select shall indicate that the file descriptor is ready for that particular operation and that the file descriptor has no exceptional condition pending.
2009-07-14 09:39:05 +00:00
David van Moolenbroek 1a9e07b0e5 PM: fix ptrace(T_EXIT) 'exit_proc: not idle' race condition. 2009-07-11 13:22:56 +00:00
David van Moolenbroek 9797d17d54 move symlink type check for readlink() into VFS, and return the right POSIX error 2009-05-20 09:46:06 +00:00
David van Moolenbroek 50b77e3529 VFS consistency: use I_PIPE/NO_PIPE when checking v_pipe 2009-05-19 14:34:44 +00:00
David van Moolenbroek f76d75a5ec Various VFS and MFS fixes to improve correctness, consistency and
POSIX compliance.

VFS changes:
* truncate() on a file system mounted read-only no longer panics MFS.
* ftruncate() and fcntl(F_FREESP) now check for write permission on
  the file descriptor instead of the file, write().
* utime(), chown() and fchown() now check for file system read-only
  status.

MFS changes:
* link() and rename() no longer return the internal EENTERMOUNT and
  ELEAVEMOUNT errors to the application as part of a check on the
  source path.
* rename() now treats EENTERMOUNT from the destination path check as
  an error, preventing file system corruption from renaming a normal
  directory to an existing mountpoint directory.
* mountpoints (mounted-on dirs) are hidden better during lookups:
  - if a lookup starts from a mountpoint, the first component has to
    be ".." (anything else being a VFS-FS protocol violation).
  - in that case, the permissions of the mountpoint are not checked.
  - in all other cases, visiting a mountpoint always results in
    EENTERMOUNT.
* a lookup on ".." from a mount root or chroot(2) root no longer
  succeeds if the caller does not have search permission on that
  directory.
* POSIX: getdents() now updates directory access times.
* POSIX: readlink() now returns partial results instead of ERANGE.

Miscellaneous changes:
* semaphore file handling bug (leading to hangs) fixed in test 32.

The VFS changes should now put the burden of checking for read-only
status of file systems entirely on VFS, and limit the access
permission checks that file systems have to perform, to checking
search permission on directories during lookups. From this point on,
any deviation from that spceification should be considered a bug.
Note that for legacy reasons, the root partition is assumed to be
mounted read-write.
2009-05-18 11:27:12 +00:00
Ben Gras 7c88767f75 remove debug msg 2009-05-11 11:57:20 +00:00
David van Moolenbroek 0ac1aaccca Limited support for nested FS->VFS requests during VFS->FS call.
- Changed VFS-FS protocol to only store OK or negative error code in
  m_type field of reply messages.
- Changed VFS to treat nonzero positive replies from FS as requests.
- Added backwards compatibility to VFS and MFS.
No protection of global data structures is provided in VFS, so many
VFS calls cannot be made safely by FS servers during many FS calls.
Use with caution (or, preferably, not at all).
2009-05-11 10:02:28 +00:00
David van Moolenbroek e08b38a5c4 regression fix: vfs lookup passes incorrect chroot information after crossing mountpoints 2009-05-09 17:53:22 +00:00
David van Moolenbroek 293be6b80b quick cleanup of old mfs cruft from vfs 2009-05-08 14:12:41 +00:00
Ben Gras dc1238b7b9 make unpause() decrease susp_count, as it shouldn't be decreased
if the process was REVIVING. (susp_count doesn't count those
 processes.) this together with dev_io SELECT suspend side effect
 for asynch. character devices solves the hanging pipe bug. or
 at last vastly improves it.

 added sanity checks, turned off by default.

 made the {NOT_,}{SUSPENDING,REVIVING} constants weirder to
 help sanity checking.
2009-05-08 13:56:41 +00:00
David van Moolenbroek 113b1ec5f3 remove unused global variable from vfs 2009-05-08 13:54:01 +00:00
Ben Gras ece26e2731 don't suspend the process as a side-effect if
device returns SUSPEND if it's select; select already
does this.
2009-05-08 13:50:29 +00:00
Ben Gras 746e138036 turn off scary looking debug messages. 2009-05-07 09:57:43 +00:00
Ben Gras fd7ef243e4 cleanup of vfs shutdown logic; makes clean unmounts easier (but
needs checking if fp_wd or fp_rd is NULL before use)
2009-04-29 16:59:18 +00:00
Ben Gras 73ee8b8b99 don't make susp_count negative. 2009-04-02 11:44:26 +00:00
Ben Gras 3bb80322d9 suppress more mostly-harmless messages. 2009-03-26 16:11:27 +00:00
Ben Gras 2d1c884e35 suppress these noisy, alarming messages. 2009-03-26 15:56:08 +00:00
Ben Gras 8af5f877bc 2009-03-04 17:44:34 +00:00
Ben Gras 3f6e061948 fix error check 2009-03-04 17:38:27 +00:00
Ben Gras 570b9cd753 Checking wrong inode pointer for refcount in mount (!) 2009-02-17 09:50:02 +00:00
Ben Gras 3cc092ff06 . new kernel call sysctl for generic unprivileged system operations;
now used for printing diagnostic messages through the kernel message
   buffer. this lets processes print diagnostics without sending messages
   to tty and log directly, simplifying the message protocol a lot and
   reducing difficulties with deadlocks and other situations in which
   diagnostics are blackholed (e.g. grants don't work). this makes
   DIAGNOSTICS(_S), ASYN_DIAGNOSTICS and DIAG_REPL obsolete, although tty
   and log still accept the codes for 'old' binaries. This also simplifies
   diagnostics in several servers and drivers - only tty needs its own
   kputc() now.
 . simplifications in vfs, and some effort to get the vnode references
   right (consistent) even during shutdown. m_mounted_on is now NULL
   for root filesystems (!) (the original and new root), a less awkward
   special case than 'm_mounted_on == m_root_node'. root now has exactly
   one reference, to root, if no files are open, just like all other
   filesystems. m_driver_e is unused.
2009-01-26 17:43:59 +00:00
Ben Gras 4984a86f32 don't hang on disappearing filesystem. 2009-01-26 13:02:41 +00:00
Ben Gras 86e7e4828e sanity check function 2009-01-20 13:43:18 +00:00
Ben Gras 45ec30f6af mostly harmless sanity checks. 2009-01-20 13:43:00 +00:00
Ben Gras d2757d4b73 debug buffer slightly usabler. 2008-12-19 15:19:42 +00:00
Ben Gras 834d9d34e8 Initialize deferred field. This seems to fix a hanging select() bug. 2008-12-17 14:20:08 +00:00
Ben Gras 34d5401ed4 put put_vnode() back where it belongs! 2008-12-16 16:11:24 +00:00
Ben Gras ccf70aa989 system_hz replaces HZ 2008-12-11 14:48:05 +00:00
Ben Gras 7d674f4b8e no more HZ; less debugging statements 2008-12-11 14:47:48 +00:00
Ben Gras b9a0d46ea9 debug out 2008-12-11 14:46:46 +00:00
Ben Gras 3287b7f7d8 don't hang old binaries 2008-12-11 14:45:49 +00:00
Ben Gras 5e1bb6eb63 added some code to debug why filesystems won't unmount 2008-12-11 14:45:31 +00:00
Ben Gras c078ec0331 Basic VM and other minor improvements.
Not complete, probably not fully debugged or optimized.
2008-11-19 12:26:10 +00:00
Philip Homburg f82a1c4df7 Fixed include files. 2008-02-25 14:35:54 +00:00
Philip Homburg bc7e3c02a3 Asynchronous select implementation. 2008-02-22 15:46:59 +00:00
Philip Homburg ff7eae2ad8 Private copy of kputc to support asynch communication with log device. 2008-02-22 15:43:33 +00:00
Philip Homburg 2ec762c60c Asynchronous communication with character specials. 2008-02-22 15:41:07 +00:00