This single function allows copying file descriptors from and to
processes, and closing a previously copied remote file descriptor.
This function replaces the five FD-related UDS backcalls. While it
limits the total number of in-flight file descriptors to OPEN_MAX,
this change greatly improves crash recovery support of UDS, since all
in-flight file descriptors will be closed instead of keeping them
open indefinitely (causing VFS to crash on system shutdown). With the
new copyfd call, UDS becomes simpler, and the concept of filps is no
longer exposed outside of VFS.
This patch also moves the checkperms(2) stub into libminlib, thus
fully abstracting away message details of VFS communication from UDS.
Change-Id: Idd32ad390a566143c8ef66955e5ae2c221cff966
This commit separates the low-level keyboard driver from TTY, putting
it in a separate driver (PCKBD). The commit also separates management
of raw input devices from TTY, and puts it in a separate server
(INPUT). All keyboard and mouse input from hardware is sent by drivers
to the INPUT server, which either sends it to a process that has
opened a raw input device, or otherwise forwards it to TTY for
standard processing.
Design by Dirk Vogt. Prototype by Uli Kastlunger.
Additional changes made to the prototype:
- the event communication is now based on USB HID codes; all input
drivers have to use USB codes to describe events;
- all TTY keymaps have been converted to USB format, with the effect
that a single keymap covers all keys; there is no (static) escaped
keymap anymore;
- further keymap tweaks now allow remapping of literally all keys;
- input device renumbering and protocol rewrite;
- INPUT server rewrite, with added support for cancel and select;
- PCKBD reimplementation, including PC/AT-to-USB translation;
- support for manipulating keyboard LEDs has been added;
- keyboard and mouse multiplexer devices have been added to INPUT,
primarily so that an X server need only open two devices;
- a new "libinputdriver" library abstracts away protocol details from
input drivers, and should be used by all future input drivers;
- both INPUT and PCKBD can be restarted;
- TTY is now scheduled by KERNEL, so that it won't be punished for
running a lot; without this, simply running "yes" on the console
kills the system;
- the KIOCBELL IOCTL has been moved to /dev/console;
- support for the SCANCODES termios setting has been removed;
- obsolete keymap compression has been removed;
- the obsolete Olivetti M24 keymap has been removed.
Change-Id: I3a672fb8c4fd566734e4b46d3994b4b7fc96d578
Most systems provide the full version number in the
'release' field and the kernel version in 'version'.
Minix used to split the full version number between
release and version which caused problems for pkgsrc
and other applications. This patch brings Minix's
uname in line with other systems such as NetBSD.
It also brings the getty banner in line with NetBSD.
Old Minix uname:
sysname->Minix
nodename->10.0.2.15
release->3
version->2.1
machine->i686
New Minix uname:
sysname->Minix
nodename->10.0.2.15
release->3.2.1
version->Minix 3.2.1 (GENERIC)
machine->i686
Change-Id: I966633dfdcf2f9485966bb0d0d042afc45bbeb7d
* Renamed struct timer to struct minix_timer
* Renamed timer_t to minix_timer_t
* Ensured all the code uses the minix_timer_t typedef
* Removed ifdef around _BSD_TIMER_T
* Removed include/timers.h and merged it into include/minix/timers.h
* Resolved prototype conflict by renaming kernel's (re)set_timer
to (re)set_kernel_timer.
Change-Id: I56f0f30dfed96e1a0575d92492294cf9a06468a5
This call copies a file descriptor from a remote process into the
calling process. The call is for the VND driver only, and in the
future, ACLs will prevent any other process from using this call.
Change-Id: Ib16fdd1f1a12cb38a70d7e441dad91bc86898f6d
- prefix them with VFS_ as they are going to VFS;
- give these calls normal call numbers;
- give them their own set of message field aliases;
- also make do_mapdriver a regular call.
Change-Id: I2140439f288b06d699a1f65438bd8306509b259e
The T_DUMPCORE implementation was not only broken - it would currently
produce a coredump of the tracer process rather than the traced
process - but also deeply flawed, and fixing it would require serious
alteration of PM's internal state machine. It should be possible to
implement the same functionality in userland, and that is now the
suggested way forward. For now, also remove the (identical) utilities
using T_DUMPCORE: dumpcore(1) and gcore(1).
Change-Id: I1d51be19c739362b8a5833de949b76382a1edbcc
* Removed startup code patches in lib/csu regarding kernel to userland
ABI.
* Aligned stack layout on NetBSD stack layout.
* Generate valid stack pointers instead of offsets by taking into account
_minix_kerninfo->kinfo->user_sp.
* Refactored stack generation, by moving part of execve in two
functions {minix_stack_params(), minix_stack_fill()} and using them
in execve(), rs and vm.
* Changed load offset of rtld (ld.so) to:
execi.args.stack_high - execi.args.stack_size - 0xa00000
which is 10MB below the main executable stack.
Change-Id: I839daf3de43321cded44105634102d419cb36cec
The following types are modified (old -> new):
* _BSD_USECONDS_T_ int -> unsigned int
* __socklen_t __int32_t -> __uint32_t
* blksize_t uint32_t -> int32_t
* rlim_t uint32_t -> uint64_t
On ARM:
* _BSD_CLOCK_T_ int -> unsigned int
On Intel:
* _BSD_CLOCK_T_ int -> unsigned long
bin/cat is also updated in order to fix warnings.
_BSD_TIMER_T_ has still to be aligned.
Change-Id: I2b4fda024125a19901120546c4e22e443ba5e9d7
for clang, fix warnings in drivers/, lib/, servers/, sys/, common/.
by turning off fatal warnings (takes effect if the default is on),
fixing warnings or reducing the warning level.
Change-Id: Ia1b4bc877c879ba783158081b59aa6ebb021a50f
Some ARM chips handle power-off with RTC alarms. PM notifies
readclock (the driver for RTCs) about the impending power-off.
If the power-off mechanism is an RTC alarm, readclock will
set the alarm. If not, there is no effect.
Change-Id: Iee00066def2a0f742cdf0dbde8e32b376edf1b78
Implement getrusage.
These fields of struct rusage are not supported and always set to zero at this time
long ru_nswap; /* swaps */
long ru_inblock; /* block input operations */
long ru_oublock; /* block output operations */
long ru_msgsnd; /* messages sent */
long ru_msgrcv; /* messages received */
long ru_nvcsw; /* voluntary context switches */
long ru_nivcsw; /* involuntary context switches */
test75.c is the unit test for this new function
Change-Id: I3f1eb69de1fce90d087d76773b09021fc6106539
. libc: add vfs_mmap, a way for vfs to initiate mmap()s.
This is a good special case to have as vfs is a slightly
different client from regular user processes. It doesn't do it
for itself, and has the dev & inode info already so the callback
to VFS for the lookup isn't necessary. So it has different info
to have to give to VM.
. libc: also add minix_mmap64() that accepts a 64-bit offset, even
though our off_t is still 32 bit now.
. On exec() time, try to mmap() in the executable if available.
(It is not yet available in this commit.)
. To support mmap(), add do_vm_call that allows VM to lookup
(to ino+dev), do i/o from and close FD's on behalf of other
processes.
Change-Id: I831551e45a6781c74313c450eb9c967a68505932
The previous test would return EFAULT as soon as the group pointer
was NULL, while it is sensible when the count is also 0.
In that case, the SETGROUP syscall is expected to clear all the
group entries as the new set is empty.
Change-Id: I07b7e1d1f023a52e3035d53f7d9b42b660e039e8
Variant of utime(2) with struct timespec (with ns precision)
instead of time_t values; also allows for tv_nsec members
the values UTIME_NOW (force update to current time) or
UTIME_OMIT (allow to set either atim or mtim independently.)
Provides a superset of utimes(2), futimes(2), lutimes(2),
and futimens(2).
Provides the same subset of utimensat(2) as does NetBSD 6.
Also import utimens() and lutimeNS() from NetBSD-current.
This also adds the sys_settime() kernel call which allows for the adjusting
of the clock named realtime in the kernel. The existing sys_stime()
function is still needed for a separate job (setting the boottime). The
boottime is set in the readclock driver. The sys_settime() interface is
meant to be flexible and will support both clock_settime() and adjtime()
when adjtime() is implemented later.
settimeofday() was adjusted to use the clock_settime() interface.
One side note discovered during testing: uptime(1) (part of the last(1)),
uses wtmp to determine boottime (not Minix's times(2)). This leads `uptime`
to report odd results when you set the time to a time prior to boottime.
This isn't a new bug introduced by my changes. It's been there for a while.
In order to make it more clear that ticks should be used for timers
and realtime should be used for timestamps / displaying the date/time,
getuptime() was renamed to getticks() and getuptime2() was renamed to
getuptime().
Servers, drivers, libraries, tests, etc that use getuptime()/getuptime2()
have been updated. In instances where a realtime was calculated, the
calculation was changed to use realtime.
System calls clock_getres() and clock_gettime() were added to PM/libc.
The build system distinction between "bootprog" and "service" is
meaningless as boot programs are standard services.
As minix.service.mk simply imports minix.bootprog.mk, reduce confusion
by removing minix.bootprog.mk and placing the rules in minix.service.mk.
Change-Id: I4056b1e574bed59a8c890239b41b1a7c7cad63e8
Remove old versions of system calls and system calls that don't have
a libc api interface anymore (dup, dup2, creat).
VFS still contains support for old system call numbers for the new stat
system calls (i.e., 65, 66, 67) to keep supporting old binaries built for
MINIX 3.2.1 (prior to the release).
Change-Id: I721779b58a50c7eeae20669de24658d55d69b25b
. also make other out-of-memory conditions less fatal
. add a test case for a user program using all the memory
it can
. remove some diagnostic prints for situations that are normal
when running out of memory so running the test isn't noisy
. some strncpy/strcpy to strlcpy conversions
. new <minix/param.h> to avoid including other minix headers
that have colliding definitions with library and commands code,
causing parse warnings
. removed some dead code / assignments
This commit removes all traces of Minix segments (the text/data/stack
memory map abstraction in the kernel) and significance of Intel segments
(hardware segments like CS, DS that add offsets to all addressing before
page table translation). This ultimately simplifies the memory layout
and addressing and makes the same layout possible on non-Intel
architectures.
There are only two types of addresses in the world now: virtual
and physical; even the kernel and processes have the same virtual
address space. Kernel and user processes can be distinguished at a
glance as processes won't use 0xF0000000 and above.
No static pre-allocated memory sizes exist any more.
Changes to booting:
. The pre_init.c leaves the kernel and modules exactly as
they were left by the bootloader in physical memory
. The kernel starts running using physical addressing,
loaded at a fixed location given in its linker script by the
bootloader. All code and data in this phase are linked to
this fixed low location.
. It makes a bootstrap pagetable to map itself to a
fixed high location (also in linker script) and jumps to
the high address. All code and data then use this high addressing.
. All code/data symbols linked at the low addresses is prefixed by
an objcopy step with __k_unpaged_*, so that that code cannot
reference highly-linked symbols (which aren't valid yet) or vice
versa (symbols that aren't valid any more).
. The two addressing modes are separated in the linker script by
collecting the unpaged_*.o objects and linking them with low
addresses, and linking the rest high. Some objects are linked
twice, once low and once high.
. The bootstrap phase passes a lot of information (e.g. free memory
list, physical location of the modules, etc.) using the kinfo
struct.
. After this bootstrap the low-linked part is freed.
. The kernel maps in VM into the bootstrap page table so that VM can
begin executing. Its first job is to make page tables for all other
boot processes. So VM runs before RS, and RS gets a fully dynamic,
VM-managed address space. VM gets its privilege info from RS as usual
but that happens after RS starts running.
. Both the kernel loading VM and VM organizing boot processes happen
using the libexec logic. This removes the last reason for VM to
still know much about exec() and vm/exec.c is gone.
Further Implementation:
. All segments are based at 0 and have a 4 GB limit.
. The kernel is mapped in at the top of the virtual address
space so as not to constrain the user processes.
. Processes do not use segments from the LDT at all; there are
no segments in the LDT any more, so no LLDT is needed.
. The Minix segments T/D/S are gone and so none of the
user-space or in-kernel copy functions use them. The copy
functions use a process endpoint of NONE to realize it's
a physical address, virtual otherwise.
. The umap call only makes sense to translate a virtual address
to a physical address now.
. Segments-related calls like newmap and alloc_segments are gone.
. All segments-related translation in VM is gone (vir2map etc).
. Initialization in VM is simpler as no moving around is necessary.
. VM and all other boot processes can be linked wherever they wish
and will be mapped in at the right location by the kernel and VM
respectively.
Other changes:
. The multiboot code is less special: it does not use mb_print
for its diagnostics any more but uses printf() as normal, saving
the output into the diagnostics buffer, only printing to the
screen using the direct print functions if a panic() occurs.
. The multiboot code uses the flexible 'free memory map list'
style to receive the list of free memory if available.
. The kernel determines the memory layout of the processes to
a degree: it tells VM where the kernel starts and ends and
where the kernel wants the top of the process to be. VM then
uses this entire range, i.e. the stack is right at the top,
and mmap()ped bits of memory are placed below that downwards,
and the break grows upwards.
Other Consequences:
. Every process gets its own page table as address spaces
can't be separated any more by segments.
. As all segments are 0-based, there is no distinction between
virtual and linear addresses, nor between userspace and
kernel addresses.
. Less work is done when context switching, leading to a net
performance increase. (8% faster on my machine for 'make servers'.)
. The layout and configuration of the GDT makes sysenter and syscall
possible.
. sys_vircopy always uses D for both src and dst
. sys_physcopy uses PHYS_SEG if and only if corresponding
endpoint is NONE, so we can derive the mode (PHYS_SEG or D)
from the endpoint arg in the kernel, dropping the seg args
. fields in msg still filled in for backwards compatability,
using same NONE-logic in the library
. make exec() callers (i.e. vfs and rs) determine the
memory layout by explicitly reserving regions using
mmap() calls on behalf of the exec()ing process,
i.e. handling all of the exec logic, thereby eliminating
all special exec() knowledge from VM.
. the new procedure is: clear the exec()ing process
first, then call third-party mmap()s to reserve memory, then
copy the executable file section contents in, all using callbacks
tailored to the caller's way of starting an executable
. i.e. no more explicit EXEC_NEWMEM-style calls in PM or VM
as with rigid 2-section arguments
. this naturally allows generalizing exec() by simply loading
all ELF sections
. drop/merge of lots of duplicate exec() code into libexec
. not copying the code sections to vfs and into the executable
again is a measurable performance improvement (about 3.3% faster
for 'make' in src/servers/)
. generalize libexec slightly to get some more necessary information
from ELF files, e.g. the interpreter
. execute dynamically linked executables when exec()ed by VFS
. switch to netbsd variant of elf32.h exclusively, solves some
conflicting headers
remove some old minix-userland-specific stuff
. /etc/ttytab as a file, and minix-compat function (fftyslot()),
replaced by /etc/ttys and new libc functions
. also remove minix-specific nlist(), cuserid(), fttyslot(), v8 regex
functions and <compat/regex.h>
. and remaining minix-only utilities that use them
. also unused <compat/pwd.h> and <compat/syslog.h> and
redundant <sys/sigcontext.h>
Currently, all servers and drivers run as root as they are forks of
RS. srv_fork now tells PM with which credentials to run the resulting
fork. Subsequently, PM lets VFS now as well.
This patch also fixes the following bugs:
- RS doesn't initialize the setugid variable during exec, causing the
servers and drivers to run setuid rendering the srv_fork extension
useless.
- PM erroneously tells VFS to run processes setuid. This doesn't
actually lead to setuid processes as VFS sets {r,e}uid and {r,e}gid
properly before checking PM's approval.
User processes can send signals with number up to _NSIG. There are a few
signal numbers above that used by the kernel, but should explicitly not
be included in the range or range checks in PM will fail.
The system processes use a different version of sigaddset, sigdelset,
sigemptyset, sigfillset, and sigismember which does not include a range
check on signal numbers (as opposed to the normal functions used by normal
processes).
This patch unbreaks test37 when the boot image is compiled with GCC/Clang.
This patch provides basic protection against damage resulting from
differently compiled servers blindly copying tables to one another.
In every getsysinfo() call, the caller is provided with the expected
size of the requested data structure. The callee fails the call if
the expected size does not match the data structure's actual size.
. make procfs check it
. detects pm/procfs mismatches
. was triggered by ack/clang pm/procfs:
add padding to mproc struct to align ack/clang layout
to fix this
. it's a good extra interface to have but doesn't
meet standardised functionality
. applications (in pkgsrc) find it and expect
full functionality the minix mmap doesn't offter
. on the whole probably better to hide these functions
(mmap and friends) until they are grown up; the base system
can use the new minix_* names
* VFS and installed MFSes must be in sync before and after this change *
Use struct stat from NetBSD. It requires adding new STAT, FSTAT and LSTAT
syscalls. Libc modification is both backward and forward compatible.
Also new struct stat uses modern field sizes to avoid ABI
incompatibility, when we update uid_t, gid_t and company.
Exceptions are ino_t and off_t in old libc (though paddings added).
3 sets of libraries are built now:
. ack: all libraries that ack can compile (/usr/lib/i386/)
. clang+elf: all libraries with minix headers (/usr/lib/)
. clang+elf: all libraries with netbsd headers (/usr/netbsd/)
Once everything can be compiled with netbsd libraries and headers, the
/usr/netbsd hierarchy will be obsolete and its libraries compiled with
netbsd headers will be installed in /usr/lib, and its headers
in /usr/include. (i.e. minix libc and current minix headers set
will be gone.)
To use the NetBSD libc system (libraries + headers) before
it is the default libc, see:
http://wiki.minix3.org/en/DevelopersGuide/UsingNetBSDCode
This wiki page also documents the maintenance of the patch
files of minix-specific changes to imported NetBSD code.
Changes in this commit:
. libsys: Add NBSD compilation and create a safe NBSD-based libc.
. Port rest of libraries (except libddekit) to new header system.
. Enable compilation of libddekit with new headers.
. Enable kernel compilation with new headers.
. Enable drivers compilation with new headers.
. Port legacy commands to new headers and libc.
. Port servers to new headers.
. Add <sys/sigcontext.h> in compat library.
. Remove dependency file in tree.
. Enable compilation of common/lib/libc/atomic in libsys
. Do not generate RCSID strings in libc.
. Temporarily disable zoneinfo as they are incompatible with NetBSD format
. obj-nbsd for .gitignore
. Procfs: use only integer arithmetic. (Antoine Leca)
. Increase ramdisk size to create NBSD-based images.
. Remove INCSYMLINKS handling hack.
. Add nbsd_include/sys/exec_elf.h
. Enable ELF compilation with NBSD libc.
. Add 'make nbsdsrc' in tools to download reference NetBSD sources.
. Automate minix-port.patch creation.
. Avoid using fstavfs() as it is *extremely* slow and unneeded.
. Set err() as PRIVATE to avoid name clash with libc.
. [NBSD] servers/vm: remove compilation warnings.
. u32 is not a long in NBSD headers.
. UPDATING info on netbsd hierarchy
. commands fixes for netbsd libc
sys_umap now supports only:
- looking up the physical address of a virtual address in the address space
of the caller;
- looking up the physical address of a grant for which the caller is the
grantee.
This is enough for nearly all umap users. The new sys_umap_remote supports
lookups in arbitrary address spaces and grants for arbitrary grantees.
- profile --nmi | --rtc sets the profiling mode
- --rtc is default, uses BIOS RTC, cannot profile kernel the presetted
frequency values apply
- --nmi is only available in APIC mode as it uses the NMI watchdog, -f
allows any frequency in Hz
- both modes use compatible data structures
- EBADCPU is returned is scheduler tries to run a process on a CPU
that either does not exist or isn't booted
- this change was originally meant to deal with stupid cpuid
instruction which provides totally useless information about
hyper-threading and MPS which does not deal with ht at all. ACPI
provides correct information. If ht is turned off it looks like some
CPUs failed to boot. Nevertheless this patch may be handy for
testing/benchmarking in the future.
- sys_schedule can change only selected values, -1 means that the
current value should be kept unchanged. For instance we mostly want
to change the scheduling quantum and priority but we want to keep
the process at the current cpu
- RS can hand off its processes to scheduler
- service can read the destination cpu from system.conf
- RS can pass the information farther
- machine information contains the number of cpus and the bsp id
- a dummy SMP scheduler which keeps all system processes on BSP and
all other process on APs. The scheduler remembers how many processes
are assigned to each CPU and always picks the one with the least
processes for a new process.