minix/servers/vfs/utility.c

216 lines
6 KiB
C
Raw Normal View History

2005-04-21 16:53:53 +02:00
/* This file contains a few general purpose utility routines.
*
* The entry points into this file are
* clock_timespec: ask the clock task for the real time
* copy_path: copy a path name from a path request from userland
2005-04-21 16:53:53 +02:00
* fetch_name: go get a path name from user space
* panic: something awful has occurred; MINIX cannot continue
* in_group: determines if group 'grp' is in rfp->fp_sgroups[]
2005-04-21 16:53:53 +02:00
*/
#include "fs.h"
#include <minix/callnr.h>
endpoint-aware conversion of servers. 'who', indicating caller number in pm and fs and some other servers, has been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.). In both PM and FS, isokendpt() convert endpoints to process slot numbers, returning OK if it was a valid and consistent endpoint number. okendpt() does the same but panic()s if it doesn't succeed. (In PM, this is pm_isok..) pm and fs keep their own records of process endpoints in their proc tables, which are needed to make kernel calls about those processes. message field names have changed. fs drivers are endpoints. fs now doesn't try to get out of driver deadlock, as the protocol isn't supposed to let that happen any more. (A warning is printed if ELOCKED is detected though.) fproc[].fp_task (indicating which driver the process is suspended on) became an int. PM and FS now get endpoint numbers of initial boot processes from the kernel. These happen to be the same as the old proc numbers, to let user processes reach them with the old numbers, but FS and PM don't know that. All new processes after INIT, even after the generation number wraps around, get endpoint numbers with generation 1 and higher, so the first instances of the boot processes are the only processes ever to have endpoint numbers in the old proc number range. More return code checks of sys_* functions have been added. IS has become endpoint-aware. Ditched the 'text' and 'data' fields in the kernel dump (which show locations, not sizes, so aren't terribly useful) in favour of the endpoint number. Proc number is still visible. Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got the formatting changed. PM reading segments using rw_seg() has changed - it uses other fields in the message now instead of encoding the segment and process number and fd in the fd field. For that it uses _read_pm() and _write_pm() which to _taskcall()s directly in pm/misc.c. PM now sys_exit()s itself on panic(), instead of sys_abort(). RS also talks in endpoints instead of process numbers.
2006-03-03 11:20:58 +01:00
#include <minix/endpoint.h>
2005-04-21 16:53:53 +02:00
#include <unistd.h>
2007-10-23 16:19:16 +02:00
#include <stdlib.h>
2012-07-13 18:08:06 +02:00
#include <string.h>
#include <assert.h>
#include <time.h>
2005-04-21 16:53:53 +02:00
#include "file.h"
Mostly bugfixes of bugs triggered by the test set. bugfixes: SYSTEM: . removed rc->p_priv->s_flags = 0; for the priv struct shared by all user processes in get_priv(). this should only be done once. doing a SYS_PRIV_USER in sys_privctl() caused the flags of all user processes to be reset, so they were no longer PREEMPTIBLE. this happened when RS executed a policy script. (this broke test1 in the test set) VFS/MFS: . chown can change the mode of a file, and chmod arguments are only part of the full file mode so the full filemode is slightly magic. changed these calls so that the final modes are returned to VFS, so that the vnode can be kept up-to-date. (this broke test11 in the test set) MFS: . lookup() checked for sizeof(string) instead of sizeof(user_path), truncating long path names (caught by test 23) . truncate functions neglected to update ctime (this broke test16) VFS: . corner case of an empty filename lookup caused fields of a request not to be filled in in the lookup functions, not making it clear that the lookup had failed, causing messages to garbage processes, causing strange failures. (caught by test 30) . trust v_size in vnode when doing reads or writes on non-special files, truncating i/o where necessary; this is necessary for pipes, as MFS can't tell when a pipe has been truncated without it being told explicitly each time. when the last reader/writer on a pipe closes, tell FS about the new size using truncate_vn(). (this broke test 25, among others) . permission check for chdir() had disappeared; added a forbidden() call (caught by test 23) new code, shouldn't change anything: . introduced RTS_SET, RTS_UNSET, and RTS_ISSET macro's, and their LOCK variants. These macros set and clear the p_rts_flags field, causing a lot of duplicated logic like old_flags = rp->p_rts_flags; /* save value of the flags */ rp->p_rts_flags &= ~NO_PRIV; if (old_flags != 0 && rp->p_rts_flags == 0) lock_enqueue(rp); to change into the simpler RTS_LOCK_UNSET(rp, NO_PRIV); so the macros take care of calling dequeue() and enqueue() (or lock_*()), as the case may be). This makes the code a bit more readable and a bit less fragile. . removed return code from do_clocktick in CLOCK as it currently never replies . removed some debug code from VFS . fixed grant debug message in device.c preemptive checks, tests, changes: . added return code checks of receive() to SYSTEM and CLOCK . O_TRUNC should never arrive at MFS (added sanity check and removed O_TRUNC code) . user_path declared with PATH_MAX+1 to let it be null-terminated . checks in MFS to see if strings passed by VFS are null-terminated IS: . static irq name table thrown out
2007-02-01 18:50:02 +01:00
#include "vmnt.h"
2005-04-21 16:53:53 +02:00
/*===========================================================================*
* copy_path *
2005-04-21 16:53:53 +02:00
*===========================================================================*/
int copy_path(char *dest, size_t size)
2005-04-21 16:53:53 +02:00
{
/* Go get the path for a path request. Put the result in in 'dest', which
* should be at least PATH_MAX in size.
2005-04-21 16:53:53 +02:00
*/
vir_bytes name;
size_t len;
assert(size >= PATH_MAX);
name = (vir_bytes) job_m_in.VFS_PATH_NAME;
len = job_m_in.VFS_PATH_LEN;
if (len > size) { /* 'len' includes terminating-nul */
Mostly bugfixes of bugs triggered by the test set. bugfixes: SYSTEM: . removed rc->p_priv->s_flags = 0; for the priv struct shared by all user processes in get_priv(). this should only be done once. doing a SYS_PRIV_USER in sys_privctl() caused the flags of all user processes to be reset, so they were no longer PREEMPTIBLE. this happened when RS executed a policy script. (this broke test1 in the test set) VFS/MFS: . chown can change the mode of a file, and chmod arguments are only part of the full file mode so the full filemode is slightly magic. changed these calls so that the final modes are returned to VFS, so that the vnode can be kept up-to-date. (this broke test11 in the test set) MFS: . lookup() checked for sizeof(string) instead of sizeof(user_path), truncating long path names (caught by test 23) . truncate functions neglected to update ctime (this broke test16) VFS: . corner case of an empty filename lookup caused fields of a request not to be filled in in the lookup functions, not making it clear that the lookup had failed, causing messages to garbage processes, causing strange failures. (caught by test 30) . trust v_size in vnode when doing reads or writes on non-special files, truncating i/o where necessary; this is necessary for pipes, as MFS can't tell when a pipe has been truncated without it being told explicitly each time. when the last reader/writer on a pipe closes, tell FS about the new size using truncate_vn(). (this broke test 25, among others) . permission check for chdir() had disappeared; added a forbidden() call (caught by test 23) new code, shouldn't change anything: . introduced RTS_SET, RTS_UNSET, and RTS_ISSET macro's, and their LOCK variants. These macros set and clear the p_rts_flags field, causing a lot of duplicated logic like old_flags = rp->p_rts_flags; /* save value of the flags */ rp->p_rts_flags &= ~NO_PRIV; if (old_flags != 0 && rp->p_rts_flags == 0) lock_enqueue(rp); to change into the simpler RTS_LOCK_UNSET(rp, NO_PRIV); so the macros take care of calling dequeue() and enqueue() (or lock_*()), as the case may be). This makes the code a bit more readable and a bit less fragile. . removed return code from do_clocktick in CLOCK as it currently never replies . removed some debug code from VFS . fixed grant debug message in device.c preemptive checks, tests, changes: . added return code checks of receive() to SYSTEM and CLOCK . O_TRUNC should never arrive at MFS (added sanity check and removed O_TRUNC code) . user_path declared with PATH_MAX+1 to let it be null-terminated . checks in MFS to see if strings passed by VFS are null-terminated IS: . static irq name table thrown out
2007-02-01 18:50:02 +01:00
err_code = ENAMETOOLONG;
return(EGENERIC);
}
/* Is the string contained in the message? If not, perform a normal copy. */
if (len > M3_STRING)
return fetch_name(name, len, dest);
2005-04-21 16:53:53 +02:00
/* Just copy the path from the message */
strncpy(dest, job_m_in.VFS_PATH_BUF, len);
if (dest[len - 1] != '\0') {
err_code = ENAMETOOLONG;
return(EGENERIC);
}
return(OK);
}
/*===========================================================================*
* fetch_name *
*===========================================================================*/
int fetch_name(vir_bytes path, size_t len, char *dest)
{
/* Go get path and put it in 'dest'. */
int r;
if (len > PATH_MAX) { /* 'len' includes terminating-nul */
err_code = ENAMETOOLONG;
return(EGENERIC);
}
/* Check name length for validity. */
if (len > SSIZE_MAX) {
err_code = EINVAL;
return(EGENERIC);
}
/* String is not contained in the message. Get it from user space. */
make vfs & filesystems use failable copying Change the kernel to add features to vircopy and safecopies so that transparent copy fixing won't happen to avoid deadlocks, and such copies fail with EFAULT. Transparently making copying work from filesystems (as normally done by the kernel & VM when copying fails because of missing/readonly memory) is problematic as it can happen that, for file-mapped ranges, that that same filesystem that is blocked on the copy request is needed to satisfy the memory range, leading to deadlock. Dito for VFS itself, if done with a blocking call. This change makes the copying done from a filesystem fail in such cases with EFAULT by VFS adding the CPF_TRY flag to the grants. If a FS call fails with EFAULT, VFS will then request the range to be made available to VM after the FS is unblocked, allowing it to be used to satisfy the range if need be in another VFS thread. Similarly, for datacopies that VFS itself does, it uses the failable vircopy variant and callers use a wrapper that talk to VM if necessary to get the copy to work. . kernel: add CPF_TRY flag to safecopies . kernel: only request writable ranges to VM for the target buffer when copying fails . do copying in VFS TRY-first . some fixes in VM to build SANITYCHECK mode . add regression test for the cases where - a FS system call needs memory mapped in a process that the FS itself must map. - such a range covers more than one file-mapped region. . add 'try' mode to vircopy, physcopy . add flags field to copy kernel call messages . if CP_FLAG_TRY is set, do not transparently try to fix memory ranges . for use by VFS when accessing user buffers to avoid deadlock . remove some obsolete backwards compatability assignments . VFS: let thread scheduling work for VM requests too Allows VFS to make calls to VM while suspending and resuming the currently running thread. Does currently not work for the main thread. . VM: add fix memory range call for use by VFS Change-Id: I295794269cea51a3163519a9cfe5901301d90b32
2014-01-16 14:22:13 +01:00
r = sys_datacopy_wrapper(who_e, path, VFS_PROC_NR, (vir_bytes) dest, len);
if (r != OK) {
err_code = EINVAL;
return(r);
2005-04-21 16:53:53 +02:00
}
2012-02-13 16:28:04 +01:00
if (dest[len - 1] != '\0') {
err_code = ENAMETOOLONG;
return(EGENERIC);
}
return(OK);
2005-04-21 16:53:53 +02:00
}
/*===========================================================================*
* isokendpt_f *
2005-04-21 16:53:53 +02:00
*===========================================================================*/
int isokendpt_f(const char *file, int line, endpoint_t endpoint, int *proc,
int fatal)
2005-04-21 16:53:53 +02:00
{
int failed = 0;
endpoint_t ke;
*proc = _ENDPOINT_P(endpoint);
2012-02-13 16:28:04 +01:00
if (endpoint == NONE) {
printf("VFS %s:%d: endpoint is NONE\n", file, line);
failed = 1;
2012-02-13 16:28:04 +01:00
} else if (*proc < 0 || *proc >= NR_PROCS) {
printf("VFS %s:%d: proc (%d) from endpoint (%d) out of range\n",
file, line, *proc, endpoint);
failed = 1;
2012-02-13 16:28:04 +01:00
} else if ((ke = fproc[*proc].fp_endpoint) != endpoint) {
if(ke == NONE) {
assert(fproc[*proc].fp_pid == PID_FREE);
2012-02-13 16:28:04 +01:00
} else {
printf("VFS %s:%d: proc (%d) from endpoint (%d) doesn't match "
"known endpoint (%d)\n", file, line, *proc, endpoint,
fproc[*proc].fp_endpoint);
assert(fproc[*proc].fp_pid != PID_FREE);
}
failed = 1;
}
if(failed && fatal)
panic("isokendpt_f failed");
2010-06-24 09:37:26 +02:00
return(failed ? EDEADEPT : OK);
2005-04-21 16:53:53 +02:00
}
endpoint-aware conversion of servers. 'who', indicating caller number in pm and fs and some other servers, has been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.). In both PM and FS, isokendpt() convert endpoints to process slot numbers, returning OK if it was a valid and consistent endpoint number. okendpt() does the same but panic()s if it doesn't succeed. (In PM, this is pm_isok..) pm and fs keep their own records of process endpoints in their proc tables, which are needed to make kernel calls about those processes. message field names have changed. fs drivers are endpoints. fs now doesn't try to get out of driver deadlock, as the protocol isn't supposed to let that happen any more. (A warning is printed if ELOCKED is detected though.) fproc[].fp_task (indicating which driver the process is suspended on) became an int. PM and FS now get endpoint numbers of initial boot processes from the kernel. These happen to be the same as the old proc numbers, to let user processes reach them with the old numbers, but FS and PM don't know that. All new processes after INIT, even after the generation number wraps around, get endpoint numbers with generation 1 and higher, so the first instances of the boot processes are the only processes ever to have endpoint numbers in the old proc number range. More return code checks of sys_* functions have been added. IS has become endpoint-aware. Ditched the 'text' and 'data' fields in the kernel dump (which show locations, not sizes, so aren't terribly useful) in favour of the endpoint number. Proc number is still visible. Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got the formatting changed. PM reading segments using rw_seg() has changed - it uses other fields in the message now instead of encoding the segment and process number and fd in the fd field. For that it uses _read_pm() and _write_pm() which to _taskcall()s directly in pm/misc.c. PM now sys_exit()s itself on panic(), instead of sys_abort(). RS also talks in endpoints instead of process numbers.
2006-03-03 11:20:58 +01:00
/*===========================================================================*
* clock_timespec *
endpoint-aware conversion of servers. 'who', indicating caller number in pm and fs and some other servers, has been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.). In both PM and FS, isokendpt() convert endpoints to process slot numbers, returning OK if it was a valid and consistent endpoint number. okendpt() does the same but panic()s if it doesn't succeed. (In PM, this is pm_isok..) pm and fs keep their own records of process endpoints in their proc tables, which are needed to make kernel calls about those processes. message field names have changed. fs drivers are endpoints. fs now doesn't try to get out of driver deadlock, as the protocol isn't supposed to let that happen any more. (A warning is printed if ELOCKED is detected though.) fproc[].fp_task (indicating which driver the process is suspended on) became an int. PM and FS now get endpoint numbers of initial boot processes from the kernel. These happen to be the same as the old proc numbers, to let user processes reach them with the old numbers, but FS and PM don't know that. All new processes after INIT, even after the generation number wraps around, get endpoint numbers with generation 1 and higher, so the first instances of the boot processes are the only processes ever to have endpoint numbers in the old proc number range. More return code checks of sys_* functions have been added. IS has become endpoint-aware. Ditched the 'text' and 'data' fields in the kernel dump (which show locations, not sizes, so aren't terribly useful) in favour of the endpoint number. Proc number is still visible. Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got the formatting changed. PM reading segments using rw_seg() has changed - it uses other fields in the message now instead of encoding the segment and process number and fd in the fd field. For that it uses _read_pm() and _write_pm() which to _taskcall()s directly in pm/misc.c. PM now sys_exit()s itself on panic(), instead of sys_abort(). RS also talks in endpoints instead of process numbers.
2006-03-03 11:20:58 +01:00
*===========================================================================*/
struct timespec clock_timespec(void)
endpoint-aware conversion of servers. 'who', indicating caller number in pm and fs and some other servers, has been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.). In both PM and FS, isokendpt() convert endpoints to process slot numbers, returning OK if it was a valid and consistent endpoint number. okendpt() does the same but panic()s if it doesn't succeed. (In PM, this is pm_isok..) pm and fs keep their own records of process endpoints in their proc tables, which are needed to make kernel calls about those processes. message field names have changed. fs drivers are endpoints. fs now doesn't try to get out of driver deadlock, as the protocol isn't supposed to let that happen any more. (A warning is printed if ELOCKED is detected though.) fproc[].fp_task (indicating which driver the process is suspended on) became an int. PM and FS now get endpoint numbers of initial boot processes from the kernel. These happen to be the same as the old proc numbers, to let user processes reach them with the old numbers, but FS and PM don't know that. All new processes after INIT, even after the generation number wraps around, get endpoint numbers with generation 1 and higher, so the first instances of the boot processes are the only processes ever to have endpoint numbers in the old proc number range. More return code checks of sys_* functions have been added. IS has become endpoint-aware. Ditched the 'text' and 'data' fields in the kernel dump (which show locations, not sizes, so aren't terribly useful) in favour of the endpoint number. Proc number is still visible. Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got the formatting changed. PM reading segments using rw_seg() has changed - it uses other fields in the message now instead of encoding the segment and process number and fd in the fd field. For that it uses _read_pm() and _write_pm() which to _taskcall()s directly in pm/misc.c. PM now sys_exit()s itself on panic(), instead of sys_abort(). RS also talks in endpoints instead of process numbers.
2006-03-03 11:20:58 +01:00
{
/* This routine returns the time in seconds since 1.1.1970. MINIX is an
* astrophysically naive system that assumes the earth rotates at a constant
* rate and that such things as leap seconds do not exist.
*/
2007-08-07 14:52:47 +02:00
register int r;
struct timespec tv;
clock_t uptime;
clock_t realtime;
2007-08-07 14:52:47 +02:00
time_t boottime;
r = getuptime(&uptime, &realtime, &boottime);
2007-08-07 14:52:47 +02:00
if (r != OK)
panic("clock_timespec err: %d", r);
tv.tv_sec = boottime + (realtime/system_hz);
/* We do not want to overflow, and system_hz can be as high as 50kHz */
assert(system_hz < LONG_MAX/40000);
tv.tv_nsec = (realtime%system_hz) * 40000 / system_hz * 25000;
return tv;
endpoint-aware conversion of servers. 'who', indicating caller number in pm and fs and some other servers, has been removed in favour of 'who_e' (endpoint) and 'who_p' (proc nr.). In both PM and FS, isokendpt() convert endpoints to process slot numbers, returning OK if it was a valid and consistent endpoint number. okendpt() does the same but panic()s if it doesn't succeed. (In PM, this is pm_isok..) pm and fs keep their own records of process endpoints in their proc tables, which are needed to make kernel calls about those processes. message field names have changed. fs drivers are endpoints. fs now doesn't try to get out of driver deadlock, as the protocol isn't supposed to let that happen any more. (A warning is printed if ELOCKED is detected though.) fproc[].fp_task (indicating which driver the process is suspended on) became an int. PM and FS now get endpoint numbers of initial boot processes from the kernel. These happen to be the same as the old proc numbers, to let user processes reach them with the old numbers, but FS and PM don't know that. All new processes after INIT, even after the generation number wraps around, get endpoint numbers with generation 1 and higher, so the first instances of the boot processes are the only processes ever to have endpoint numbers in the old proc number range. More return code checks of sys_* functions have been added. IS has become endpoint-aware. Ditched the 'text' and 'data' fields in the kernel dump (which show locations, not sizes, so aren't terribly useful) in favour of the endpoint number. Proc number is still visible. Some other dumps (e.g. dmap, rs) show endpoint numbers now too which got the formatting changed. PM reading segments using rw_seg() has changed - it uses other fields in the message now instead of encoding the segment and process number and fd in the fd field. For that it uses _read_pm() and _write_pm() which to _taskcall()s directly in pm/misc.c. PM now sys_exit()s itself on panic(), instead of sys_abort(). RS also talks in endpoints instead of process numbers.
2006-03-03 11:20:58 +01:00
}
/*===========================================================================*
* in_group *
*===========================================================================*/
2012-03-25 20:25:53 +02:00
int in_group(struct fproc *rfp, gid_t grp)
{
int i;
2012-02-13 16:28:04 +01:00
for (i = 0; i < rfp->fp_ngroups; i++)
if (rfp->fp_sgroups[i] == grp)
return(OK);
return(EINVAL);
}
make vfs & filesystems use failable copying Change the kernel to add features to vircopy and safecopies so that transparent copy fixing won't happen to avoid deadlocks, and such copies fail with EFAULT. Transparently making copying work from filesystems (as normally done by the kernel & VM when copying fails because of missing/readonly memory) is problematic as it can happen that, for file-mapped ranges, that that same filesystem that is blocked on the copy request is needed to satisfy the memory range, leading to deadlock. Dito for VFS itself, if done with a blocking call. This change makes the copying done from a filesystem fail in such cases with EFAULT by VFS adding the CPF_TRY flag to the grants. If a FS call fails with EFAULT, VFS will then request the range to be made available to VM after the FS is unblocked, allowing it to be used to satisfy the range if need be in another VFS thread. Similarly, for datacopies that VFS itself does, it uses the failable vircopy variant and callers use a wrapper that talk to VM if necessary to get the copy to work. . kernel: add CPF_TRY flag to safecopies . kernel: only request writable ranges to VM for the target buffer when copying fails . do copying in VFS TRY-first . some fixes in VM to build SANITYCHECK mode . add regression test for the cases where - a FS system call needs memory mapped in a process that the FS itself must map. - such a range covers more than one file-mapped region. . add 'try' mode to vircopy, physcopy . add flags field to copy kernel call messages . if CP_FLAG_TRY is set, do not transparently try to fix memory ranges . for use by VFS when accessing user buffers to avoid deadlock . remove some obsolete backwards compatability assignments . VFS: let thread scheduling work for VM requests too Allows VFS to make calls to VM while suspending and resuming the currently running thread. Does currently not work for the main thread. . VM: add fix memory range call for use by VFS Change-Id: I295794269cea51a3163519a9cfe5901301d90b32
2014-01-16 14:22:13 +01:00
/*===========================================================================*
* sys_datacopy_wrapper *
*===========================================================================*/
int sys_datacopy_wrapper(endpoint_t src, vir_bytes srcv,
endpoint_t dst, vir_bytes dstv, size_t len)
{
/* Safe function to copy data from or to a user buffer.
* VFS has to be a bit more careful as a regular copy
* might trigger VFS action needed by VM while it's
* blocked on the kernel call. This wrapper tries the
* copy, invokes VM itself asynchronously if necessary,
* then tries the copy again.
*
* This function assumes it's between VFS and a user process,
* and therefore one of the endpoints is SELF (VFS).
*/
int r;
endpoint_t them = NONE;
vir_bytes themv = -1;
int writable = -1;
r = sys_datacopy_try(src, srcv, dst, dstv, len);
if(src == VFS_PROC_NR) src = SELF;
if(dst == VFS_PROC_NR) dst = SELF;
assert(src == SELF || dst == SELF);
if(src == SELF) { them = dst; themv = dstv; writable = 1; }
if(dst == SELF) { them = src; themv = srcv; writable = 0; }
assert(writable >= 0);
assert(them != SELF);
if(r == EFAULT) {
/* The copy has failed with EFAULT, this means the kernel has
* given up but it might not be a legitimate error. Ask VM.
*/
if((r=vm_vfs_procctl_handlemem(them, themv, len, writable)) != OK) {
return r;
}
r = sys_datacopy_try(src, srcv, dst, dstv, len);
}
return r;
}