File Descriptors – More than Meets the Eye – Pt. 3
A search on “what is a file descriptor” :
http://en.wikipedia.org/wiki/File_descriptor
“In computer programming, a file descriptor (FD) is an abstract indicator for accessing a file” — the “file” keyword is clickable and it goes on to compare “files” to “documents” on a computer.
The meaning is elusive. If you look at “ways to obtain an FD” — a socket() call provides a file descriptor. It’s hard to think of a network socket as a file, or document.. and the fact there is a sockfs… I’m really not getting this filesystem stuff.
“Generally, a file descriptor is an index for an entry in a kernel-resident array data structure containing the details of open files. In POSIX this data structure is called a file descriptor table, and each process has its own file descriptor table. The process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process itself cannot read or write the file descriptor table directly.
…
In Unix-like systems, file descriptors can refer to any Unix file type named in a file system. As well as regular files, this includes directories, block and character devices (also called “special files”), Unix domain sockets, and named pipes. File descriptors can also refer to other objects that do not normally exist in the file system, such as anonymous pipes and network sockets.”
‘other objects’ — notice they don’t say ‘files’ – because nobody refers to a network socket as a file!!! There is some mysterious zone to these file descriptors.. And it all happens at the kernel level..
I am digging to find out what is this is all about. Where does a file descriptor come from? How can we “create” one? My mind’s eye layed upon Solaris Doors, a door_create syscall will return a “Door descriptor” — a file descriptor without a name in the filesystem — whatever that is. Seriously, what is a filesystem anymore… It’s so abstract..
But I knew I had an opportunity to visualize how a file descriptor gets created, or at least see wtf Doors does to “create” one.
I’m using CScope on the Solaris 2.8 source tree. Here’s what I find:
osnet_volume/usr/src/uts/common/fs/doorfs/door_sys.c
int door_create(void (*pc_cookie)(), void *data_cookie, uint_t attributes) { int fd; proc_t *p = ttoproc(curthread); int err; if ((attributes & ~(DOOR_UNREF | DOOR_PRIVATE | DOOR_UNREF_MULTI)) || ((attributes & (DOOR_UNREF | DOOR_UNREF_MULTI)) == (DOOR_UNREF | DOOR_UNREF_MULTI))) return (set_errno(EINVAL)); if ((err = door_create_common(pc_cookie, data_cookie, attributes, p, &fd, NULL)) != 0) return (set_errno(err)); f_setfd(fd, FD_CLOEXEC); return (fd); }
Note: FD_CLOEXEC means that the file descriptor should be closed if the process execs.
osnet_volume/usr/src/uts/common/sys/thread.h
/* * proctot(x) * convert a proc pointer to a thread pointer. this only works with * procs that have only one lwp. * * proctolwp(x) * convert a proc pointer to a lwp pointer. this only works with * procs that have only one lwp. * * ttolwp(x) * convert a thread pointer to its lwp pointer. * * ttoproc(x) * convert a thread pointer to its proc pointer. * * lwptot(x) * convert a lwp pointer to its thread pointer. * * lwptoproc(x) * convert a lwp to its proc pointer. */ #define proctot(x) ((x)->p_tlist) #define proctolwp(x) ((x)->p_tlist->t_lwp) #define ttolwp(x) ((x)->t_lwp) #define ttoproc(x) ((x)->t_procp) #define lwptot(x) ((x)->lwp_thread) #define lwptoproc(x) ((x)->lwp_procp)
osnet_volume/usr/src/uts/common/sys/door.h
/* Attributes originally obtained from door_create operation */ #define DOOR_UNREF 0x01 /* Deliver an unref notification with door */ #define DOOR_PRIVATE 0x02 /* Use a private pool of server threads */ #define DOOR_UNREF_MULTI 0x10 /* Deliver unref notification more than once */ /* Attributes (additional) returned with door_info and door_desc_t data */ #define DOOR_LOCAL 0x04 /* Descriptor is local to current process */ #define DOOR_REVOKED 0x08 /* Door has been revoked */ #define DOOR_IS_UNREF 0x20 /* Door is currently unreferenced */ [..] extern kmutex_t door_knob; extern kcondvar_t door_cv; extern size_t door_max_arg;
back in door_sys.c:
/* * Common code for creating user and kernel doors. If a door was * created, stores a file structure pointer in the location pointed * to by fpp (if fpp is non-NULL) and returns 0. Also, if a non-NULL * pointer to a file descriptor is passed in as fdp, allocates a file * descriptor representing the door. If a door could not be created, * returns an error. */ static int door_create_common(void (*pc_cookie)(), void *data_cookie, uint_t attributes, proc_t *p, int *fdp, file_t **fpp) { door_node_t *dp; vnode_t *vp; struct file *fp; extern struct vnodeops door_vnodeops; static door_id_t index = 0; dp = kmem_zalloc(sizeof (door_node_t), KM_SLEEP); dp->door_target = p; dp->door_data = data_cookie; dp->door_pc = pc_cookie; dp->door_flags = attributes; vp = DTOV(dp); mutex_init(&vp->v_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&vp->v_cv, NULL, CV_DEFAULT, NULL); vp->v_op = &door_vnodeops; vp->v_type = VDOOR; vp->v_vfsp = &door_vfs; vp->v_data = (caddr_t)vp; VN_HOLD(vp); mutex_enter(&door_knob); dp->door_index = index++; /* add to per-process door list */ door_list_insert(dp); mutex_exit(&door_knob); if (falloc(vp, FREAD | FWRITE, &fp, fdp)) { /* * If the file table is full, remove the door from the * per-process list, free the door, and return NULL. */ mutex_enter(&door_knob); door_list_delete(dp); mutex_exit(&door_knob); kmem_free(dp, sizeof (door_node_t)); return (EMFILE); } if (fdp != NULL) setf(*fdp, fp); mutex_exit(&fp->f_tlock); if (fpp != NULL) *fpp = fp; return (0); }
KM_SLEEP: allow sleeping until memory is available
osnet_volume/usr/src/uts/common/sys/door.h
/* * Underlying 'filesystem' object definition */ typedef struct door_node { vnode_t door_vnode; struct proc *door_target; /* Proc handling this doors invoc's. */ struct door_node *door_list; /* List of active doors in proc */ struct door_node *door_ulist; /* Unref list */ void (*door_pc)(); /* Door server entry point */ void *door_data; /* Cookie passed during invocations */ door_id_t door_index; /* Used as a uniquifier */ door_attr_t door_flags; /* State associated with door */ uint_t door_active; /* Number of active invocations */ struct _kthread *door_servers; /* Private pool of server threads */ } door_node_t;
/usr/include/sys/door.h
#define VTOD(v) ((struct door_node *)(v)) #define DTOV(d) ((struct vnode *)(d))
osnet_volume/usr/src/uts/common/sys/vnode.h
/* * The vnode is the focus of all file activity in UNIX. * A vnode is allocated for each active file, each current * directory, each mounted-on file, and the root. */ /* * vnode types. VNON means no type. These values are unrelated to * values in on-disk inodes. */ typedef enum vtype { VNON = 0, VREG = 1, VDIR = 2, VBLK = 3, VCHR = 4, VLNK = 5, VFIFO = 6, VDOOR = 7, VPROC = 8, VSOCK = 9, VBAD = 10 } vtype_t; /* * All of the fields in the vnode are read-only once they are initialized * (created) except for: * v_flag: protected by v_lock * v_count: protected by v_lock * v_pages: file system must keep page list in sync with file size * v_filocks: protected by flock_lock in flock.c * v_shrlocks: protected by v_lock */ /* XX64 Can fields be reordered? */ typedef struct vnode { kmutex_t v_lock; /* protects vnode fields */ ushort_t v_flag; /* vnode flags (see below) */ uint_t v_count; /* reference count */ struct vfs *v_vfsmountedhere; /* ptr to vfs mounted here */ struct vnodeops *v_op; /* vnode operations */ struct vfs *v_vfsp; /* ptr to containing VFS */ struct stdata *v_stream; /* associated stream */ struct page *v_pages; /* vnode pages list */ enum vtype v_type; /* vnode type */ dev_t v_rdev; /* device (VCHR, VBLK) */ caddr_t v_data; /* private data for fs */ struct filock *v_filocks; /* ptr to filock list */ struct shrlocklist *v_shrlocks; /* ptr to shrlock list */ kcondvar_t v_cv; /* synchronize locking */ void *v_locality; /* hook for locality info */ } vnode_t;
osnet_volume/usr/src/uts/common/sys/types.h
typedef char *caddr_t; /* ?<core address> type */
Man mutex_init
SYNOPSIS cc -mt [ flag... ] file...[ library... ] #include <thread.h> #include <synch.h> int mutex_init(mutex_t *mp, int type, void * arg); int mutex_lock(mutex_t *mp); int mutex_trylock(mutex_t *mp); int mutex_unlock(mutex_t *mp); int mutex_destroy(mutex_t *mp); DESCRIPTION Mutual exclusion locks (mutexes) prevent multiple threads from simultaneously executing critical sections of code which access shared data (that is, mutexes are used to seri- alize the execution of threads). All mutexes must be global. A successful call for a mutex lock by way of mutex_lock() will cause another thread that is also trying to lock the same mutex to block until the owner thread unlocks it by way of mutex_unlock(). Threads within the same process or within other processes can share mutexes. Mutexes can synchronize threads within the same process or in other processes. Mutexes can be used to synchronize threads between processes if the mutexes are allocated in writable memory and shared among the cooperating processes (see mmap(2)), and have been initialized for this task. Initialize Mutexes are either intra-process or inter-process, depending upon the argument passed implicitly or explicitly to the initialization of that mutex. A statically allocated mutex does not need to be explicitly initialized; by default, a statically allocated mutex is initialized with all zeros and its scope is set to be within the calling process. For inter-process synchronization, a mutex needs to be allo- cated in memory shared between these processes. Since the memory for such a mutex must be allocated dynamically, the mutex needs to be explicitly initialized using mutex_init(). The mutex_init() function initializes the mutex referenced by mp with the type specified by type. Upon successful initialization the state of the mutex becomes initialized SunOS 5.8 Last change: 10 Sep1998 1 Threads Library Functions mutex_init(3THR) and unlocked. No current type uses arg although a future type may specify additional behavior parameters by way of arg. type may be one of the following:
Something’s not right. The mutex_init in Doors is using 4 arguments and different types.. This is the kernel mutex stuffs
osnet_volume/usr/src/uts/common/sys/mutex.h
/* * Public interface to mutual exclusion locks. See mutex(9F) for details. * * The basic mutex type is MUTEX_ADAPTIVE, which is expected to be used * in almost all of the kernel. MUTEX_SPIN provides interrupt blocking * and must be used in interrupt handlers above LOCK_LEVEL. The iblock * cookie argument to mutex_init() encodes the interrupt level to block. * The iblock cookie must be NULL for adaptive locks. * * MUTEX_DEFAULT is the type usually specified (except in drivers) to * mutex_init(). It is identical to MUTEX_ADAPTIVE. * * MUTEX_DRIVER is always used by drivers. mutex_init() converts this to * either MUTEX_ADAPTIVE or MUTEX_SPIN depending on the iblock cookie. * * Mutex statistics can be gathered on the fly, without rebooting or * recompiling the kernel, via the lockstat driver (lockstat(7D)). */ typedef enum { MUTEX_ADAPTIVE = 0, /* spin if owner is running, otherwise block */ MUTEX_SPIN = 1, /* block interrupts and spin */ MUTEX_DRIVER = 4, /* driver (DDI) mutex */ MUTEX_DEFAULT = 6 /* kernel default mutex */ } kmutex_type_t; typedef struct mutex { #ifdef _LP64 void *_opaque[1]; #else void *_opaque[2]; #endif } kmutex_t; #ifdef _KERNEL #define MUTEX_HELD(x) (mutex_owned(x)) #define MUTEX_NOT_HELD(x) (!mutex_owned(x) || panicstr) extern void mutex_init(kmutex_t *, char *, kmutex_type_t, void *); extern void mutex_destroy(kmutex_t *); extern void mutex_enter(kmutex_t *); extern int mutex_tryenter(kmutex_t *); extern void mutex_exit(kmutex_t *); extern int mutex_owned(kmutex_t *); extern struct _kthread *mutex_owner(kmutex_t *);
I found the SOB:
osnet_volume/usr/src/uts/common/os/mutex.c
/* * The iblock cookie 'ibc' is the spl level associated with the lock; * this alone determines whether the lock will be ADAPTIVE or SPIN. * The only exception is the case when 'ibc' is exactly LOCK_LEVEL, * which we treat as ADAPTIVE unless SPIN is explicitly requested. * At present, the only lock with this dubious property is reaplock. */ /* ARGSUSED */ void mutex_init(kmutex_t *mp, char *name, kmutex_type_t type, void *ibc) { mutex_impl_t *lp = (mutex_impl_t *)mp; ASSERT(ibc < (void *)KERNELBASE); /* see 1215173 */ if ((int)ibc >= ipltospl(LOCK_LEVEL) && ibc < (void *)KERNELBASE && (SPIN_LOCK((int)ibc) || type == MUTEX_SPIN)) { ASSERT(type != MUTEX_ADAPTIVE && type != MUTEX_DEFAULT); MUTEX_SET_TYPE(lp, MUTEX_SPIN); LOCK_INIT_CLEAR(&lp->m_spin.m_spinlock); LOCK_INIT_HELD(&lp->m_spin.m_dummylock); lp->m_spin.m_minspl = (int)ibc; } else { ASSERT(type != MUTEX_SPIN); MUTEX_SET_TYPE(lp, MUTEX_ADAPTIVE); MUTEX_CLEAR_LOCK_AND_WAITERS(lp); } }
I think might die if I analyze what that code to the bottom. So I won’t, I’ll just know that we initialize a mutex and we use mutux to serialize access to data across threads. : )
Leave a Reply