ICFS-thesis/implementation.tex

\chapter{Implementation}
\label{impl}

This chapter outlines the software design and architecture of ICFS, detailing how these elements address the challenge of fine-grained access control. Subsequent sections introduce the FUSE framework, the architectural strategies employed to mitigate unauthorised filesystem access, methods for managing process-specific permissions, and the implementation of access dialogues.

\section{FUSE Framework}
\label{impl:fuse}

To regulate filesystem operations, ICFS employs the FUSE (Filesystem in Userspace) framework\cite{FUSE}, which intercepts filesystem calls. FUSE enables the creation of custom filesystems or layers in user space, offering flexibility and ease of implementation. It provides an API for developers to define filesystem behaviour. Once implemented (hereafter termed the FUSE application ), the system mounts the custom filesystem at a specified location, substituting standard filesystem operations with methods defined by the API.

ICFS implements this API in C using the libfuse3 library \cite{LIBFUSE}. It initialises the FUSE daemon via the \verb|fuse_main()| function, which manages communication between the kernel and the FUSE application. Rather than directly overriding system calls, FUSE interacts with the kernel through \verb|/dev/fuse|, a specialised device file that translates filesystem requests into API method invocations using a dedicated protocol.

ICFS does not have a backing store (a separate filesystem that contains actual data). Instead, it functions as a so-called passthrough filesystem, where system calls are forwarded to the original filesystem, if access control policies allow them.

To enforce access restrictions, ICFS mounts directly over the target directory, intercepting all access requests directed to it. As part of Linux's Virtual Filesystem (VFS) architecture, processes interacting with the protected directory are routed through ICFS. However, ICFS retains direct access to the underlying files by opening the directory with the \verb|O_PATH| flag before mount. Subsequent operations are executed using ``at''-suffixed system calls like \verb|openat()|, performed directly at the file descriptor level \cite{MANOPEN}, which bypasses ICFS's own layer.

\section{Permission Tables}

To enforce an access control policy over time, filesystem needs to store user decisions in an appropriate data structure. As described in \autoref{icfs:model}, ICFS can give out two types of permissions: temporary and permanent. To accommodate this access control model, ICFS implements two data structures: a temporary permissions table, and a permanent permissions table, which we describe in detail in \autoref{impl:temp} and \autoref{impl:perm} respectively.

To pass permissions to child processes, both tables use procfs. When a permission check for the requesting process yields no results, recursive checks are performed on parent processes by traversing the process tree.

\subsection{Temporary Permissions}
\label{impl:temp}

To function, temporary permissions storage should contain all information needed to identify the process, and associate the files to which the access is denied or allowed with it. We chose to keep track of processes by comparing the following characteristics:

\begin{itemize}
	\item Process ID: Number that uniquely identifies a process on Linux systems.
	\item Start time: The time the process started after system boot. The value is expressed in clock ticks.
\end{itemize}

The process is considered the same if and only if both characteristics match.

At first, it might seem that factoring in start time is excessive. However, using PID as the only identifying property of a process is problematic: PID is only unique among the currently running processes, not across the entire uptime of the system. Processes can not only acquire the PID of another, already finished process by accident, but also attempt to request a specific PID \cite{SOSETPID}. The start time is looked up in procfs by PID, which is provided by libfuse.

The temporary permissions table consists of tuples $(pid, starttime, allowed, denied)$, where $allowed$ and $denied$ are sequences of files, that the process is allowed or denied to access respectively.

In our implementation, entries are organised in a hash map, with PIDs as keys. This provides quick lookup of entries much needed for filesystem operations. ICFS uses the hash map implementation from the Convenient Containers library \cite{CC}, that is well-tested and has an intuitive interface, which has helped to simplify the development.

One disadvantage this data structure, is that there isn't any inherent mechanism to remove entries that are no longer valid (e.g. permissions of a process that is already finished).

Unfortunately, we haven't found an efficient way to remove expired entries in the temporary permission table. On Linux, a process can't be notified of other processes' end unless they are child processes or the tracking process is being run with superuser permissions \cite{SOPROCNOTIF}. Hence, we had to resort to cleaning out expired entries using the garbage collection technique: an independent thread periodically checks validity of every entry in the table. If an entry is invalid, it is erased.

\subsection{Permanent Permissions}
\label{impl:perm}

Since permanent permissions are granted to all processes' with the same executable, only it's filename is needed for identification. Since the permissions have to persist after filesystem restart, the table needs to be stored on the disk. Hence, we chose SQLite \cite{SQLITE} as the backend for the permanent permissions table. It is well-tested and lightweight, making it an ideal choice for a program like ICFS.

Due to specifics of relational databases, the permissions are stored as a relation  \allowbreak$(executable, filename, type)$, where $executable$ is the filename of the executable, $filename$ is a filename of the file that the permission targets and $type$ is a boolean value indicating whether the permission allows or prohibits access to the target file.

The database is stored in a file on the disk that the user chooses during startup. The database file is protected from outside access using standard POSIX permissions: during installation, a special user is created for ICFS, the owner UID of the executable is set to the UID of the new user, and the setuid bit is set, to allow other users to launch ICFS as a special user. On startup, database file is created as the special user, and the access mode is set to prohibit access by any other user. After the database is opened, UID of ICFS process (effective UID) is switched to the UID of the user (real UID) that originally started it using the \verb|setuid| system call. The database remains open for the rest of the runtime of ICFS.

Unfortunately, in the current version of ICFS there is no way to edit the permanent permission table. We discuss this limitation in more detail in \autoref{eval:future}.

\section{Access dialogues}

Access dialogues are implemented as a separate program, that the FUSE daemon spawns using the \verb|popen| function, provided by the standard C library. In the arguments, daemon specifies (in this order):

\begin{itemize}
	\item The PID of the requesting process.
	\item Path to the process's executable.
	\item ICFS's mountpoint.
	\item Name of the file the process is attempting to access.
\end{itemize}

After user interaction, the dialogue terminates with an exit code indicating the decision and outputs the relevant filename to standard output. The daemon validates the filename’s existence and relevance to the original request.

For example, if a process requests access to \verb|~/Documents/book.pdf|, but the dialogue specifies a nonexistent file or an unrelated file like \verb|~/Documents/other.txt|, the dialogue reappears. Currently, input validation occurs solely within the FUSE daemon, not the dialogue itself. Integrating real-time feedback on input validity into the dialogue interface could enhance usability, as noted in \autoref{eval:future}.

The access dialogue program is written in C and utilises GTK4 \cite{GTK} and libadwaita \cite{ADW} libraries for its graphical interface.