12.1 Fundamentals of Operating Systems | COMP.SEC.100 Kyberturvallisuus I / Cyber Security I

The nature of operating systems¶

A computer operating system (OS) is, as its name suggests, a system that allows a device to be used or operated. A program could in principle use the hardware even without an OS—for example, the OS itself is such a program. One purpose of an OS is to make the applications running on top of it more or less independent of the current hardware configuration. It is obvious that different types of devices require fairly different operating systems, and the user of a device has more or less opportunity to modify the OS structure or features:

mainframe – server – multiprocessor device – personal computer – mobile device – real-time system – embedded system – smart card –

distributed system (with a network OS) – virtual machine (i.e., another OS underneath, e.g., a hypervisor)

Between the operating system and the hardware there may be a ”thin” layer of hardware-specific software, namely the HAL, hardware abstraction layer. This layer, as well as possible microprogramming that uses the processor’s actual instruction set, is ignored in this course. Fundamentally, the devices accessed via the OS are:

Processor (CPU), especially with its set of registers, including the program counter (PC), which points to the next instruction to be executed.
Main memory (RAM).
Secondary storage (or mass storage), which is also related to device control and where logical (and virtual) addresses are translated into physical ones.
Device controllers, which mediate interaction between the CPU and I/O devices. External data communication is also involved here.
Buses: each typically connects several parts of the computer. In addition to internal CPU buses, there may be various device buses.

Insofar as these are interconnected, they may be referred to collectively as the central unit, as distinct from peripherals such as printers. All devices are put to work by the user, typically through processes created by application programs. These are the actual users of the OS. To an ordinary human user, usually only the OS command interpreter is visible, which itself is just an application program, whether command-line based or graphical. The term shell for the user interface is descriptive. Often, the user role is taken by an automatically started process such as a web server.

An application programmer must also be aware of interfaces beyond those directed to hardware. Programs use the OS via system calls. The collection of available calls is called an API (application programming interface). It implements the abstraction of hardware and creates, as it were, a ”machine” that the programmer uses. It may be divided into different interfaces for the central unit, peripherals, and network connections. (The term API also refers to the interface by which programmers connect their code to software written by others.)

◼ From a security perspective, a difficulty is that an intruder or malware author can operate at a much lower level than an ordinary user. One well-known attack target has been the boot sector.

When a computer is started, the processor begins executing instructions from ROM (and nowadays typically from flash memory). These are used to check that the central unit and the BIOS residing in ROM function as expected (“self-test”). The result is stored.

After this, the BIOS (firmware, basic input output system) initializes mass storage devices and retrieves from there, from the boot sector, the basic part of the operating system, the bootstrap loader. This is a small program whose purpose is to prepare the environment for the operating system and load it into memory. Preparation means setting up the OS data structures in a suitable order and also loading necessary driver programs. (Because the OS thus in a sense loads the OS, this is the same phenomenon as Baron Münchausen rescuing himself from a swamp by pulling on his bootstraps—booting.) Control then passes to the OS. The checks performed before the OS starts can be quite complex (◼ cf. TPM, trusted platform module).

This material provides a very rough introduction to operating systems, and many things only appear when their related security is discussed. Suitable introductory texts can be found online for obtaining a more detailed perspective. An entire book is also available: ”OSs: Three Easy Pieces”. Much more is involved than what security considerations require, and the OS is a very complex piece of software. One can be convinced of this by examining a diagram of the Linux kernel organized by functionality and layers.

The operating system as separator and integrator¶

In order for demanding security policies to be realized, ◼ the computer hardware may be required to have tamper-resistance properties, and at least part of the operating system must be especially trustworthy. Here, hardware and the system that uses it are examined only under “ordinary” requirements, and even then the focus is on security aspects related to program execution. Central to this is separation. ◼ In short, users, that is, their processes, are kept sufficiently separate from each other (confidentiality and integrity) and from the operating system and hardware (integrity).

This is necessary even when a user fully owns the device. Device settings and operating system structures are so complex that ◼ accidental modification can compromise data in one way or another—including by disrupting processing altogether (availability). Many OS structures are related to availability, but in such cases the issue is mainly efficiency and fairness, and these aspects do not normally fall under security. In critical, real-time applications, the operating system must ensure that certain processes always receive processor time when they need it.

Important basic OS structures in this general view are as follows. Items in brackets are mentioned to help grasp the whole; they relate to operating system security tasks but are somewhat further from the hardware.

processes: privileges, interrupts, threads, concurrency, scheduling. ◼ [user identifiers, groups, authentication, accounting]
memory: different memory types (ROM, RAM…), addressing structure, reference techniques, virtual memory.
files and peripherals. ◼ [access rights, search paths, confinement, “mounting” resources]

Behind the OS separation task lies the modern OS need to manage numerous shared resources used by multiple actors. In principle, the approaches are as follows:

Sequentially, one entity at a time: printers, and in some contexts, tape devices.
Sequentially via an interrupt mechanism: processes on the CPU; one special concern is preventing deadlock. The scheduler (dispatcher) tries to ensure performance and fair sharing so that no process starves.
In parallel: process memory allocation and files, and possibly management of parallel processors. In addition to separating actors, the OS must manage memory and disk space without fragmenting them into pieces that are either too small to use or so numerous as to slow execution excessively.
Pipelined or staged: as in the internal implementation of the processor itself, which is not an OS concern. Speculative execution is close to this; see the discussion of attacker models in that section.

◼ Only the first is strict separation, and not even in the sense required by multi-level security mechanisms (MLS). Even in sequential process execution, it may be necessary to ensure that registers are cleared. The use of shared data structures could be thought of as “overlapping”. Nevertheless, the OS must ensure exclusion via locks so that it, too, proceeds sequentially without collisions. This means, among other things, that outdated data is not written over newer data.

Despite separations, information exchange between processes can occur in many ways. One administrative goal is the synchronization of processes doing the same work. Common mechanisms include:

files
memory (between threads of the same process and via segmentation)
signals, delivered to a process via its control block; these can be used, for example, to terminate a process
semaphores, special data structures whose purpose is to implement exclusion
message passing via various mechanisms: - mailboxes - queues (e.g., pipes and named pipes) - message channels (e.g., sockets)

Above, four major challenges for the OS related to concurrently progressing interdependent processes have been mentioned: exclusion, synchronization, deadlock, and starvation. These topics are covered in the aforementioned book’s second “easy piece” under the heading Concurrency.

Processes in operating systems¶

Applications executed by the machine’s processor usually consist of multiple processes. In addition, there are numerous processes through which the operating system (OS) performs management tasks, in particular ensuring that processes—and thereby each application—receive sufficient processor time to function properly. This is done in such a way that as large a portion of processor time as possible is spent on application processes rather than administration. Administrative overhead is caused especially by process switching and the many transfers and checks associated with switching.

A process may be privileged or a user process, or in some systems its protection state (mode) may be something in between. This separation can be implemented based on hardware-provided protection bits; for example, already in Intel’s historical 80386/486 there are two bits that give four modes, or protection rings. User processes are on the outermost ring and obtain services from privileged processes by invoking them like subroutines. Other rings may include, from the inside outward, the kernel, other OS components, and utilities. ◼ Through these modes, hardware-level multi-level security can be achieved. Similarly, multi-level security is supported by the fact that processes operate in their own address spaces and communicate with each other only via OS-provided services.

Both Unix and Windows use only the innermost and outermost of the mentioned protection rings: in Windows “kernel mode” and “user mode”, in Unix rings 0 and 3. In Windows, context switches and transitions between these rings are implemented by the Local Procedure Call facility, and user programs invoke the operating system via APIs. ◼ One fundamental security-related mechanism in Windows is locking for managing concurrency, by which a user can prevent others from accessing an object.

Threads (”lightweight processes”) within the same process share the same address space. This makes context switching between them more efficient, but on the other hand ◼ the OS cannot apply security mechanisms between them in the same way as between processes.

An interrupt means that a currently executing process is temporarily set aside and something else is done that is more important at that moment (for example, something the program cannot “do itself”) or “fairer” (such as running another user’s process in between). There are many kinds of interrupts:

software interrupts, caused by the executing process itself:
- programmed: the process calls the OS. System calls are needed especially for I/O, when a process uses a peripheral device.
- program errors (e.g., division by zero, invalid memory reference)
external interrupts
- clock, e.g. 100 times per second
- I/O interrupts, which occur when a transfer completes. At that point, a process’s status must be changed to ready.

Interrupts have a hierarchy that can also be considered a security structure: a lower-level interrupt does not interrupt the handling of a higher-level one. External interrupts can additionally be temporarily blocked by setting a mask, so that the OS can allocate time to a more important process. There are, however, some fault interrupts (e.g., memory problems) that cannot be masked.

An interrupt may thus mean switching to execute another user’s process, but first there is always a transition to the OS interrupt handler, whose address is read from the interrupt vector. ◼ From a security standpoint, it is crucial that no unauthorized party can modify that address.

Information about running processes is stored in OS tables, from which utilities such as Unix’s ps or lsof (process status, list open files) obtain their data. ◼ A skilled attacker may cause their own process not to appear in ps output, but even then the issue is not that the information is absent from the process table. Rather, the attacker has modified ps or the mechanism by which its output is displayed. Instead of skill, an attacker may use a well-crafted exploitation kit (rootkit).

The process table contains process control blocks (PCB), which include information such as:

identifiers for the process, its user, and the current effective user (cf. SUID programs); also the protection mode, if it is not derived from the user
state information, in particular whether the process is running, waiting to run, waiting for an I/O operation to complete or for some other event (and which event), or possibly swapped out
information that must be loaded into registers when execution resumes, especially the program counter and the bounds of the process’s memory region
priority, both current and base level
list of open files
information about other required resources
data used to track CPU time and other resource usage and that may be used for accounting

Memory management¶

Main memory is a data store that programs access directly—without having to specify a storage medium on which the data resides. This type of memory also exists in different forms, of which ◼ from a security perspective it is important to distinguish memory that can only be read (ROM) from memory that processes can, at least in principle, write to. Naturally, the former can be trusted more than the latter. Intermediate forms include EPROM, EEPROM (cf. smart cards), and WROM (write-once memory).

Some memory, although it belongs to a process address space and thus to main memory, is in fact stored on a secondary storage device as part of the implementation of virtual memory—that is, it does not belong to the processor’s memory space. In such structures, ◼ the separateness, even removability, of secondary storage (disk) must be considered. Can someone access it via the hardware? Does plain data (non-encrypted) remain if the system crashes?

In addition to secondary storage and main memory, the OS also manages cache memory, the fastest type of memory. It tries to anticipate which data a process will need next, keeping such data in cache once it has been fetched from main memory. Cache therefore sits not between secondary and main memory, but between main memory and processor registers. (Cf. web browsing cache or DNS cache.)

Actual use of main memory occurs when the OS receives a memory reference R from a user process. Then one of the following happens: the OS (or its MMU, memory management unit)

splits and reassembles R based on what it has read from various address tables, which form a tree-like structure so that the size of each table remains reasonable;
adds a base address to the relative part (“offset”) contained in R;
checks whether R falls within an area permitted to the process. This realizes the reference monitor idea. In modern systems, entries in address tables contain, in addition to pointers, several flags (bits) by which access can be restricted, in particular allowing access only to the operating system itself.

These operations depend on the memory structure, which offers the following possibilities:

If there is only one user (as in a smart card), separating the user’s process(es) from the OS memory area requires only a fixed boundary (fence) beyond which references are not allowed. A natural implementation uses relative addresses, which are relocated to physical ones only when the program is loaded.
In multi-user systems, both lower and upper bounds are required, and in addition, because processes typically have dynamic data, limits are needed for that as well.
If processes need to share memory, or if a particular process wants to protect part of the memory it references, for example from writing, ◼ memory units (words or somewhat larger but still small units) can be equipped with access-control bits. Such tagged architecture consumes memory but is flexible. Despite its advantages, it did not become widespread mainly for compatibility reasons, although renewed interest emerged in the late 2010s (cf. ARM MTE, CHERI, SPARC ADI). For comparison, modern Intel and ARM processors provide no‑execute support via NX or XD (No Execute / Execute Disabled) bits. These permissions are encoded in page‑table entries, whereas tagged architectures associate protection information with memory itself, not with pages.
Segmentation: the program (and data areas) is divided into parts that are logical named entities and placed in contiguous memory areas. ◼ The OS maintains an address table and all memory references go through it, enabling various security checks (also for multiple referencing processes). On the other hand, especially dynamic references outside a segment must be monitored. Segmentation has its own implementation overhead and also causes memory fragmentation. In Unix, segmentation is strictly pragmatic: each process has only the segments “text” for program code, “data” for non-executable program parts such as variables and constants, and “stack” for temporary data of variable size. Segmentation is giving way to paging.
Paging divides a program into fixed-size pieces that are placed in memory and referenced similarly to segments. Memory is filled efficiently and page boundary crossings are conveniently handled as address overflows to a new page. The OS manages page allocation, but ◼ cannot isolate access control to the same degree as with segmentation.
- By paging segments, the benefits of both systems can be achieved, at the cost of an additional step in reference handling.
- With virtual memory, the page table also contains a presence bit. If it is 0 and the operation is a read, a page fault occurs and the process waits until the page is fetched from secondary storage. Even for write operations, the page must at some point be fetched in full so that it can be written back in modified form.

File systems¶

Persistent storage of data on secondary storage devices resembles the management of main memory. In secondary storage, however, the logical unit of data is a file, and files are organized into directories. From a cybersecurity perspective, file management involves more challenges, because files are handled in more ways than main memory, there are often more actors involved, storage locations may vary in quality, and storage typically takes longer.

The following is a partially structured list of what is associated with a file. ◼ If any of these could be said to have less connection to cybersecurity than the others, it would perhaps be performance.

name — structure — type
access — attributes
operations (creation, opening, lookup, reading, writing, closing, deletion) — system calls
file representation in memory: buffer, i.e. disk cache, which is different from the main memory cache
directory structures
- single-level — hierarchical
- paths
- directory operations
file system implementation
- file and directory descriptors
- storage structure
- file location, possibly on multiple disks; caches
- shared files
- performance — reliability
- disk space management
- log-structured file systems
- interpretations — encodings, especially in the case of multimedia

◼ Insofar as the file system is not examined from the perspective of secondary storage devices or virtual memory, the associated security issues are closer to the user than to the hardware. In Unix, even peripheral devices and their drivers are abstracted, as all of them are treated like files.

Alongside the Unix file system, well-known ones include Windows FAT and NTFS (file access table and New Technology File System). Also important are network file systems such as NFS (1984 — version 4.2: RFC 7862, 2016) and Windows SMB (often “Samba”, from Server Message Block). The latter allow files located on another machine to be accessed as if they were local. NFS operates over the UDP protocol (and nowadays also TCP) and continues to function “automatically” after a server crash. ◼ The NFS system has been modified to take Kerberos authentication into account during the startup phase. With the SMB protocol, not only files but also, for example, printers can be shared. Platform-independent further development (since 1996) is known as CIFS, Common Internet File System.

The operating system as a device controller¶

The core functions of an operating system can be considered the management of the processor and memory, and in comparison all other devices are merely “auxiliary” (or even peripherals). For these, the OS contains separate processes that “know” the structure of the device and hide it from other parts of the OS. These processes are called device drivers. The central OS-managed mechanisms for I/O between the processor and peripherals are queues and buffers. Queues store (or link to) the parameters of I/O requests waiting to be processed. Buffers are data structures into which the stream of bits being transferred flows and from which it is consumed by the target at its own pace. The idea is that the sender of bits can write as efficiently as possible, even if the receiver is not constantly reading.

Actual use of devices takes place via their controllers. These can be quite intelligent, allowing drivers to operate at a higher level of abstraction. For example, network controllers may implement complex protocols. In addition, between drivers and controllers there may be separate hardware accelerators, and some controllers can transfer data directly between the device and main memory without CPU involvement (DMA, direct memory access). Devices are connected to device buses. Two standards important for personal computers are SCSI and USB (small computer system interface; universal serial bus).

Although drivers were said to be separate “providers of abstraction”, they nevertheless often belong to the OS kernel. They are important for performance and ◼ in any case require more privileges at some point than user processes. This can lead to security problems, since drivers are produced by many vendors and new ones may have to be introduced into the operating system. One solution to this, and more generally to running untrusted code, is ◼ the wrapper technique: this involves installing a new software layer between programs and resources, without requiring changes on either side. It thus “wraps” the program and forwards all or some of its system calls. The wrapper can be configured to wait for certain kinds of calls, and when they occur it can block the call, modify it, or modify its return value. If the wrapped target is in the kernel, the wrapper itself must also be there. In practice, wrappers for application programs can also be loadable kernel modules.

An important multi-level driver structure is related to disk files. To user processes, these are naturally presented as streams of characters, whereas disk operations are most efficiently performed in blocks. To implement this, the OS should maintain a disk buffer in main memory. ◼ This involves an integrity problem: if a process or the entire machine crashes, the file on disk may not correspond to the view that the process last had of it.

Secure-by-Default

At the beginning of this module, it is stated that there are many kinds of devices running operating systems. There are also many kinds of operating system users. One is comfortable with the command line or editing the Windows registry; for another, even creating a folder may be unfamiliar. With respect to consumer devices and their operating systems (Windows, Mac, Android, iOS), it cannot be assumed that the average user has sufficient knowledge and skills to configure the device securely. Therefore, default settings should be secure.

A practical example of this is that Windows machines have Microsoft’s cybersecurity software enabled by default, along with a preconfigured firewall. Microsoft’s cybersecurity software also disables itself if the user installs a third-party cybersecurity product. Correspondingly, it automatically re-enables itself if the third-party product is no longer detected. The motivation is that two simultaneously running cybersecurity programs would only get in each other’s way, weakening security. Automatic reactivation is intended to avoid a situation where the user is left completely unprotected if, for example, they remove a third-party cybersecurity product after a trial period expires.

You may have noticed that Linux distributions were not mentioned at all in the list of operating systems in the first paragraph of this green box. This is because secure-by-default is not implemented with the same level of rigor in even the most common Linux distributions, such as Ubuntu. For example, the firewall is not enabled by default. In discussions related to this topic, this is often justified by the fact that a standard Ubuntu installation does not by default include background processes or servers bound to internet ports*—and thus a firewall is not needed**. On the other hand, the default state of the firewall is not mentioned at any stage of the installation, and the use of many programs without it would be quite vulnerable. It is therefore entirely the user’s responsibility to set up a firewall, but a user accustomed to another system (e.g. Windows) may easily assume that default settings keep them sufficiently safe. This is thus a counterexample of secure-by-default. It is also worth noting that different systems have very different security cultures. If you are reading this as a future software designer or developer, the course staff encourage leaning toward secure-by-default.

*Side note: background processes are referred to as services in Windows; in Linux the term is daemon.

**From a security perspective, this justification is rather weak. It is also difficult to appeal to minimalism, given that Ubuntu offers to install various secondary components during installation, yet does not say a single word about the firewall.

Fundamentals of Operating Systems¶