Operating system hardening¶

The best way to secure operating systems and virtual machines is to build them without vulnerabilities, which in practice means that security is integrated by design (in accordance with the design principles). For example, one can use formal methods to prove that certain classes of errors cannot exist in software or hardware and that the system is functionally correct. Scaling such proofs to very large systems remains difficult, but the field is advancing rapidly and has reached the stage where key components—such as microkernels, file systems, and compilers—have been verified against formal specifications. Moreover, not all components need to be verified: to guarantee isolation, a secure microkernel or hypervisor together with a few additional components is sufficient. Verifying other components may be desirable, but it is not strictly necessary for isolation. Naturally, any proof is only as good as the underlying specification—if the specification is flawed, a perfectly conforming implementation does not help.

Despite these efforts, it has not been possible to eliminate all vulnerabilities from large, real-world systems. To defend against the attacks described in the threat model, modern operating systems employ a range of techniques that complement the isolation and mediation mechanisms discussed in the preceding section. Here, five categories of defenses used to harden operating systems are considered:

Information hiding. The idea is to prevent attackers from finding control-transfer points related to program execution in memory, by randomizing their locations between executions.
Control-flow restrictions, which aim to ensure that execution follows only those paths intended by the programmer. While attackers often target control-flow transfers, these restrictions also help mitigate accidental errors.
Partitioning, which augments the now-standard separation of data and executable code. The most important partitioning occurs between the kernel and user processes via protection rings, but in the opposite direction to earlier designs due to vulnerabilities such as Meltdown.
Code and data integrity checks. These extend older techniques such as hash-based file integrity checks. A newer perspective is remote verification, particularly during the boot process.
Anomaly detection. This is a runtime counterpart and continuation of integrity checking, and a special case of general system monitoring relative to allowed behavior. Rather than comparing against strict allow lists, observed behavior is typically compared to a learned notion of normality.

These hardening measures are implemented by operating-system designers and developers, often in cooperation with processor manufacturers. An organization deploying operating systems on its computers can additionally enable IDS solutions on hosts or networks to raise alerts on anomalies. Beyond that, systems should be hardened during deployment by reducing the attack surface. This process begins with changing default passwords. Unnecessary software and services should be removed entirely or at least disabled. Likewise, unnecessary user accounts or local login accounts should be removed; on server systems, the only remaining local login account is often root.

Information hiding (advanced)¶

One of the most important defensive techniques in most modern operating systems is hiding information of interest to attackers. In particular, by randomizing the locations of all relevant memory regions—code, heap, global data, and stack—attackers cannot easily determine where to redirect control flow or identify which addresses contain sensitive data. The term Address Space Layout Randomization (ASLR) was coined in 2001, when this randomization was implemented in the Linux kernel (also discussed in software security). Similar features soon appeared in other operating systems. The first general-purpose systems to enable ASLR by default were OpenBSD in 2003 and Linux in 2005. Windows and macOS followed in 2007. These early implementations randomized only user-space address layouts; kernel randomization, known as Kernel ASLR (KASLR), appeared in major operating systems roughly a decade later.

The idea of KASLR is simple, but its design involves several non-trivial decisions. For example, how random should random be? Specifically, which parts of an address should be randomized? Suppose the Linux kernel’s code occupies 1 GiB (= 2³⁰) of address space and can start at any 2 MiB (= 2²¹) boundary. The amount of entropy available for randomization is 30 − 21 = 9 bits. In other words, at most 512 guesses are needed to locate the kernel code (also obtained by directly computing 1 GiB / 2 MiB). If an attacker finds a vulnerability that allows redirecting kernel control flow from user space to a guessed address, compromising a few hundred machines might suffice to achieve a reasonable probability that at least one guess succeeds—even if the others crash in the process.

Another critical design decision concerns what to randomize. Most current implementations use coarse-grained randomization: they randomize the base address of code, heap, or stack, but all elements within remain at fixed offsets relative to that base. This approach is simple and fast. However, once an attacker obtains even a single code pointer—via an information leak, for example—they can infer the addresses of all instructions. The same applies to heap, stack, and other structures. It is therefore unsurprising that information leaks are highly prized by attackers today.

Finer-grained randomization is possible. For example, randomization can occur at the level of pages or functions. If function order within a memory region is randomized, knowing the base address of kernel code alone is insufficient for an attacker. One can go further and shuffle memory blocks, instructions (possibly inserting unused instructions), or even register allocations. Such fine-grained techniques incur costs in space and performance.

Fine-grained randomization can also be applied to data. Studies have shown that heap allocations, global variables, and even stack variables may be scattered throughout memory. Naturally, this too introduces performance and memory overhead.

Even with KASLR, coarse randomization is only a weak defense against memory-error exploitation. Numerous publications have demonstrated that KASLR can often be bypassed via information leaks, pointer disclosures, or side channels.

Control-flow restrictions (advanced)¶

Regulating the operating system’s control flow targets a different dimension of defense than hiding memory locations. By ensuring that attackers cannot redirect execution to arbitrary code, exploitation of memory errors is hindered even if such errors cannot be eliminated. The best-known example is Control-Flow Integrity (CFI). Many toolchains (such as LLVM and Microsoft Visual Studio) support CFI, and it has been incorporated into the Windows kernel as Control Flow Guard since 2017.

Conceptually, CFI is straightforward: ensure that program execution always follows the static control-flow graph. A function return should transfer control only to the point from which the function was called. An indirect call—via a function pointer in C or a virtual function in C++—should only target the entry points of functions that the code is allowed to call. To implement this protection, all legitimate targets of indirect control transfers (returns, indirect calls, and indirect jumps) are identified and associated with the respective instructions. At runtime, each control transfer is checked against this set; if it falls outside the allowed targets, CFI raises an alert or terminates the program.

As with ASLR, CFI exists in many variants, ranging from coarse-grained to fine-grained, and from context-insensitive to context-sensitive. As with ASLR, most current implementations employ only the simplest, coarsest forms. Coarse CFI relaxes rules for performance reasons. For instance, instead of restricting a function return to the precise call site, it might be allowed to return to any possible call site. While weaker than fine-grained CFI, this still significantly reduces attackers’ freedom and is much faster to check at runtime.

On modern systems, some forms of CFI are—or will be—supported directly by hardware. Intel’s Control-flow Enforcement Technology (CET) supports shadow stacks for return integrity and indirect branch tracking for forward-edge integrity, albeit in a coarse-grained manner. ARM processors provide pointer authentication, which prevents tampering by storing a pointer authentication code (PAC) in the upper bits of pointer values. This mechanism functions similarly to a message authentication code (MAC).

Unfortunately, CFI only mitigates attacks that corrupt control flow—such as tampering with return addresses, function pointers, or jump targets—and is ineffective against other attacks. For example, it cannot prevent memory corruption that elevates the privilege level of the current process (e.g. by setting its effective UID to root). Given the success of control-flow protections, one may ask whether analogous protections are possible for data. The answer is “yes, but.” Data-Flow Integrity (DFI) statically specifies, for each load instruction, which store instructions may have produced the value. Store instructions are labeled, and the allowed label sets are recorded. At runtime, each memory byte is associated with the label of the last instruction that wrote to it. When a load occurs, the system checks whether the last store belongs to the permitted set; if not, an alert is raised. Unlike CFI, DFI has not seen wide deployment, apparently due to its significant performance overhead.

Partitioning (advanced)¶

Modern operating systems enforce a strict separation between code and data. Each memory page is either executable or writable, but not both. This policy, commonly called W⊕X (write xor execute), prevents executing instructions from data regions and modifying existing code. Without the possibility of code injection, attackers seeking to control execution must reuse existing code. Similar mechanisms prevent sensitive kernel structures—such as system call tables or interrupt vectors—from being modified after initialization. All major operating systems support these mechanisms, typically using hardware support such as the processor’s no-execute bit, although names and details vary (for example, Microsoft’s Data Execution Prevention, DEP).

In a previous section, it was shown how operating systems use CPU protection rings to prevent user processes from accessing arbitrary kernel data or executing kernel code. Protection is also needed in the opposite direction, as demonstrated by recent side-channel attacks (Meltdown, Spectre, and others). Previously, for efficiency reasons, the kernel was mapped into every process’s address space and system calls were handled using the process’s page tables. Linux operated this way from its inception in 1991 until December 2017 (i.e. until Meltdown). Although kernel pages were marked with the supervisor bit, preventing user access, risks remained. For example, a null pointer dereference in kernel code might normally cause only a crash, but an attacker could map user-space code at address zero to gain elevated privileges.

To prevent attackers from injecting malicious instructions into the kernel via user space and to mitigate side channels, stronger isolation than protection rings alone is required. Many processors now support SMAP and SMEP (Supervisor Mode Access and Execution Protection). When enabled via control register bits, any attempt by the kernel to access or execute user memory triggers a page fault. Naturally, SMAP must be temporarily disabled when the kernel legitimately needs to access user memory.

On older processors without hardware Meltdown fixes, operating systems such as Linux fully separate kernel page tables from process page tables. The kernel may still access user pages when necessary, but permissions differ. In particular, marking pages as non-executable effectively provides SMEP-like behavior. Mitigating other speculative-execution vulnerabilities (such as Spectre or RIDL) is more problematic. Some defenses aim to disable speculation entirely. Others, like Windows, restrict speculation to code within the same security domain and CPU core, while OpenBSD disabled hyperthreading on Intel processors entirely (from 2018 onward). Until hardware evolves further, it remains unclear how sufficient and composable such mitigations are.

Advanced side-channel attacks exploit aggressive resource sharing in modern systems: multiple security domains share caches, TLBs, branch predictor state, arithmetic units, and more. While sharing improves efficiency, it also creates side channels, as highlighted by the principle of minimizing shared mechanisms. To mitigate such attacks, operating systems may sacrifice efficiency by partitioning resources at a fine granularity. For example, via cache allocation or page coloring, different processes can be given access to disjoint cache regions. Unfortunately, partitioning is not always straightforward, and is currently unavailable for many low-level resources.

Code and data integrity checks (advanced)¶

One way to reduce the impact of malicious code in an operating system is to ensure that code and data are immutable and originate from a trusted vendor. For example, Windows has for many years required driver signing. Some newer versions go further, combining software checks with hardware assistance so that the system effectively runs only trusted code, including applications. Microsoft refers to this approach as Device Guard. Even privileged malware cannot easily execute unauthorized applications, because enforcement mechanisms reside in a hardware-assisted virtualized environment. Most code-signing solutions focus on operating-system extensions, which the OS verifies for integrity and authenticity. Similar processes are widely applied to updates.

All of this is effective provided that trust has a solid foundation. Here, the trust anchor (cf. concept map) consists of the code that verifies signatures and the operating system itself. But how can one be sure that these have not been compromised by a bootkit? Ensuring the integrity of boot-time software requires multiple stages, typically aligned with the boot process itself. Booting has been multi-stage since the earliest commercial computers. Even the IBM 701 of the early 1950s initiated booting by loading a single word from punched cards.

Secure boot begins with a trust anchor (or root of trust) that initiates the boot process and is typically hardware-based, such as a microcontroller. It starts execution from internal immutable memory or protected internal flash that cannot be reprogrammed (or only under strict authentication). Modern Apple systems, for example, include a T2 Security Chip that provides a hardware root of trust for secure boot. Google has developed a similar processor called Titan. We now examine how hardware-backed trust ensures that a system boots securely.

In general-purpose systems, boot stages begin with firmware, which may load a bootloader, which in turn loads the OS kernel. The kernel may load additional boot drivers until the OS is fully initialized. All these stages must be protected. For example, UEFI can secure the first stage using Secure Boot, verifying that the bootloader is signed with a key matching firmware-stored key material. The bootloader can then verify the digital signature of the OS kernel before loading it. The kernel, in turn, verifies all additional components, such as drivers or integrated anti-malware tools, before execution. By starting anti-malware software early, it can later verify subsequent components and extend the chain of trust to the fully initialized OS.

Is the system now securely running? Possibly—but how can this be verified, especially remotely? Attestation allows a remote party to detect system modifications. Attestation typically relies on specialized hardware such as a TPM (Trusted Platform Module), which acts as a root of trust and records step-by-step measurements of loaded software. TPMs support cryptographic operations, key generation and management, secure storage, and crucially, integrity measurements, as also discussed in the module on hardware.

To prove integrity, a TPM may provide cryptographic hashes of all boot-time components. The verifier compares them to known-good values. Often less communication suffices: the TPM forms a hash chain of measurements, stored in PCR registers (Platform Configuration Registers), which are reset to known values at every boot. If the current PCR value is X and a new module with hash Y is measured, the TPM stores hash(X,Y). For remote attestation, the verifier sends a nonce, and the TPM returns a signed response computed over the nonce and PCR values. The verifier thus knows the response is fresh and originates from the TPM, and can reconstruct the chain to ensure that correct code was loaded at every stage.

Integrity checking may also continue at runtime. For example, a hypervisor can perform self-checks on its virtual machines to verify that code and data structures remain intact. This technique is known as Virtual Machine Introspection (VMI). VMI may reside within the hypervisor or in a separate application. Besides code, commonly inspected structures include process lists (is a rootkit hiding?), system call tables, and interrupt vectors.

Anomaly detection (advanced)¶

A monitor—whether implemented in a hypervisor or an operating system—can also be used to watch for unusual events (anomaly detection, cf. also an earlier module). For instance, a system that crashes hundreds of times in succession may be under attack attempting to guess address-space randomization. Of course, such conclusions are indirect. Anomaly detection systems must strike a balance between too many false positives and too few alerts.