- COMP.SEC.100
- 13. Distributed Systems Security
- 13.5 Attacks on coordinated distributed systems (advanced)
Attacks on coordinated distributed systems (advanced)¶
Because distributed systems rely primarily on message forwarding both in communication and in coordination, attacks can be grouped at the communication level, as a complement to the earlier structuring of the attack surface:
- Timing‑based attacks: This includes message omission, premature, delayed, or out‑of‑order communication. Collisions and denial‑of‑service also belong to this group because they typically appear as disturbances in message delivery, preventing access to communication channels or resources.
- Data‑based attacks: Spoofing, mimicking, replay, information leakage (including via side channels), and content manipulation. Manipulating message contents manifests as Byzantine failure.
A disturbance — even one that is not an attack — also has the property of duration: a disturbance may be transient, random, intermittent, or permanent. In addition, attacks may consist of multiple simultaneous events, possibly involving cooperation among several attackers. In such cases the resulting disturbance blends timing, data, duration, and dispersed locations.
As known from infromation influence operations, a cyber attacker may be satisfied simply by undermining assumptions about the functioning of resources, services, and underlying coordination. A resource attack often does not harm the resource itself but primarily affects the service being executed on it.
Below are some attack scenarios. Considering the enormous diversity of resource‑ and service‑based distribution, this is only a set of illustrative examples.
Coordination of resources, i.e. the infrastructure view (advanced)¶
From resource distribution we highlight only two important examples: cloud services and client‑server models. Attacks are discussed after a general overview.
Cloud model¶
In this context, the most outsourced model SaaS (Software as a Service) can be given less attention because it corresponds directly to the client‑server model. In both IaaS and PaaS models the customer controls the software and its data; in PaaS nothing more, and in IaaS additionally the operating system. IaaS (Infrastructure aaS) therefore provides the hardware, and PaaS (Platform aaS) also provides the runtime environment for the customer’s applications.
From a security perspective, cloud computing can be decomposed into components that reveal aspects of the attack surface. Similarly to datacenter infrastructures, which are combinations of computing and storage resources, the cloud is a combination of geographically distributed resources available to the user on demand. The user has a “cloud‑like foggy” view of the exact location and composition of the resources but sees a virtualized set of highly available and reliable resources and services that scale well. The user defines functional attributes of interest as service‑level objectives:
- performance,
- reliability,
- replication and isolation properties in terms of VM (virtual machine) types and numbers,
- latency,
- the level of security regarding cryptography and other mechanisms in computation and communication,
- cost parameters for service delivery expressed as service level agreements (SLA).
The exact composition or location of the resources, and the mechanisms combining them, are transparent to the user. They directly see their data (PaaS) or their data and operating system (IaaS). Cloud mechanisms include authentication, access control, resource brokering, VM startup, scheduling, monitoring, reconfiguration, load balancing, communication structures, interfaces, storage, and many others. Together with physical resources and interfaces, these constitute the attack surface.
Client‑server model¶
The client‑server model is a software architecture where client programs — often but not necessarily directly user programs — communicate with a server program that appears centralized from their viewpoint. Server hardware is replicated to improve availability, and the server software itself may be layered or hierarchical, as in multitier multitenancy models. As noted earlier, the SaaS cloud model can be interpreted as a client‑server implementation.
Attacks and defense¶
The perspectives presented here apply, with adaptation, to other resource‑coordinated distribution models as well, and many issues are quite general.
Resources. Attacks on resources compromise their availability.
Access control techniques, including firewalls, can limit external access. Authorization processes ensure actual access rights. Other approaches include sandboxing (running a program within strict boundaries) and a tamper‑resistant Trusted Computing Base (TCB) that enforces coordination and monitors resource use.
Availability can also be compromised indirectly by partitioning resources and the services running on them. This also occurs through attacks on communication channels. For services, integrity becomes endangered.
Access control. Attacks exploit identity deception and theft. For resources the effect is availability, but for data/services also integrity and confidentiality.
Intrusion detection systems (IDS) are typical mitigation techniques. These are supplemented with periodic or random authentication checks. The validity of identities recognized by the system must also be verified regularly.
Virtual machine (VM). A common issue is information leakage from the VM via side‑channel attacks or similar means. The result is loss of integrity and confidentiality of the VM’s services.
VM security was discussed in with operating systems. Three aspects are highlighted here: detecting leakage, the level at which the leakage occurs, and handling it. “Taint analysis” (a form of static testing) is an effective technique for detecting leakage. Since side‑channel attacks often occur at hardware level and process‑switching schedulers influence them, hardware performance counters are widely used for detection. Handling VM corruption often begins by tightening system‑level trust specifications — which processes may access what — and examining these analytically, formally, or via stress‑testing. Hypervisors are commonly used to enforce VM operations.
Scheduler. Abnormal task or resource allocation indicates scheduler corruption. Such deviation can be detected using access control. If an attacker takes control of the scheduler, inconsistencies in system state or resource‑to‑task binding can be filtered through coordination mechanisms designed to preserve consistency. These attacks generally affect availability and integrity.
Notably, scheduler corruption does not affect confidentiality.
Broker. A cloud broker resembles a scheduler but operates at a higher level. Corruption primarily affects resource availability.
Similar defensive mechanisms apply as with scheduler corruption. Backup brokers may exist; without them the system may need to be halted.
Communication. Because communication is central to resource coordination, corruption severely impacts availability. Failure to perform replication, assign resources to tasks, etc., fundamentally jeopardizes system functionality.
Several communication‑related techniques are presented in their own module, including retries, ACK/NACK‑based mechanisms, and cryptographically protected channels.
Monitoring and accounting. Incorrect information about system or service state may compromise confidentiality, integrity, and availability.
Procedures that uphold consistency of system state apply here. The replication and coordination mechanisms introduced earlier are exactly those used to maintain correct data and prevent disturbances.
Coordination of services, i.e. the application view (advanced)¶
Here we focus on the idea of blockchains, but for balance we first mention other systems. In general, the essential property of a service‑coordination model is service integrity supported by some level of consistency and reasonable availability. In contrast, in the resource‑coordination model, availability of resources and access to them are dominant.
Event services, databases¶
Event services include data mining, banking, and stock‑trade events. Databases and monetary services require consistency, which in banking need not be strong in all respects. For data mining and data retrieval, weaker consistency models suffice. Retrieval may operate with geographically separate datacenters and produce outdated results, but this may be adequate if service requirements are eventually met.
A simpler and often faster storage structure than a database is the KVS (key‑value store), a type of associative memory in which values (not necessarily structured as database fields) are stored at positions indicated by keys. A KVS is naturally distributed, and any consistency model from strict to eventual may suit the application.
Blockchains and cryptocurrencies¶
It is difficult to achieve consistent bookkeeping of transactions in a distributed system where actors do not trust each other and may be unreliable. Blockchains provide distributed and public bookkeeping without coordination. The ledger is stored as multiple copies across the network. Whenever a participant submits a transaction to the ledger, other participants perform checks to ensure that the transaction is valid, and such valid transactions are added to the ledger chain as blocks.
No record in a block can be changed retroactively without also changing all subsequent blocks. Such modifications require network consensus, so an attacker cannot perform them unilaterally. Because of this, participants can easily verify transactions. Blockchains form the basis of numerous cryptocurrencies, the most prominent being Bitcoin.
Technically, a blockchain is a list of blocks containing transaction records. The above properties follow from the fact that each block contains the cryptographic hash of the previous block and a timestamp. If a block is altered without altering all subsequent blocks, the next hash no longer matches, making tampering detectable.
When used as a distributed ledger, blockchains operate in a peer‑to‑peer network. Members of such a network participate in the protocol that verifies newly submitted blocks. Blockchains are an example of widely used systems that tolerate Byzantine failure well.
The general blockchain concept allows any party to participate and contains no access restrictions. This applies to many widely used cryptocurrencies, including Bitcoin. Restricted participation is also possible, where an administrative entity grants permissions.
To prevent denial‑of‑service attacks and other abuses, participation requires a notion of proof‑of‑work (PoW). Work means spending computation time on an otherwise meaningless task — brute‑forcing bits to produce a correctly formatted hash. This is an effective means to prevent service misuse in general, e.g. spam, because the required work is hard on average but easy to verify. However, PoW systems also lead to high energy consumption and, depending on the nature of the work, can impose unreasonably high barriers to participation — as in certain cryptocurrencies requiring specialized hardware.
Cryptographically breaking a blockchain is currently too difficult but may become feasible with new technologies such as quantum computing. Moreover, while collusion attacks are too expensive in large systems, they may be possible in systems with fewer participants.
Blockchain nodes must remain in continuous contact to compare information. Eclipse attacks may trick some nodes into wasting computation or confirming invalid transactions. Consensus may thereby be endangered.