Building privacy¶

Growing concerns about privacy in society have made privacy by design a popular approach. Privacy by design means that protection is planned, implemented, and integrated into products and services in order to alleviate users’ concerns. So far, however, there is only limited literature on how this is done in practice in information systems.

When designing privacy-preserving systems, the main principles are:

Minimise trust requirements: limit the need to trust the behaviour of other parties when handling sensitive information. For example, in the case of electronic citizens’ initiatives, the use of cryptographic primitives removes the need for users to trust authorities to protect users’ identities.
Minimise risk: limit the probability and impact of data breaches. For example, in the Tor network compromising a single Tor router does not reveal sensitive information about a user’s web browsing. If the entry node is compromised, the actual destination of messages is not revealed, only the next node. If, on the other hand, the exit node is compromised, it is not revealed where the message originally came from, only the previous node.

In order to implement these principles in practice, the aim is to minimise:

collection: limit data collection and storage;
disclosure: limit the flow of information to parties other than those whom the data concerns. This applies to both direct (the actual content) and indirect information (database queries, etc.);
replication: limit the number of parties storing or processing data;
centralisation: avoid single points of failure with respect to the system’s privacy features;
linkability: limit an attacker’s ability to combine data;
retention: limit the storage duration of data.

The first task is to identify data flows in the system that should be eliminated. These include all flows where information is transmitted to parties other than those whom the data concerns. After this, one should consider what is the minimum amount of information required to accomplish the task. For eliminating unnecessary data flows, for example, the following methods can be used:

Use of local data: sensitive data can be processed on the user’s device and only the result is transmitted.
Data encryption: data can be encrypted locally and only the encrypted data is transmitted onward.
Use of privacy-preserving cryptographic protocols: if the data must also be processed, it can be processed locally using a privacy-preserving protocol before being sent to the service, thereby reducing the disclosure of information within the service.
Obfuscation: data can be processed locally using various obfuscation methods, for example by reducing precision or adding noise.
Anonymisation: identifying information is removed from the data and an anonymous channel is used for transmission.

It is also important to evaluate the impact of privacy technologies. Systematic evaluation typically consists of modelling the mechanism as a probabilistic transformation, that is, certain outputs are more likely than others for certain inputs. Next, a threat model is defined, i.e. what the attacker can observe and what they know in advance. Third, it is assumed that the attacker knows the privacy mechanism, and one considers how they would attempt to circumvent it. This is usually done by analysing probability distributions or, for example, by using machine learning to determine what the attacker could infer. From this, a probability distribution is obtained, showing what an attacker could infer from each input. The distribution can be used to measure the attacker’s inference capability.

Everyday privacy

You may recall the recommendation from the GDPR module example to reject website cookies whenever possible. However, that example did not really justify why this is worth doing. Therefore, we will now consider perhaps the most problematic type of cookies from a privacy perspective: third-party tracking cookies. More precisely, these are, for example, cookies set by advertising providers (amazon-adsystem.com, doubleclick.net) that track users across websites.

For example, when a website you visit contains an advertising banner, it may not be stored on the website itself; instead, your browser retrieves it from the advertiser’s site, such as amazon-adsystem.com. At the same time, the advertiser sets its own cookie in your browser. If you then visit another website that uses the same advertiser, the advertiser can track your browsing using this cookie. The advertiser thus learns which websites you visit and, for example, what products you view in online stores. In this way, it can target advertisements to you on other websites as well. Perhaps the most enthusiastic bargain hunters are pleased to receive relevant advertisements based on their browsing behaviour, but the price is their own privacy.

Advertisements are not, however, the only source of tracking cookies. Many websites contain “share” or embedded content from various social media platforms. Following the same principle as advertising banners, these elements are retrieved from their providers’ servers. For example, embedded Facebook posts or social sharing widgets included on a news website are loaded directly from Facebook’s servers. At the same time, Facebook may receive information via its tracking technologies that you have visited that page. Until early 2026, this also applied to Facebook’s ”Like” button. Since then, Meta has discontinued these external plugins, but similar tracking mechanisms continue to exist through other embedded content. This raises the question of whether social media companies need to know so much about anyone’s browsing behaviour—but this is, of course, how they make their money: by using and selling user data.

There is also good news: third-party tracking cookies are increasingly restricted by major browser providers. Firefox and Safari block them by default (2023—), significantly limiting cross-site tracking. Chromium-based browsers such as Chrome and Microsoft Edge take a more gradual approach: they still allow third-party cookies, but provide users with controls and tracking prevention features, while also developing alternative, more privacy-focused technologies.

New advertising targeting technologies are, of course, emerging, many of which attempt to take privacy better into account. Nevertheless, it appears that you will still disclose more information about yourself by accepting cookies than by rejecting them.