Lowering the cost of anonymization

a PhD thesis

Among the numerous variants of differential privacy listed in Chapter 2, two main variants model adversaries with partial background knowledge, using indistinguishability: noiseless privacy [46, 116], and distributional DP [36]. This work discusses the shortcomings of distributional DP in Section 3.1.2, and Section 3.1.3 uses the formalism of noiseless privacy to define active and passive partial knowledge DP.

Other variants, which also model adversaries with partial background knowledge, are not based on indistinguishability, but directly constrain the posterior knowledge of an attacker as a function of their prior knowledge. Among those are adversarial privacy [326], membership privacy [254], and aposteriori noiseless privacy [46]. It is straightforwards to adapt the examples given in this chapter to show that these definitions suffer from the same flaws as noiseless privacy when data has correlations. Because of space constraints, we do not study them in detail.

Several other definitions have been proposed. Pufferfish privacy [228] can be seen as a generalization of noiseless privacy, and similarly, coupled-worlds privacy [36] (and its inference-based variant) generalizes distributional differential privacy: instead of protecting individual tuples, they protect arbitrary sensitive properties of the data. It is straightforward to generalize our results to the more generic frameworks.

All opinions here are my own, not my employers.
I'm always glad to get feedback! If you'd like to contact me, please do so via e-mail (se.niatnofsed@neimad) or Twitter (@TedOnPrivacy).