Ted is writing things

On privacy, research, and privacy research.

A friendly, non-technical introduction to differential privacy

— updated

Differential privacy is getting a lot of attention lately. Companies and governments are starting to publish data anonymized with this notion. Universities are offering courses about it. Statisticians are getting acquainted with this new approach to protecting data. Open-source organizations are publishing tooling to make differential privacy easier to use.

So, you might be wondering: what's the hype all about? What even is differential privacy? What makes it so special? How does it work in practice? And, perhaps more importantly, can I understand it without having to read a bunch of complicated equations?

The good news is: you've come to the right place. Welcome to my friendly blog post series about differential privacy! It provides simple explanations for the core concepts behind differential privacy. It is meant for a wide, non-technical audience: it doesn't assume any prior knowledge, uses as little math as possible, and illustrates everything with simple examples and diagrams.

Same diagram as before, duplicated, with the bottom line missing one person in the database. A double arrow labeled "basically the same" points to the two outputs. Same diagram as before, duplicated, with the bottom line missing one person in the database. A double arrow labeled "basically the same" points to the two outputs. image/svg+xml Some process

Sounds interesting? Excellent! Start with these two articles.

Then, this blog post series splits in two branches. You can read one or the other in any order, depending on what you're most interested in.

The first branch is about the how: what techniques can you use to achieve differential privacy? It's a little bit technical, though I still keep it as simple as I can. If that doesn't sound interesting, feel free to skip over and go directly to the next section!

The second branch of these series is about the why. In which contexts can differential privacy be used? Why do organizations decide to adopt it? What policy questions does it raise? The articles in this branch are accessible to non-technical folks, and are all self-contained. You can read them in any order you like!

Finally, one article lists the known real-world deployments of DP, along with their privacy parameters.

This series isn't finished. I have a list of future articles I'd like to write… and I'm adding new ideas to this list faster than I'm writing blog posts! If you're looking for further things to read on differential privacy, you can do two things.

  • You can check out this reading list I curated. I particularly recommend it you're looking for more formal content: textbooks with mathematical proofs, scientific papers, etc.
  • You can follow me on Mastodon or subscribe to this blog's RSS feed to keep updated about future posts.

All opinions here are my own, not my employer's.   |   Feedback on these posts is very welcome! Please reach out via e-mail (se.niatnofsed@neimad) or Mastodon for comments and suggestions.   |   Interested in deploying formal anonymization methods? My colleagues and I at Tumult Labs can help. Contact me at oi.tlmt@neimad, and let's chat!