Differential Privacy

A mathematical framework for sharing information about a dataset while protecting the privacy of individuals in that dataset through controlled noise addition.

Also known as:DPStatistical Privacy

What is Differential Privacy?

Differential Privacy is a mathematical definition of privacy that provides provable guarantees about the privacy of individuals in a dataset. It works by adding carefully calibrated noise to query results or data, ensuring that individual records cannot be identified.

Core Concept

Definition A mechanism M satisfies ε-differential privacy if for any two databases D and D' differing by one record:

P[M(D) ∈ S] ≤ e^ε × P[M(D') ∈ S]

Key Parameters

Epsilon (ε)

  • Privacy loss parameter
  • Lower = more privacy
  • Trade-off with accuracy

Delta (δ)

  • Probability of privacy breach
  • Usually very small
  • Relaxed definition

Noise Mechanisms

Laplace Mechanism For numerical queries. Adds Laplace noise.

Gaussian Mechanism For (ε,δ)-DP. Adds Gaussian noise.

Exponential Mechanism For categorical outputs. Probabilistic selection.

Applications

  • Census data release
  • Machine learning
  • Location data
  • Health analytics
  • Search queries

Benefits

  • Provable guarantees
  • Composition theorems
  • Flexible implementation
  • Regulatory compliance

Challenges

  • Accuracy trade-offs
  • Parameter selection
  • Complex to implement
  • Computational cost

Implementations

  • Google DP library
  • Apple (device analytics)
  • US Census Bureau
  • OpenDP