Daniel Winograd-Cort University of Pennsylvania, USA, gives the first presentation in the third panel, Applications, in the ICFP 2017 conference. Co-written by Andreas Haeberlen and Aaron Roth, University of Pennsylvania, USA.
Differential privacy is a widely studied theory for analyzing sensitive data with a strong privacy guarantee--any change in an individual's data can have only a small statistical effect on the result--and a growing number of programming languages now support differentially private data analysis. A common shortcoming of these languages is poor support for adaptivity. In practice, a data analyst rarely wants to run just one function over a sensitive database, nor even a predetermined sequence of functions with fixed privacy parameters; rather, she wants to engage in an interaction where, at each step, both the choice of the next function and its privacy parameters are informed by the results of prior functions. Existing languages support this scenario using a simple composition theorem, which often gives rather loose bounds on the actual privacy cost of composite functions, substantially reducing how much computation can be performed within a given privacy budget. The theory of differential privacy includes other theorems with much better bounds, but these have not yet been incorporated into programming languages.
We propose a novel framework for adaptive composition that is elegant, practical, and implementable. It consists of a reformulation based on typed functional programming of the privacy filters of Rogers et al (2016), together with a concrete realization of this framework in the design and implementation of a new language, called Adaptive Fuzz. Adaptive Fuzz transplants the core static type system of Fuzz to the adaptive setting by wrapping the Fuzz typechecker and runtime system in an outer adaptive layer, allowing Fuzz programs to be conveniently constructed and type-checked on the fly. We describe an interpreter for Adaptive Fuzz and report results from two case studies demonstrating its effectiveness for implementing common statistical algorithms over real data sets.