Cloaking Functions: Differential Privacy with Gaussian Processes

Embodiment Factors


bits/min	billions	2,000
billion calculations/s	~100	a billion
embodiment	20 minutes	5 billion years

Evolved Relationship with Information

New Flow of Information

Evolved Relationship

Rasmussen and Williams (2006)

Differential Privacy, summary

We want to protect a user from a linkage attack…

…while still performing inference over the whole group.
Making a dataset private is more than just erasing names.

Narayanan and Felten (2014);Ohm (2010);Barth-Jones (2012)

To achieve a level of privacy one needs to add randomness to the data.
This is a fundamental feature of differential privacy.

See The Algorithmic Foundations of Differential Privacy by Dwork and Roth (2014) for a rigorous introduction to the framework.

Differential Privacy for Gaussian Processes

We have a dataset in which the inputs, \(\mathbf{X}\), are public. The outputs, \(\mathbf{ y}\), we want to keep private.

Data consists of the heights and weights of 287 women from a census of the !Kung (Howell, 1967)

Vectors and Functions

Hall et al. (2013) showed that one can ensure that a version of \(f\), function \(\tilde{f}\) is \((\varepsilon, \delta)\)-differentially private by adding a scaled sample from a GP prior.

3 pages of maths ahead!

Applied to Gaussian Processes

We applied this method to the GP posterior.
The covariance of the posterior only depends on the inputs, \(\mathbf{X}\). So we can compute this without applying DP.
The mean function, \(f_D(\mathbf{ x}_*)\), does depend on \(\mathbf{ y}\). \[f_D(\mathbf{ x}_*) = \mathbf{ k}(x_*, \mathbf{X}) \mathbf{K}^{-1} \mathbf{ y}\]
We are interested in finding

\[|| f_D(\mathbf{ x}_*) - f_{D^\prime}(\mathbf{ x}_*) ||_H^2\]

…how much the mean function (in RKHS) can change due to a change in \(\mathbf{ y}\).

Applied to Gaussian Processes

Using the representer theorem, we can write \[|| f_D(\mathbf{ x}_*) - f_{D^\prime}(\mathbf{ x}_*) ||_H^2\]

as:

\[\Big|\Big|\sum_{i=1}^nk(\mathbf{ x}_*,\mathbf{ x}_i) \left(\alpha_i - \alpha^\prime_i\right)\Big|\Big|_H^2\]

where \(\boldsymbol{\alpha} - \boldsymbol{\alpha}^\prime = \mathbf{K}^{-1} \left(\mathbf{ y}- \mathbf{ y}^\prime \right)\)

L2 Norm

\[\Big|\Big|\sum_{i=1}^nk(\mathbf{ x}_*,\mathbf{ x}_i) \left(\alpha_i - \alpha^\prime_i\right)\Big|\Big|_H^2\]

where \(\boldsymbol{\alpha} - \boldsymbol{\alpha}^\prime = \mathbf{K}^{-1} \left(\mathbf{ y}- \mathbf{ y}^\prime \right)\)
We constrain the kernel: \(-1\leq k(\cdot,\cdot) \leq 1\) and we only allow one element of \(\mathbf{ y}\) and \(\mathbf{ y}^\prime\) to differ (by at most \(d\)).
So only one column of \(\mathbf{K}^{-1}\) will be involved in the change of mean (which we are summing over).
The distance above can then be shown to be no greater than \(d\;||\mathbf{K}^{-1}||_\infty\)

Applied to Gaussian Processes

This ‘works’ in that it allows DP predictions…but to avoid too much noise, the value of \(\varepsilon\) is too large (here it is 100)

EQ kernel, \(\ell= 25\) years, \(\Delta=100\)cm

Inducing Inputs

Using sparse methods (i.e. inducing inputs) can help reduce the sensitivity a little. We’ll see more on this later.

Cloaking

So far we’ve made the whole posterior mean function private…

…what if we just concentrate on making particular predictions private?