Published January 10, 2009 | Version Submitted
Discussion Paper Open

Differential Privacy with Compression

Abstract

This work studies formal utility and privacy guarantees for a simple multiplicative database transformation, where the data are compressed by a random linear or affine transformation, reducing the number of data records substantially, while preserving the number of original input variables. We provide an analysis framework inspired by a recent concept known as differential privacy (Dwork 06). Our goal is to show that, despite the general difficulty of achieving the differential privacy guarantee, it is possible to publish synthetic data that are useful for a number of common statistical learning applications. This includes high dimensional sparse regression (Zhou et al. 07), principal component analysis (PCA), and other statistical measures (Liu et al. 06) based on the covariance of the initial data.

Additional Information

We thank Avrim Blum and John Lafferty for helpful discussions. KL is supported in part by an NSF Graduate Research Fellowship. LW and SZ's research is supported by NSF grant CCF-0625879, a Google research grant and a grant from Carnegie Mellon's Cylab.

Attached Files

Submitted - 0901.1365.pdf

Files

0901.1365.pdf

Files (149.5 kB)

Name Size Download all
md5:b5f56434b456e0ebddf52d9e55e34b78
149.5 kB Preview Download

Additional details

Identifiers

Eprint ID
96884
Resolver ID
CaltechAUTHORS:20190702-110751042

Related works

Funding

NSF Graduate Research Fellowship
NSF
CCF-0625879
Google
Carnegie Mellon University

Dates

Created
2019-07-08
Created from EPrint's datestamp field
Updated
2023-06-02
Created from EPrint's last_modified field