Nan Zhang CNS Division Of Computer and Network Systems
CSE Direct For Computer & Info Scie & Enginr
Start Date:
July 15, 2010
End Date:
June 30, 2017 (Estimated)
Awarded Amount to Date:
$1,326,660.00
Investigator(s):
John Abowd john.abowd@cornell.edu (Principal Investigator)
Johannes Gehrke (Former Principal Investigator)
Lars Vilhuber (Co-Principal Investigator)
John Abowd (Former Co-Principal Investigator)
Sponsor:
Cornell University
373 Pine Tree Road
Ithaca, NY
14850-2820
(607)255-5014
Safely managing the release of data containing confidential information about individuals is a problem of great societal importance. Governments, institutions, and researchers collect data whose release can have enormous benefits to society by influencing public policy or advancing scientific knowledge. But dissemination of these data can only happen if the privacy of the respondents' data is preserved or if the amount of disclosure is limited.
The goal of this research project is to bridge the gap between the statistics and computer science community and between theory and practice in limiting disclosure. The research focuses on limiting statistical disclosure using synthetic data, the most advanced method from the statistics community that enables the construction of public data sets with strong statistical properties; the research incorporates formal privacy guarantees from the computer science community into this approach. Techniques focus on household data and relational data dealing with real problems motivated by the U.S. Census Bureau and related agencies.
The approach of the team is based on the development of novel techniques for boosting the utility of synthetic data generation methods with formal privacy guarantees; novel formal privacy models that formalize attackers implicitly considered in the statistics literature, and new attacker models that allow an exploration of the space between weak and strong adversaries; and novel techniques designed for data from censuses or surveys about households which have a relational structure.
The research has broad impact by influencing the methodology of statistical agencies around the world. The project also develops a open-source toolkit for limiting disclosure in data publishing with formal privacy guarantees; it integrates undergraduate students into research, and it creates educational material for material for practitioners responsible for safe data handling.
For further information see the project web site at the URL:
www.cs.cornell.edu/bigreddata/privacy
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Michaela Götz, Ashwin Machanavajjhala, Guozhang Wang, Xiaokui Xiao, Johannes Gehrke. "Publishing Search Logs - A Comparative Study of Privacy Guarantees," IEEE Transactions on Knowledge and Data Engineering, v.24, 2012, p. 520.
Johannes Gehrke, Beng Chin Ooi, Evaggelia Pitoura. "Guest Editorial: Special Section on the International Conference on Data Engineering," IEEE Trans. Knowl. Data Eng., v.26, 2014, p. 1298.
Johannes Gehrke, Nikos Mamoulis:. "Front Matter.," PVLDB, v.6, 2013, p. i.
Abowd, John M.; McKinney, Kevin L.. "Noise infusion as a confidentiality protection measure for graph-based statistics," Statistical Journal of the IAOS, v.32, 2016, p. 127.
BOOKS/ONE TIME PROCEEDING
Xiaokui Xiao and Gabriel Bender and Michael Hay and Johannes Gehrke. "iReduct: differential privacy with reduced relative errors", 07;/15/2010-06/30/2011, , Timos K. Sellis and Renee J. Miller and Anastasios Kementsietsidis and Yannis Velegrakis"Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011",  2011;, "http://dblp.uni-trier.de/rec/bibtex/conf/sigmod/XiaoBHG11".
Johannes Gehrke and Edward Lui and Rafael Pass. "Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy", 07;/15/2010-06/30/2011, , Yuval Ishai"Theory of Cryptography - 8th Theory of Cryptography Conference, TCC 2011, Providence, RI, USA, March 28-30, 2011. Proceedings.",  2011;, "conf/tcc/GehrkeLP11".
Xiaokui Xiao and Gabriel Bender and Michael Hay and Johannes Gehrke. "iReduct: differential privacy with reduced relative errors", 07;/01/2011-06/30/2012, , Timos K. Sellis and Renee J. Miller and Anastasios Kementsietsidis and Yannis Velegrakis"Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011",  2011;, "http://dblp.uni-trier.de/rec/bibtex/conf/sigmod/XiaoBHG11".
Johannes Gehrke and Edward Lui and Rafael Pass. "Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy", 07;/01/2011-06/30/2012, , Yuval Ishai"Theory of Cryptography - 8th Theory of Cryptography Conference, TCC 2011, Providence, RI, USA, March 28-30, 2011. Proceedings.",  2011;, "conf/tcc/GehrkeLP11".
Truls Amundsen Bjørklund, Michaela Götz, Johannes Gehrke, Nils Grimsmo. "Workload-aware indexing for keyword search in social networks", 07;/01/2011-06/30/2012, , Craig Macdonald, Iadh Ounis, Ian Ruthven"Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011",  2011;, "[BGGG2011:Indexing]".
Johannes Gehrke, Michael Hay, Edward Lui, and Rafael Pass.. "Crowd-Blending Privacy", 07;/01/2011-06/30/2012, "Proceedings of the 22nd International Cryptology Conference.",  2012;, "[GHLP2012:Crowd]".
Please report errors in award information by writing to: awardsearch@nsf.gov.