De-identifying government datasets: Techniques and governance
NIST has published Special Publication (SP) 800-188, De-Identifying Government Datasets: Techniques and Governance.
De-identification removes identifying information from a data set so that the remaining data cannot be linked to specific individuals. Government agencies can use de-identification to reduce the privacy risks associated with collecting, processing, archiving, distributing, or publishing government data.
Previously, NIST published NIST Internal Report (IR) 8053, De-Identification of Personal Information, which provided a survey of de-identification and re-identification techniques. SP 800-188 provides specific guidance to government agencies that wish to use de-identification.
This final document was authored by experts at NIST and the U.S. Census Bureau and references up-to-date research and practices for both traditional de-identification approaches as well as the use of formal privacy methods, such as differential privacy to create de-identified datasets.
This document also addresses other approaches for making datasets that contain sensitive information available to researchers and for public transparency. Where appropriate, this document cautions users about the inherent limitations of traditional de-identification approaches when compared to formal privacy methods, such as differential privacy.
More information: Simson Garfinkel, De-Identifying Government Datasets, NIST (2023). DOI: 10.6028/NIST.SP.800-188