Debate over new census privacy measures overlooks larger issues with data error in Title I funding

Controversy surrounds the U.S. Census Bureau's new measures to preserve privacy, but a new study examines how existing data error can pose an even larger problem for evidence-based policies. The cornerstone of the Census Bureau's updated privacy measures, differential privacy, requires injecting statistical uncertainty, or noise, when sharing sensitive data. Scholars, politicians, and activists have raised concerns about the effect of this noise on crucial uses of census data. Yet most analyses of trade-offs around differential privacy overlook deeper uncertainties in census data. In a new study, researchers examined how education policies that use census data misallocate funds as a result of statistical uncertainty.

The study found that misallocations due to noise injected for privacy can be small or negligible, compared to misallocations due to existing sources of data error such as misreporting or non-response. But the study also finds that simple policy reforms could help funding formulas address unequal distribution of uncertainty from data error and smooth the way for new privacy protections, offering an avenue for compromise between targeted policy, equity, and better privacy protections.

The study, conducted by researchers at Carnegie Mellon University (CMU) and published in Science, focuses on Title I of the Elementary and Secondary Education Act, which provides financial assistance to school districts with high numbers of children from low-income families to help ensure that all children meet state education standards. Federal funds are allocated through formulas based primarily on Census estimates of poverty and the cost of education in every state. In 2021, the U.S. government appropriated more than $16.5 billion in Title I funds to more than 13,000 school districts and other local education agencies.

In this study, researchers used an exact simulation of the Title I allocation process to compare the policy impacts of noise injected for privacy to the impacts of existing statistical uncertainty. Specifically, they compared the impacts of quantified data error and of a possible differentially private noise injection mechanism. For example, of the $11.7 billion in 2021 Title I funds this study examined, $1.06 billion were allocated away from some districts in an average run of the simulation due to data error alone. This figure increased by just $50 million when the researchers injected noise to provide relatively strong privacy protection.

"We paid special attention to the way Title I implicitly concentrates the negative impacts of statistical uncertainty on marginalized groups," explains Ryan Steed, a Ph.D. student at CMU's Heinz College, who led the study. "Weakening privacy protection does little to help these groups, and for them, participating in a Census survey can be especially risky."

The results show that misallocations due to statistical uncertainty particularly disadvantage marginalized groups (e.g., Black and Asian students; districts with large populations of Hispanic students). Whether a demographic group lost funding depended on whether its members tended to live in high- or low-poverty districts, including those in denser, usually urban districts.

"However, we also identified policy reforms that could reduce the disparate impacts of both data error and privacy mechanisms," notes Steven Wu, assistant professor at CMU's School of Computer Science. "For example, using multi-year averages, rather than estimates from a single year, decreased both overall misallocation and disparities in outcomes."

Among the study's limitations, the authors point out their study does not account for systematic undercounts and many other unquantified forms of statistical uncertainty that affect poverty estimates, including previous measures to protect privacy such as data swapping.

"Our results suggest that the impacts of differential privacy relative to other sources of error in census data could be minimal," notes Alessandro Acquisti, professor of information technology and public policy at CMU's Heinz College, who coauthored the study. "Simply acknowledging the effects of data error could improve future policy design for both funding formulas and avoiding disclosure."

More information: Ryan Steed et al, Policy impacts of statistical uncertainty and privacy, Science (2022). DOI: 10.1126/science.abq4481

Journal information: Science

Provided by Carnegie Mellon University

Debate over new census privacy measures overlooks larger issues with data error in Title I funding

Differential privacy the correct choice for the 2020 US Census

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

New tech could help traveling VR gamers experience 'ludicrous speed' without motion sickness

On the trail of deepfakes, researchers identify 'fingerprints' of AI-generated video

Holographic displays offer a glimpse into an immersive future

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

With a game show as his guide, researcher uses AI to predict deception

New approach could make reusing captured carbon far cheaper, less energy-intensive

How much energy can offshore wind farms in the U.S. produce? New study sheds light

Engineers uncover key to efficient and stable organic solar cells

Mask-inspired perovskite smart windows enhance weather resistance and energy efficiency

Researchers increase storage, efficiency and durability of capacitors

Study explores why human-inspired machines can be perceived as eerie

High-energy-density capacitors with 2D nanomaterials could significantly enhance energy storage

Study shows potential of super grids when hurricanes overshadow solar panels

Rubber-like stretchable energy storage device fabricated with laser precision

Why can't robots outrun animals?

Virtual sensors help aerial vehicles stay aloft when rotors fail

Debate over new census privacy measures overlooks larger issues with data error in Title I funding

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Share article

E-MAIL THE STORY