Business

/

ArcaMax

How federal confidential data contribute to the 'public good'

Anya Litvak, Pittsburgh Post-Gazette on

Published in Business News

PITTSBURGH — The federal government collects a lot of granular confidential data about people and businesses that is never released to the public.

Some of it lives inside non-descript secure rooms scattered across the U.S. called Federal Statistical Research Data Centers. Penn State University has one, as does Ohio State University. Researchers who want to study this confidential data have to apply to the agency that keeps it for approval, a process that takes months.

Randy Walsh, an economics professor at the University of Pittsburgh, launched his campaign to bring a Federal Statistical Research Data Center to the Steel City eight years ago. In April, he celebrated the official opening in a windowless room in the basement of the Cathedral of Learning.

“It’s beautiful,” Walsh said, describing a space with wood paneling and gray carpet, with 10 computers and a conference room, which requires a badge and an FBI background check to enter.

Inside is a repository of access-restricted data collected by federal agencies and made available for research in service of “the public good.”

This confidential data from the Census Bureau, the IRS, labor and health agencies, among others, has been used to assess policy decisions, identify interventions, track the health of communities and find links between peoples’ home and work lives.

Walsh, a co-director of the new data center along with Brian Kovak, professor of economics and public policy at CMU’s Heinz College of Information Systems and Public Policy, said there are fewer than a dozen researchers in Pittsburgh that are currently able to work inside of the center. In the Post-Gazette’s latest “In Conversation With” Q&A, he said he hopes to attract many more.

This interview has been edited for clarity and length.

Post-Gazette: Can you give me a sense of some of the data that's available?

Walsh: Why don’t I talk about some of the research being done. (In 2023) Claudia Golden won the Nobel Prize in economics for a lot of work that she has done on gender inequality and wages. A large part of that work was done leveraging data from the FSRDC, tracking individuals and their wages.

If you’ve ever run across a thing called the Opportunity Atlas, it’s this guy Raj Chetty. He's a MacArthur Genius. He's done lots of work understanding how the neighborhood where you grow up impacts your later life outcome. And all of that work is done using this confidential data in the Census data center, where I have all the demographics of where people live.

Stuff that folks in Pittsburgh are either already doing in a data center somewhere else, or that they’ve got in the pipeline includes looking at how firms make decisions about job relocation. So how do firms decide where to move their production across the U.S. and take jobs with them? We've got a proposal in right now trying to understand what are the determinants of homeownership rates in a city or in a neighborhood?

 

We’ve got someone looking at …. what are the benefits, in terms of my wages, of being willing to change metropolitan areas? Someone's got a proposal to look at impacts of different policies designed to respond to overdose deaths. And another project that looks at a suite of different public policies that are designed to help people get off of unemployment and back into the workforce.

PG: You’re an applied microeconomist. What does that mean?

Walsh: The “applied” means that I do work with data and the “micro” means that I that I tend to look at individual behavior, rather than, like, inflation or business cycles or the macro economy. I look at urban economics. I do economic history work. I do stuff on environment, I do stuff on housing market and things like that.

PG: Do you have a favorite government data set?

Walsh: I tend to study neighborhood composition and neighborhood outcomes, and so I love just the basic Census data set with all the demographics down to a really fine spatial scale, because it lets me see things that I otherwise couldn't.

The Census has been doing all this work to link individuals across these different data sets — to connect people from healthcare data with where they live, to their unemployment insurance file, which tells you about their employment history. Being able to track someone as they move in and out of employment with different employers — those linkages that you can only get in these data centers are just really important.

PG: Does this imply that somebody with access can pick a person and learn everything about them?

Walsh: I can't do it with your name. For every project, every piece of data you're going to use, every link you can use has to be defined and approved, and I'm never going to see your name. And if I ever tried… if I really wanted to, say, find out about this reporter (and) what can I learn about her? There would be jail time and huge fines in my future if I ever got caught doing that.

PG: Why we need one (of these) in Pittsburgh when there's already 37 of them?

Walsh: We were starting to see that as we were recruiting faculty or recruiting PhD students, they (would say), “Well, you know, this other university I'm looking at has a census research data center.” Not having the data center here means the cost of using it is really expensive. You've got to travel physically to another city. And so that really inhibits folks putting in the energy to put in a proposal.

If the research that was being done there wasn't that important, maybe it wouldn't matter. But it turns out, (for) a lot of the research in social science and in business and in public health, access to these data sets, really makes a difference.


©2026 PG Publishing Co. Visit at post-gazette.com. Distributed by Tribune Content Agency, LLC.

 

Comments

blog comments powered by Disqus