How Racial Data Gets ‘Cleaned’ in the U.S. Census
The national survey offers more identity choices than ever—until those choices get scrubbed away.
While early racial data were gathered to feed an obsession with racial purity, and were even used to locate Japanese Americans for internment during World War II, over time the Census Bureau settled on bureaucracy to explain its work. And yet, a simple count of the population remains ideologically loaded. These data are not neutral or objective information about the population. Instead they reflect changing political priorities and techniques to grasp how the country’s population is seen—and how resources are made available to them
* * *
Shortly after the country’s founding, the U.S. government began collecting data on the racial and ethnic make-up of every person in each household. Every decennial ushers in some new language meant to enhance the accuracy and reliability of the census as a measurement of the entire national population. There’s symbolic power in being represented on the census—in being counted. But as the political scientist Melissa Nobles shows in her book Shades of Citizenship, these data also track compliance with civil-rights legislation, particularly voting districts. They are linked to federal resources, intensifying public agitation around the categories.
During the years between each census, researchers, activists, politicians, and interest groups lobby for the rewording of a label, the addition (or elimination) of a category, or the disaggregation of another, such as Asian or American Indian or Alaska Native. In 2000, for example, “Hispanic or Latino, or Spanish origins” was reclassified from racial to ethnic data. Respondents were also allowed to select multiple boxes to reflect multiracial heritage for the first time. Additional changes that affect how the racial makeup of the country is represented are underway, including the creation of a separate category for people of Middle Eastern and North African descent (referred to as MENA).
Shifts in racial classifications raise questions about what exactly is being counted, how people interpret the same questions differently, and what to do about people’s changing perceptions of their racial background. In 2015, the Pew Research Center reported that at least 9.8 million people reported a different racial or ethnic background than they did in 2000. When someone appears to “change” races, the resulting data is sometimes construed as erroneous.
Errors in reporting and recording certainly do happen. But if racial data must be cleaned, then some data is dirty. And that dirtiness is undeniably political. Some responses are more likely to be diagnosed as dirty. Given the goal of creating information that is comparable from one national census to the next, the data most under suspect are those that correspond to the categories most in flux: people who checked more than one box, for example, or those who saw themselves as members of different racial or ethnic groups at different times.
While data cleansing can raise ethical questions about altering people’s responses, it offers a bureaucratic solution to a difficult position for the Census Bureau. The bureau is under public pressure to modify its data-collection methods, on the one hand. But, on the other, it is also expected to provide reliable data that is comparable over time and across other government agencies at the local, state, and national levels. The desire for comparability prompts some of the most intensive or imaginative cleaning.
In 2010, the “some other race” category proved the dirtiest. This selection included a write-in box where respondents were expected to provide the name of the race to which they felt they belonged. The vast majority of the more than 19 million people (6.2 percent of respondents) who made this selection also identified themselves as having “Hispanic, Latino, or Spanish” origins for the ethnicity question asked prior to their race. In its document 2010 Census Redistricting Data, the Bureau states that it used “automated” and “expert” coding to recode write-in responses for compliance with the master files (or predetermined rules) of the database or system. For example, the document states that someone describing themselves as “Haitian” and “Moroccan” was recoded to “black” and “white.” This “some other race” also includes people who preferred to write in responses like “multiracial” in lieu of ticking multiple boxes.
While these new measures might reduce costs, civil-rights groups like the Leadership Conference on Civil and Human Rights are concerned that they will continue to undercount or otherwise misrepresent vulnerable populations and communities of color whose members are less likely to have reliable internet access. That might make them vulnerable to inaccurate identification in administrative records.
* * *
The Census Bureau didn’t respond to a request for comment or clarification about its perception of dirty data. Nevertheless, the bureau likely finds itself in a cultural minefield, as it becomes a site where debates unfold about which individuals and groups are rendered invisible, as much as how finite public resources get allocated. The ongoing dispute over whether future censuses should or will include a question about sexual orientation or gender identity belie the simplicity of the current sex question, which only asks respondents if they are male or female. With more public pressure and social change, that data might also become disaggregated one day, and then recoded into categories like “cisgender male” or “female, not transgender.”
Some people bristle at being asked to reduce the complexity of their self-perceptions into a singular choice. The “check-this-box” mentality of the census is at odds with the more fluid and ambiguous self-perceptions of the population: people originating from outside the country, for example, or those habituated to customizable digital profiles, like those on Facebook, which appear to revel in the uncertainty of multitudinous identity. If anything, these digital tools have helped accelerate citizens’ willingness to self-identify in categories broader than those provided by the government—and even to demand to be able to do so.
Even so, some of the choices haven’t changed. Since the first census in 1790, one category has remained stable, or at least been modified the least on the national census and other official government forms: “white.”
Source: The Atlantic