Big data can render some as ‘low-resolution citizens’

In India, a government database where 1.25 billion residents are identified with fingerprints and photographs has created a bureaucratic infrastructure that, while meant to bring marginalized citizens into the fold, can sometimes have the opposite effect.

Using Aadhaar, India’s biometrics-based individual identification system, as a test case, Ranjit Singh, Ph.D. ’20, and his collaborator, Steven Jackson, associate professor of information science in the Cornell Ann S. Bowers College of Computing and Information Science and associate professor in the Department of Science and Technology Studies in the College of Arts & Sciences, examined how the system works for the country’s nearly 1.4 billion people.

Their paper, “Seeing Like an Infrastructure: Low-resolution Citizens and the Aadhaar Identification Project,” was published on Oct. 18 in Proceedings of the Association of Computing Machinery on Human-Computer Interaction. The paper also won a Best Paper Award at this week’s ACM Computer-Supported Cooperative Work conference.

“There’s a deep relationship between the way of life of a person and what is recorded and registered about them on a database,” said Singh, now a postdoctoral scholar at Data & Society, an independent, nonprofit research organization focusing on the social implications of data and automation.

“When they are in alignment, you manifest in ‘high resolution’ and these systems work for you amazingly well,” Singh said. “But when they break down, they break down in really uneven ways. And capturing that unevenness requires us to see like an infrastructure.”

Aadhaar, introduced more than 10 years ago, is the world’s largest biometrics-based identification database. With more than 1.25 billion residents registered, Aadhaar is designed to provide a standardized legal identity to all Indian residents, including those who had no identity documents previously.

Registrants have on file biometric (10 fingerprints, two iris photographs, a facial photograph) and demographic (name, age, gender, residential address) information, and are given a 12-digit number, similar to the nine-digit Social Security number in the U.S.

For his field work, Singh returned to his native India three times – in 2015, 2016 and 2018 – for a total of 18 months and interviewed a wide range of stakeholders, including members of the Unique Identification Authority of India, which implements Aadhaar.

Singh was interested in the disconnect, if any, between the plans for implementing Aadhaar and the reality on the ground for the citizens participating in the program.

“That was part of the impetus for engaging with the project,” he said. “I always had a hunch that it would deeply shape the nature of Indian citizenship, although the number was never claimed to be an indicator of citizenship. It was promoted as a way of uniquely identifying people across government databases.”

Singh and Jackson came up with the terms “low-resolution” and “high-resolution” citizens while reading previous literature on the topic. One problem with Aadhaar was that people whose fingerprints were not distinct – not high enough in resolution – faced difficulties in getting registered in the database.

“So that, to a certain extent, allowed Steve and me to actually talk about how people need to be in ‘high resolution’ to become a part of the system,” Singh said.

It’s understandable, and generally accurate, to lump “low-resolution citizens” at the lower end of the socioeconomic scale, and “high-resolution citizens” at the other end, but it’s not absolute, Singh said.

“It’s not just about social hierarchies,” he said, “it’s about hierarchy as it manifests through data.”

Still, Jackson said, marginalities tend to stack up. “It is often the case,” he said, “that if you’re rendered marginal in one system, that often becomes a way of becoming marginal in other systems.”

In closing, Singh writes that using big-data systems in state governance is a social and moral challenge: “Social, because making up and interpreting a population as data requires work, organization and discipline; moral, because using data to represent citizens inevitably raises practical and normative questions of fairness, accountability and inclusion.”

Jackson praised Singh’s work, which also formed the basis of his collaborator’s doctoral thesis.

“It’s a really powerful example of the differential impact of seemingly universal systems,” Jackson said, noting U.S. voter ID laws as another example. “Even systems that are offered in good faith may be universal or inclusive in ambition while being differentiating in outcome.

“Some of that is by friction,” he said. “If you make access a little bit more difficult for some people, that makes certain people’s lives a little bit harder. But for others, it just flows. The result is a differentiating impact, and a subtle but important contribution to inequality.”

The work was supported by a doctoral field research grant from the National Science Foundation.

Read this story in The Cornell Chronicle.

More news

View all news
		 Image of blue lines representing data