Social scientists take on data-driven discrimination

By: Susan Kelley,  Cornell Chronicle
Tue, 02/12/2019

Search engine rankings can glorify or torpedo a reputation in minutes. Scoring systems decide whether a person gets a loan. Online hiring platforms calculate which applicants get a job interview. And automated decision systems pick which students get into which schools.

With big data, machine learning and digital surveillance pervasive in all facets of life, they have the potential to create racial and social inequalities – and make existing discrimination even worse, according to a team of Cornell scientists addressing the problem.

“Data-driven algorithms have a huge influence on our lives. And why do we believe their conclusions?” said Martin Wells, the Charles A. Alexander Professor of Statistical Sciences, who leads the team. “Sometimes the substantive conclusions from the algorithms are correct. Sometimes they are not. That’s something we want to analyze and understand.”

Through 2020, the Algorithms, Big Data and Inequality collaborative project will challenge data-driven discrimination by examining the design and use of algorithmic systems. The project, sponsored by Cornell’s Institute for the Social Sciences, includes faculty in disciplines across campus.

Bias can infiltrate algorithmic systems in a variety of ways, said Wells, who is also a professor of social statistics, biostatistics, clinical epidemiology, and law. Algorithms and the data they use – such as health records, advertising and retail activities, social media posts and public records – are almost always “black boxed,” or hidden by technical complexity and corporate secrecy, with potentially discriminatory or unethical uses.

Bias is often baked into the historical data that algorithms process.

“If we’re building a machine-learning model and we calibrate it on historical data, we’re just going to propagate the inherent biases in the data,” said Wells. “For example, in the criminal justice context, there are racial inequities in sentencing. So that data has inequities built in, and they’re built into the new models.”

The team will look at the problem through a variety of research projects.

Ifeoma Ajunwa, assistant professor of labor relations, law and history, and Brooke Erin Duffy, assistant professor of communication, will examine how independent creative workers such as journalists, photographers and video creators adapt to the algorithms of platforms like Google, Facebook and Instagram. These types of algorithmic systems make it increasingly difficult for creative workers to reach potential clients and audiences.

“No longer is their labor focused exclusively on content creation,” Duffy said. “Creator workers must now consider content promotion as they struggle to make sense of search engine optimization and the ever-shifting algorithms on sites like Instagram and Twitter, which threaten to render their content invisible.”

“If we’re building a machine-learning model and we calibrate it on historical data, we’re just going to propagate the inherent biases in the data.”

Martin Wells, the Charles A. Alexander Professor of Statistical Sciences

Ajunwa is writing a book about the role of law and technology in the workplace and their effects on management practices. For a separate project, she’ll interview the workers who design and develop hiring and productivity tools, to gain a deeper understanding of the ethical issues they face in their work.

“Algorithmic decision-making is the civil rights issue of the 21st century,” Ajunwa said. “As algorithmic systems can provide end runs against equal protection laws and may also be used to skirt laws ensuring equal opportunity on the labor market, it becomes even more crucial that leading universities like Cornell are involved in cutting-edge social scientific research to address any arising inequalities.”

Solon Barocas, assistant professor of information science, hopes to learn how companies that use statistical models for employment and credit decisions address concerns about bias and discrimination. He’ll also examine how those procedures compare to recent proposals from law and computer science focused on fairness, accountability and transparency in machine learning, and where practice and research could better inform each other.

Credit reports and credit scoring are the topic of a project by Malte Ziewitz, assistant professor of science and technology studies. He’ll look into the recourse available to everyday citizens who feel they’ve been misrepresented in credit assessments. Ziewitz will also research a New York City task force that is examining the city’s “automated decision systems,” which significantly impact New Yorkers’ lives by matching students with schools, assessing teacher performance and detecting Medicaid fraud. And he’ll examine what the process of passing regulations for algorithmic accountability can teach us.

While his colleagues look at the downstream consequences of data-driven discrimination, Wells, a statistician, is looking at how algorithms work. “With my training, I know how to peek under the hood of the algorithm and analyze the underlying mathematical and statistical architecture,” he said. He’ll also look at how computer scientists and policy makers can detect bias and identify whether it is real or perceived. “There’s a lot of basic statistical modeling and analysis to do,” he said.

Collectively, the team hopes to draw attention to Cornell’s unique skill sets in the arena. “We’re all working toward making sure Cornell gets recognized as a place where we take seriously not only issues of inequality,” Duffy said, “but also the role that technology may play in challenging or exacerbating social inequalities.”

This story also appeared in the Cornell Chronicle.

Project members Solon Barocas, Brooke Erin Duffy, Malte Ziewitz, Ifeoma Ajunwa