Ada Lovelace probably didn’t foresee the impact of the mathematical formula she published in 1843, now considered the first computer algorithm.
Nor could she have anticipated today’s widespread use of algorithms, in applications as different as the 2016 U.S. presidential campaign and Mac’s first-year seminar registration. “Over the last decade algorithms have become embedded in every aspect of our lives,” says Shilad Sen, professor in Macalester’s Math, Statistics, and Computer Science (MSCS) Department.
How do algorithms shape our society? Why is it important to be aware of them? And for readers who don’t know, what is an algorithm, anyway?
What Is an Algorithm?
“An algorithm is just a set of step-by-step instructions, like a recipe,” explains Ruth Berman ’17, who majored in neuroscience and computer science and works as a software engineer at Airbnb.
A computer algorithm can process large quantities of data, which is why one proved to be the recipe for improving satisfaction among the roughly 500 incoming Mac students who need to be assigned to one of 35 first-year seminar courses each year. Before 2009, two staff dedicated one week each summer to maximizing, by hand, the number of first-years placed in the students’ first- or second-choice
Enter MSCS professor Andrew Beveridge, professor emeritus Stan Wagon, and Sean Cooke ’09, who collaborated in 2008 to create an algorithm optimizing student placement in seminars. First used in 2009 and now standard procedure at Macalester, the algorithm places more students in their preferred seminars than the manual method did, while vastly reducing staff time spent on the task.
Boon or Bane?
The algorithm was a boon to incoming students and Academic Programs and Advising. While its use hasn’t caused unanticipated problems for the office, that’s not the case for all algorithms. The trouble is, Berman points out, that “algorithms are built by people, and so we build all of the biases we carry into algorithms.”
A relatively innocuous example of personal bias was revealed when people’s photos were appearing upside down on their phones, she explains: “It turns out that people who are left-handed turn their phones the opposite way of people who are right-handed when they take a landscape photo, causing the final image to be stored upside down.” Since there are more right-handed people than left-handed ones, it’s likely that right-handed programmers wrote the algorithm telling a cell phone the steps to follow to take a picture. “This is a problem that right-handed programmers might not notice on their own, so only left-handed users experienced this problem,” she says.
Other biases, though, built into algorithms by programmers unaware of their own prejudices, can have far more serious implications.
Unintentional bias may have influenced earlier versions of facial recognition software, which historically has been produced in the United States by white males. This software is trained to recognize visual components of human faces, including those that signify gender, in a process called Machine Learning (ML): A computer scientist chooses photographs of faces (i.e., a training dataset), and tells the computer to scan them and to identify facial characteristics that the scientist specifies. The computer develops algorithms as it learns which pixels of an image denote a nose, for example.
The way that bias can enter this training and the algorithms that the software develops during ML, suggests Berman, is that “When programmers, particularly those from dominant social groups, compile a training dataset of faces, they might not think to include a diverse array of races or genders in that dataset. It’s not deliberate; it’s simply a reflection of who they see around them.”
Software trained only with photographs of white male faces is most accurate when it’s presented with faces of white men. It’s less accurate at correctly identifying the gender and facial features of white females, even less accurate at recognizing faces of light-skinned females of color, and least accurate when presented with faces of dark-skinned females.
This type of algorithmic bias can create problems that disproportionally affect women and people of color. Because facial recognition software trained on white male faces is less accurate at recognizing people with darker skin and incorrectly identifies women more often than men, people in those categories are more likely to be erroneously identified as not matching their photo ID. Consequently, they could be targeted for additional scrutiny more often than white men unless the scanning algorithm’s ML bias
An example of how this bias could play out in real life was reported in July 2018 by the American Civil Liberties Union of Northern California, which tested facial recognition software using photographs of then-current members of Congress and mug shots of other people who had been arrested for crimes. The result, reports ACLU’s Free Future blog, was that 28 members of Congress were incorrectly matched with mug shots. The blog didn’t mention whether female representatives were mismatched more than their male counterparts or whether any mismatches were between genetic relatives, who might have similar facial features. Nonetheless, 39 percent of the false matches were people of color, despite that population composing 20 percent of Congress.
The first step toward solving any problem is to recognize that a problem exists. “I think one of the biggest reasons that we struggle with bias in algorithms and ML models is that the field of computer science lacks diversity,” Berman says. “If individuals from different backgrounds come together to build a model, they’re able to recognize oversights that a homogenous group might miss. If we can make sure we have a more diverse group of engineers building these systems, we’ll see a reduction in algorithmic bias.” There’s reason to hope that reduction is coming: Algorithms’ potential for perpetuating bias is a hot topic among computer scientists today, notes Sen.
Brent Hecht ’05 agrees. Hecht, a geography and computer science double major, is an assistant professor at Northwestern University and the director of the university’s People, Space, and Algorithms research group, which seeks to “identify and address societal problems that are created or exacerbated by advances in computer science.” In a lecture on algorithmic bias that Hecht presented at Macalester in 2018, he shared his belief that computer scientists should disclose potential negative societal repercussions of their research in any paper they publish about that research.
For example, wrote Hecht and co-authors in March 2018 in the ACM Future of Computing blog, imagine a hypothetical robot prototype that automates home-care chores for people who have a physical disability. The robot might reduce the cost of that care. But it also might contribute to large-scale job loss among home healthcare workers who formerly performed those chores.
To address that scenario, Hecht and co-authors argue for a more rigorous peer review process for computer science research papers, to ensure that researchers are aware of and transparent about potential negative impacts of their research. They believe a more stringent review process should, long-term, incentivize the computer science field to more deeply engage with and address potential downsides of its innovations.
In the home healthcare example, researchers would have to disclose job loss and other potential downsides of the robotic prototype, and suggest technical and policy refinements to alleviate those downsides and maximize the robot’s net benefit to society. This stringent review would also address algorithms built into the robot.
Can a cost to society arise from exposing personal data to algorithms by sharing it online? Sure, you can decide how much information to post on your Facebook page. But once it’s posted, data-mining companies can collect it. That’s what whistleblowers alleged happened in 2016, when a now-shuttered data-mining company harvested data from millions of Facebook profiles without the knowledge of the owners of that data. Harvested information was used by the Republican presidential campaign to construct and deploy a marketing strategy, say the whistleblowers.
According to The Guardian, which published the whistleblower report, social media sites were used to target voters with messages custom-tailored to those voters’ social media profiles. In the report, whistleblowers said the data-mining company monitored the effects of its messaging on different types of voters by tracking their online activity. That kept the company and the campaign aware of levels of voter engagement with various social media sites.
Information about targeted voters’ online activity allowed the company to constantly update its algorithms to more precisely target messages that the campaign sent to voters. This resulted in different categories of voters being presented with different messages based, in part, on geographical information about voters who visited YouTube’s home page, where the campaign bought ad space.
Voters whose geographical information suggested they were swing voters—uncommitted to the Republican nominee but open to persuasion—received ads showing some of the nominee’s high-profile supporters. Voters in geographical areas where the campaign believed people would support the Republican Party saw an image of the nominee looking triumphant, and information designed to help them find their nearest polling station.
Grappling with Data Privacy
Given the public’s increasing awareness of a lack of individual control over personal online data, are Mac students sharing less information on social media than they used to? Sen has taught at Macalester for a decade and doesn’t think so: “They share everything.”
In the past five years, though, “they’ve become more aware of labor and equity implications of algorithms,” he observes. “People are starting to think about the creation of online data as labor.” To understand data as labor, consider that, “People create data (reviews, ratings, searches, web clicks, etc.) that are a critical resource for a tech company’s algorithms and therefore its bottom line.”
In addition, personal data you post online may be sold to a data broker, which can sell it to marketing companies. Bottom line? Businesses can profit from your data but you do not. To push back against use of personal data posted online, California passed the California Consumer Privacy Act of 2018, the first such law in the United States. Once it goes into effect in 2020, Californians will be able to opt out of having their personal online information sold. They’ll also have additional control over, and knowledge of, how their data is handled by companies doing business in California. If those companies don’t protect personal data, the law will allow them to be sued by the people who posted the data and the state’s attorney general.
Grappling with Equity
Companies’ ability to derive revenue by using algorithms, in part, to collect personal online data has led to another prominent issue within the field of computer science today. That’s the observation that algorithms used by large companies to collect and manage data can lead to a serious redistribution of wealth, with the companies as beneficiaries. “Macalester students are watching this,” Sen says.
Algorithms’ potential to skew outcomes extends to news distribution. This especially affects people who acquire most of their information about current events online, says political science professor Adrienne E. Christiansen. As someone who studies technology’s role in shaping socio-political change, she cautions that algorithms “can mislead us into believing that more people agree with us than actually do. It warps our outlook.” That’s because our capacity to personalize what news gets sent to us electronically, whether it’s social media clickbait or formal news from reliable online sources like respected newspapers, “unquestionably shapes our perspectives unless we’ve developed a systematic practice of ‘casting the net widely’ to acquire information.”
Online news feeds, regardless of their source, decrease the likelihood you’ll encounter different perspectives. That’s because the feeds are personalized to your preferences, which a news provider’s algorithm has learned from your previous likes and clicks. Not encountering different ideas deprives you of opportunities to consider them.
“This can lead to a divided society,” Christiansen says. “I value the clash of ideas. It’s a way to test my own ideas, [but] it’s harder and harder to do that. Our enemies in other parts of the world will take advantage of those schisms [in a divided society] by exploiting them to foment discontent. What I fear terribly is that people will shrug their shoulders at that.”
The Long View
Not all algorithms are problematic. Berman’s anti-discrimination team uses algorithms to replace subjective information on Airbnb’s website with objective information, in order to remove opportunities for the company’s hosts and guests to make biased decisions. Hecht and his colleagues are working to maximize algorithms’ net benefit to society. Algorithms help people study by undergirding self-paced computer teaching programs, like the one Christiansen used to learn Danish in preparation for her 2019 sabbatical in Denmark.
Sen suggests that everyone who uses the internet should consider the consequences of their online behavior. Next time you’re using your computer, he recommends, “think about what AI [artificial intelligence, a type of algorithm] is learning from the actions you’re taking online.”
The Macalester Lens
Hecht believes his combined geography and computer science education at Macalester “has allowed me to see subjects in a different light than most of my colleagues. That combination was a big factor in our early identification of bias in algorithms as a major risk factor, and our ability to raise early warning signs of it.”
Berman took a class in computer science “by accident” her first semester at Mac, but by the end of the semester had decided to major in it. “We learned how to solve problems and answer questions using programming as a tool,” she says. “The MSCS professors were incredible mentors and helped me understand the many opportunities to apply computer science to solve real problems.”
Acquiring a wide-angle view of the world via a liberal arts education, learning to use computer science to solve real problems: “There is no one better positioned to tackle the human effects of algorithms than Macalester students,” Sen says. “They are broad thinkers, fearless, and have a strong moral compass.”
Englishwoman Ada Lovelace (1815–52) grew up privately tutored in mathematics and, at age 17, began collaborating with University of Cambridge mathematics professor Charles Babbage to solve mathematical problems.
Babbage developed plans for the Analytical Engine, considered the first programmable computer. In 1842, Lovelace translated an article about the Engine from the French, adding notes, equations, and a mathematical formula she wrote. (The translation was published in 1843.)
This formula, intended to apply the Engine to calculate complicated numbers, is considered the first computer algorithm and Lovelace, the first computer scientist. Her notes propose using the Engine to translate music, images, and text into digital format.
Lovelace’s visionary understanding of computer algorithms’ potential was proven true by British code-breaking work during World War II, when an algorithm enabled coded text within German messages to be converted to numbers, analyzed, and converted back into human-readable text. Lovelace’s name lives on in a computer language the U.S. Department of Defense named after her in 1980, and in Ada Lovelace Day, established in 2009 as the second Tuesday of every October to celebrate women’s achievements in STEM (science, technology, engineering, and mathematics) careers.
By Janet Cass ’81 / Illustrations by Carl Wiens / i2iart.com
February 1 2019Back to top