Auditing Algorithms @ Northeastern

This site is the homepage for the Algorithm Auditing Research Group within the College of Computer and Information Science at Northeastern University. Here, you will find explanations of and links to our work, as well as open-source data and code from our research.

Why Audit Algorithms?

Today, we are surrounded by algorithmic systems in our everyday life. Examples on the web include Google Search, which personalizes search results to try and surface more relevant content; Amazon and Netflix, which recommend products and media; and Facebook, which personalizes each user's news-feed to highlight engaging content. Algorithms are also increasingly appearing in real world contexts, like surge pricing for vehicles from Uber; predictive policing algorithms that attempt to infer where crimes will occur and who will commit them; and credit scoring systems that determine eligibility for loans and credit cards. The proliferation of algorithms is driven by the explosion of Big Data that is available about people's online and offline behavior.

Although there are many cases where algorithms are beneficial to users, scientists and regulators are concerned that they may also harm individuals. For example, sociologists and political scientists worry that online Filter Bubbles may create "echo chambers" that increase political polarization. Similarly, personalization on e-commerce sites can be used to implement price discrimination. Furthermore, algorithms may exhibite racial and gender discrimination if they are trained on biased datasets. As algorithmic system proliferate, the potential for (unintentional) harmful consequences to users increases.

If we are going to live in a society permeated by sophisticated algorithms, then it is imperative that we understand how these algorithms are being implemented, the data they are using, and the effect that they have on individuals. Below, you will find links to specific research projects that our group has undertaken to address these issues.

Search Engines, Maps, and Filter Bubbles

Personalization on Google Search

Billions of people around the world rely on Google Search as their gateway to information on the Web. In this project, we examine how Google Search personalizes results for users, and what types of search queries are more heavily personalized.

View Details

Geolocal Personalization

One of the most important features used by search engines to personalize content is the user's location. In this follow-up to our Google Search paper, we specifically focus on how location impacts search results.

View Details

International Borders on Maps

Online mapping services like Google and Bing Maps often personalize international borders of countries based on the location of the user viewing the map. In this study, we exhaustively catalog cases of border personalization around the world, including several instances that had never been documented before.

View Details

Suppressing SEME

Previous research has shown that politically biased search results can dramatically impact the voting preferences of undecided voters; the so-called Search Engine Manipulation Effect. In our work, we have studied the effectiveness of design interventions that reveal bias in search engine results to users.

View Details

Bias and Discrimination

Discrimination in the Gig-economy

Gig-economy websites often solicit ratings and reviews of workers from customers. This social feedback is critical for the success of workers, as it may influence the hiring decisions of future customers. However, as we show in this study, social feedback from customers can be gender biased against female workers, as well as racially biased against workers of color. Furthermore, we also observe that female and minority workers appear lower in search results, possibly due to the effects of biased social feedback.

Read the Paper

Gender and Academic Performance

Although a machine learning algorithm may achieve high-accuracy overall, its performance may be significantly worse for specific subpopulations of invididuals, especially if they are under-represented in training data. In this study, we examine techniques for fairly and accurately predicting academic performance in gender-imbalanced enviroments (i.e. STEM courses).

Read the Paper

E-commerce and Marketplaces

Price Discrimination

On the web, it is possible for e-commerce sites to personalize the prices of products for each person, a phenomenon known as price discrimination. In this project, we measure personalization on major e-commerce and travel sites to identify cases of price discrimination, as well as a related technique called price steering.

View Details

Surge Pricing on Uber

Ridesharing services have become extremely popular, but they also use a controversial surge pricing algorithm to dynamically adjust prices. In this study, we examine Uber's surge pricing algorithm to understand how it works, how customers can avoid it, and whether it actually incentivizes drivers to change their behavior.

View Details

Algorithmic Pricing on Amazon

Amazon Marketplace is a competitive environment that pits third-party sellers against Amazon itself. In this study, we examine Amazon's Buy Box algorithm, and investigate the dynamic pricing algorithms adopted by sellers to adjust their prices in real-time.

View Details

Online Advertising and Tracking

Tracing Information Flows

It is common knowledge that trackers and advertisers collect information about people as they browse the web. However, most people do not realize that these companies collaborate with each other to increase the amount of data they can collect. In this study, we develop a methodology that can causally infer the data sharing relationships between online ad companies. This gives us an unprecendented ability to understand the roles that different companies play in the tracking ecosystem, as well as reason about the implications for users' privacy.

View Details

“Recommended for You”

Many news websites and blogs embed widgets from Content Recommendation Networks (CRNs) like Outbrain and Taboola. These are the boxes of links with headlines like “Around the Web”. In the past, CRNs have been criticised and fined for spreading spammy links, and not disclosing that many of their recommended links are actually paid advertisements. In this study, we survey the CRNs to determine if they are properly disclosing advertisements, and what kinds of ads they are promoting.

View Details

Interested in getting involved in exciting research at Northeastern? Please visit Volunteer Science to find out how you can contribute!