Looking for our paper, our code, or the data we collected? Check out the researcher's page: View Researcher's Page
Today, many e-commerce websites personalize their content, including Netflix (movie recommendations), Amazon (product suggestions), and Yelp (business reviews). In many cases, personalization provides advantages for users: for example, when a user searches for an ambiguous query such as "router", Amazon may be able to suggest the woodworking tool instead of networking devices. However, personalization on e-commerce sites may also be used to the user's disadvantage by manipulating the products shown (price steering) or by customizing the prices of products (price discrimination). Unfortunately, today, we lack the tools and techniques necessary to be able to detect such behavior.
In this study, we examine the prevelence and impact of price discrimination and price steering on 16 major e-commerce sites. This includes 10 general retailers (Best Buy, Home Depot, Sears, Walmart, etc.) as well as 6 travel sites (Cheaptickets, Orbitz, Expedia, Hotels.com, Priceline, and Travelocity). We leverage the same basic measurement techniques pioneered in our study of personalization on Web search to examine how often these e-commerce sites personalize content for real users (experiment 1), as well as the specific user attributes that trigger personalization (experiment 2).
There are several key results from our study. First, based on our measurements from real people, we find that several e-commerce sites implement price discrimination and steering. Closer examination reveals that a small fraction of users receive personalized results across many sites, indicating that these users are being specifically targeted. Second, we identify specific personalization strategies being employed by seven websites:
Below, we discuss our experimental methodology and specific results in more detail.
The goal of our first experiment was to figure out how often real users receive personalized results (i.e. price steering or price discrimination) from e-commerce sites. We accomplished this by creating an experiment on Amazon Mechanical Turk (AMT). During the experiment, real people recruited from AMT would visit a webpage we built. The webpage would execute search queries selected by us on the 16 target e-commerce sites. All web traffic was forwarded from the users' browsers through a proxy server controlled by us. For each query request through our proxy we sent two more requests from "empty" accounts with no history or cookies that serve as experimental controls. Finally, the proxy would save all three sets of HTML results returned from the e-commerce sites.
The goal of our second experiment was to figure out what user attributes cause personalization. We achieved this by creating fake user accounts with features that were controlled by us. We measured the impact of specific features (cookies, operating system, browser, purchase history) by creating fake accounts that were all identical except for the one feature we were interested in testing. We used the PhantomJS web browser to automated searches from our fake accounts and save the results from e-commerce sites.
After analyzing the data we collected from real world users we noticed that most sites had some form of personalization. The graph below shows each site we experimented on and the percentage of products which had a price difference when comparing the two control accounts, and when comparing the controls to real users. The section on the left is for e-commerce sites, the middle are hotels, and the right are rental cars. In this case, the two control accounts are identical, so we expect them to receive the same prices; if they do not, it means there is noise in the underlying data (for example, maybe a hotel room was booked by someone else, so the price went up). What the graph shows is that the real users receive altered prices at much higher frequencies than the controls, which indicates price discrimination against the real users.
The pictures below is an example of price discrimination we observed in our experiments. The hotel shown on the left is the price that the real user recieved while the one on the right is what our control and comparison experiments recieved. Notice that the real user is is being shown a higher price in this case.
We noticed another difference on Cheaptickets where a users who were logged-in to were given a lower, "Members Only" prices that were not shown to non-members. The image below to the left shows the hotel shown to the member and on the right the same hotel with a higher price for the non member.
There is an interesting pattern that we noticed for many of the real users. Not every user in our tests received personalized results, but the ones that did were more likely to have personalization across many e-commerce sites. The graph below shows this pattern. The x-axis of each graph represents each individual real user and the dots highlight instances werethe coresponding user had personalization for the site on the y-axis. The vertical bars show instances where a specific real user received personalized results across more than one site.
The first sites that we examine are Cheaptickets and Orbitz. These sites are actually one company, and appear to be implemented using the same HTML structure and server-side logic. As highlighted above, we observe both sites personalizing hotel results based on user account.
Below, we have a pair of graphs showing price discrimination on Cheaptickets. The graph on the left shows the percent of hotel rooms with different prices. This graph shows that logged-in users will recieve different prices on about 5% of hotels. The graph on the right illustrates the actual dollar differences in price; we find that hotels with inconsistent prices are $12 cheaper on average for members.
Hotels.com and Expedia are also owned by a single company, and our analysis reveals that they both implement the same personalization strategy: randomized A/B tests on users. A/B testing is a common practice among large websites, and is used to test specific features of a site (for example: do people click a blue button more often than a red button?). In this case, Hotels.com and Expedia appear to be randomly dividing users among three "buckets" based on their cookie. The graph below shows that users in different buckets see different hotel rooms (figure a) in a different order (figure b). Finally, the figure (c) shows that users in two of the buckets are shown higher priced hotels towards the top of the page, which is an example of price steering.
When analyzing Priceline, we discovered that they alter hotel search results based on the user's history of clicks and purchases. Users who clicked on or reserved low-priced hotel rooms receive slightly different results in a much different order, compared to users who click on nothing, or click/reserve expensive hotel rooms. We manually examined these search results but could not locate any clear reasons for this reordering. Although it is clear that account history impacts search results on Priceline, we cannot classify the changes as steeringor price discrimination.
The image below shows how Priceline alters hotel search results based on a users's click and purchase history. Notice that the users who clicked on or reserved low-price hotel rooms diverge from the results shown to all the other experimental treatments.
For Travelocity, we discovered that they alter hotel search results for users who browse from iOS devices. The graphs below show that users browsing with Safari on iOS receive slightly different hotels, and in a much different order, than users browsing from Chrome on Android, Safari on OS X, or other desktop browsers. The takeaway from the grpahs below is that we observe evidence consistent with price discrimination in favor of iOS users on Travelocity. Unlike Cheaptickets and Orbitz, which clearly mark sale price “Members Only” deals, there is no visual cue on Travelocity’s results that indicates prices have been changed for iOS users.
Among the 10 retailers we ran experiments on, we only discovered evidence of personalization on Home Depot. Similar to our findings on Travelocity, Home Depot personalizes results for users with mobile browsers. In fact, the Home Depot website serves HTML with different structure and CSS to desktop browsers, Safari on iOS, and Chrome on Android.
The graphs below depict the results of our browser experiment on Home Depot. Strangely, Home Depot serves 24 search results per page to desktop browsers and Android, but serves 48 products per page to iOS users. We discovered the pool of results served to mobile browsers contains more expensive products overall, leading to higher nDCG scores (figure c) for mobile browsers. Thus, Home Depot is effectively steering users on mobile devices towards more expensive products.
In addition to steering, Home Depot also discriminates against Android users. We discovered that Android users consistently see differences on about 6% of prices (1 or 2 out 24 products per page). However, the practical impact of this discrimination is low: the average price difference is about $.41.
Prior to the public release of our paper, we contacted all of the companies involved so that they were aware of our research and findings. The following is an example of the email we sent (in this case, to Orbitz):
Hello, My name is Christo Wilson, and I am a professor at Northeastern University. Recently, my colleagues and I have been researching the algorithms used by major websites -- including Cheaptickets and Orbitz -- to personalize content for their users. In particular, we have just concluded a study examining the personalization algorithms used by e-commerce sites that determine what products and prices users receive when they search on these sites. Our experiments demonstrate that Cheaptickets and Orbitz personalizes search results for users. Specifically, the prices of some products are consistently altered for a subset of users on the site. This is commonly referred to as price discrimination. I have attached a pre-publication version of our paper. Our paper has been accepted at the Internet Measurement Conference 2014 (http://conferences2.sigcomm.org/imc/2014/) and will be publicly released in early September. I am reaching out to you for any comments or concerns you may have about the draft of our paper. There is still a chance to edit the manuscript before it is publicly released, and we would appreciate any feedback you have on our results. Thank you for your time. Christo Wilson Assistant Professor College of Computer Science Northeastern University http://www.ccs.neu.edu/home/cbw/
Below are the responses we recieved from companies that we were allowed to make public. Note that companies were sent a pre-publication draft of our paper, and the final version of the paper addresses the concerns raised by the responses we recieved.
Chris Chiames, Vice President, Corporate Affairs at Orbitz was kind enough to read our paper and sent us the following thorough response:
Orbitz also sent the following supporting documents along with their response:
We spoke on the phone with John Kim, Chief Product Officer and Dan Friedman, Senior Director of Stats Optimization at Expedia. They confirmed our findings that Expedia and Hotels.com perform extensive A/B testing on users. However, Mr. Friedman stated that Expedia does not implement price discrimination on rental cars, which does not agree with the results of our study. After speaking with the representatives from Expedia, we went back and double checked our results from Expedia, and confirmed that real users see altered prices on rental cars at a significantly higher frequency than our experimental controls. Anyone wishing to confirm our results is welcome to download our data and analysis scripts from the research page.
On October 28, 2014, we spoke to Keith Nowak and Blake Clark from Travelocity. We determined that there was a problem with our testing methodology on Travelocity: although our crawlers only detected personalization for iOS, in actuality Travelocity offers "Mobile Exclusive" deals for users on all mobile devices. Thus, unfortunately, the results in our paper are only partially correct. Mr. Nowak and Mr. Clark graciously agreed to keep an open-line of communication with our research team, so that we can verify any future experimental results directly with their team.
To summarize our findings: using data from real users, we find evidence for price steering and price discrimination on four general retailers and six travel sites. Overall, travel sites show price inconsistencies in a higher percentage of cases, relative to the controls, with prices increasing for AMT users by hundreds of dollars. Furthermore, we observe that many real users experience personalization across multiple sites.
Using data gathered from fake accounts, we are able to isolate specific user attributes that trigger personalization on seven e-commerce sites. This includes logging-in to an account on Cheaptickets and Orbitz, using a mobile device with Travelocity and Home Depot, purchase history on Priceline, and A/B testing on Expedia and Hotels.com.
Overall, it is difficult for us to give concrete advice about how users can obtain the lowest prices online. Each e-commerce site that we examined seems to implement different personalization techniques, and it is almost certain that these algorithms will change over time. The simple, although somewhat unsatisfying, answer is that users should try multiple strategies when shopping to see if they can get lower prices. Example strategies include: logging-in to an account on the site, visiting the site from an incognito browser window, and visiting the site from a mobile device.
Do you have any questions, comments, or concerns? Feel free to send us an email to Professor Wilson at .