Recommenender System: How Amazon gets money from regular and casual surfers

Dr. Mustansar Ali Ghazanfar (AKA Dr Musi)
MSc and PhD University of Southampton, UK
Assistant Professor, UET Pakistan
Research Associate
Email:,, Inc. (NASDAQ: AMZN), a Fortune 500 company based in Seattle, hosted in WWW in July 1995 and today offers Earth’s biggest variety., Inc. is perhaps Earth’s most customer-centric company, where customers can discover and uncover anything (Books; Movies, Music & Games; Digital Downloads; Electronics & Computers; Home & Garden; Toys, Kids & Baby; Grocery; Apparel, Shoes & Jewelry; Health & Beauty; Sports & Outdoors; and Tools, Auto & Industrial are name a few) they might want to buy online, and endeavors to offer its customers the lowest possible prices. Amazons Operating cash flow increased 31% to $5.47 billion for the trailing twelve months in 2013 compared with $4.18 billion for the trailing twelve months ended December 31, 2012.In this article, we would go through the algorithm that Amazon’s product recommender engines uses to attract more customers.
A Recommender System (RS) consists of two basic entities: users and items, where users provide their opinions (ratings) about items. We denote these users by U = {u_1, u_2,…, u_M}, where the number of users using the system is |U| = M, and denote the set of items being recommended by I = {i_1,i_2,…, i_N}, with |I| = N. We can represent each element of user space U and item space I with a profile. We usually represent a user’s profile by de fining their characteristics like age, gender, geographical location, etc.; however, in simple cases we represent it by a unique user Identifier (ID). Similarly, we represent each item by de fining some characteristic; for example in a book recommender system, each book can be represented by author, topic, year of release, etc.
Recommender systems store the history of the user’s interactions with the system; for example, user purchase history, types of items they purchase together, their ratings, etc. Most of the recommender systems require users to rate some item, in order to recommend unknown items; for example, in the Netflix movie recommender system, when a new user registers they have to rate some movies in order to get proper recommendations from the system. The users will have given ratings of some but not all of the items.
Typically, the ratings are defined on a subset of I x U and not on the whole space. The task of the recommender systems then becomes to extrapolate rating (by a function) to the whole space I x U in order to make recommendations. There are different ways to extrapolate the utility function over the whole I x U space. We can use data mining and machine learning algorithms, approximation theory, and some heuristics for prediction.
Defining Users’ and Items’ Profiles:
The main building elements of the recommender systems, i.e. users and items, need to be modelled in such a way that recommendation algorithms can exploit them. Recommender systems usually get initial information about users when they first register with the system. The simplest way is to create an empty user’s profile, which is updated as the system gathers the user’s feedback. This method, however, would not be able to recommend any items unless it gathers some information about the user’s preferences. An alternative approach is where the user manually creates a profile. The user might need to give their interests (e.g. types of domain they are interested in), demographic (e.g. age, genre, etc.) information, and geographical (e.g. country) information. Another approach, used by the MovieLens video recommender system and iLike music recommender system (, LastFm ( requires user to provide ratings on a predefined set of items. For example, when a new user registers with the iLikeweb-site, the system presents them a list of artists they need to rate before getting the recommendations.
After getting the initial information, the system maintains the user’s profile, as they provide feedback. The feedback can be explicit or implicit. Explicit feedback, where the user provides their opinions about certain items, can be positive or negative and usually comes in the forms of ratings. Rating scales can be discrete, although most of the recommender systems use discrete scales. Explicit feedback can also be gathered by allowing users to write comments and opinions about certain items. In implicit feedback the user’s interaction with the item is observed; for example, web usage mining (e.g. time spent in a web page), analysing the listening/watching habits in media player (e.g. in YouTube the system might store how a user plays, re-plays, skips, and stops videos), and observing the history of the transactions in the e-commerce website (e.g. items purchased or returned by a user).
An item’s profile can be defined in different ways: (1) by getting features (or meta data) about the item, (2) by using the ratings provided by users on that item, (3) by using the domain-specific Ontologies (categories), and (4) by using demographic information about items.
The Most Famous Recommendation Algorithm: Collaborative Filtering:
The most famous and simple type of algorithm used to make recommendation is Collaborative Filtering. Collaborative Filtering recommend items by taking into account the taste (in terms of preferences of items) of users, under the assumption that users will be interested in items that users similar to them have rated highly. Examples of these systems include Amazon and Ringo. Collaborating filtering recommender systems are based on the assumption that people who agreed in the past will agree in the future too. There are three main steps to make a prediction (whether user will like it or not) for an item a user has not seen/purchased/rated before as follows:
In the first step, users rate some items they have experienced previously.
In the second step, an active user (the user for whom the recommendations are computed)’s profile is matched with other users’ profiles in the system. A set of similar users also called neighbors of the active user is found.
In the last step, predictions are made for items that the active user has not rated based on the ratings provided by its nearest neighbors. Finally, these items are presented to the active user in a suitable order.
Example Based on Figure1: User Musi has not seen the movie “The Godfather” and he is in a dilemma—whether or not to rent this movie. Only two users, Hamza and Adam have already seen this movie. He knows that Hamza has the same taste in movies as he has, as both of them have liked “Troy” and disliked “Forest Gump” movies. Furthermore, he knows that Adam has quite opposite tastes to his, as Adam has liked the movies he disliked (i.e. “Forest Gump”) and vice versa. Considering this he asks Hamza’s opinion and discards (or acts opposite to) Adam’s opinion and makes the decision accordingly. It must be noted that Fahime has exactly the same taste as Musi; however, her opinion cannot be taken into account, as she has not rated the “The Godfather”.
What Recommendation Algorithm the Amazon Uses:
The above-mentioned approach is called User-Based Collaborative Filtering (UBCF). The Amazon uses a slightly different approach—by selecting the most similar items rather than users—Item-Based Collaborative Filtering (IBCF). The steps in the IBCF are as follows:
In the first step, all items rated by an active user are retrieved.
In the second step, the target item’s similarity is computed with the set of retrieved items. A set of most similar items is selected.
In the last step, the prediction for the target item is made by computing the weighted average of the active user’s rating on the most similar items.
The rationale behind using the IBCF rather than UBCF is that, the similarities between the items can be calculated using an off-line stage and that the items set is non-volatile, i.e. mostly in e-commerce web-site, the items set changes very often compared with the users (which keeps registering with the passage of time).
In-fact, Amazon is using much simpler approach to make huge money by giving personalized recommendations to users, nonetheless, the truth is it is working very well by converting casual surfer into regular ones and regular ones into potential customers. Amazon got its own Adware program, with a reach to more than 244 million customers worldwide.
In the next post I shall show, what sort of algorithms Google search engine, Google News, Netflix, and other web-giants are using to make huge money by giving users the personalized recommendations. It is worth mentioning; more than 90% of Google’s revenue comes from personalized advertisements, whether it’s AdWords or AdSense. The total advertisement revenue (in Millions) of Google from Jan 2014 to June 2014 is $28,225 (90% of Google’s total revenue that was $31,375); from which 21,404 (68% of total revenue) comes from Websites and $6,821 (22% of total revenue) comes from Network Members’ websites.

Was This Post Helpful:

1 votes, 5 avg. rating


Leave a Reply