By EKbana on December 28, 20224m read
Recommender systems are one of the most successful and widespread applications of machine learning technologies in business. You can find large-scale recommender systems in retail, video-on-demand services, or music streaming services. To develop and maintain such systems, a company typically needs a group of experienced data scientists and engineers.
Building a recommendation system for a grocery store to find out what product a customer is going to purchase is quite complicated. In the case of this grocery store, the same customers can buy a different basket of products each time. In analyzing the cases to solve this problem in this recommendation system we cannot simply feed data into a machine-learning model for the recommendation. This recommendation system is not actually like other recommendation systems.
4000 customers were selected who usually visit every month. With a higher number of visits, we could get a higher number of baskets from that customer. The data that was taken for the recommendation was the unique product in baskets in each specific transaction. There are more than 90 unique sections in the overall data. The association rule is applied in this system to identify the next section that is in the basket. The main reason to select this algorithm instead of other machine learning algorithms is to give priority to the customer choice that can be identified using past transactions and baskets. If this algorithm fails to identify the next section using the two combinations of the section then 5 similar customers are selected and the next section is identified using their baskets and transaction. The KNearest Neighbour algorithm is used to select similar customers.
Association rule mining is a technique to identify underlying relations between different items. Take the example of a Supermarket where customers can buy a variety of items. Usually, there is a pattern in what the customers buy. For instance, mothers with babies buy baby products such as milk and diapers. Damsels may buy makeup items whereas bachelors may buy beers and chips etc. In short, transactions involve a pattern. More profit can be generated if the relationship between the items purchased in different transactions can be identified.
For instance, if items A and B were bought together more frequently then several steps can be taken to increase the profit. For example:
- A and B can be placed together so that when a customer buys one of the products he doesn’t have to go far away to buy the other product.
- People who buy one of the products can be targeted through an advertisement campaign to buy the other.
- Collective discounts can be offered on these products if the customer buys both of them.
- Both A and B can be packaged together.
The process of identifying an association between products is called association rule mining.
KNearest Neighbour Algorithm
In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression:
- In k-NN classification, the output is a class membership. An object is classified by a plurality vote of its neighbours, with the object being assigned to the class most common among its k nearest neighbours (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbour.
- In k-NN regression, the output is the property value for the object. This value is the average of the values of k nearest neighbours.
- Two combinations of the section are given for specific customers.
- All the basket of that customer is extracted.
- Using Association rule find two sections with higher confidence.
- If the sections are found, the product from those sections is recommended, Otherwise, 5 similar customers are selected, and using the confidence from Association rule 2 sections are suggested.
Among 3000 customers almost 2000 customers scored more than 0 which means,
Score 0: No section matched
Score 1: 1 section matched
Score 3: 3 sections matched
Well, this recommender system result is appreciable as it guessed 70% of sections. This can be further optimized by understanding and analyzing section recommendations more.