#artificialintelligenceinaction

Market Basket Analysis: a use case on AI and marketing campaigns

Artificial Intelligence solutions

Market basket Analysis

Market Basket Analysis is a strategic data mining technique used by retailers to enhance sales through an understanding of customer purchasing patterns. This method involves examining datasets, such as historical purchase records, to identify items that tend to be bought together.

By recognizing these co-occurrence patterns, retailers can make informed decisions to:

  • Optimize inventory management;
  • Develop effective marketing strategies;
  • Utilize cross-selling tactics;
  • Improve store layout to enhance customer engagement.

The steps to perform Market Basket Analysis can be summarized as follows:

  • Gather customer transaction data, including items purchased in each transaction, transaction time and date, and any other relevant information;
  • Clean and preprocess the data by removing irrelevant information, handling missing values, and converting data into a suitable format for analysis;
  • Utilize association rule mining algorithms like Apriori or FP-Growth to identify frequent itemsets, which are sets of items frequently purchased together in a transaction;
  • Calculate the support and confidence for each frequent itemset, expressing the probability that an item is purchased based on the purchase of another item;
  • Generate association rules based on frequent itemsets and their corresponding support and confidence values. Association rules express the likelihood of an item being purchased based on the purchase of another item;
  • Interpret the analysis results, identifying items frequently purchased together, the strength of association between items, and other relevant aspects of customer behavior and preferences;
  • Use the insights derived from the analysis to make business decisions, such as product recommendations, optimizing store layouts, and targeted marketing campaigns.

There are three main types of analyses:

  1. Identifying sets of items frequently purchased together and generating association rules expressing the likelihood of one item being purchased with another. This is used to identify relationships or associations between items in a transactional dataset;
  2. Identifying sequences of frequent items while considering the order of item purchases in a transaction, based on sequential association rules describing the probability of one item sequence being followed by another;
  3. Cluster analysis of items or similar transactions into clusters or segments based on their attributes. This helps identify customer segments with similar buying behaviors, which can lead to product recommendations and marketing strategies.

Application of Market Basket Analysis

As mentioned earlier, Market Basket Analysis can provide targeted insights in various domains, including:

  • Retail: Identifying frequently purchased product combinations and creating promotions or cross-selling strategies;
  • E-commerce: Recommending complementary products to customers to enhance their experience.
  • Restaurants: Identifying menu items often ordered together and creating meal packages or menu recommendations;
  • Healthcare: Understanding which medications are frequently prescribed together and identifying patient behavior patterns or treatment outcomes;
  • Banking/Finance: Identifying products or services often used together by customers and creating targeted marketing campaigns or bundled product offers;
  • Telecommunications: Understanding which products or services are frequently purchased together and creating service bundles to increase revenue and improve the customer experience.

Analysis with Association Rules

In data mining, association rules are one of the methods used to extract hidden relationships in data and are used for Market Basket Analysis. They were initially introduced for discovering patterns within transactions recorded in supermarket sales. For example, the rule:

{onions, potatoes} ⇒ {hamburger}

found in the analysis of a supermarket’s receipts indicates that if a customer buys onions and potatoes together, they are likely to also purchase hamburger meat.

This information can be used as a basis for marketing activities, such as promotional offers or product placement on shelves. Association rules are also used in many other areas, such as web mining, anomaly detection, and bioinformatics.

In a broader sense, association rules describe event correlations and can be viewed as probabilistic rules. Two events are correlated when they are frequently observed together.

The problem of discovering association rules can be expressed as follows:

  • Let I = {i1, i2, …, im} be a set of literals called objects (or items).
  • A transaction T is a set of objects such that T ⊆ I. A transaction database D is a set of transactions and is usually stored in a table format:
    • [Transaction ID, Item ID].
  • An itemset X is a set of objects such that X ⊆ I. It is said that a transaction T contains an itemset X if X ⊆ T.
  • The support of an itemset X (support(X)) is the fraction of transactions in D that contains X:
  • support(X) = (transactions containing X) / (total number of transactions).
  • An association rule is an implication of the form X ⇒ Y, where X and Y are itemsets, and X ∩ Y ≠ ∅.
  • X ⇒ Y has support s in the database D if and only if a fraction equal to s of transactions in D contain X ∪ Y: s = support(X ⇒ Y) = support(X ∪ Y).
  • X ⇒ Y has confidence c in the database D if and only if, among all transactions containing X, a fraction c also contains Y:
  • c = confidence(X ⇒ Y) = support(X ∪ Y) / support(X).
  • Confidence and support can also be expressed in percentage form.

Given a database D, the task of discovering association rules with at least minimum support (referred to as minsup) and minimum confidence (referred to as minconf), where minsup and minconf are user-specified values, can be decomposed into two sub-problems:

  • Find all itemsets with support above the minimum. These itemsets are called large itemsets. This sub-problem is solved by the APRIORI algorithm.
  • Generate all association rules with at least the minimum confidence from the set of large itemsets.

Use Case

Revelis is working on a project for a well-known and historic agro-food company based in Calabria. Their high-quality products are easily identifiable thanks to targeted marketing campaigns and carefully designed packaging, achieved through substantial investments that have created a well-regarded brand nationally and internationally.

In the specific case study, a system has been implemented to conduct analyses to identify products to recommend to customers in marketing initiatives. These techniques are based on Market Basket Analysis, specifically implemented through Association Rules.

By using the results of these analyses and combining them with an assessment of customer profiles based on unsupervised algorithms, marketing managers can determine, for example, which products a customer has never purchased but have frequently been bought by other customers who generally purchase similar products. This allows them to make recommendations to loyal customers for products different from their usual purchases but that may appeal to them based on the purchase history of similar customers.

The implementation of these procedures has been carried out using simple and robust technologies that enable efficient execution whenever necessary. In particular, the Rialto™ platform was used for data analysis, and the open-source Grafana system for data visualization in the form of dashboards.

To create purchase recommendations based on association rules, a series of dedicated tasks, models, and workflows have been implemented.

rialto

The first step was to acquire input from the database containing transactions made on the company’s e-commerce website and transform it into a suitable format for analysis.

Subsequently, orders placed by unregistered guest users on the site were filtered out.

Next, the orders were grouped to obtain a list of product identifiers purchased by individual users. It should be noted that each row exclusively lists the product identifiers purchased by the anonymous user.

At this point, it was possible to create association rules using various algorithms available in Rialto™:

  • FP Growth: This algorithm discovers frequent patterns in three steps:
  1. Calculating item frequencies and identifying the most frequent ones
  2. Using the FP-Tree to encode transactions
  3. Extracting frequent itemsets from the FP-Tree
  • MNR (Minimal Non Redundant): It extracts the minimal set of non-redundant association rules (Minimal Non Redundant). A closed itemset is an itemset that is not strictly included in any other itemset with the same support. A generator Y of a closed itemset X is an itemset that (1) has the same support as X and (2) does not have a subset with the same support. The MNR set is defined as the set of association rules in the form:

P1 ==> P2 / P1

where P1 is a generator of P2, P2 is a closed itemset, and the rule has minsup and minconf greater than the specified input thresholds.

  • TOP K Rules: It produces the top K association rules with the highest confidence greater than or equal to the specified minconf parameter.

The rules produced have the format defined earlier:

X ⇒ Y, where X and Y are itemsets, and X ∩ Y ≠ ∅.

The itemset X contains the products that must be purchased together to trigger the recommendation for the product(s) in itemset Y.

These rules are then applied to data from the transaction database. Whenever a user is found who has purchased the preceding products (itemset X) but not the consequent ones (itemset Y), the latter are suggested by the system.

The implementation of the described analyses is based on the Rialto™ platform, which allows Revelis to provide solutions that:

  • Support the development and execution of AI-based applications.
  • Enable the analysis and monitoring of Big Data, prediction of phenomena, and explanation of decision-making models.

In the solution designed for this use case, data visualization is achieved by integrating a Grafana-based dashboard into the system. Grafana is an interactive open-source platform for data visualization that allows users to view data through unified diagrams and charts on a single dashboard (or multiple dashboards) to facilitate understanding and interpretation.

Together with the client, two main use cases have been identified:

  • Integration of market basket analysis with user profiling techniques to generate personalized offers for a specific profiling category.
  • Creation of targeted offers for customers based on their most recent order.

The system implements these features, allowing the automatic generation of personalized marketing emails for customers using the results of Market Basket Analysis.

Conclusion

By using the system implemented by Revelis, the client has evolved its existing marketing campaign planning procedures. The ultimate goal is to increase the variety of products sold by expanding the range of products purchased by individual customers, diversifying the offering, and increasing customer loyalty.

Author Luigi Granata

Watch the video interview with Luigi Granata published on the Linkeidn Revelis page regarding the use case mentioned in this article.