DEV Community

MustafaLSailor
MustafaLSailor

Posted on • Updated on

Association rule extraction

Association rule extraction is an unsupervised learning method often used in fields such as data mining and market basket analysis. This method aims to find relationships between items or items in large data sets, especially items that tend to occur together.

Association rules are expressed as "X implies Y" or "X => Y", where X and Y are sets of elements. This rule states that when X elements are taken, Y elements are usually also taken.

Three main metrics are used in association rule extraction:

Image description

Support: Shows how often a rule is true. For example, it shows the frequency of occurrence of X and Y together.
Confidence: It shows the probability of receiving Y in cases where X is purchased.
Lift: Indicates how much more X and Y taken together is than expected. If the lift value is greater than 1, this indicates that X and Y are interdependent.
Algorithms such as Apriori, Eclat, and FP-Growth are widely used to extract association rules. These algorithms are often used to find frequent itemsets in a data set and then extract association rules from these itemsets.

Image description

explanation with example

Of course, I can explain association rule inference with a supermarket example.

Let's imagine that a supermarket has data on millions of products purchased by thousands of customers. Supermarket management may want to understand which products are often purchased together so they can optimize product placement or create effective cross-selling strategies.

In this case, association rule extraction can be used to find such information. For example, association rule extraction can reveal rules such as:

Bread => Milk: This rule shows that when customers buy bread, they usually also buy milk.
Toothbrush => Toothpaste: This rule shows that customers who buy a toothbrush usually also buy toothpaste.
The support, confidence and lift values of these rules determine how strong the rule is and how valid it is in real life.

For example, if the support value for the rule "Bread => Milk" is high, this indicates that bread and milk were purchased together. If the trust value is high, this indicates that a customer who buys bread usually also buys milk. If the lift value is greater than 1, this indicates that the probability of purchasing bread and milk together is greater than random.

This information can help supermarket management optimize product placement, create cross-selling strategies, and even improve inventory management.

Top comments (0)