Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Here's what I did

The top 10 rules involve total = high and predict bread-and-cake, supported by 723 transactions.

  • They all have a consequent of “bread and cake”
  • They all indicate a high total transaction amount
  • “biscuits” and “frozen foods” appear in many of them.

You have to be careful about interpreting association rules. They are merely associations, not necessarily causal relations. If we are interested in total, for example, should we try to convince people who already buy biscuits, frozen foods and fruit to buy bread and cake as well? – because according to the above rule this combination tends to be associated with a high total transaction amount. This is flawed reasoning: the product combination does not cause a high total. Those 723 transactions probably include a vast assortment of other items, in addition to those mentioned in the rule.

However, an interesting exercise might be to model the path through the store required to collect associated items and see whether changes to that path (shorter, longer, displayed offers, etc) have an effect on transaction amount or basket size.

I continued as follows.

  1. Remove the total attribute. The top 10 rules still all predict bread-and-cake, so remove bread-and-cake
  2. No rules found, so reduce lowerBoundMinSupport to 0.05
  3. The top 10 rules now all predict vegetables
  4. The most interesting rule is

     beef=t fruit=t potatoes=t 287 ==> vegetables=t 273 conf: 0.95
    
  5. Remove vegetables. Now everything predicts frozen foods, so remove it
  6. Only one rule resulted, so I reduced lowerBoundMinSupport further, to 0.025
  7. Now the top 10 rules predict biscuits, so remove it
  8. Now the top 10 rules predict baking needs, with one exception:

     laundry needs=t wrapping=t dental needs=t prepared meals=t 132 ==> tissues-paper prd=t 125 conf:(0.95)
    
  9. Remove baking needs. Now the top 10 rules predict tissues-paper prd, so remove it
  10. The top rules now predict sauces-gravy-pkle, margarine, fruit

Here are the resulting rules:

1. canned vegetables=t puddings-deserts=t party snack foods=t cheese=t
      fruit=t 126 ==> sauces-gravy-pkle=t 116 conf:(0.92)
2. canned vegetables=t puddings-deserts=t party snack foods=t cheese=t 
      margarine=t 134 ==> sauces-gravy-pkle=t 123 conf:(0.92)
3. juice-sat-cord-ms=t canned vegetables=t breakfast food=t 
      sauces-gravy-pkle=t jams-spreads=t cheese=t 131 ==> margarine=t 120 
      conf:(0.92)
4. juice-sat-cord-ms=t canned fruit=t canned vegetables=t milk-cream=t 
      department137=t 141 ==> fruit=t 129 conf:(0.91)
5. canned fruit=t canned vegetables=t sauces-gravy-pkle=t jams-spreads=t 
      cheese=t 129 ==> margarine=t 118 conf:(0.91)
6. canned vegetables=t confectionary=t party snack foods=t wrapping=t 
      cheese=t 128 ==> sauces-gravy-pkle=t 117 conf:(0.91)
7. juice-sat-cord-ms=t canned vegetables=t sauces-gravy-pkle=t 
      jams-spreads=t party snack foods=t cheese=t 148 ==> margarine=t 135 
      conf:(0.91)
8. juice-sat-cord-ms=t canned vegetables=t puddings-deserts=t 
      party snack foods=t cheese=t 133 ==> sauces-gravy-pkle=t 121 
      conf:(0.91)
9. canned vegetables=t breakfast food=t sauces-gravy-pkle=t jams-spreads=t 
      party snack foods=t cheese=t 133 ==> margarine=t 121 conf:(0.91)
10. juice-sat-cord-ms=t canned fruit=t canned vegetables=t jams-spreads=t 
      cheese=t 128 ==> margarine=t 116 conf:(0.91)

At this point I became bored and gave up. I’m not very interested in supermarkets. And here in New Zealand I would be very unlikely to buy canned vegetables, so none of these rules would apply to me.

The following Discussion step invites you to share what you did with this data, along with your thoughts on market basket analysis in general.

Share this article:

This article is from the free online course:

More Data Mining with Weka

The University of Waikato

Contact FutureLearn for Support