Skip main navigation

Here’s what I did

Ian Witten explains what he did with the supermarket data

The top 10 rules involve total = high and predict bread-and-cake, supported by 723 transactions.

  • They all have a consequent of “bread and cake”
  • They all indicate a high total transaction amount
  • “biscuits” and “frozen foods” appear in many of them.

You have to be careful about interpreting association rules. They are merely associations, not necessarily causal relations. If we are interested in total, for example, should we try to convince people who already buy biscuits, frozen foods and fruit to buy bread and cake as well? – because according to the above rule this combination tends to be associated with a high total transaction amount. This is flawed reasoning: the product combination does not cause a high total. Those 723 transactions probably include a vast assortment of other items, in addition to those mentioned in the rule.

However, an interesting exercise might be to model the path through the store required to collect associated items and see whether changes to that path (shorter, longer, displayed offers, etc) have an effect on transaction amount or basket size.

I continued as follows.

  1. Remove the total attribute. The top 10 rules still all predict bread-and-cake, so remove bread-and-cake
  2. No rules found, so reduce lowerBoundMinSupport to 0.05
  3. The top 10 rules now all predict vegetables
  4. The most interesting rule is

     beef=t fruit=t potatoes=t 287 ==> vegetables=t 273 conf: 0.95
  5. Remove vegetables. Now everything predicts frozen foods, so remove it
  6. Only one rule resulted, so I reduced lowerBoundMinSupport further, to 0.025
  7. Now the top 10 rules predict biscuits, so remove it
  8. Now the top 10 rules predict baking needs, with one exception:

     laundry needs=t wrapping=t dental needs=t prepared meals=t 132 ==> tissues-paper prd=t 125 conf:(0.95)
  9. Remove baking needs. Now the top 10 rules predict tissues-paper prd, so remove it
  10. The top rules now predict sauces-gravy-pkle, margarine, fruit

Here are the resulting rules:

1. canned vegetables=t puddings-deserts=t party snack foods=t cheese=t
fruit=t 126 ==> sauces-gravy-pkle=t 116 conf:(0.92)
2. canned vegetables=t puddings-deserts=t party snack foods=t cheese=t
margarine=t 134 ==> sauces-gravy-pkle=t 123 conf:(0.92)
3. juice-sat-cord-ms=t canned vegetables=t breakfast food=t
sauces-gravy-pkle=t jams-spreads=t cheese=t 131 ==> margarine=t 120
conf:(0.92)
4. juice-sat-cord-ms=t canned fruit=t canned vegetables=t milk-cream=t
department137=t 141 ==> fruit=t 129 conf:(0.91)
5. canned fruit=t canned vegetables=t sauces-gravy-pkle=t jams-spreads=t
cheese=t 129 ==> margarine=t 118 conf:(0.91)
6. canned vegetables=t confectionary=t party snack foods=t wrapping=t
cheese=t 128 ==> sauces-gravy-pkle=t 117 conf:(0.91)
7. juice-sat-cord-ms=t canned vegetables=t sauces-gravy-pkle=t
jams-spreads=t party snack foods=t cheese=t 148 ==> margarine=t 135
conf:(0.91)
8. juice-sat-cord-ms=t canned vegetables=t puddings-deserts=t
party snack foods=t cheese=t 133 ==> sauces-gravy-pkle=t 121
conf:(0.91)
9. canned vegetables=t breakfast food=t sauces-gravy-pkle=t jams-spreads=t
party snack foods=t cheese=t 133 ==> margarine=t 121 conf:(0.91)
10. juice-sat-cord-ms=t canned fruit=t canned vegetables=t jams-spreads=t
cheese=t 128 ==> margarine=t 116 conf:(0.91)

At this point I became bored and gave up. I’m not very interested in supermarkets. And here in New Zealand I would be very unlikely to buy canned vegetables, so none of these rules would apply to me.

The following Discussion step invites you to share what you did with this data, along with your thoughts on market basket analysis in general.

This article is from the free online

More Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now