Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £35.99 £24.99. New subscribers only T&Cs apply

Find out more

Multivariable Regressions

The beauty of regressions is they can account for many variables to better mirror the real world. Watch Raj Venkatesan explain more.
[NOISE], that was a lot. The first time you see a regression output, you’re going to have a sticker shock. It’s going to take you by surprise that this is a lot. But as you keep seeing it, you’re going to get used to it. So go back, watch the video on the regression output multiple times. You’ll get better at it. But let me remind you, squash is not going away this module. It’s staying right here, and maybe I’ll bring another squash and put up here. But this is where you really work hard and you gain a lot, because this regression is used a lot. And a good understanding of this is going to pay a lot of dividends.
Now, we looked at one variable promotion. Now you can be thinking, wait a minute, this guy must be kidding. Only promotion affects sales? There must be a lot of things that are affecting sales. How can it be only promotion? How will I know all of those variables? How can I put them all together? The good news is, yes, you can put all those variables that you think will affect sales into a regression function. That’s the beauty of a regression function. But when you do that, it is important to know what you put into the regression function. But also what you did not put in the regression function, right?
That is the key here and that’s what we’re going to see what it means. So the way to understand that is to look at some Simulated Shopper Card Data.
What I mean by simulated is this is data that I made up, right? So I am making up this data on units purchased, but I’m making this data up methodically, right? I’m using an equation like this, a + b1 times price paid, b2 times feature, b3 times display. And feature and display are variables that are either 1, if a product was on Feature and Display and 0 otherwise. Price is, you all know what price is, and we have units purchased here. The units purchased could be anything from three, two, one, zero. This product was not on feature, so in this first row for customer 1, price was $1.50. No feature, no display.
We observed unit sales of 3 and you got that by plugging in this equation. I’m not going to give you the values I used for a, b1, b2, b3 yet. That’s what a regression function will tell you. And if I made up this data, if I run a regression on this data, I should be able to get a, b1, b2, b3, all right? So think of this as the god equation, okay? I played God. I said, here we go. Units purchased are from now on going to be a + b1 times price paid + b2 feature + b3 display. There’s a little bit of randomness. I want to keep people on their toes.
That’s it, and I played God, and I said, this is a, b1, b2, b3. Now regression is now going to find out what that was. That’s what the beauty of regression is. In real world, this is what is happening, right? Marketers set feature, display and price and then they go and see how people react. And then they have to put together this equation back and say, on average, this is how people react. This is the weight they give for price paid. This is the weight they give for feature. This is the weight they give for display. So that’s what we’re going to do here, okay? So let’s see what happened here.
So we take all this data, and by some freak of nature, you’re so smart, you said, the only way price units change are with three variables, price paid, feature and display. And you knew that and you ran the regression. What happens here? That’s here, where you are the smart guy who found the true model and you were able to find the god equation.
So, you ran this regression, you found these coefficients, my God! They’re all the same that I put in there. Intercept was 6.28, price was -2.31, feature, display, they’re all good. Wow, you are awesome. You got an R-squared of 93%, that’s unimaginable! How did you do that? Well, you were just smart, you found out all of those things. Now assume you are not a genius. You are not a Nobel Prize winner, or someone like that. And then you said, well, make a guess. And you said units sold is a function of feature and display. I see these nice things in the store. That’s what is driving sales. And I’m going to put only those two.
I don’t know if price was in the model. I’m not going to put it. You forgot to put it and you ran this regression, and that is the estimated model. What’s the difference? Look at the intercept, it’s lower, whoa! Feature is higher. Display is higher. R-Squared is only 18%, which is okay, because price paid is not there. But why are the coefficients of feature and display different between the True Model and Estimated Model? You didn’t include price, that’s fine. But shouldn’t you get the same coefficient for feature and display if you did or did not include price? Why are the coefficients different?
Think about it and I’ll be back.

Learn how inputting many variables into your regression equation gets you closer to the true model of the way your market responds.

This article is from the free online

Marketing Analytics

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now