3.7

## Purdue University

Skip to 0 minutes and 0 secondsThis course is about collecting, analyzing and reporting data that we get from social media. Data collection is easy and fast using social media. However, analysis require a specialized set of skills. In this class you will need to know at least a modicum of statistical analysis to be successful. I will give you an example of a statistical skill that you can use in this class. In case you have this type of skill, you should be very successful in the course, the way you are. In case you do not recognize this skill as something you've learned in the past, you might have to look up some information about this type of analysis. Or maybe take one of our intro courses.

Skip to 1 minute and 2 secondsWhichever the case, however I would like to emphasize that the statistical skills you need in this class are not very advanced. They require that you understand one and fundamental thing. Which is that statistical analysis is about stating the obvious with a certain degree of certainty. And this applies to just about any type of statistical analysis you want to do. And I'll provide an example, right here and now by reviewing in the shortest possible way the logic of T-tests. Now T-tests are a statistical procedure by which you can tell if two groups are indeed different. If there are statistically different or not. Let's just take this example of a tweet that I might have posted with a cat in it.

Skip to 1 minute and 55 secondsCats are very popular on the Internet. Initially let's just say, all females like the tweet and no males like the tweet. No, we got the impressions, the people looked at it and only the females liked the tweet. Now, if that is the case, its common sense to say that there is a significant difference between males and females. Males are different than females, why? We don't know that from the statistics but the statistics tell us the difference. Now, what happens if the difference is not as large as that? Let's just say that we have one male among the ten who likes the tweet. Is the group of males now different from the group of females?

Skip to 2 minutes and 45 secondsThe T-test, the statistical T-test, will be able to tell us to what degree our certainty has moved by reducing the differences between the groups. I put that number, the statistical test number in this cell. And the number is generated by a formula which is called the T-test formula, which basically looks at how spread the values are. And how large the difference in proportions are, it's no more, no less than that. And now, we see that we, by adding one male to the mix, we can say that, our certainty has shifted. If in the beginning, we could have said that there's zero chance that males and females are not different, right?

Skip to 3 minutes and 36 secondsNow, by adding one male, we say, there's a very tiny 0.00000 chance that may be males and females are different, or maybe they're not. Now, as we add more males who like the tweet. Let's just say that we reissue the same tweet several times. And on each iteration we have more males liking the tweet. Now as you observe here at the bottom, our likelihood that the two groups are different moves from zero to higher and higher numbers. Until we can get to a situation where the test will tell us hey, you know actually this two groups are so similar that I actually cannot run. I'm a test of difference not the test to similarity.

Skip to 4 minutes and 25 secondsI'm breaking up, these two groups are not that different. So, T-test in most statistics, is meant to tell us what are the significant differences between groups, by basically looking at the characteristics of the groups by simple counts. In a future video, I will talk about the different type of analysis which is complimentary to this, which is the correlation analysis. Which looks at how similar characteristics of individuals within a group could be. But between T-tests and correlations, basically you have all the statistical skill you need to be very successful in this class.

# Statistically Significant Differences Between Groups

In this video, we will review the underlying logic and inferences behind t-tests and correlation as it relates to social media. The particular example provides an illustrative way of thinking rather than a complete method to be applied immediately.

• How will you use t-tests in your social media analysis? Share what you will compare and what you think results will be with your peers.