# Basic statistics

After we have done the necessary pre-processing of our pupillometry data, there are several approaches we can take to resolve the questions we would like to have answered using statistical analysis.

We could, of course, be interested in purely describing our data, using summary statistics to infer central tendencies like the average or median values or the spread in our data, that is, the range or standard deviations. However, typically we would also be interested in testing a hypothesis. Below, we briefly describe three such approaches after first describing some assumptions that should be met in our data.

## Statistical Assumptions

Before employing any statistical tests, we need to ensure that our data is mathematically appropriate. Otherwise, we may draw incorrect conclusions from our experimental manipulations. Any statistical test we run comes with assumptions about how our data is organized and distributed.

Some of the most common statistical tests require normality: that the data we’re comparing isn’t heavily skewed to the extremes. Additionally, t-tests require that our data is independent or paired. Independent means that one is comparing two groups of different individuals. Paired means that the same subjects provided data for both experimental conditions.

The data for each experimental condition must vary equally to conduct an analysis of variance (ANOVA). This is called homoscedasticity and means that one condition’s data isn’t more variable or noisy than another.

Such limitations may give the impression that statistical tests are confined. However, many alternative statistical tests and corrections can be applied to our data if it does not meet all these assumptions.

## t-test and ANOVA

ANOVA and t-tests are often used in the classic ‘null hypothesis significance testing’ (NHST) framework. We can examine whether there is a significant difference between the two groups using a t-test. The ANOVA can be used for more than two groups using the F-distribution. In both cases, we test whether our data comes from the same population (this is our null hypothesis).

In the human and social sciences, we typically reject the null hypothesis if the probability value (p-value) of a t-test or ANOVA is 0.05 or below. If there is less than a 5% chance that our result was a product of sheer luck, then we assume there is a significant difference between our conditions.

The NHST is widely used – but subject to controversy regarding the arbitrary cut-off point at p = 0.05. Crucially, NHST does not tell us about the magnitude of an effect, that is, how big the difference is in pupil dilation between conditions. It should, therefore, always be complemented with effect size reports, for example, Cohen’s d or partial eta-squared.

## Bayesian approaches

As an alternative to the NHST framework, the Bayesian approach uses the Bayes theorem to allow us to update our beliefs based on new evidence. In other words, how likely is our hypothesis given what we know and what we have just learned with our newly acquired data? The Bayes factor (BF) tells us this, indexing the ratio of the likelihood of one hypothesis to the possibility of another based on the data we observe. Most statistical software can be used to compute the BF.

Bayesian statistics can tell us the likelihood that there is a difference between our experimental conditions. It can also tell us the likelihood of no difference rather than just telling us that we didn’t observe any conclusive for or against a difference. The figure below (taken from JASP) illustrates how BF values can inform us about the strength of the evidence for or against our hypothesis.

Generally, the smaller your BF, the greater the evidence against your hypothesis. Larger BFs signify more robust evidence for a valid hypothesis. Values close to 1 indicate that your data does not contain evidence for or against your hypothesis, similar to a non-significant p-value from frequentist statistics.

## Time-series analyses

To this point, we have discussed approaches to statistically test our hypotheses by using extracted features of pupil data: the average or peak pupil dilation within a preset time interval (the analysis window). However, this type of analysis might miss out on one of the key advantages of pupillometry, namely, its temporal resolution. With time-series analysis, instead of extracting average or peak values, we analyze pupil dilation over some time.

One measurement that can be calculated from the pupil time series is the drift rate or the rate the pupil size changes over the time window of your choice. This can be done by simply calculating the slope of the pupil size from the beginning to the end of your time window. The drift rate of the pupil can be taken as a proxy for attentional maintenance, with more negative values demonstrating more fatigue or disengagement. The figure below shows pupil traces that demonstrate significant drift for all conditions. This will be discussed more in Week 5.

## Combining methods

Pupillometry data can be analyzed in several ways, and there is currently no consensus on right or wrong. Importantly, we should let the nature of our research questions and experimental designs inform us on the best ways to test hypotheses.

Although we have listed three different statistical approaches separately, they can also be combined. For example, we could start by analyzing differences in averages but then examine the moment when dilation starts varying.

Finally, we could use Bayes factors as a complement to provide further insight into the plausibility of an effect or, in the case of null results, test whether this should be considered an actual absence of an effect.