Guess the p-value and effect size

The aim is to guess the p-value and effect size for a two-tailed between-groups t-test for equal variances. You can gain up to 20 points per attempt.

By default, the graph will show mean +/- SD, SE, or 95% Margin of Error (MOE). Pay attention to the text above the graph to interpret the error bars correctly.

The nickname shown below will be used for the leaderboard if you complete more than 10 attempts (see Leaderboard tab).

Guess the p-value:

Guess Cohen's d:

Leaderboard showing average scores and total points for participants with at least 10 attempts.

Ranks are shown in parentheses next to the values.

Filter by Nickname:

Background

Data and statistical result visualisations are very frequently used in scientific publications. Ideally, a graph should provide us with easy-to-understand information on statistical significance as well as effect size. The aim of this Shiny app is to better understand to what degree different types of graphs achieve this aim.

Note that in theory every type of graph presented by the app (independent of the error bar displayed) provides you with the necessary information to determine the p-value and Cohen’s d, as the statistical measures needed to compute these measures are mutually dependent. However, the idea isn’t to measure the exact lengths of the error bars using a ruler and then entering them into formulas. The idea rather is to see how good we are at approximating the p-value and the effect size using a visual representation of the data and the statistical results.

To help you with the approximation, the next section provides you with some rules of thumb for converting different types of error bars. The final two section provide the exact formulas (but reading these sections is not necessary for having a go at guessing!).

Rules of thumb

Here are some rules of thumb for visually estimating the sizes of error bars not shown on the plot. These should hopefully be helpful in improving the accuracy of your guesses. Keep in mind that these rules are for a two-tailed between-groups t-test, assuming equal variances and equal sample sizes $n$.

Abbreviations:

SD: standard deviation
SE: standard error
MOE: margin of error—the margin of error is half the width of the confidence interval

SD to SE to MOE

SD to SE:

Average the two SDs (as shown further below this is not quite accurate, but probably good enough for a rough estimate).
Multiply the result with 1.5 (to be precise, $\sqrt{2}$).
Divide the result by $\sqrt{n}$.

SE to MOE:

Multiply SE by 2 (approximaton of the critical t-value) to obtain the margin of error

An example

Suppose each group has a sample size of 25, and the first group has an SD of 35 and the second group has an SD of 45.

SD to SE

Average the two SDs: 40
Multiply by 1.5: 60
Divide by 5 (square root of $n$): 12 (for comparison: result with proper formula = 11.31)

SE to MOE

Multiply SE by 2: 24 (for comparison: result with proper formula = 22.75)

MOE to SE to SD

MOE to SE:

Average the margins of error for both groups.
Divide the result by 2.

SE to SD:

Multiply SE by $\sqrt{n}$.
Take $\frac{2}{3}$ of the result.

Error bars displayed in the figure

As is common in publications, the error bars displayed are for the individual group means. This section describes how the relevant measures (SD, SE and MOE) are calculated for each group individually. The next section describes how to calculate the relevant measures for the difference between group means. Note that the calculations presented in the next section are what actually matters for the t-test!

Standard deviation

The standard deviation for each group is calculated as:

$$ SD = \sqrt{\frac{\Sigma_{i=1}^n (x_i - \overline{x})^2}{n-1}} $$ Where:

$x_i$: Individual data points (from $i = 1$ to $n$).
$\overline{x}$: Sample mean.
$(x_i - \overline{x})^2$: Squared deviation of each data point from the mean.
$\sum_{i=1}^n (x_i - \overline{x})^2$: Sum of squared deviations, also called the sum of squares (SS).
$n - 1$: Degrees of freedom.

Standard error

The standard error (SE) for each group is calculated as: $$ SE = \frac{SD}{\sqrt{n}} $$ Where:

$SD$: Standard deviation.
$n$: Sample size.

Margin of error

The margin of error (MOE) for each group calculated as: $$ \text{MOE} = SE \cdot t_{\text{crit}} $$

Where:

$SE$: Standard error.
$t_{\text{crit}}$: Critical value from the t-distribution with $n - 1$ degrees of freedom.

The confidence interval for each group is:

$$ CI = \bar{x} \pm \text{MOE} $$ Where:

$\bar{x}$: Sample mean.
$MOE$: Margin of error.

Error bars for the mean difference

As mentioned above, what actually matters for our t-test is the variability of the difference between group means. The relevant measures are calculated as explained below. Note that all of the formulas assume equal variances for the two groups.

Pooled standard deviation

For a two-group between-subjects design with equal variances, the formula for computing the pooled standard deviation is:

$$ SD_{\text{pooled}} = \sqrt{ \frac{(n_1 - 1)SD_A^2 + (n_2 - 1)SD_B^2}{n_A + n_B - 2} } $$

Where:

$SD_A$ and $SD_B$: Sample standard deviations of group A and group B,
$n_A$ and $n_B$: Sample sizes of group A and group B.

When $n_A = n_B$, the pooled standard deviation simplifies to: $$ SD_{\text{pooled}} = \sqrt{ \frac{SD_A^2 + SD_B^2}{2} } $$

This is the root mean square (RMS) of the two standard deviations.

If both standard deviations are the same, this further simplifies to the simple arithmetic mean of the two standard deviations: $$ SD_{\text{pooled}} = \frac{SD_A + SD_B}{2} $$

Standard error

For a two-group between-subjects design with equal variances, the formula for computing the standard error of the mean difference is:

$$ SE_{\text{diff}} = \frac{ SD_{\text{pooled}} }{ \sqrt{ \frac{n_A \cdot n_B}{n_A + n_B} } } = SD_{\text{pooled}} \cdot \sqrt{ \frac{1}{n_A} + \frac{1}{n_B} } $$

Where:

$SD_{\text{pooled}}$: Pooled standard deviation.
$n_A$ and $n_B$: Sample sizes of group A and group B.

If $n_A = n_B$, this simplifies to: $$ SE_{\text{diff}} = SD_{\text{pooled}} \cdot \sqrt{ \frac{2}{n} } = \frac{ SD_{\text{pooled}} \cdot \sqrt{2} }{ \sqrt{n} } $$

Margin of error

For a two-group between-subjects design with equal variances, the margin of error (MOE) for the mean difference is calculated as:

$$ MOE_{\text{diff}} = SE_{\text{diff}} \cdot t_{\text{crit}} $$

Where:

$SE_{\text{diff}}$: Standard error of the difference between means.
$t_{\text{crit}}$: Critical value from a t-distribution with $n_1 + n_2 - 2$ degrees of freedom.

The confidence interval for the mean difference is:

$$ CI_{\text{diff}} = (\bar{x}_{A} - \bar{x}_{B}) \pm MOE_{\text{diff}} $$

Where:

$\bar{x}_A, \bar{x}_B$: Sample means of groups A and B.
$MOE_{\text{diff}}$: Margin of error for the mean difference.