Appendix S — SPSS Tutorial: Repeated Measures ANOVA
Conducting one-way repeated measures ANOVA in SPSS
S.1 Overview
This tutorial walks through a one-way repeated measures ANOVA using the core_session.csv dataset. The research question is: Did muscular strength change significantly across three time points (pre-, mid-, and post-training) in participants enrolled in a 12-week resistance training program?
The dataset contains one row per participant per time point (long format). Before running a repeated measures ANOVA in SPSS, the data must be in wide format — one row per participant, with separate columns for each time point. We will cover this restructuring step first.
Prerequisites: You should have completed the SPSS tutorials for independent samples t-test and one-way ANOVA before working through this tutorial.
S.2 Part 1: Preparing the data (wide format)
SPSS’s repeated measures ANOVA procedure (via General Linear Model) requires the data to be in wide format, with one row per participant and separate variables for each repeated measurement.
Step 1: Open core_session.csv in SPSS
Go to File → Open → Data, locate core_session.csv, and open it. Make sure column headers are read correctly.
Step 2: Filter to the training group only
Since our research question focuses on the training group:
- Go to Data → Select Cases
- Select If condition is satisfied, click If…
- Enter the condition:
group = "training"(or use the variable’s numeric code if group is coded numerically) - Click Continue → OK
SPSS will mark non-training cases with a diagonal line; they will be excluded from subsequent analyses.
Step 3: Restructure from long to wide format
- Go to Data → Restructure
- Select Restructure selected variables into cases — wait, we want the opposite. Select Restructure selected cases into variables (wide format)
- Click Next
- Set the Identifier variable to
id(participant ID) - Set the Index variable to
time(pre, mid, post) - Move
strength_kgto the Variables to be transposed list - Click Next through the remaining steps and click Finish
SPSS creates a new dataset with one row per participant and three new variables: strength_kg.1 (pre), strength_kg.2 (mid), and strength_kg.3 (post). Rename these to strength_pre, strength_mid, and strength_post for clarity using Variable View.
After restructuring, save the new wide-format dataset (File → Save As) before running the analysis. Keeping both the original long-format and the new wide-format files on hand makes it easy to run other analyses (e.g., descriptive statistics, paired comparisons) on the original file without re-restructuring.
S.3 Part 2: Running the repeated measures ANOVA
Step 1: Open the General Linear Model dialog
Go to Analyze → General Linear Model → Repeated Measures
Step 2: Define the within-subject factor
In the Repeated Measures Define Factor(s) dialog:
- In the Within-Subject Factor Name field, type:
Time - In the Number of Levels field, type:
3 - Click Add
- Click Define
Step 3: Assign variables to the factor levels
In the Repeated Measures dialog:
- Move
strength_preto the box next toTime(1) - Move
strength_midto the box next toTime(2) - Move
strength_postto the box next toTime(3)
Step 4: Request options
Click Options:
- Check Descriptive statistics
- Check Estimates of effect size
- Check Homogeneity tests (produces Mauchly’s test)
- Under Display Means for, move
Timeto the right panel - Check Compare main effects
- Under Confidence interval adjustment, select Bonferroni
- Click Continue
Step 5: Request a means plot
Click Plots:
- Move
Timeto the Horizontal Axis box - Click Add
- Click Continue
Step 6: Run the analysis
Click OK. SPSS will produce several output tables.
S.4 Part 3: Interpreting the output
S.4.1 Table 1: Descriptive statistics
The first table shows the mean, standard deviation, and n for each time point:
| Time Point | M (kg) | SD | n |
|---|---|---|---|
| Pre-training | 79.67 | 12.26 | 30 |
| Mid-training (6-week) | 81.69 | 12.26 | 30 |
| Post-training (12-week) | 85.06 | 12.48 | 30 |
Strength increased progressively at each time point, with a total gain of approximately 5.4 kg from pre to post.
S.4.2 Table 2: Mauchly’s test of sphericity
| Mauchly’s W | χ² | df | p | ε (GG) | ε (HF) | ε (LB) | |
|---|---|---|---|---|---|---|---|
| Time | .932 | 2.13 | 2 | .054 | .936 | 1.000 | .500 |
Interpreting this table:
Mauchly’s W = .932, χ²(2) = 2.13, p = .054. Because p > .05, the sphericity assumption is not violated. We will use the Sphericity Assumed row in the within-subjects effects table.
If the sphericity assumption had been violated (p < .05), we would inspect the GG epsilon:
- ε_GG ≥ .75 → use Huynh-Feldt corrected row
- ε_GG < .75 → use Greenhouse-Geisser corrected row
The within-subjects effects table will contain four rows for the Time effect: Sphericity Assumed, Greenhouse-Geisser, Huynh-Feldt, and Lower-bound. You must determine which row to report based on Mauchly’s test result — not by choosing the row with the most favorable p-value. Selecting a row post hoc based on significance constitutes p-hacking[1,2].
S.4.3 Table 3: Tests of within-subjects effects
| Source | Correction | SS | df | MS | F | p | η²_p |
|---|---|---|---|---|---|---|---|
| Time | Sphericity Assumed | 443.73 | 2 | 221.87 | 116.0 | < .001 | .800 |
| Greenhouse-Geisser | 443.73 | 1.872 | 237.18 | 116.0 | < .001 | .800 | |
| Huynh-Feldt | 443.73 | 2.000 | 221.87 | 116.0 | < .001 | .800 | |
| Error (Time) | Sphericity Assumed | 110.94 | 58 | 1.91 | |||
| Greenhouse-Geisser | 110.94 | 54.28 | 2.04 | ||||
| Huynh-Feldt | 110.94 | 58.00 | 1.91 |
Interpreting this table:
Because Mauchly’s test was not significant, we read the Sphericity Assumed row: F(2, 58) = 116.0, p < .001, η²_p = .80.
The F-ratio is the same across all rows — only the degrees of freedom (and therefore the exact p-value) change with the corrections. Notice that because ε_GG = .936 and ε_HF = 1.000, the corrected degrees of freedom are very close to the uncorrected values, and all rows yield identical p-values here.
Partial eta-squared of .80 means that 80% of the variance in strength that is attributable to within-person sources (time + error, excluding between-person baseline differences) is explained by the time effect. This is a very large effect by Cohen’s (1988) benchmarks (.01 small, .06 medium, .14 large). The within-subjects design is so efficient here because individual strength levels are highly stable across participants — removing this between-subjects variability makes the time effect stand out very clearly.
S.4.4 Table 4: Pairwise comparisons (Bonferroni-corrected)
| Comparison | Mean Difference (kg) | SE | p (adjusted) | 95% CI |
|---|---|---|---|---|
| Pre → Mid | −2.02 | 0.27 | < .001 | [−2.70, −1.34] |
| Pre → Post | −5.38 | 0.33 | < .001 | [−6.22, −4.54] |
| Mid → Post | −3.36 | 0.45 | < .001 | [−4.50, −2.22] |
Interpreting this table:
All three pairwise comparisons are statistically significant after Bonferroni correction. Strength increased significantly from pre to mid (2.02 kg gain), from mid to post (3.36 kg gain), and from pre to post (5.38 kg total gain). The confidence intervals are narrow, indicating high precision in the estimates of change. Negative signs in the mean difference column reflect the direction of subtraction (earlier time − later time); reverse the sign for reporting (mid − pre = +2.02 kg).
S.5 Part 4: Computing effect sizes
S.5.1 Partial eta-squared (η²_p)
SPSS reports η²_p directly in the within-subjects effects table when you check Estimates of effect size in the Options dialog. Read the value straight from the Partial Eta Squared column — no additional calculation is needed. From our output: η²_p = .800.
The underlying formula, shown here for conceptual understanding only, is:
\[\eta^2_p = \frac{SS_{\text{time}}}{SS_{\text{time}} + SS_{\text{error}}}\]
where \(SS_{\text{error}}\) is the time × subjects error term from the Within-Subjects Effects table.
S.5.2 Partial omega-squared (ω²_p)
SPSS does not report ω²_p directly, but you do not need to compute it by hand. Two convenient options are available:
- Statistical Calculators appendix — Use the interactive effect size calculator in the Statistical Calculators appendix. Enter the SS and MS values from the SPSS output table and it returns ω²_p instantly.
- SPSS 31 and later — The built-in power analysis module (Analyze → Power Analysis → General Linear Model) reports ω²_p alongside η²_p for repeated measures designs.
For reference, the formula requires three values from the Within-Subjects Effects table: \(SS_{\text{time}}\), \(df_{\text{time}}\), and \(MS_{\text{error}}\), plus the number of participants \(n\):
\[\omega^2_p = \frac{SS_{\text{time}} - df_{\text{time}} \times MS_{\text{error}}}{SS_{\text{time}} + (n \cdot k - df_{\text{time}}) \times MS_{\text{error}}}\]
where \(k\) is the number of time points. For our example, this yields ω²_p = .88 — a very large effect. Report ω²_p as the primary effect size in publications; include η²_p for comparability with other studies.
S.5.3 Cohen’s d for pairwise comparisons
For each significant pairwise comparison, compute Cohen’s d from the Bonferroni output or from the difference scores directly:
| Comparison | Mean Diff | SD of Diff | Cohen’s d | Interpretation |
|---|---|---|---|---|
| Mid − Pre | 2.02 | 1.46 | 1.38 | Large |
| Post − Pre | 5.38 | 1.81 | 2.97 | Very large |
| Post − Mid | 3.36 | 2.47 | 1.36 | Large |
The SD of the difference scores is not shown in the Bonferroni output. To obtain it in SPSS, create the difference variable using Transform → Compute Variable (e.g., diff_mid_pre = strength_mid - strength_pre) and then run Analyze → Descriptive Statistics → Descriptives on that new variable. SPSS will report the SD directly in the output — no hand arithmetic needed. Repeat for each pairwise comparison, then divide the mean difference by the SD to obtain Cohen’s d.
S.6 Part 5: Creating a visualization
SPSS produces a basic estimated marginal means plot automatically when you request it in the Plots dialog. To improve it:
- Double-click the chart in the output to open it in the Chart Editor
- Add error bars: Elements → Error Bars → 95% CI or Standard Error
- Adjust axis labels and titles as needed
- Right-click → Export to save as PNG or PDF
For a more polished visualization, use the R code provided in Chapter 15 (the line plot and spaghetti plot figures).
S.7 Part 6: APA-style write-up
Using the output from this tutorial, a complete APA-style report reads:
“A one-way repeated measures ANOVA was conducted to examine the effect of training time on muscular strength (kg) in 30 participants enrolled in a 12-week resistance training program. Mauchly’s test indicated that the sphericity assumption was not violated, W(2) = .932, p = .054. The within-subjects effect of time was statistically significant, F(2, 58) = 116.0, p < .001, η²_p = .80, ω²_p = .88. Descriptive statistics indicated progressive strength gains from pre-training (M = 79.7, SD = 12.3 kg) to mid-training at six weeks (M = 81.7, SD = 12.3 kg) and post-training at twelve weeks (M = 85.1, SD = 12.5 kg). Bonferroni-corrected pairwise comparisons confirmed that all three time points differed significantly: pre vs. mid, mean difference = 2.02 kg, p < .001, 95% CI [1.34, 2.70]; mid vs. post, mean difference = 3.36 kg, p < .001, 95% CI [2.22, 4.50]; and pre vs. post, mean difference = 5.38 kg, p < .001, 95% CI [4.54, 6.22].”
S.8 Part 7: Checking the normality assumption
Repeated measures ANOVA requires normality of the difference scores between each pair of time points, not of the raw scores.
Step 1: Create difference score variables
Go to Transform → Compute Variable:
diff_mid_pre = strength_mid - strength_prediff_post_pre = strength_post - strength_prediff_post_mid = strength_post - strength_mid
Step 2: Run Shapiro-Wilk tests on each difference variable
Go to Analyze → Descriptive Statistics → Explore:
- Move all three difference variables to the Dependent List
- Click Plots, check Normality plots with tests
- Click Continue → OK
SPSS will produce Shapiro-Wilk W statistics, Q-Q plots, and histograms for each difference variable.
Interpreting the output:
- p > .05 for all three difference variables → normality is not violated; proceed with the repeated measures ANOVA
- p < .05 for one or more difference variables with n < 30 → consider the nonparametric Friedman’s ANOVA (Chapter 19)
- p < .05 with n ≥ 30 → the test is likely overpowered; inspect the Q-Q plot visually and consider ANOVA robust given the larger sample
S.9 Troubleshooting common issues
“SPSS won’t let me define the within-subject factor”
Make sure you clicked Add after typing the factor name and number of levels, before clicking Define. Both steps are required.
“I get an error about unequal group sizes”
The repeated measures GLM requires complete data for every participant across all time points. Check for missing values using Analyze → Descriptive Statistics → Frequencies with the Display frequency tables option. Participants with any missing time point will be excluded from the analysis by SPSS (listwise deletion). Address missing data using imputation or alternative software if the exclusions are substantial.
“Mauchly’s test cannot be computed”
If you have only two levels in your within-subjects factor, SPSS does not compute Mauchly’s test because a two-level factor automatically satisfies sphericity. This is expected — proceed with the sphericity-assumed F-row.
“The within-subjects effects table shows ‘Greenhouse-Geisser’ but I’m not sure which row to report”
You must base your choice on Mauchly’s p-value and ε_GG before inspecting the F-rows. If p > .05 → Sphericity Assumed. If p < .05 and ε_GG ≥ .75 → Huynh-Feldt. If p < .05 and ε_GG < .75 → Greenhouse-Geisser. Never choose based on which row gives the smallest p-value.
“The pairwise comparisons show different signs than I expected”
SPSS computes differences in the order the time points were defined (e.g., Time 1 − Time 2 = Pre − Mid). If the training group improves over time, these differences will be negative (later time minus earlier time is positive; earlier minus later is negative). Reverse the sign when describing gains in your write-up.
S.10 Practice exercises
Use the core_session.csv dataset to complete the following exercises. Filter to the appropriate group before beginning each exercise.
Training group, VO₂max: Test whether VO₂max (mL·kg⁻¹·min⁻¹) changed significantly from pre to mid to post in the training group (n = 25 with complete data). Report Mauchly’s test, the appropriate F-row, Bonferroni comparisons, and η²_p.
Control group, strength: Run the same repeated measures ANOVA for muscular strength in the control group. Compare the pattern of results to the training group. What does the F-ratio and η²_p tell you about change in the control condition?
Training group, agility: Conduct a repeated measures ANOVA for agility T-test time (s) in the training group across pre, mid, and post. Note that lower scores indicate better agility. Report the direction of change (i.e., are times getting faster?) and compute Cohen’s d for each pairwise comparison.
Write-up practice: Using the results from Exercise 1 (VO₂max, training group), write a complete APA-style results paragraph including Mauchly’s test, the omnibus F with effect size, and all three pairwise comparisons with confidence intervals.