Appendix R — SPSS Tutorial: One-Way ANOVA

Conducting one-way between-subjects ANOVA in SPSS

NoteLearning Objectives

By the end of this tutorial, you will be able to:

  • Conduct a one-way between-subjects ANOVA in SPSS
  • Check the homogeneity of variance assumption using Levene’s test
  • Select Welch’s ANOVA or Brown-Forsythe ANOVA when variances are unequal
  • Run and interpret Tukey HSD and Games-Howell post hoc tests
  • Compute and interpret η² and ω² effect sizes
  • Create visualizations for multi-group comparisons
  • Report results following APA guidelines

R.1 Overview

One-way ANOVA is the standard procedure for comparing a continuous dependent variable across three or more independent groups. SPSS provides comprehensive tools for running the analysis, checking assumptions, conducting post hoc tests, and estimating effect sizes — all within a single dialog. This tutorial demonstrates:

  • How to set up data correctly for one-way ANOVA in SPSS
  • How to run the analysis and request assumption checks
  • How to read the Group Statistics and ANOVA source tables
  • How to select and interpret post hoc tests
  • How to compute and report η² and ω²
  • How to create effective visualizations for group comparisons

Understanding SPSS output for ANOVA requires careful attention to multiple tables: the Descriptives table, Levene’s test, the ANOVA source table, and the post hoc comparisons. This tutorial walks through each in sequence.

Prerequisites: Familiarity with SPSS data entry, basic hypothesis testing (Chapter 10), and t-tests (Chapter 13).

R.2 Dataset for this tutorial

We will use the Core Dataset (core_session.csv). Download it here: core_session.csv

For this tutorial, we will conduct the following analyses:

  1. One-way ANOVA: Compare vo2_mlkgmin (VO₂max in mL·kg⁻¹·min⁻¹) across three group levels: control, resistance, and endurance, at post-training
  2. Assumption checks: Levene’s test for homogeneity of variance, normality screening
  3. Post hoc tests: Tukey HSD (equal variances) and Games-Howell (unequal variances)
  4. Effect sizes: η² and ω²

Open the file in SPSS using File → Open → Data, select the CSV file type, and import using the Text Import Wizard. Filter to post-training observations before proceeding: Data → Select Cases → If: time = 'post'

R.3 Part 1: Running one-way ANOVA

R.3.1 Data structure

For one-way ANOVA, your SPSS data file should have:

  • One grouping variable (e.g., group) with three or more values (e.g., control, resistance, endurance)
  • One continuous dependent variable (e.g., vo2_mlkgmin)
  • Each row represents one participant

R.3.2 Procedure

  1. Analyze → Compare Means → One-Way ANOVA…
  2. Move vo2_mlkgmin to Dependent List
  3. Move group to Factor
  4. Click Options…
    • Check Descriptive (group means and SDs)
    • Check Homogeneity of variance test (Levene’s test)
    • Check Welch and Brown-Forsythe (robust tests for when variances are unequal)
    • Check Means plot
    • Click Continue
  5. Click Post Hoc…
    • Under Equal Variances Assumed, check Tukey
    • Under Equal Variances Not Assumed, check Games-Howell
    • Click Continue
  6. Click OK
TipRequesting effect sizes directly

In SPSS version 27 and later, you can request η² and ω² directly from the One-Way ANOVA dialog. Click Effect Size and check Eta squared and Omega squared. If your version does not offer this, calculate them from the source table (see Part 3 below).

R.3.3 Interpreting the output

SPSS produces several tables. Work through them in order.

R.3.3.1 Table 1: Descriptives

The Descriptives table reports the mean, standard deviation, standard error, and 95% confidence interval for each group. Review this table first to understand the direction of any differences.

Group N Mean Std. Deviation Std. Error 95% CI Lower 95% CI Upper
Control 20 40.6 6.4 1.43 37.6 43.6
Resistance 20 45.2 6.2 1.39 42.3 48.1
Endurance 20 52.4 6.8 1.52 49.2 55.6
Total 60 46.1 7.9 1.02 44.0 48.1

Note: Values are illustrative and based on simulated data matching the chapter example.

Reading this table: The endurance group has the highest mean VO₂max (52.4), followed by resistance (45.2) and control (40.6). The confidence intervals for endurance and control do not overlap, suggesting a statistically significant difference between these groups.

R.3.3.2 Table 2: Test of Homogeneity of Variances (Levene’s Test)

Levene Statistic df1 df2 Sig.
Based on Mean 0.24 2 57 .787

Reading this table: Levene’s test assesses H₀: all group variances are equal. Here, F(2, 57) = 0.24, p = .787. Because p > .05, we fail to reject the equal variance assumption — the homogeneity of variance assumption is met. We proceed with the standard ANOVA and Tukey HSD post hoc tests.

WarningWhen Levene’s test is significant (p < .05)

If Levene’s test yields p < .05, the equal variance assumption is violated. In this case:

  • Use the Welch or Brown-Forsythe robust test results (reported in a separate SPSS table) rather than the standard ANOVA F
  • Use Games-Howell post hoc tests instead of Tukey HSD
  • Report the Welch or Brown-Forsythe statistic and its corrected degrees of freedom

R.3.3.3 Table 3: ANOVA source table

Sum of Squares df Mean Square F Sig.
Between Groups 1248.6 2 624.3 14.87 < .001
Within Groups 2391.4 57 41.95
Total 3640.0 59

Reading this table:

  • Between Groups F = 14.87, p < .001 → reject H₀. There is a statistically significant effect of training group on VO₂max.
  • df_between = k − 1 = 3 − 1 = 2
  • df_within = N − k = 60 − 3 = 57
  • Report as: F(2, 57) = 14.87, p < .001

R.3.3.4 Table 4: Robust Tests of Equality of Means (Welch / Brown-Forsythe)

These alternative statistics are reported when Levene’s test is significant. In this example they are not needed, but the table would appear as:

Statistic df1 df2 Sig.
Welch 14.93 2 38.2 < .001
Brown-Forsythe 14.87 2 56.8 < .001

When you use these statistics, report the corrected df₂ (shown above as 38.2 for Welch) rather than the standard df_within.

R.4 Part 2: Post hoc tests

Because the omnibus F-test is significant, post hoc comparisons identify which specific group pairs differ.

R.4.1 Tukey HSD output (equal variances assumed)

SPSS reports all pairwise comparisons. Focus on the Mean Difference, Sig., and 95% Confidence Interval columns.

(I) group (J) group Mean Difference (I−J) Std. Error Sig. 95% CI Lower 95% CI Upper
Control Resistance −4.60 2.05 .082 −9.7 0.5
Endurance −11.80* 2.05 < .001 −16.9 −6.7
Resistance Control 4.60 2.05 .082 −0.5 9.7
Endurance −7.20* 2.05 .003 −12.3 −2.1
Endurance Control 11.80* 2.05 < .001 6.7 16.9
Resistance 7.20* 2.05 .003 2.1 12.3

* Mean difference is significant at the .05 level.

Reading this table:

  • Endurance vs. Control: mean difference = 11.8 mL·kg⁻¹·min⁻¹, p < .001, CI [6.7, 16.9] → significant
  • Endurance vs. Resistance: mean difference = 7.2 mL·kg⁻¹·min⁻¹, p = .003, CI [2.1, 12.3] → significant
  • Resistance vs. Control: mean difference = 4.6 mL·kg⁻¹·min⁻¹, p = .082, CI [−0.5, 9.7] → not significant at α = .05
NoteHomogeneous Subsets table

SPSS also produces a Homogeneous Subsets table that groups together means that do not differ significantly. Groups in the same subset column are statistically similar to one another. In this example, control and resistance would appear in Subset 1, while endurance appears in Subset 2 — confirming that endurance training produced distinctly higher VO₂max.

R.4.2 Games-Howell output (unequal variances not assumed)

If Levene’s test had been significant, you would interpret the Games-Howell table using the same logic as above, but with adjusted standard errors and corrected degrees of freedom that account for unequal variances.

R.5 Part 3: Computing effect sizes

R.5.1 Eta-squared (η²)

SPSS may report η² directly, or you can compute it from the source table:

\[ \eta^2 = \frac{SS_{\text{between}}}{SS_{\text{total}}} = \frac{1248.6}{3640.0} = .343 \]

Approximately 34% of the total variance in VO₂max is accounted for by training group — a large effect by Cohen’s (1988) benchmarks (.01 small, .06 medium, .14 large).

R.5.2 Omega-squared (ω²)

ω² provides a less biased estimate and is the preferred effect size to report:

\[ \omega^2 = \frac{SS_{\text{between}} - (k-1) \cdot MS_{\text{within}}}{SS_{\text{total}} + MS_{\text{within}}} = \frac{1248.6 - 2 \times 41.95}{3640.0 + 41.95} = \frac{1164.7}{3681.95} = .316 \]

ω² = .32 is slightly lower than η² = .34, reflecting the expected downward correction for bias.

TipGetting effect sizes automatically in SPSS 27+

In newer SPSS versions, click Effect Size in the One-Way ANOVA dialog to request η² and ω² automatically. These appear in a separate Effect Sizes table in the output. Always prefer ω² for your primary report.

R.6 Part 4: Creating visualizations in SPSS

R.6.1 Means plot

The means plot generated automatically via Options → Means plot shows group means connected by a line. While useful for a quick overview, it does not show variability. For publication-quality figures, create error bar plots manually.

R.6.3 Box plot

  1. Graphs → Legacy Dialogs → Boxplot…
  2. Select Simple and Summaries for groups of casesDefine
  3. Move vo2_mlkgmin to Variable
  4. Move group to Category Axis
  5. Click OK

Box plots show the full distribution (median, IQR, outliers) for each group, complementing the error bar plot.

R.7 Part 5: APA-style write-up

Using the output above, a complete APA-style results section would read:

A one-way between-subjects ANOVA examined the effect of training group (control, resistance, endurance) on VO₂max (mL·kg⁻¹·min⁻¹) following a 12-week intervention. Levene’s test indicated that the assumption of equal variances was met, F(2, 57) = 0.24, p = .787. The ANOVA revealed a significant effect of training group on VO₂max, F(2, 57) = 14.87, p < .001, η² = .34, ω² = .32. Post hoc comparisons using Tukey HSD indicated that the endurance group (M = 52.4, SD = 6.8) had significantly higher VO₂max than both the resistance group (M = 45.2, SD = 6.2), p = .003, 95% CI [2.1, 12.3], and the control group (M = 40.6, SD = 6.4), p < .001, 95% CI [6.7, 16.9]. The resistance and control groups did not differ significantly, p = .082, 95% CI [−0.5, 9.7].

R.8 Part 6: Checking normality per group

R.8.1 Procedure

  1. Analyze → Descriptive Statistics → Explore…
  2. Move vo2_mlkgmin to Dependent List
  3. Move group to Factor List
  4. Click Plots…
    • Check Normality plots with tests
    • Check Histogram
    • Click Continue
  5. Click OK

R.8.2 Interpreting the output

SPSS produces histograms, Q-Q plots, and the Shapiro-Wilk test for each group. For the Shapiro-Wilk test, p > .05 indicates no significant departure from normality within that group. Visually, Q-Q plots should show points falling approximately on the diagonal reference line.

NoteANOVA is robust to mild non-normality

With n ≥ 20 per group, one-way ANOVA is robust to moderate departures from normality, particularly when group sizes are approximately equal[1]. If Shapiro-Wilk is non-significant and Q-Q plots look reasonably straight, proceed with ANOVA. If normality is severely violated with small samples, consider the Kruskal-Wallis test (see the nonparametric chapter).

R.9 Troubleshooting common issues

Issue: Levene’s test is significant (p < .05)

Use the Welch or Brown-Forsythe row in the Robust Tests table instead of the standard ANOVA F. Use Games-Howell for post hoc comparisons.

Issue: One group has a very small n (< 10)

Small groups reduce power and make normality checks less reliable. Consider whether groups can be balanced through additional data collection, or use the Kruskal-Wallis test as a robust alternative.

Issue: Post hoc tests show significant differences not visible in the descriptives

Check whether the CI for the mean difference excludes zero — if it does, the difference is significant. Small but consistent mean differences can be statistically significant with larger samples.

Issue: The omnibus F is non-significant but group means look different in the plot

This usually reflects insufficient power (small sample size). Report the effect size and confidence intervals and note that the study may have been underpowered to detect this magnitude of difference.

Issue: All post hoc comparisons are non-significant even though F is significant

This occasionally occurs because the omnibus F and pairwise post hoc tests have different sensitivities. The omnibus test may detect a pattern that no single pairwise comparison captures on its own. Report the overall significant F alongside the non-significant pairwise results, and discuss which specific contrasts were examined.

R.10 Practice exercises

  1. Using core_session.csv, compare sprint_20m_s (20-m sprint time) across the three group levels at post-training. State your hypotheses, check Levene’s test, and run appropriate post hoc tests. Report the results in APA format.

  2. Run the same analysis but filter to the control group only and compare sprint performance across three time points (pre, mid, post). Note: this requires a different test — which one, and why?

  3. Suppose Levene’s test for the sprint analysis returns p = .018. Re-run the analysis using the Welch F-test and Games-Howell post hoc tests. How do the conclusions compare?

  4. Calculate η² and ω² from the source table for the sprint ANOVA. Which is larger, and why?