Appendix A — Core Dataset Overview

Mixed Neuromuscular Training Study in Recreationally Active Adults

A.1 Purpose of this page

This page is an orientation guide to the Core Dataset used throughout the book. It focuses on the study story, file structure, and how to navigate repeated measures. For precise variable definitions, units, allowed values, and missing-data conventions, use the Core Dataset Codebook.

TipWhere to look for variable details

Use the Core Dataset Codebook whenever you need to confirm: - what a variable represents, - its units, - its type and measurement scale, - allowed values or plausible ranges, - missing value conventions.

A.2 Study story (high level)

The Core Dataset represents a Movement Science study of recreationally active adults completing a 6-week mixed neuromuscular training program.

A.2.1 Design

  • Between-subjects factor: training vs control
  • Within-subjects factor: pre, mid, post
  • Some outcomes are collected in repeated trials: 3 trials per time point

The dataset is intentionally designed to support most topics in an introductory statistics sequence, including repeated measures, relationships and prediction, and reliability concepts.

NoteImportant clarification

This dataset is synthetic. It is created for teaching and demonstration. It is designed to behave like realistic Movement Science data, but it does not represent real people.

A.3 Files included and when to use each

The Core Dataset is stored as three CSV files.

TipDownloading CSV files

Some Internet browsers may try to open CSV files as text. To avoid this, right-click the link and choose “Save link as…” or similar wording, depending on your browser.

A.3.1 1) Participant table: core_participants.csv

Click to download: core_participants.csv

Use this file for describing the sample at baseline (participant characteristics).

  • One row per participant
  • Key: id

A.3.2 2) Session table: core_session.csv

Click to download: core_session.csv

Use this file for most analyses that treat each time point as a single observation per participant (for example pre vs post comparisons, repeated measures designs, or correlations using session-level measures).

  • Up to three rows per participant (pre, mid, post)
  • Key: id + time

A.3.3 3) Trial table: core_trials.csv

Click to download: core_trials.csv

Use this file when you need to work with repeated attempts within each time point (for example reliability, within-session patterns, or comparing trial summaries like mean of 3 vs best of 3).

  • Up to nine rows per participant (3 time points × 3 trials)
  • Key: id + time + trial

A.4 Keys, identifiers, and repeated measures

A key is the set of columns that uniquely identifies each row. Keys are essential because they prevent accidental duplication and help you correctly handle repeated measures.

A.5 Unit of analysis reminder

Many statistical methods in this book assume the participant is the independent unit. Trials are repeated observations nested within participants. Treating trials as independent cases can exaggerate certainty and distort results.

A.6 Working with trial data: two common strategies

You will see two defensible approaches in later chapters, depending on the research question.

A.7 Strategy A: Keep trial-level data

Use trial-level data when you care about:

  • within-session patterns (fatigue, learning, adaptation across attempts)
  • measurement consistency and reliability concepts
  • variability across trials as part of performance
  • variability across trials as part of performance

A.8 Strategy B: Summarize trials into a session value

Use trial summaries when you want one value per participant per time point.

Common summaries:

  • mean of 3 trials (typical performance)
  • best of 3 trials (capacity)
  • median of 3 trials (robust typical performance)
TipBook convention

Unless a chapter is explicitly about trial-to-trial patterns or reliability, we will often summarize trial outcomes into one session value (usually mean of trials) so the analysis aligns with participant-level inference.

A.10 Suggested first activities (for students)

  1. Identify the key for each file and explain why each key is different.
  2. Write one sentence describing the study design (between-subjects factor and within-subjects factor).
  3. Explain why trial data are not automatically independent observations.
  4. Choose a trial-based outcome and argue for mean of trials vs best of trials, depending on a research question.

A.11 Where this dataset appears in the book

  • Chapters 2–3: dataset structure, tables, visualization, screening
  • Chapters 4–7: center, variability, normal curve, standard scores
  • Chapters 8–9: sampling error, confidence intervals
  • Chapter 11: correlation and bivariate regression (strength_kg and jump_cm at the pre-training time point; also used in the SPSS Tutorial: Correlation and Bivariate Regression)
  • Chapters 10–16: t tests and ANOVA family, repeated measures, factorial designs, ANCOVA
  • Chapter 18: reliability using trial data
  • Chapter 19: nonparametric methods using ordinal and count outcomes
TipNext step

Keep this overview bookmarked. Use it when you feel unsure about which file to use or how repeated measures are structured. Use the Codebook when you need precise variable definitions and units.