Appendix B — Core Dataset Codebook

Variable Dictionary for the Mixed Neuromuscular Training Study

B.1 Purpose of this page

This codebook is the variable dictionary for the Core Dataset used throughout the book. It provides precise definitions, units, measurement scales, and missing-data conventions. It intentionally avoids repeating the study story and file navigation details, which are covered in the Core Dataset Overview.

B.2 Missing data conventions

This book treats missing data as meaningful and expects missingness to be documented when possible.

System missing: value truly absent (not collected, device failure, dropout)
User-coded missing (optional): a coded reason category, if used consistently

If user-coded missing values are used, a simple set is recommended:

Code	Meaning
-9	not assessed (planned missing or not applicable)
-8	participant unable to perform
-7	equipment or recording failure
-6	protocol deviation (invalid trial)

Use user-coded missing values only if you can keep them consistent across variables and files. Otherwise, prefer system missing with notes in a separate log.

B.3 Identifiers and design variables

Variable	Description	Units	Type	Scale	Allowed values
id	participant identifier	none	categorical	nominal	unique
group	intervention assignment	none	categorical	nominal	training, control
time	measurement time point	none	categorical	nominal	pre, mid, post
trial	trial number	none	discrete	ordinal	1, 2, 3

B.4 Participant descriptors

Variable	Description	Units	Type	Scale	Allowed values / range
age_years	age at baseline	years	continuous	ratio	plausible adult range
sex_category	self-identified sex category	none	categorical	nominal	categories as collected
height_cm	standing height	cm	continuous	ratio	plausible adult range
mass_kg	body mass	kg	continuous	ratio	plausible adult range
training_age_years	training history	years	continuous	ratio	0 and up

B.5 Session-level outcomes

B.5.1 Performance

Variable	Description	Units	Type	Scale	Allowed values / range
sprint_20m_s	20 m sprint time	s	continuous	ratio	plausible range
strength_kg	lower-body strength (e.g., leg press 1RM)	kg	continuous	ratio	plausible range
agility_ttest_s	T-test agility time	s	continuous	ratio	plausible range

B.5.2 Physiology

Variable	Description	Units	Type	Scale	Allowed values / range
vo2_mlkgmin	aerobic capacity estimate	mL·kg⁻¹·min⁻¹	continuous	ratio	plausible range
hr_rest_bpm	resting heart rate	bpm	continuous	ratio	plausible range
rpe_6_20	perceived exertion	none	ordinal	ordinal	integers 6–20

B.5.3 Clinical and self-report

Variable	Description	Units	Type	Scale	Allowed values / range
pain_0_10	pain intensity rating	none	ordinal	ordinal	integers 0–10
function_0_100	function score	none	bounded	interval_like	0–100

B.5.4 Balance (count outcome)

Variable	Description	Units	Type	Scale	Allowed values / range
balance_errors_count	number of balance errors	count	discrete	ratio_like	0 and up

B.6 Trial-level outcomes

Variable	Description	Units	Type	Scale	Allowed values / range
jump_height_cm	countermovement jump height	cm	continuous	ratio	plausible range
peak_force_n	peak force during task	N	continuous	ratio	plausible range
emg_rms_uv	EMG RMS amplitude	µV	continuous	ratio	plausible range
sway_area_cm2	sway area during balance task	cm²	continuous	ratio	plausible range

Expected distribution shapes

Some trial-level variables (for example EMG amplitude and sway area) are often right-skewed in real datasets. Always visualize before assuming symmetry.

B.7 Derived variables (created during analysis)

These variables are typically created in later chapters rather than stored in the raw dataset.

Derived variable	Definition	Typical use
change_post_pre	post − pre for an outcome	change scores and effect sizes
percent_change	100 × (post − pre) / pre	practical interpretation
mean_of_trials	mean within id and time across trials	session summaries
best_of_trials	maximum within id and time across trials	capacity summaries
z_score	standardized value within a reference group	standard scores and percentiles