Chapter 11: Correlation and Bivariate Regression
Student Resources
I use the 4 “P’s” framework to help you learn the material in this chapter: Prepare, Practice, Participate, and Perform. To increase the chances to succeed in this course, I strongly encourage you to complete all four “P’s” for each chapter.
1 Prepare
1.1 Chapter Overview
This chapter introduces correlation and bivariate regression—essential tools for quantifying and modeling relationships between variables in Movement Science. You’ll learn how to compute and interpret Pearson’s correlation coefficient, fit a linear regression model, and use these methods responsibly to examine associations and make predictions from movement data.
1.2 Multimedia Resources
The following table provides access to video and slide resources for this chapter. Click the links to open them in an overlay for better viewing on all devices.
| Resource | Description | Link |
|---|---|---|
| Long Video Overview | A detailed video explaining correlation and bivariate regression, interpreting Pearson’s r, and computing regression models in movement science research. | 🔗 Watch Video |
| Slide Overview PDF | PDF slides that serve as an overview of this chapter. Read these before the textbook to introduce the main concepts and vocabulary. | 🔗 Download PDF |
| Slide Deck HTML | Interactive HTML slides for class. During class, the instructor controls the presentation; after class, review at your own pace. | 🔗 Open Slides |
| Slide Deck PDF | PDF version of the slide deck for download and offline viewing. | 🔗 Download PDF |
1.3 Read the Chapter
Read (Weir & Vincent, 2021, p. Ch11) and (Furtado, 2026, p. Ch11) to understand how to quantify relationships between variables using correlation and bivariate regression.
To succeed in this course, you must read the textbook chapters assigned for each topic. This is the only way to learn the material in depth.
Once done, proceed to the next section to practice what you learned.
2 Practice
Practicing what you learned in the chapter is essential to mastering the material. Below are some resources to help you practice the material in this chapter.
2.1 Frequently Asked Questions
Correlation measures the strength and direction of the linear relationship between two continuous variables. It quantifies the degree to which two variables tend to change together systematically—that is, whether knowing the value of one variable helps predict the value of the other. Positive correlation means higher values of one variable tend to occur with higher values of the other (e.g., leg strength and vertical jump height). Negative correlation means higher values of one variable tend to occur with lower values of the other (e.g., body mass and endurance performance). A correlation of zero indicates no systematic linear relationship.
Pearson’s correlation coefficient (\(r\)) is the most common measure of correlation. It ranges from \(-1\) to \(+1\): - \(r = +1\): Perfect positive linear relationship - \(r = -1\): Perfect negative linear relationship - \(r = 0\): No linear relationship - \(|r| > 0.7\): Strong correlation (approximate) - \(0.4 < |r| < 0.7\): Moderate correlation (approximate) - \(|r| < 0.4\): Weak correlation (approximate)
These thresholds are approximate and context-dependent. Always interpret the magnitude of \(r\) relative to what is theoretically expected and practically meaningful in your specific research domain.
No. This is the most important limitation of correlation. A strong correlation between two variables does not mean that changes in one variable cause changes in the other. Possible explanations for a correlation include: 1. Confounding variables: A third unmeasured variable influences both (e.g., hot weather causes both increased ice cream sales and more drownings). 2. Reverse causation: The assumed direction of causation may be backwards. 3. Spurious correlations: The association may be coincidental with no meaningful connection.
Establishing causation requires experimental evidence: random assignment, manipulation of the independent variable, and control of confounders. Always use cautious language: “X is associated with Y,” not “X causes Y.”
No. Pearson’s \(r\) quantifies only linear associations. Two variables may have a strong, systematic non-linear relationship but show a weak or near-zero \(r\). For example, the relationship between exercise intensity and lactate concentration is exponential, and the Yerkes-Dodson inverted-U relationship between arousal and performance would yield \(r \approx 0\) despite a clear pattern. This is why always plotting your data (scatterplot) before computing \(r\) is essential—visual inspection reveals nonlinearity that \(r\) cannot detect.
Several factors can artificially influence the magnitude of \(r\): - Outliers: A single extreme data point can dramatically inflate or deflate \(r\). Always inspect scatterplots for outliers. - Restriction of range: If the range of values for one or both variables is artificially narrowed (e.g., studying only elite athletes), \(r\) will be attenuated. - Measurement error: Unreliable measurements reduce the observed \(r\) relative to the true relationship. - Sample size: Small samples produce unstable \(r\) estimates with wide confidence intervals.
Bivariate (simple) linear regression models the relationship between one predictor variable (\(X\)) and one outcome variable (\(Y\)) using a straight line: \[\hat{Y} = b_0 + b_1 X\] Where \(b_0\) is the intercept (predicted value of \(Y\) when \(X = 0\)) and \(b_1\) is the slope (the change in predicted \(Y\) for a one-unit increase in \(X\)). The regression line is determined by the least squares criterion: it minimizes the sum of squared differences between observed and predicted values. Regression is used when the goal is prediction, whereas correlation is used to quantify the strength of association.
- Slope (\(b_1\)): For every one-unit increase in \(X\), the predicted \(Y\) changes by \(b_1\) units. A positive slope indicates a positive relationship; a negative slope indicates a negative relationship. The slope has units (units of \(Y\) per unit of \(X\)).
- Intercept (\(b_0\)): The predicted value of \(Y\) when \(X = 0\). The intercept is often not directly interpretable if \(X = 0\) is outside the range of the data.
Example: If the regression equation for predicting jump height (cm) from leg strength (kg) is \(\hat{Y} = 10.2 + 0.44X\), then for every 1 kg increase in leg strength, predicted jump height increases by 0.44 cm.
\(R^2\) (the coefficient of determination) is the square of Pearson’s \(r\) and represents the proportion of variance in \(Y\) explained by \(X\). It ranges from 0 to 1: - \(R^2 = 0.64\) means 64% of the variability in \(Y\) is accounted for by \(X\) - The remaining \(1 - R^2\) is unexplained variance (residual)
\(R^2\) is a measure of effect size and practical importance, not just statistical significance. A statistically significant \(r\) can have a small \(R^2\) in large samples, meaning the predictor accounts for little practical variance.
Both methods assume: 1. Linearity: The relationship between \(X\) and \(Y\) is linear (check with a scatterplot) 2. Independence: Observations are independent of one another 3. Homoscedasticity: The variance of residuals is constant across all levels of \(X\) (check with a residual plot) 4. Normality of residuals: For inference (hypothesis tests, CIs), residuals should be approximately normally distributed
Violating these assumptions—especially linearity and homoscedasticity—can produce misleading results. Always inspect plots before trusting numerical output.
Correlation: Report \(r\), sample size, and p-value (or confidence interval): - “Leg strength was significantly correlated with vertical jump height, \(r(6) = .99\), \(p < .001\).”
Regression: Report the equation, \(R^2\), and significance of the model: - “Leg strength significantly predicted vertical jump height (\(b = 0.44\), \(\beta = .99\)), \(R^2 = .98\), \(F(1, 6) = 314.2\), \(p < .001\).”
Always include a scatterplot with the regression line when reporting regression results.
2.2 Test your Knowledge
Take this low-stakes quiz to test your knowledge of the material in this chapter. This quiz is for practice only and will help you identify areas where you may need additional review.
3 Participate
This section includes activities and discussions that will be completed during class time. Your active participation is essential for deepening your understanding of the material.
During class, we will: - Construct and interpret scatterplots to visualize bivariate relationships - Compute and interpret Pearson’s correlation coefficient for Movement Science datasets - Distinguish between correlation and causation using real-world examples - Fit a bivariate regression model and interpret the slope, intercept, and \(R^2\) - Identify violations of assumptions (linearity, homoscedasticity) using residual plots - Practice reporting correlation and regression results in APA format
4 Perform
4.1 Apply Your Learning
Now that you’ve prepared, practiced, and participated, it’s time to demonstrate your mastery of the material through assignments and assessments.
I strongly encourage you to complete the previous “Ps” (Prepare, Practice, Participate) before attempting any assignments or assessments associated with this chapter.