References

1. Moore, D. S., McCabe, G. P., & Craig, B. A. (2021). Introduction to the practice of statistics (10th ed.). W. H. Freeman; Company.
2. Agresti, A. (2003). Categorical data analysis.
3. Conover, W. J. (1999). Practical nonparametric statistics.
4. Good, P. I. (2005). Permutation tests: A practical guide to resampling methods for testing hypotheses.
5. Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. PLoS Biology, 13(3), e1002106. https://doi.org/10.1371/journal.pbio.1002106
6. Hoenig, J. M., & Heisey, D. M. (2001). ABCs of alpha, beta, delta, and epsilon. Ecology, 82(12), 3369–3372. https://doi.org/10.1890/0012-9658(2001)082[3369:AOABDE]2.0.CO;2
7. Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124
8. Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Review of Social Psychology, 25(1), 60–75. https://doi.org/10.1080/10463283.2014.922662
9. Levene, H. (1960). Robust tests for equality of variances. Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, 278–292.
10. Osborne, J. (2002). Notes on the use of data transformations. Practical Assessment, Research & Evaluation, 8(6). https://scholarworks.umass.edu/pare/vol8/iss1/6
11. Portney, L. G., & Watkins, M. P. (2020). Foundations of clinical research: Applications to practice.
12. Senn, S. (2002). Letter to the editor: Cross-over trials in clinical research. Statistics in Medicine, 21(19), 2843–2844. https://doi.org/10.1002/sim.1097
13. Thomas, L. (2015). How to estimate power and sample size. Trauma Surgery & Acute Care Open, 1(1), e000005. https://doi.org/10.1136/tsaco-2015-000005
14. Vincent, W. J. (2005). Statistics in kinesiology.
15. Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology, 57(1), 173–181. https://doi.org/10.1348/000711004849222
16. Austin, P. C. (2015). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 50(3), 399–424. https://doi.org/10.1080/00273171.2015.1128582
17. Babyak, M. A. (2004). What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 66(3), 411–421. https://doi.org/10.1097/01.psy.0000127692.23278.a9
18. Bahr, R., Andersen, T. E., Løken, S., Myklebust, G., & Engebretsen, L. (2005). Biomechanics of lumbar intervertebral disk injuries. Medicine & Science in Sports & Exercise, 37(2), 193–199. https://doi.org/10.1249/01.mss.0000152737.17598.0b
19. Bobbert, M. F. (2000). Why is the force–velocity relationship in leg press tasks quasi-linear rather than hyperbolic? Journal of Applied Biomechanics, 16(4), 304–315. https://doi.org/10.1123/jab.16.4.304
20. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences.
21. Cormie, P., McGuigan, M. R., & Newton, R. U. (2011). Acute resistance training and changes in neuromuscular and morphological characteristics. Sports Medicine, 41(7), 557–575. https://doi.org/10.2165/11590380-000000000-00000
22. Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J., Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder, B., Skidmore, A. K., Zurell, D., & Lautenbach, S. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27–46. https://doi.org/10.1111/j.1600-0587.2012.07348.x
23. Fox, J. (2015). Applied regression analysis and generalized linear models.
24. Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and other stories.
25. Harrell, F. E. (2015). Regression modeling strategies: With applications to linear models, logistic and ordinal regression, and survival analysis.
26. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction.
27. Jackson, D. L. (1990). Structural equation modeling: A multidisciplinary journal. Structural Equation Modeling: A Multidisciplinary Journal, 1(1), 1–2. https://doi.org/10.1080/10705519409539975
28. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in r.
29. Jurca, G., Bootman, J. L., & Sokol, M. C. (2005). Assessing claims of treatment effectiveness: Is there a need for a new paradigm? Value in Health, 8(6), 727–734. https://doi.org/10.1111/j.1524-4733.2005.00058.x
30. Miles, S. (2014). A framework for understanding organizational ethics. Business Ethics: A European Review, 23(2), 154–167. https://doi.org/10.1111/beer.12044
31. Pearl, J. (2009). Causality: Models, reasoning, and inference.
32. Rohrer, J. M. (2018). Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in Methods and Practices in Psychological Science, 1(1), 27–42. https://doi.org/10.1177/2515245917745629
33. Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289–310. https://doi.org/10.1214/10-STS330
34. Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics.
35. White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(4), 817–838. https://doi.org/10.2307/1912934
36. Whittingham, M. J., Stephens, P. A., Bradbury, R. B., & Freckleton, R. P. (2006). Why do we still use stepwise modelling in ecology and behaviour? Journal of Animal Ecology, 75(5), 1182–1189. https://doi.org/10.1111/j.1365-2656.2006.01141.x
37. Willy, R. W., & Meira, E. P. (2019). The ’best’ way to build strength: An evidence-based approach to building muscle and strength. International Journal of Sports Physical Therapy, 14(6), 839–850. https://doi.org/10.26603/ijspt20190839
38. Winter, D. A. (2009). Biomechanics and motor control of human movement.
39. Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley.
40. Wilcox, R. R. (2017). Introduction to robust estimation and hypothesis testing (4th ed.). Academic Press.
41. Hippel, P. T. von. (2005). Mean, median, and skew: Correcting a textbook rule. Journal of Statistics Education, 13(2). https://doi.org/10.1080/10691898.2005.11910570
42. Bland, J. M., & Altman, D. G. (1996). Transformations, means, and confidence intervals. BMJ, 312(7038), 1079. https://doi.org/10.1136/bmj.312.7038.1079
43. Limpert, E., Stahel, W. A., & Abbt, M. (2001). Log-normal distributions across the sciences: Keys and clues. BioScience, 51(5), 341–352. https://doi.org/10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2
44. Stergiou, N., Harbourne, R. T., & Cavanaugh, J. T. (2006). Optimal movement variability: A new theoretical perspective for neurologic physical therapy. Journal of Neurologic Physical Therapy, 30(3), 120–129.
45. Stergiou, N., & Decker, L. M. (2011). Human movement variability, nonlinear dynamics, and pathology: Is there a connection? Human Movement Science, 30, 869–888. https://doi.org/10.1016/j.humov.2011.06.002
46. Hopkins, W. G. (2000). Measures of reliability in sports medicine and science. Sports Medicine, 30(1), 1–15. https://doi.org/10.2165/00007256-200030010-00001
47. Atkinson, G., & Nevill, A. M. (1998). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports Medicine, 26(4), 217–238. https://doi.org/10.2165/00007256-199826040-00002
48. Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.
49. Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156–166. https://doi.org/10.1037/0033-2909.105.1.156
50. Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2013). Non-normal data: Is ANOVA still a valid option? Psicothema, 25(4), 552–557. https://doi.org/10.7334/psicothema2013.552
51. Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3-4), 591–611. https://doi.org/10.1093/biomet/52.3-4.591
52. Razali, N. M., & Wah, Y. B. (2011). Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
53. Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use welch’s t-test instead of student’s t-test. International Review of Social Psychology, 30(1), 92–101. https://doi.org/10.5334/irsp.82
54. Ghasemi, A., & Zahediasl, S. (2012). Normality tests for statistical analysis: A guide for non-statisticians. International Journal of Endocrinology and Metabolism, 10(2), 486–489. https://doi.org/10.5812/ijem.3505
55. Bulmer, M. G. (1979). Principles of statistics.
56. Joanes, D. N., & Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 47(1), 183–189. https://doi.org/10.1111/1467-9884.00122
57. Westfall, P. H. (2014). Kurtosis as peakedness, 1905–2014. r.i.p. The American Statistician, 68(3), 191–195. https://doi.org/10.1080/00031305.2014.917055
58. Ho, J., Tumkaya, T., Aryal, S., Choi, H., & Claridge-Chang, A. (2019). Moving beyond p values: Data analysis with estimation graphics. Nature Methods, 16, 565–566. https://doi.org/10.1038/s41592-019-0470-3
59. Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of the normality assumption in large public health data sets. Annual Review of Public Health, 23, 151–169. https://doi.org/10.1146/annurev.publhealth.23.100901.140546
60. Vincent, W. J. (1999). Statistics in kinesiology.
61. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
62. Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29. https://doi.org/10.1177/0956797613504966
63. Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge.
64. Batterham, A. M., & Hopkins, W. G. (2006). Making meaningful inferences about magnitudes. International Journal of Sports Physiology and Performance, 1(1), 50–57. https://doi.org/10.1123/ijspp.1.1.50
65. Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods, 1(2), 115–129. https://doi.org/10.1037/1082-989X.1.2.115
66. Kline, R. B. (2013). Beyond significance testing: Statistics reform in the behavioral sciences.
67. Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. https://doi.org/10.3389/fpsyg.2013.00863
68. Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365–376. https://doi.org/10.1038/nrn3475
69. Krzywinski, M., & Altman, N. (2013). Points of significance: Importance of being uncertain. Nature Methods, 10(9), 809–810. https://doi.org/10.1038/nmeth.2613
70. Gardner, M. J., & Altman, D. G. (1986). Confidence intervals rather than p values: Estimation rather than hypothesis testing. BMJ, 292(6522), 746–750. https://doi.org/10.1136/bmj.292.6522.746
71. Altman, D. G., & Bland, J. M. (2000). Statistics notes: The use of transformation when comparing two means. BMJ, 312, 1153. https://doi.org/10.1136/bmj.312.7039.1153
72. Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604. https://doi.org/10.1037/0003-066X.54.8.594
73. Morey, R. D., Hoekstra, R., Rouder, J. N., Lee, M. D., & Wagenmakers, E.-J. (2016). The fallacy of placing confidence in confidence intervals. Psychonomic Bulletin & Review, 23, 103–123. https://doi.org/10.3758/s13423-015-0947-8
74. Hopkins, W. G., Marshall, S. W., Batterham, A. M., & Hanin, J. (2009). Progressive statistics for studies in sports medicine and exercise science. Medicine & Science in Sports & Exercise, 41(1), 3–13. https://doi.org/10.1249/MSS.0b013e31818cb278
75. Nakagawa, S., & Cuthill, I. C. (2007). Effect size, confidence interval and statistical significance: A practical guide for biologists. Biological Reviews, 82, 591–605. https://doi.org/10.1111/j.1469-185X.2007.00027.x
76. Kelley, K., & Preacher, K. J. (2012). On effect size. Psychological Methods, 17(2), 137–152. https://doi.org/10.1037/a0028086
77. Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. John Wiley & Sons.
78. Newcombe, R. G. (1998). Two-sided confidence intervals for the single proportion: Comparison of seven methods. Statistics in Medicine, 17, 857–872. https://doi.org/10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
79. Agresti, A., & Coull, B. A. (1998). Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician, 52(2), 119–126. https://doi.org/10.1080/00031305.1998.10480550
80. Schenker, N., & Gentleman, J. F. (2001). Judging statistical significance from confidence intervals. The American Statistician, 55(3), 182–186. https://doi.org/10.1198/000313001317098149
81. Cumming, G., & Finch, S. (2009). Inference by eye: Reading the overlap of independent confidence intervals. Statistics in Medicine, 28, 205–220. https://doi.org/10.1002/sim.3471
82. Maxwell, S. E., Delaney, H. D., & Kelley, K. (2018). Designing experiments and analyzing data: A model comparison perspective (3rd ed.). Routledge.
83. Kelley, K. (2007). Sample size planning for the coefficient of variation from the accuracy in parameter estimation approach. Behavior Research Methods, 39(4), 755–766. https://doi.org/10.3758/BF03192966
84. Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/BF03193146
85. American Psychological Association. (2020). Publication manual of the american psychological association (7th ed.). American Psychological Association.
86. Atkinson, G., & Nevill, A. M. (1998). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports Medicine, 26(4), 217–238. https://doi.org/10.2165/00007256-199826040-00002
87. Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003. https://doi.org/10.1037/0003-066X.49.12.997
88. Fisher, R. A. (1925). Statistical methods for research workers.
89. Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society A, 231, 289–337. https://doi.org/10.1098/rsta.1933.0009
90. Gigerenzer, G. (2004). Mindless statistics. Journal of Socio-Economics, 33, 587–606. https://doi.org/10.1016/j.socec.2004.09.033
91. Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108
92. Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a world beyond "p < 0.05". The American Statistician, 73(sup1), 1–19. https://doi.org/10.1080/00031305.2019.1583913
93. Goodman, S. (2008). A dirty dozen: Twelve p-value misconceptions. Seminars in Hematology, 45(3), 135–140. https://doi.org/10.1053/j.seminhematol.2008.04.003
94. Student [Gosset, W. S. (1908). The probable error of a mean. Biometrika, 6(1), 1–25. https://doi.org/10.2307/2331554
95. Welch, B. L. (1947). The generalization of "student’s" problem when several different population variances are involved. Biometrika, 34(1-2), 28–35. https://doi.org/10.1093/biomet/34.1-2.28
96. Altman, D. G., & Bland, J. M. (1995). Statistics notes: Absence of evidence is not evidence of absence. BMJ, 311, 485. https://doi.org/10.1136/bmj.311.7003.485
97. Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, p values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31, 337–350. https://doi.org/10.1007/s10654-016-0149-3
98. Ruxton, G. D. (2006). The unequal variance t-test is an underused alternative to student’s t-test and the mann-whitney u test. Behavioral Ecology, 17(4), 688–690. https://doi.org/10.1093/beheco/ark016
99. Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44, 701–710. https://doi.org/10.1002/ejsp.2023
100. Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14(5), 779–804. https://doi.org/10.3758/BF03194105
101. Kruschke, J. K. (2015). Doing bayesian data analysis: A tutorial with r, JAGS, and stan (2nd ed.). Academic Press.
102. Schoot, R. van de, Depaoli, S., King, R., Kramer, B., Märtens, K., Tadesse, M. G., Vannucci, M., Gelman, A., Veen, D., Willemsen, J., & Yau, C. (2021). Bayesian statistics and modelling. Nature Reviews Methods Primers, 1, 1. https://doi.org/10.1038/s43586-020-00001-2
103. Wagenmakers, E.-J., Marsman, M., Jamil, T., Ly, A., Verhagen, J., Love, J., Selker, R., Gronau, Q. F., Šmíra, M., Epskamp, S., Matzke, D., Rouder, J. N., & Morey, R. D. (2018). Bayesian inference for psychology. Part i: Theoretical advantages and practical ramifications. Psychonomic Bulletin & Review, 25, 35–57. https://doi.org/10.3758/s13423-017-1343-3
104. Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. Nature, 567, 305–307. https://doi.org/10.1038/d41586-019-00857-9
105. Rosenthal, R. (1986). Meta-analytic procedures for social research.
106. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
107. Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. https://doi.org/10.1177/2515245918770963
108. Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E.-J., Berk, R., Bollen, K. A., Brembs, B., Brown, L., Camerer, C., & and 62 others. (2018). Redefine statistical significance. Nature Human Behaviour, 2, 6–10. https://doi.org/10.1038/s41562-017-0189-z
109. Cowles, M., & Davis, C. (1982). On the origins of the .05 level of statistical significance. American Psychologist, 37(5), 553–558. https://doi.org/10.1037/0003-066X.37.5.553
110. Lakens, D., Adolfi, F. G., Albers, C. J., Anvari, F., Apps, M. A. J., Argamon, S. E., Baguley, T., Becker, R. B., Benning, S. D., Bradford, D. E., & and 76 others. (2018). Justify your alpha. Nature Human Behaviour, 2, 168–171. https://doi.org/10.1038/s41562-018-0311-x
111. Hoekstra, R., Morey, R. D., Rouder, J. N., & Wagenmakers, E.-J. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin & Review, 21, 1157–1164. https://doi.org/10.3758/s13423-013-0572-3
112. Matejka, J., & Fitzmaurice, G. (2017). Same stats, different graphs: Generating datasets with varied appearance and identical statistics through simulated annealing. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 1290–1294. https://doi.org/10.1145/3025453.3025912
113. Tukey, J. W. (1949). Comparing individual means in the analysis of variance. Biometrics, 5(2), 99–114. https://doi.org/10.2307/3001913
114. Games, P. A., & Howell, J. F. (1976). Pairwise multiple comparison procedures with unequal n’s and/or variances: A monte carlo study. Journal of Educational Statistics, 1(2), 113–125. https://doi.org/10.3102/10769986001002113
115. Olejnik, S., & Algina, J. (2003). Generalized eta and omega squared statistics: Measures of effect size for some common research designs. Psychological Methods, 8(4), 434–447. https://doi.org/10.1037/1082-989X.8.4.434
116. Mauchly, J. W. (1940). Significance test for sphericity of a normal n-variate distribution. Annals of Mathematical Statistics, 11(2), 204–209. https://doi.org/10.1214/aoms/1177731915
117. Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24(2), 95–112. https://doi.org/10.1007/BF02289823
118. Huynh, H., & Feldt, L. S. (1976). Estimation of the box correction for degrees of freedom from sample data in randomized block and split-plot designs. Journal of Educational Statistics, 1(1), 69–82. https://doi.org/10.3102/10769986001001069
119. Girden, E. R. (1992). ANOVA: Repeated measures. Sage.
120. Miller, G. A., & Chapman, J. P. (2001). Misunderstanding analysis of covariance. Journal of Abnormal Psychology, 110(1), 40–48. https://doi.org/10.1037/0021-843X.110.1.40
121. Huitema, B. E. (2011). The analysis of covariance and alternatives: Statistical methods for experiments, quasi-experiments, and single-case studies (2nd ed.). Wiley.
122. Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68(5), 304–305. https://doi.org/10.1037/h0025105
123. Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 327(8476), 307–310. https://doi.org/10.1016/S0140-6736(86)90837-8
124. Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012
125. Weir, J. P. (2005). Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research, 19(1), 231–240. https://doi.org/10.1519/15184.1
126. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420–428. https://doi.org/10.1037/0033-2909.86.2.420
127. Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.
128. Hollander, M., Wolfe, D. A., & Chicken, E. (2013). Nonparametric statistical methods (3rd ed.). Wiley. https://doi.org/10.1002/9781119196037
129. Scheirer, C. J., Ray, W. S., & Hare, N. (1976). The analysis of ranked data derived from completely randomized factorial designs. Biometrics, 32(2), 429–434. https://doi.org/10.2307/2529511
130. Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 143–146. https://doi.org/10.1145/1978942.1978963
131. Quade, D. (1967). Rank analysis of covariance. Journal of the American Statistical Association, 62(320), 1187–1200. https://doi.org/10.1080/01621459.1967.10500925
132. Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement of health status: Ascertaining the minimal clinically important difference. Controlled Clinical Trials, 10(4), 407–415. https://doi.org/10.1016/0197-2456(89)90005-6
133. Cook, R. J., & Sackett, D. L. (1995). The number needed to treat: A clinically useful measure of treatment effect. BMJ, 310(6977), 452–454. https://doi.org/10.1136/bmj.310.6977.452
134. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
135. Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41(5), 582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C
136. R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
137. Wickham, H., & Grolemund, G. (2016). R for data science: Import, tidy, transform, visualize, and model data. O’Reilly Media. https://r4ds.had.co.nz
138. Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. https://doi.org/10.1038/s41562-016-0021