References
1. Moore, D. S., McCabe, G. P., & Craig, B. A. (2021).
Introduction to the practice of statistics (10th ed.). W. H.
Freeman; Company.
2. Agresti, A. (2003). Categorical data analysis.
3. Conover, W. J. (1999). Practical nonparametric statistics.
4. Good, P. I. (2005). Permutation tests: A practical guide to
resampling methods for testing hypotheses.
5. Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M.
D. (2015). The extent and consequences of p-hacking in science. PLoS
Biology, 13(3), e1002106. https://doi.org/10.1371/journal.pbio.1002106
6. Hoenig, J. M., & Heisey, D. M. (2001). ABCs of alpha, beta,
delta, and epsilon. Ecology, 82(12), 3369–3372. https://doi.org/10.1890/0012-9658(2001)082[3369:AOABDE]2.0.CO;2
7. Ioannidis, J. P. A. (2005). Why most published research findings are
false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124
8. Lakens, D. (2014). Performing high-powered studies efficiently with
sequential analyses. European Review of Social Psychology,
25(1), 60–75. https://doi.org/10.1080/10463283.2014.922662
9. Levene, H. (1960). Robust tests for equality of variances.
Contributions to Probability and Statistics: Essays in Honor of
Harold Hotelling, 278–292.
10. Osborne, J. (2002). Notes on the use of data transformations.
Practical Assessment, Research & Evaluation, 8(6).
https://scholarworks.umass.edu/pare/vol8/iss1/6
11. Portney, L. G., & Watkins, M. P. (2020). Foundations of
clinical research: Applications to practice.
12. Senn, S. (2002). Letter to the editor: Cross-over trials in clinical
research. Statistics in Medicine, 21(19), 2843–2844.
https://doi.org/10.1002/sim.1097
13. Thomas, L. (2015). How to estimate power and sample size. Trauma
Surgery & Acute Care Open, 1(1), e000005. https://doi.org/10.1136/tsaco-2015-000005
14. Vincent, W. J. (2005). Statistics in kinesiology.
15. Zimmerman, D. W. (2004). A note on preliminary tests of equality of
variances. British Journal of Mathematical and Statistical
Psychology, 57(1), 173–181. https://doi.org/10.1348/000711004849222
16. Austin, P. C. (2015). An introduction to propensity score methods
for reducing the effects of confounding in observational studies.
Multivariate Behavioral Research, 50(3), 399–424. https://doi.org/10.1080/00273171.2015.1128582
17. Babyak, M. A. (2004). What you see may not be what you get: A brief,
nontechnical introduction to overfitting in regression-type models.
Psychosomatic Medicine, 66(3), 411–421. https://doi.org/10.1097/01.psy.0000127692.23278.a9
18. Bahr, R., Andersen, T. E., Løken, S., Myklebust, G., &
Engebretsen, L. (2005). Biomechanics of lumbar intervertebral disk
injuries. Medicine & Science in Sports & Exercise,
37(2), 193–199. https://doi.org/10.1249/01.mss.0000152737.17598.0b
19. Bobbert, M. F. (2000). Why is the force–velocity relationship in leg
press tasks quasi-linear rather than hyperbolic? Journal of Applied
Biomechanics, 16(4), 304–315. https://doi.org/10.1123/jab.16.4.304
20. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003).
Applied multiple regression/correlation analysis for the behavioral
sciences.
21. Cormie, P., McGuigan, M. R., & Newton, R. U. (2011). Acute
resistance training and changes in neuromuscular and morphological
characteristics. Sports Medicine, 41(7), 557–575. https://doi.org/10.2165/11590380-000000000-00000
22. Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G.,
Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J.,
Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder,
B., Skidmore, A. K., Zurell, D., & Lautenbach, S. (2013).
Collinearity: A review of methods to deal with it and a simulation study
evaluating their performance. Ecography, 36(1), 27–46.
https://doi.org/10.1111/j.1600-0587.2012.07348.x
23. Fox, J. (2015). Applied regression analysis and generalized
linear models.
24. Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and
other stories.
25. Harrell, F. E. (2015). Regression modeling strategies: With
applications to linear models, logistic and ordinal regression, and
survival analysis.
26. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The
elements of statistical learning: Data mining, inference, and
prediction.
27. Jackson, D. L. (1990). Structural equation modeling: A
multidisciplinary journal. Structural Equation Modeling: A
Multidisciplinary Journal, 1(1), 1–2. https://doi.org/10.1080/10705519409539975
28. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013).
An introduction to statistical learning: With applications in
r.
29. Jurca, G., Bootman, J. L., & Sokol, M. C. (2005). Assessing
claims of treatment effectiveness: Is there a need for a new paradigm?
Value in Health, 8(6), 727–734. https://doi.org/10.1111/j.1524-4733.2005.00058.x
30. Miles, S. (2014). A framework for understanding organizational
ethics. Business Ethics: A European Review, 23(2),
154–167. https://doi.org/10.1111/beer.12044
31. Pearl, J. (2009). Causality: Models, reasoning, and
inference.
32. Rohrer, J. M. (2018). Thinking clearly about correlations and
causation: Graphical causal models for observational data. Advances
in Methods and Practices in Psychological Science, 1(1),
27–42. https://doi.org/10.1177/2515245917745629
33. Shmueli, G. (2010). To explain or to predict? Statistical
Science, 25(3), 289–310. https://doi.org/10.1214/10-STS330
34. Tabachnick, B. G., & Fidell, L. S. (2019). Using
multivariate statistics.
35. White, H. (1980). A heteroskedasticity-consistent covariance matrix
estimator and a direct test for heteroskedasticity.
Econometrica, 48(4), 817–838. https://doi.org/10.2307/1912934
36. Whittingham, M. J., Stephens, P. A., Bradbury, R. B., &
Freckleton, R. P. (2006). Why do we still use stepwise modelling in
ecology and behaviour? Journal of Animal Ecology,
75(5), 1182–1189. https://doi.org/10.1111/j.1365-2656.2006.01141.x
37. Willy, R. W., & Meira, E. P. (2019). The ’best’ way to build
strength: An evidence-based approach to building muscle and strength.
International Journal of Sports Physical Therapy,
14(6), 839–850. https://doi.org/10.26603/ijspt20190839
38. Winter, D. A. (2009). Biomechanics and motor control of human
movement.
39. Tukey, J. W. (1977). Exploratory data analysis.
Addison-Wesley.
40. Wilcox, R. R. (2017). Introduction to robust estimation and
hypothesis testing (4th ed.). Academic Press.
41. Hippel, P. T. von. (2005). Mean, median, and skew: Correcting a
textbook rule. Journal of Statistics Education, 13(2).
https://doi.org/10.1080/10691898.2005.11910570
42. Bland, J. M., & Altman, D. G. (1996). Transformations, means,
and confidence intervals. BMJ, 312(7038), 1079. https://doi.org/10.1136/bmj.312.7038.1079
43. Limpert, E., Stahel, W. A., & Abbt, M. (2001). Log-normal
distributions across the sciences: Keys and clues. BioScience,
51(5), 341–352. https://doi.org/10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2
44. Stergiou, N., Harbourne, R. T., & Cavanaugh, J. T. (2006). Optimal movement
variability: A new theoretical perspective for neurologic physical
therapy. Journal of Neurologic Physical Therapy,
30(3), 120–129.
45. Stergiou, N., & Decker, L. M. (2011). Human movement
variability, nonlinear dynamics, and pathology: Is there a connection?
Human Movement Science, 30, 869–888. https://doi.org/10.1016/j.humov.2011.06.002
46. Hopkins, W. G. (2000). Measures of reliability in sports medicine
and science. Sports Medicine, 30(1), 1–15. https://doi.org/10.2165/00007256-200030010-00001
47. Atkinson, G., & Nevill, A. M. (1998). Statistical methods for
assessing measurement error (reliability) in variables relevant to
sports medicine. Sports Medicine, 26(4), 217–238. https://doi.org/10.2165/00007256-199826040-00002
48. Field, A. (2018). Discovering statistics using IBM SPSS
statistics (5th ed.). SAGE Publications.
49. Micceri, T. (1989). The unicorn, the normal curve, and other
improbable creatures. Psychological Bulletin, 105(1),
156–166. https://doi.org/10.1037/0033-2909.105.1.156
50. Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R.
(2013). Non-normal data: Is ANOVA still a valid option?
Psicothema, 25(4), 552–557. https://doi.org/10.7334/psicothema2013.552
51. Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance
test for normality (complete samples). Biometrika,
52(3-4), 591–611. https://doi.org/10.1093/biomet/52.3-4.591
52. Razali, N. M., & Wah, Y. B. (2011). Power comparisons of
shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests.
Journal of Statistical Modeling and Analytics, 2(1),
21–33.
53. Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists
should by default use welch’s t-test instead of student’s t-test.
International Review of Social Psychology, 30(1),
92–101. https://doi.org/10.5334/irsp.82
54. Ghasemi, A., & Zahediasl, S. (2012). Normality tests for
statistical analysis: A guide for non-statisticians. International
Journal of Endocrinology and Metabolism, 10(2), 486–489.
https://doi.org/10.5812/ijem.3505
55. Bulmer, M. G. (1979). Principles of statistics.
56. Joanes, D. N., & Gill, C. A. (1998). Comparing measures of
sample skewness and kurtosis. Journal of the Royal Statistical
Society: Series D (The Statistician), 47(1), 183–189. https://doi.org/10.1111/1467-9884.00122
57. Westfall, P. H. (2014). Kurtosis as peakedness, 1905–2014. r.i.p.
The American Statistician, 68(3), 191–195. https://doi.org/10.1080/00031305.2014.917055
58. Ho, J., Tumkaya, T., Aryal, S., Choi, H., & Claridge-Chang, A.
(2019). Moving beyond p values: Data analysis with estimation graphics.
Nature Methods, 16, 565–566. https://doi.org/10.1038/s41592-019-0470-3
59. Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The
importance of the normality assumption in large public health data sets.
Annual Review of Public Health, 23, 151–169. https://doi.org/10.1146/annurev.publhealth.23.100901.140546
60. Vincent, W. J. (1999). Statistics in kinesiology.
61. Cohen, J. (1988). Statistical power analysis for the behavioral
sciences (2nd ed.). Lawrence Erlbaum Associates.
62. Cumming, G. (2014). The new statistics: Why and how.
Psychological Science, 25(1), 7–29. https://doi.org/10.1177/0956797613504966
63. Cumming, G. (2012). Understanding the new statistics: Effect
sizes, confidence intervals, and meta-analysis. Routledge.
64. Batterham, A. M., & Hopkins, W. G. (2006). Making meaningful
inferences about magnitudes. International Journal of Sports
Physiology and Performance, 1(1), 50–57. https://doi.org/10.1123/ijspp.1.1.50
65. Schmidt, F. L. (1996). Statistical significance testing and
cumulative knowledge in psychology: Implications for training of
researchers. Psychological Methods, 1(2), 115–129. https://doi.org/10.1037/1082-989X.1.2.115
66. Kline, R. B. (2013). Beyond significance testing: Statistics
reform in the behavioral sciences.
67. Lakens, D. (2013). Calculating and reporting effect sizes to
facilitate cumulative science: A practical primer for t-tests and
ANOVAs. Frontiers in Psychology, 4, 863. https://doi.org/10.3389/fpsyg.2013.00863
69. Krzywinski, M., & Altman, N. (2013). Points of significance:
Importance of being uncertain. Nature Methods, 10(9),
809–810. https://doi.org/10.1038/nmeth.2613
70. Gardner, M. J., & Altman, D. G. (1986). Confidence intervals
rather than p values: Estimation rather than hypothesis testing.
BMJ, 292(6522), 746–750. https://doi.org/10.1136/bmj.292.6522.746
71. Altman, D. G., & Bland, J. M. (2000). Statistics notes: The use
of transformation when comparing two means. BMJ, 312,
1153. https://doi.org/10.1136/bmj.312.7039.1153
72. Wilkinson, L., & Task Force on Statistical Inference. (1999).
Statistical methods in psychology journals: Guidelines and explanations.
American Psychologist, 54(8), 594–604. https://doi.org/10.1037/0003-066X.54.8.594
73. Morey, R. D., Hoekstra, R., Rouder, J. N., Lee, M. D., &
Wagenmakers, E.-J. (2016). The fallacy of placing confidence in
confidence intervals. Psychonomic Bulletin & Review,
23, 103–123. https://doi.org/10.3758/s13423-015-0947-8
74. Hopkins, W. G., Marshall, S. W., Batterham, A. M., & Hanin, J.
(2009). Progressive statistics for studies in sports medicine and
exercise science. Medicine & Science in Sports &
Exercise, 41(1), 3–13. https://doi.org/10.1249/MSS.0b013e31818cb278
75. Nakagawa, S., & Cuthill, I. C. (2007). Effect size, confidence
interval and statistical significance: A practical guide for biologists.
Biological Reviews, 82, 591–605. https://doi.org/10.1111/j.1469-185X.2007.00027.x
76. Kelley, K., & Preacher, K. J. (2012). On effect size.
Psychological Methods, 17(2), 137–152. https://doi.org/10.1037/a0028086
77. Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein,
H. R. (2009). Introduction to meta-analysis. John Wiley &
Sons.
78. Newcombe, R. G. (1998). Two-sided confidence intervals for the
single proportion: Comparison of seven methods. Statistics in
Medicine, 17, 857–872. https://doi.org/10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
79. Agresti, A., & Coull, B. A. (1998). Approximate is better than
"exact" for interval estimation of binomial proportions. The
American Statistician, 52(2), 119–126. https://doi.org/10.1080/00031305.1998.10480550
80. Schenker, N., & Gentleman, J. F. (2001). Judging statistical
significance from confidence intervals. The American
Statistician, 55(3), 182–186. https://doi.org/10.1198/000313001317098149
81. Cumming, G., & Finch, S. (2009). Inference by eye: Reading the
overlap of independent confidence intervals. Statistics in
Medicine, 28, 205–220. https://doi.org/10.1002/sim.3471
82. Maxwell, S. E., Delaney, H. D., & Kelley, K. (2018).
Designing experiments and analyzing data: A model comparison
perspective (3rd ed.). Routledge.
83. Kelley, K. (2007). Sample size planning for the coefficient of
variation from the accuracy in parameter estimation approach.
Behavior Research Methods, 39(4), 755–766. https://doi.org/10.3758/BF03192966
84. Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007).
G*power 3: A flexible statistical power analysis program for the social,
behavioral, and biomedical sciences. Behavior Research Methods,
39(2), 175–191. https://doi.org/10.3758/BF03193146
85. American Psychological Association. (2020). Publication manual
of the american psychological association (7th ed.). American
Psychological Association.
86. Atkinson, G., & Nevill, A. M. (1998). Statistical methods for
assessing measurement error (reliability) in variables relevant to
sports medicine. Sports Medicine, 26(4), 217–238. https://doi.org/10.2165/00007256-199826040-00002
87. Cohen, J. (1994). The earth is round (p < .05). American
Psychologist, 49(12), 997–1003. https://doi.org/10.1037/0003-066X.49.12.997
88. Fisher, R. A. (1925). Statistical methods for research
workers.
89. Neyman, J., & Pearson, E. S. (1933). On the problem of the most
efficient tests of statistical hypotheses. Philosophical
Transactions of the Royal Society A, 231, 289–337. https://doi.org/10.1098/rsta.1933.0009
90. Gigerenzer, G. (2004). Mindless statistics. Journal of
Socio-Economics, 33, 587–606. https://doi.org/10.1016/j.socec.2004.09.033
91. Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on
p-values: Context, process, and purpose. The American
Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108
92. Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving
to a world beyond "p < 0.05". The American Statistician,
73(sup1), 1–19. https://doi.org/10.1080/00031305.2019.1583913
93. Goodman, S. (2008). A dirty dozen: Twelve p-value misconceptions.
Seminars in Hematology, 45(3), 135–140. https://doi.org/10.1053/j.seminhematol.2008.04.003
94. Student [Gosset, W. S. (1908). The probable error of a mean.
Biometrika, 6(1), 1–25. https://doi.org/10.2307/2331554
95. Welch, B. L. (1947). The generalization of "student’s" problem when
several different population variances are involved.
Biometrika, 34(1-2), 28–35. https://doi.org/10.1093/biomet/34.1-2.28
96. Altman, D. G., & Bland, J. M. (1995). Statistics notes: Absence
of evidence is not evidence of absence. BMJ, 311, 485.
https://doi.org/10.1136/bmj.311.7003.485
97. Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole,
C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, p
values, confidence intervals, and power: A guide to misinterpretations.
European Journal of Epidemiology, 31, 337–350. https://doi.org/10.1007/s10654-016-0149-3
98. Ruxton, G. D. (2006). The unequal variance t-test is an underused
alternative to student’s t-test and the mann-whitney u test.
Behavioral Ecology, 17(4), 688–690. https://doi.org/10.1093/beheco/ark016
99. Lakens, D. (2014). Performing high-powered studies efficiently with
sequential analyses. European Journal of Social Psychology,
44, 701–710. https://doi.org/10.1002/ejsp.2023
100. Wagenmakers, E.-J. (2007). A practical solution to the pervasive
problems of p values. Psychonomic Bulletin & Review,
14(5), 779–804. https://doi.org/10.3758/BF03194105
101. Kruschke, J. K. (2015). Doing bayesian data analysis: A
tutorial with r, JAGS, and stan (2nd ed.). Academic Press.
102. Schoot, R. van de, Depaoli, S., King, R., Kramer, B., Märtens, K.,
Tadesse, M. G., Vannucci, M., Gelman, A., Veen, D., Willemsen, J., &
Yau, C. (2021). Bayesian statistics and modelling. Nature Reviews
Methods Primers, 1, 1. https://doi.org/10.1038/s43586-020-00001-2
103. Wagenmakers, E.-J., Marsman, M., Jamil, T., Ly, A., Verhagen, J.,
Love, J., Selker, R., Gronau, Q. F., Šmíra, M., Epskamp, S., Matzke, D.,
Rouder, J. N., & Morey, R. D. (2018). Bayesian inference for
psychology. Part i: Theoretical advantages and practical ramifications.
Psychonomic Bulletin & Review, 25, 35–57. https://doi.org/10.3758/s13423-017-1343-3
104. Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists
rise up against statistical significance. Nature, 567,
305–307. https://doi.org/10.1038/d41586-019-00857-9
105. Rosenthal, R. (1986). Meta-analytic procedures for social
research.
106. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011).
False-positive psychology: Undisclosed flexibility in data collection
and analysis allows presenting anything as significant.
Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
107. Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence
testing for psychological research: A tutorial. Advances in Methods
and Practices in Psychological Science, 1(2), 259–269. https://doi.org/10.1177/2515245918770963
108. Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A.,
Wagenmakers, E.-J., Berk, R., Bollen, K. A., Brembs, B., Brown, L.,
Camerer, C., & and 62 others. (2018). Redefine statistical
significance. Nature Human Behaviour, 2, 6–10. https://doi.org/10.1038/s41562-017-0189-z
109. Cowles, M., & Davis, C. (1982). On the origins of the .05 level
of statistical significance. American Psychologist,
37(5), 553–558. https://doi.org/10.1037/0003-066X.37.5.553
110. Lakens, D., Adolfi, F. G., Albers, C. J., Anvari, F., Apps, M. A.
J., Argamon, S. E., Baguley, T., Becker, R. B., Benning, S. D.,
Bradford, D. E., & and 76 others. (2018). Justify your alpha.
Nature Human Behaviour, 2, 168–171. https://doi.org/10.1038/s41562-018-0311-x
111. Hoekstra, R., Morey, R. D., Rouder, J. N., & Wagenmakers, E.-J.
(2014). Robust misinterpretation of confidence intervals.
Psychonomic Bulletin & Review, 21, 1157–1164. https://doi.org/10.3758/s13423-013-0572-3
112. Matejka, J., & Fitzmaurice, G. (2017). Same stats, different
graphs: Generating datasets with varied appearance and identical
statistics through simulated annealing. Proceedings of the 2017 CHI
Conference on Human Factors in Computing Systems, 1290–1294. https://doi.org/10.1145/3025453.3025912
113. Tukey, J. W. (1949). Comparing individual means in the analysis of
variance. Biometrics, 5(2), 99–114. https://doi.org/10.2307/3001913
114. Games, P. A., & Howell, J. F. (1976). Pairwise multiple
comparison procedures with unequal n’s and/or variances: A monte carlo
study. Journal of Educational Statistics, 1(2),
113–125. https://doi.org/10.3102/10769986001002113
115. Olejnik, S., & Algina, J. (2003). Generalized eta and omega
squared statistics: Measures of effect size for some common research
designs. Psychological Methods, 8(4), 434–447. https://doi.org/10.1037/1082-989X.8.4.434
116. Mauchly, J. W. (1940). Significance test for sphericity of a normal
n-variate distribution. Annals of Mathematical Statistics,
11(2), 204–209. https://doi.org/10.1214/aoms/1177731915
117. Greenhouse, S. W., & Geisser, S. (1959). On methods in the
analysis of profile data. Psychometrika, 24(2),
95–112. https://doi.org/10.1007/BF02289823
118. Huynh, H., & Feldt, L. S. (1976). Estimation of the box
correction for degrees of freedom from sample data in randomized block
and split-plot designs. Journal of Educational Statistics,
1(1), 69–82. https://doi.org/10.3102/10769986001001069
119. Girden, E. R. (1992). ANOVA: Repeated measures. Sage.
120. Miller, G. A., & Chapman, J. P. (2001). Misunderstanding
analysis of covariance. Journal of Abnormal Psychology,
110(1), 40–48. https://doi.org/10.1037/0021-843X.110.1.40
121. Huitema, B. E. (2011). The analysis of covariance and
alternatives: Statistical methods for experiments, quasi-experiments,
and single-case studies (2nd ed.). Wiley.
122. Lord, F. M. (1967). A paradox in the interpretation of group
comparisons. Psychological Bulletin, 68(5), 304–305.
https://doi.org/10.1037/h0025105
123. Bland, J. M., & Altman, D. G. (1986). Statistical methods for
assessing agreement between two methods of clinical measurement.
Lancet, 327(8476), 307–310. https://doi.org/10.1016/S0140-6736(86)90837-8
124. Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and
reporting intraclass correlation coefficients for reliability research.
Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012
125. Weir, J. P. (2005). Quantifying test-retest reliability using the
intraclass correlation coefficient and the SEM. Journal
of Strength and Conditioning Research, 19(1), 231–240. https://doi.org/10.1519/15184.1
126. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations:
Uses in assessing rater reliability. Psychological Bulletin,
86(2), 420–428. https://doi.org/10.1037/0033-2909.86.2.420
127. Siegel, S., & Castellan, N. J. (1988). Nonparametric
statistics for the behavioral sciences (2nd ed.). McGraw-Hill.
128. Hollander, M., Wolfe, D. A., & Chicken, E. (2013).
Nonparametric statistical methods (3rd ed.). Wiley. https://doi.org/10.1002/9781119196037
129. Scheirer, C. J., Ray, W. S., & Hare, N. (1976). The analysis of
ranked data derived from completely randomized factorial designs.
Biometrics, 32(2), 429–434. https://doi.org/10.2307/2529511
130. Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J.
(2011). The aligned rank transform for nonparametric factorial analyses
using only ANOVA procedures. Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems, 143–146. https://doi.org/10.1145/1978942.1978963
131. Quade, D. (1967). Rank analysis of covariance. Journal of the
American Statistical Association, 62(320), 1187–1200. https://doi.org/10.1080/01621459.1967.10500925
132. Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement
of health status: Ascertaining the minimal clinically important
difference. Controlled Clinical Trials, 10(4),
407–415. https://doi.org/10.1016/0197-2456(89)90005-6
133. Cook, R. J., & Sackett, D. L. (1995). The number needed to
treat: A clinically useful measure of treatment effect. BMJ,
310(6977), 452–454. https://doi.org/10.1136/bmj.310.6977.452
134. Fawcett, T. (2006). An introduction to ROC analysis.
Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
135. Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003).
Interpretation of changes in health-related quality of life: The
remarkable universality of half a standard deviation. Medical
Care, 41(5), 582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C
136. R Core Team. (2024). R: A language and environment for
statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
137. Wickham, H., & Grolemund, G. (2016). R for data science:
Import, tidy, transform, visualize, and model data. O’Reilly Media.
https://r4ds.had.co.nz
138. Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S.,
Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J.,
Ware, J. J., & Ioannidis, J. P. A. (2017). A manifesto for
reproducible science. Nature Human Behaviour, 1, 0021.
https://doi.org/10.1038/s41562-016-0021