Reporting of clustering techniques in sports sciences: a scoping review


Abstract


Multivariate statistical methods are among the most used ones in sportssciences with clustering methods emerging as prominent unsupervised learningtechniques. This study presents a scoping review of original articlesutilizing clustering techniques in sports sciences, following the PRISMASCRguidelines. A comprehensive search across various databases using theboolean “AND” combination of “clustering” and “sport” yielded 278 articles.Notably, 86.7% of these articles were published within the last 14 years,with a predominant focus (66.2%) on sports performance analysis. The majorityof studies included professional athletes (56.4%), with football/soccer,basketball, and tennis being the most commonly studied sports, representing12.2%, 7.5%, and 2.2% of the selected articles, respectively. Hierarchical clusteringwas the most frequently used method (31.6%), followed by the k-meansalgorithm for partitional clustering. However, the clustering method was notreported in 26.6% of the articles, and 55.0% did not specify the criterion usedfor determining the optimal number of clusters. Moreover, more than 85%of the articles lacked computational details related to data reproducibility.These findings underscore the urgent need for substantial improvement in reportingpractices regarding the methodology, algorithms, criteria for clusteridentification, and software usage in sports science literature.

Keywords: Cluster analysis, unsupervised learning, scoping review, sports sciences.

References


Aggarwal, C. C. and Reddy, C. K. (2014). Data Clustering: Algorithms and Applications. CRC Press.

Alamar, B. C. (2013). Sports analytics: A guide for coaches, managers, and other decision makers. Columbia University Press.

Albert, J., Glickman, M. E., Swartz, T. B., and Koning, R. H. (2017). Handbook of statistical methods and analyses in sports. Crc Press.

Anıl Duman, E., Sennaro˘glu, B., and Tuzkaya, G. (2024). A cluster analysis of basketball players for each of the five traditionally defined positions. Proceedings of the Institution of Mechanical Engineers, Part P: Journal of Sports Engineering and Technology,238(1):55–75.

Anzer, G., Bauer, P., and Brefeld, U. (2021). The origins of goals in the german bundesliga. Journal of Sports Sciences, 39(22):2525–2544.

Ball, K. and Best, R. (2007). Different centre of pressure patterns within the golf stroke i: Cluster analysis. Journal of sports sciences, 25(7):757–770.

Bianchi, F., Facchinetti, T., and Zuccolotto, P. (2017). Role revolution: towards a newmeaning of positions in basketball. Electronic Journal of Applied Statistical Analysis, 10(3):712–734.

Carpita, M., Pasca, P., Arima, S., and Ciavolino, E. (2023). Clustering of variables methodsand measurement models for soccer players’ performances. Annals of Operations Research, 325(1):37–56.

Cartigny, E., Fletcher, D., Coupland, C., and Bandelow, S. (2021). Typologies of dualcareer in sport: A cluster analysis of identity and self-efficacy. Journal of Sports Sciences, 39(5):583–590.

Cavedon, V., Z. P. S. M. M. M. B. M. P. I. . M. C. (2023). Coaching strategies in wheelchair basketball: A statistical approach for player selection on the court. undersubmission.

Chandran, A., Brown, D., Nedimyer, A. K., and Kerr, Z. Y. (2019). Statistical methods for handling observation clustering in sports injury surveillance. Journal of athletic training, 54(11):1192–1196.

Dalton-Barron, N., Palczewska, A., Weaving, D., Rennie, G., Beggs, C., Roe, G., and Jones, B. (2022). Clustering of match running and performance indicators to assess between-and within-playing position similarity in professional rugby league. Journal of Sports Sciences, 40(15):1712–1721.

Everitt, B., Landau, S., Leese, M., and Stahl, D. (2011). Cluster analysis, wiley. Chichester, UK.

Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, anddensity estimation. Journal of the American statistical Association, 97(458):611–631.

Glickman, M. E. (2017). Discussion of practical problems in sports analytics. JQAS Invited Session, 490.

Hautbois, C., Djaballah, M., and Desbordes, M. (2020). The social impact of participative sporting events: a cluster analysis of marathon participants based on perceived benefits. Sport in Society, 23(2):335–353.

Hodge, K. and Petlichkoff, L. (2000). Goal profiles in sport motivation: A cluster analysis. Journal of Sport and Exercise Psychology,22(3):256–272.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021). An Introduction to StatisticalLearning: with Applications in R. Springer US.

James, G., Witten, D., Hastie, T., Tibshirani, R., and Taylor, J. (2023). An Introductionto Statistical Learning with Applications in Python. Springer.

Kaufman, L. and Rousseeuw, P. J. (2009). Finding groups in data: an introduction tocluster analysis. John Wiley & Sons.

Koutroumbas, K. and Theodoridis, S. (2008). Pattern recognition. Academic Press.

Kubat, M. and Kubat, M. (2021). Ambitions and goals of machine learning. An Introduction to Machine Learning, pages 1–15.

Landesman, P. (2015). Concussion imdb. https://shorturl.at/mDEJ4. Accessed on2024-04-17.

Lloyd, S. (1982). Least squares quantization in pcm. In Least squares quantization in PCM, 1982., IEEE Transactions on Information Theory, volume 28, pages 129–137.IEEE.

MacQueen, J. et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA.

McNicholas, P. (2016). Mixture Model-Based Classification. Chapman and Hall/CRC. Miller, B. (2011). Moneyball imdb. https://www.imdb.com/title/tt1210166/?ref_

=fn_al_tt_1. Accessed on 2024-04-17.

Miller, T. W. (2015). Sports analytics and data science: winning the game with methods and models. FT press.

Muniz, M. and Flamand, T. (2022). A weighted network clustering approach in the nba. Journal of Sports Analytics, 8(4):251–275.

Murray, N. P. and Hunfalvay, M. (2017). A comparison of visual search strategies of elite and non-elite tennis players through cluster analysis. Journal of sports sciences, 35(3):241–246.

Musa, R. M., Taha, Z., Majeed, A. P. A., and Abdullah, M. R. (2018). Machine learning in sports: identifying potential archers. Springer.

Nuñez, J. L., Mahbubani, L., Hu´escar, E., and Le´on, J. (2019).Relationships between cardiorespiratory fitness, inhibition, and math fluency: A cluster analysis. Journal of Sports Sciences, 37(23):2660–2666.

Resnik, D. B. and Shamoo, A. E. (2017). Reproducibility and research integrity. Accountability in research, 24(2):116–123.

Schwab, S., Janiaud, P., Dayan, M., Amrhein, V., Panczak, R., Palagi, P.

M., Hemkens,L. G., Ramon, M., Rothen, N., Senn, S., Furrer, E., and Held, L. (2022). Ten simplerules for good research practice. PLOS Computational Biology, 18(6).

Siedlik, J. A., Bergeron, C., Cooper, M., Emmons, R., Moreau, W., Nabhan, D., Gallagher, P., and Vardiman, J. P. (2016). Advanced treatment monitoring for olympic level athletes using unsupervised modeling techniques. Journal of Athletic Training, 51(1):74–81.

Singh, D. and Gosain, A. (2013). A comparative analysis of distributed clustering algorithms:A survey. In 2013 International Symposium on Computational and Business Intelligence, pages 165–169. IEEE.

Smyth, D. (2022). The job market for sports analysts.

Swartz, T. B. (2020). Where should i publish my sports paper? The American Statistician,74(2):103–108.

Thibault, V., Guillaume, M., Berthelot, G., El Helou, N., Schaal, K., Quinquis, L., Nassif, H., Tafflet, M., Escolano, S., Hermine, O., et al. (2010). Women and men in sport performance: the gender gap has not evolved since 1983. Journal of sports science & medicine, 9(2):214.

Tricco, A. C., Lillie, E., Zarin, W., O’Brien, K. K., Colquhoun, H., Levac, D., Moher, D., Peters, M. D., Horsley, T., Weeks, L., et al. (2018). Prisma extension for scoping reviews (prisma-scr): checklist and explanation. Annals of internal medicine, 169(7):467–473.

van der Linden, M. L., Corrigan, O., Tennant, N., and Verheul, M. H. (2021). Cluster analysis of impairment measures to inform an evidence-based classification structure in racerunning, a new world para athletics event for athletes with hypertonia, ataxia

or athetosis. Journal of Sports Sciences, 39(sup1):159–166.

Zhang, S., Lorenzo, A., G´omez, M.-A., Mateus, N., Gon¸calves, B., and Sampaio, J. (2018). Clustering performances in the nba according to players’ anthropometric attributes and playing experience. Journal of sports sciences, 36(22):2511–2520.

Zuccolotto, P. and Manisera, M. (2020). Basketball data science: With applications in R. CRC Press.


Full Text: pdf
کاغذ a4

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.