Hamstring Strain Injury Risk in Soccer: An Exploratory, Hypothesis-Generating Prediction Model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Hamstring strain injuries (HSI) are common in soccer and difficult to predict. Machine learning and regression approaches may identify novel strength-related predictors, but model development requires transparent reporting. Objective: To develop and internally validate a prediction model for hamstring strain injuries in amateur soccer players using preseason strength and clinical measures. Study Design: Prospective cohort study. Methods: This prospective cohort study included 120 amateur male soccer players monitored across one competitive season (30 weeks). Baseline predictors were age, body mass index, prior injury, and bilateral isometric hip and knee strength variables measured with a handheld dynamometer. The outcome was player-level hamstring injury status (≥1 HSI vs none). Twenty candidate predictors were reduced to 10 via symmetrical uncertainty feature ranking. A logistic regression model was trained (n=83 players) with nested four-fold cross-validation and tested on an independent hold-out set (n=37 players). Model performance was evaluated using the area under the ROC curve (AUC), calibration slope and intercept, and confusion matrices. Results: Twenty-one players sustained ≥1 HSI (32 events; 28% reinjuries). With 10 predictors and 21 events, the events-per-variable ratio was 2.1, below recommended thresholds, indicating risk of overfitting. On the test set (5 injured, 32 uninjured), the model achieved an accuracy 64.9%, AUC 0.68 (95% CI 0.52–0.84), calibration slope 0.85, and intercept –0.12. Sensitivity was 60% and specificity 65.6%. Dominant-leg hip abduction strength was the only statistically significant predictor (OR=0.82, 95% CI 0.70–0.96), though stability analyses identified previous hamstring injury as the most consistent contribution despite significance in regression due to limited events. Conclusion: Previous hamstring injury remained the strongest predictor of future injury risk, while reduced dominant-leg hip abduction strength emerged as a candidate risk factor but demonstrated instability under resampling. Neither age nor hamstring isometric strength were significant predictors in this cohort. Model discrimination was modest, calibration indicated mild overfitting, and overall risk of bias was high. This study represents a TRIPOD Category 2 prediction model development without external validation. Findings should therefore be considered exploratory and hypothesis-generating, requiring confirmation in larger, methodologically robust cohorts.

Article activity feed