Prediction and Associated Factors of Fatigue in Hypothyroidism Using Explainable Machine Learning Models, 2026, Sundus

Dolphin

Senior Member (Voting Rights)

Clinical Epidemiology and Global Health

Available online 11 March 2026, 102335
In Press, Journal Pre-proofWhat’s this?

Prediction and Associated Factors of Fatigue in Hypothyroidism Using Explainable Machine Learning Models​


Habiba Sundus 1, Sajad Ul Islam 2, Sohrab Ahmad Khan 1, Noor Fatima 3
Show more
Add to Mendeley
Share
Cite
https://doi.org/10.1016/j.cegh.2026.102335Get rights and content
Under a Creative Commons license
Open access

Highlights​


  • Fatigue prevalence in hypothyroidism was 47.9% among 315 adults.

  • Random forest achieved the best performance (AUC = 0.88, 95% CI 0.80–0.95).

  • Key predictors were physical activity, BMI, smoking, and device use.

  • SHAP explainability showed lifestyle factors outweighed biochemical markers.

  • Findings support integrative, lifestyle-focused fatigue-management strategies.

Abstract​

Problem considered​

Fatigue is a prevalent yet under-recognized symptom among individuals with hypothyroidism, influenced by multiple interacting biological and behavioral factors. Traditional statistical methods may overlook nonlinear and complex relationships between predictors and fatigue severity. Identifying key associated factors through advanced analytical approaches could support more comprehensive management strategies.

Methods​

A cross-sectional study was conducted among 315 adults with hypothyroidism attending an endocrine outpatient clinic between April and August 2025. Fatigue was measured using the Fatigue Severity Scale (FSS; mean score ≥4). Potential predictors included demographic, clinical, and lifestyle characteristics. After data preprocessing (scaling, encoding, imputation), three models, logistic regression, support vector machine (SVM), and random forest (RF), were trained using stratified training–testing splits. Model performance was evaluated using AUC, accuracy, precision, recall, F1-score, Brier score, and calibration metrics. Feature importance and SHAP values were applied for model interpretability.

Results​

Fatigue prevalence was 47.9%. The RF model showed the best discrimination (AUC 0.88, 95% CI 0.80–0.95; accuracy 0.81, 95% CI 0.71–0.90) with acceptable calibration. Major predictors included physical activity, age, electronic device use, BMI, diet, smoking status, and education, while TSH and hemodynamic measures contributed minimally.

Conclusion​

Explainable machine learning effectively identified key behavioral and clinical factors associated with fatigue in hypothyroidism. Findings highlight the dominant role of modifiable lifestyle factors, suggesting that management should extend beyond thyroid hormone replacement. External validation is recommended before clinical integration.

Keywords​

fatigue
hypothyroidism
machine learning
prediction model
TRIPOD+AI
SHAP
 
Key predictors were physical activity, BMI, smoking, and device use
I don't think the word 'predictor' is correct here, but what the hell do I know with my linear conception of time and my insistence that events that happen after cannot be their own cause?
Findings support integrative, lifestyle-focused fatigue-management strategies.
Yeah this is what happens when you simply don't consider things like the linear concept of time and use quantitative math on low-quality high-bias qualitative data.
Findings highlight the dominant role of modifiable lifestyle factors, suggesting that management should extend beyond thyroid hormone replacement.
This is just a medical application of "the beatings will continue until morale is reported to have improved". I don't understand the drive for fake evidence just because it gives answers that seem better, as long as you don't think about it. Clearly if we could just convince people suffering from rabies to drink, the whole rabies problem would be fixed, just like magic.
 
Back
Top Bottom