On the question of severity definitions, you’re right that the way we evaluate “severe” using multiple case definitions doesn’t map cleanly onto how patients typically describe severity (e.g., housebound/bedbound), and I understand why that’s a sticking point. That’s an area where there’s still a lot of debate, and it’s helpful to hear how it’s being interpreted from your perspective.
In some of these studies, “severity” isn’t defined by functional status (like housebound or bedbound), but by how many case definitions a person meets at the same time. The idea is that people who meet multiple definitions tend to report a broader range and higher frequency/severity of symptoms, so they’re grouped as “more severe” within that framework.
But as you pointed out, that doesn’t necessarily line up with how patients experience severity in real life, where functional ability is usually the key distinction. So it’s a bit of a proxy measure rather than a direct one, and that mismatch is part of why it can be confusing or controversial.
It's good to hear acknowledgement that this is controversial, but the primary problem is not that it doesn't line up with patient experience. The primary problem is that it doesn't line up with how the term has been used for decades in research and clinical practice.
Further to the reference I provided
above, here are some more study cohorts' SF36 PF scores to illustrate why a group with a mean SF36 PF of 77 cannot be described as severe. I'm mindful that (a) a prospective study would be expected to pick up more mild and marginal cases, which would raise the mean SF36PF scores, and (b) the Jason cohort is much earlier in the illness than most other studies.
The median SF36PF of 345 people with ME/CFS attending eleven specialist NHS clinics in the UK 2014-2016 and providing follow-up data was
40 (interquartile range 25-60), and they had been sick for a median 26 months (IQR 12-80), see
Collin & Crawley 2017 Appendix.
Rekeland et al. 2022 has a nice table showing SF36 PF scores of various study cohorts. Now some of these deliberately excluded milder patients, so that will reduce average scores, but it still gives an idea of the physical function of people who want intervention:
And finally, this is table 3 from
van Campen et al. 2020:
There's one other study I've come across that labelled their cohort as severe when that was not warranted:
Friedberg et al. 2016 "Efficacy of two delivery modes of behavioral self-management in
severe chronic fatigue syndrome". In that study, baseline SF36 PF was
38, so, pretty average, not severe, though some with severe ME/CFS were included.. 37% of the sample was working (21 were working full-time, 12 half-time and 17 part-time). Again, just not a severe batch of patients. But Friedberg et al. defended it in the paper (perhaps in response to a reviewer?) as follows:
Illness-related fatigue and functional limitations in the study sample appeared to be severe based on the fact that the vast majority of participants presented baseline FSS scores equal to or higher than 5.0 (98.5%), considered to be ‘high fatigue’ severity, that is, two SDs above healthy controls.[41,42] In addition, the mean FSS score at baseline, 6.52 (SD = .49), was about two SDs higher than that found in a primary care sample of combined unexplained chronic fatigue and CFS patients that underwent a similar cognitivebehavioral self-management intervention.[20] Also, on the baseline web diary, the mean seven-day numerical fatigue rating (0–10) was 6.92 (SD = 1.28) which was about 1 SD higher than that reported in a previous CFS study sample recruited from the local community.[43] Finally, the mean SF-36 PF score of 37 was 2.0 SDs below the US population mean, [44] and 1.0 SD below SF-36 PF scores averaged over five published CFS self-management studies (Discussion). Based on population data, only 16% of individuals with SF-36 PF scores in the 30–39 range can walk one block or more.[23] These statistics suggest that our sample was severely ill.
If the goal is to get more funding and recognition as indicated
above, then severity labels should not have been invoked in this series of papers on the prospective IM study. It looks more like crying wolf, and could be counterproductive. A solution? Call the groups what they are. Label them by the criteria they fulfilled e.g. Fukuda only, IOM only, CCC only, Fukuda/CCC, Fukuda/IOM, Fukuda/CCC/IOM. And then conflate them into fewer groups in whatever way the data leads e.g. Fukuda only vs ≥2 criteria or Fukuda +CCC/IOM. I would love to see an analysis where we see this at baseline, 6 months and 7 years, so we learn which criteria do a better job of picking out the people who remain ill.
If future studies want to focus on more severe patients, then assess severity. Decode ME did a nice job of this (#27 in questionnaire 1):
