Discriminating Myalgic Encephalomyelitis/Chronic Fatigue Syndrome and comorbid conditions using metabolomics in UK Biobank, 2024, Huang et al

I am pretty sure that any diagnostic procedure that does not provide information on mechanism will only ever be circular. Its validity is based on correlation with the uncertain clinical classification you want to do better than, but it will only ever be second best to that.

Jobbing doctors and patients think they want diagnostics. Specialist physicians spend their time clearing up the errors generated by that.

It's true that this pathway may not make diagnosis more accurate than the average (includes GP and specialist diagnoses), it would also not be standalone for specialists. Idea is that it will be better than average accuracy for GPs. True aim is speeding up diagnosis. Take a blood test, trained AI algorithm sees convoluted signature in the structure of <5% CV% markers that distinguishes defined patients from comorbid disease.

This paper was a proof of concept, exciting thing was that the disease score we got for ME/CFS was better than we expected. We are planning to validate but also improve on the concept with different dataset

Current average is that it takes 4 years for diagnosis in Australia. Aim is to speed that up so that some treatment strategies like pacing can be employed early. We see plenty of benefit to get a faster diagnosis of ME/CFS. Even if it's just to get told that over exertion can be harmful earlier on the pipeline, reduce exacerbation.

Yes. Identifying pathomechanism is the gold, this is just another path on the meantime that is worth exploring.
 
Last edited:
It's true that this pathway isn't going to make diagnosis more accurate on the average, it would also not be standalone for specialists. Idea is that it will be better than average for GPs. It's only going to take who has been diagnosed and speed up the process.

But is this real medicine? Have you had experience of how this actually works in clinical practice? Because it doesn't work the way most doctors assume it does.

We don't even know what we mean by a 'more accurate diagnosis'. We think that a group of people have a common element to the mechanism of their illness but we have every reason to think that the process is sufficiently complex and multifactorial for that to be true of a range of overlapping populations any one of which we might choose to think of as 'ME/CFS'. That is how it is for RA. There are half a dozen plausible biological groupings we could call 'core RA'. In practice we don't care because we have prognostic data relating to a range of markers.

Why would a test be better than a GP knowing the Canadian Consensus Criteria and applying them? The GP can do that on the spot, rather than wait a fortnight for lab tests to get collated. I don't know of any real life diagnostic test that depends on measuring a load of things and putting them through some sort of algorithm or nomogram but you would need a dedicated system in the lab to do that.

And I don't follow how if it just takes those diagnosed it speeds anything up - they are already diagnosed surely?

As I see it pacing is the sensible thing to do for anyone with symptoms vaguely suggestive of ME/CFS anyway. Diagnostic certainty isn't that relevant. The real problem is that idiot physicians are going around suggesting that people with 'fatigue' should do exercises. Nobody should be advised to do exercises to get better, so making a diagnosis isn't so that you can treat people differently. If someone has an adverse response to exercise they certainly should not push through whatever a blood test shows. You already know enough to advise.

People have asked this before but I am not aware of any precedent where data trawling has led to a useful diagnostic test based on combined results from a range of tests for which there is no mechanistic interpretation. I doubt this is real medicine - it seems more like a statistician's fantasy. Do you know of clear examples?
 
Only realistic path without understanding mechanism. Any test without understanding mechanism has to be developed to distinguish against other diseases, especially common comorbid ones. That's the major bottleneck and time suck for mecfs diagnosis.
That doesn’t really answer my question:
Are there any examples of this path ever working out for any disease?

In other words: why is this a «realistic path»?
 
But is this real medicine? Have you had experience of how this actually works in clinical practice? Because it doesn't work the way most doctors assume it does.

We don't even know what we mean by a 'more accurate diagnosis'. We think that a group of people have a common element to the mechanism of their illness but we have every reason to think that the process is sufficiently complex and multifactorial for that to be true of a range of overlapping populations any one of which we might choose to think of as 'ME/CFS'. That is how it is for RA. There are half a dozen plausible biological groupings we could call 'core RA'. In practice we don't care because we have prognostic data relating to a range of markers.

Why would a test be better than a GP knowing the Canadian Consensus Criteria and applying them? The GP can do that on the spot, rather than wait a fortnight for lab tests to get collated. I don't know of any real life diagnostic test that depends on measuring a load of things and putting them through some sort of algorithm or nomogram but you would need a dedicated system in the lab to do that.

And I don't follow how if it just takes those diagnosed it speeds anything up - they are already diagnosed surely?

As I see it pacing is the sensible thing to do for anyone with symptoms vaguely suggestive of ME/CFS anyway. Diagnostic certainty isn't that relevant. The real problem is that idiot physicians are going around suggesting that people with 'fatigue' should do exercises. Nobody should be advised to do exercises to get better, so making a diagnosis isn't so that you can treat people differently. If someone has an adverse response to exercise they certainly should not push through whatever a blood test shows. You already know enough to advise.

People have asked this before but I am not aware of any precedent where data trawling has led to a useful diagnostic test based on combined results from a range of tests for which there is no mechanistic interpretation. I doubt this is real medicine - it seems more like a statistician's fantasy. Do you know of clear examples?

The distinguishing of ME/CFS from similar conditions or comorbid conditions is the bottleneck for diagnosis. Meeting the symptoms of CCC is easy, that's not why it takes so long to diagnose mecfs. Excluding all other reasons for persistent fatigue.

This pathway is new, it's only with the advent of AI, using technologies that can find patterns that we can't see. It's new and I understand there is going to be some pushback from some but actually probably less than we initially thought. Medicine is embracing AI tools faster than we expected.
 
That doesn’t really answer my question:
Are there any examples of this path ever working out for any disease?

In other words: why is this a «realistic path»?

Current examples of these panels are used in cancer. Identifying subtypes and markers of prognosis.

This technology is still young, if it had already been around for decades we'd seen this trickle beyond cancer. As it stand cancer leads in all precision tech.
 
Chris, @MelbME, can you please identify why my reasoning set out in this post is wrong, why Figure 1 and the conclusions drawn from it help us identify ME/CFS-specific pathology?
Chris, there are at least two issues here.


One is the point I made about Figure 1, where the comparison between the UK Biobank 'ME/CFS' cohort and a super-healthy subset (C2) from the UK biobank suggests a whole range of issues with lipids including cholesterol. The result of that comparison is that you have claimed that issues with lipids including cholesterol are a characteristic of ME/CFS. But the super-healthy subset are unusual for their age, in their lack of cholesterol issues. A more appropriate comparison would have been a subset of C1 (the general non-ME/CFS cohort), matched for sex and BMI.

You haven't addressed that issue yet. Do you understand the point I am making? Do you agree that the super-healthy cohort was not an appropriate comparison - therefore making some of the conclusions in that paper and your other paper that did the same thing inaccurate? If not, can you explain why a comparison with a super-healthy cohort was appropriate, why my reasoning is wrong?


A second point is the approach of a 'diagnostic test' made from looking for correlations between various factors that have been measured in the members of the UK Biobank. I can't recall/am not sure what the comparison population was for the model development. What was the comparison population for model development?

I'm not sure about the statistical approach used, perhaps it was marvellous and will be useful with some future dataset that does not have the ME/CFS labelling problems of the UK Biobank ME/CFS cohort.

But, we have seen quite a number of these efforts that take a whole lot of data, most of which will be irrelevant or due to lifestyle changes, to produce an algorithm for separating people with an ME/CFS label from others. Your model is far from the only one, and measures of accuracy are pretty meaningless in these cases, particularly so with the UK Biobank data where we have a low confidence about the accuracy of the ME/CFS diagnoses, and particularly when the model is not validated in a separate data set.

Here again are the inputs to the 'diagnostic model':
The 28 selected features for the LightGBM model, which was selected as the optimum model
(from Fig 4) (negative or positive association, effect size, individual p value, vs healthy controls from Supp Data 14 )

Frequency of tiredness/lethargy (positive, 1.35, ***)
Sleep duration (positive, 0.47,***)
Whole body pain (positive, 3.82, ***)
Headache pain (positive, 0.96, ***)
Female (data not given for comparison with healthy controls)
Alcohol consumption (negative, -1.26, ***) (note: only options were 'never or previous' and 'current')
Smoker (negative association,-0.21, *)
IPAQ (Physical Activity) High (negative,-0.76, ***)
Nucleated Red Blood Cell % (positive, 0.01, NS)
Nucleated Red Blood Cell Count (negative, -0.01, NS)
Facial pain (positive, 1.82, ***)
Stomach/abdominal pain (positive, 1.37, ***)
Hip pain (positive, 1.05, ***)

PUFA% (negative, -0.16, ***)
Total P (negative, -0.28, ***)
Leucine (positive, 0.03, NS)
Age at recruitment (data not given for comparison with healthy controls)
Acetone (negative, -0.10, **)
S-LDL-P (positive, 0.16, ***)
S-LDL-TG (positive, 0.27, ***)
Systolic Blood Pressure (negative, -0.07, *)
Acetoacetate (negative, -0.03, NS)
Frequency of depressed moods (positive, 0.73, ***)
Nap during Day (positive, 1.01, ***)
L-VLDL-Free cholesterol Very Low Density Lipoproteins (positive, 0.14, ***)
Sleeplessness/Insomnia (positive, 0.78, ***)
Immature Reticulocyte Fraction (positive, 0.20, ***)
M-VLDL-P (positive, 0.20, ***)
 
Last edited:
@MelbME, looking at this from a different perspective, is there any indication in the combination of test results the algorithm/AI identifies as very likely to indicate ME/CFS that does actually give useful clues to pathology? A bit like the genetic slight differences that point in useful directions in DecodeME, are there any useful directions you or the AI have spotted in your tests?
 
The distinguishing of ME/CFS from similar conditions or comorbid conditions is the bottleneck for diagnosis. Meeting the symptoms of CCC is easy, that's not why it takes so long to diagnose mecfs. Excluding all other reasons for persistent fatigue.

But hang on. Do you actually mean co-morbidities? Because something is only a com-morbidity if you have already diagnosed ME/CFS and are looking for something else as well. Hypertension is not a problem because you diagnose it by taking the blood pressure and always will be. Hypothyroidism is relevant because it might mimic ME/CFS and also perhaps because it might co-exist with ME/CFS. But that is sorted with a T4 and TSH, which all GPs know to do on someone fatigued. Depression covers all sorts of things. Most people with ME/CFS are depressed by having it. Depression might simulate ME/CFS (rather than being a co-morbidity) but the gold standard for sorting the two out will remain CCC, because there isn't any other definition of 'ME/CFS'. If there is a way of diagnosing some category of 'ME/CFS' other than CCC we cannot find it by looking for the best correlation with CCC, surely? If meeting CCC is easy then diagnosing ME/CFS is easy - by definition, since there isn't any other definition of a concept of 'ME/CFS'.

This pathway is new, it's only with the advent of AI, using technologies that can find patterns that we can't see.

But is it new? People have been going on about devising these diagnostic tests based on statistics for at least twenty years - and as far as I know got nowhere.

Medicine is embracing AI tools faster than we expected.

The medical Twitterati are always 'embracing' what sounds trendy. They were embracing this years ago. If you were saying this in 1999 I would have thought it plausible. Twenty five years later I am less inclined to.
 
@MelbME, looking at this from a different perspective, is there any indication in the combination of test results the algorithm/AI identifies as very likely to indicate ME/CFS that does actually give useful clues to pathology? A bit like the genetic slight differences that point in useful directions in DecodeME, are there any useful directions you or the AI have spotted in your tests?

It's possible. I think the problem is that the signature is about finding ways to separate mecfs from many other conditions. It does this by finding factors specific of other conditions just as much as it does using factors for ME/CFS. Like diabetes would probably use glucose and bcaa's in the algorithm because it separates diabetes away from ME/CFS.

Gene information is going to be limited to risk scores in ME because there is very likely to be an environmental component to getting the disease. The genetic risk for ME/CFS could be high based on genetics. Could be 50% of the population.

I think the lipid changes are interesting. From this paper the forest plots are interesting. They show markers that separate out mecfs from the other conditions. The lipids stand out as most interesting for mecfs. Have developed a hypothesis around lipids from this and other work.
 
Current examples of these panels are used in cancer. Identifying subtypes and markers of prognosis.
That sounds like a much easier issue to solve than to diagnosing ME/CFS where we have no clue at what to look at or if we’ve got the relevant data at all.

How do those cancer models do when you take away the imaging of relevant tissue and the blood tests developed specifically for cancer?
 
But hang on. Do you actually mean co-morbidities? Because something is only a com-morbidity if you have already diagnosed ME/CFS and are looking for something else as well. Hypertension is not a problem because you diagnose it by taking the blood pressure and always will be. Hypothyroidism is relevant because it might mimic ME/CFS and also perhaps because it might co-exist with ME/CFS. But that is sorted with a T4 and TSH, which all GPs know to do on someone fatigued. Depression covers all sorts of things. Most people with ME/CFS are depressed by having it. Depression might simulate ME/CFS (rather than being a co-morbidity) but the gold standard for sorting the two out will remain CCC, because there isn't any other definition of 'ME/CFS'. If there is a way of diagnosing some category of 'ME/CFS' other than CCC we cannot find it by looking for the best correlation with CCC, surely? If meeting CCC is easy then diagnosing ME/CFS is easy - by definition, since there isn't any other definition of a concept of 'ME/CFS'.



But is it new? People have been going on about devising these diagnostic tests based on statistics for at least twenty years - and as far as I know got nowhere.



The medical Twitterati are always 'embracing' what sounds trendy. They were embracing this years ago. If you were saying this in 1999 I would have thought it plausible. Twenty five years later I am less inclined to.

Even 5 years ago our discussion with the business groups at the University had pointed to problems they'd had in translating AI tools to GPs, that largely fallen away on the past 5 years.

Application of the tech is still new, the potential was recognized decades ago. Sure.

Comorbids we used were the common ones that mecfs patients in ukbiobank actually had and were enriched against a general population background in ME. We added hypertension to account for the lipid and lipoprotein elevations. Interesting thing is that mecfs group (25% hypertension) had lipoprotein profiles that were close to equal to a group where 100% of people had hypertension.
 
I think the lipid changes are interesting. From this paper the forest plots are interesting. They show markers that separate out mecfs from the other conditions. The lipids stand out as most interesting for mecfs. Have developed a hypothesis around lipids from this and other work.
Indeed, this is a niggling concern, that you have a whole funding stream dependent on this finding that cholesterol and other lipids are unusual in ME/CFS (a finding based on this comparison of a group of people with a rather questionable ME/CFS diagnosis with a group of super-healthy people who have unusual lipid profiles for their age). And perhaps that funding stream is what is making it so hard to see the problems I am pointing out.

Can you please address my questions? Do you see the problem with comparing an ME/CFS cohort with a cohort that has been selected to not have issues such as high cholesterol? What was the comparison population used to make your model? If you think my arguments are wrong, why are they wrong?

I summarised the issue with the selection of a comparison group in my post upthread here.
 
Chris, there are at least two issues here.


One is the point I made about Figure 1, where the comparison between the UK Biobank 'ME/CFS' cohort and a super-healthy subset (C2) from the UK biobank suggests a whole range of issues with lipids including cholesterol. The result of that comparison is that you have claimed that issues with lipids including cholesterol are a characteristic of ME/CFS. But the super-healthy subset are unusual for their age, in their lack of cholesterol issues. A more appropriate comparison would have been a subset of C1 (the general non-ME/CFS cohort), matched for sex and BMI.

You haven't addressed that issue yet. Do you understand the point I am making? Do you agree that the super-healthy cohort was not an appropriate comparison - therefore making some of the conclusions in that paper and your other paper that did the same thing inaccurate? If not, can you explain why a comparison with a super-healthy cohort was appropriate, why my reasoning is wrong?


A second point is the approach of a 'diagnostic test' made from looking for correlations between various factors that have been measured in the members of the UK Biobank. I can't recall/am not sure what the comparison population was for the model development. What was the comparison population for model development?

I'm not sure about the statistical approach used, perhaps it was marvellous and will be useful with some future dataset that does not have the ME/CFS labelling problems of the UK Biobank ME/CFS cohort.

But, we have seen quite a number of these efforts that take a whole lot of data, most of which will be irrelevant or due to lifestyle changes, to produce an algorithm for separating people with an ME/CFS label from others. Your model is far from the only one, and measures of accuracy are pretty meaningless in these cases, particularly so with the UK Biobank data where we have a low confidence about the accuracy of the ME/CFS diagnoses, and particularly when the model is not validated in a separate data set.

Here again are the inputs to the 'diagnostic model':

We used non-disease healthy controls because it is the clean baseline needed for our biomarker discovery. We ran logistic regression (odds ratios, adjusted for age/sex/meds) for every group against the same healthy cohort. Each of the 7 pure comorbid cohorts vs the exact same healthy cohort what let us create the forest plots in Fig. 2 and quantify pleiotropy (overlap). Without a single shared healthy reference, you couldn’t directly compare “how strong is this VLDL signal in ME/CFS vs hypertension vs migraine?” All effect sizes and directions become comparable on the same scale.

The individual biomarker associations (vs healthy) establish the biological building blocks. The LightGBM model then takes those (plus baseline characteristics) and learns to discriminate ME/CFS from the realistic mixed population (C2 + all comorbidities). Healthy controls are the “negative anchor” the algorithm needs to learn what “normal” looks like before it can pick out ME/CFS in a noisy sea of overlapping diseases.
 
Indeed, this is a niggling concern, that you have a whole funding stream dependent on this finding that cholesterol and other lipids are unusual in ME/CFS (a finding based on this comparison of a group of people with a rather questionable ME/CFS diagnosis with a group of super-healthy people who have unusual lipid profiles for their age). And perhaps that funding stream is what is making it so hard to see the problems I am pointing out.

Can you please address my questions? Do you see the problem with comparing an ME/CFS cohort with a cohort that has been selected to not have issues such as high cholesterol? What was the comparison population used to make your model? If you think my arguments are wrong, why are they wrong?

I summarised the issue with the selection of a comparison group in my post upthread here.

If you are concerned then simply look at the forest plots. We have a 100% hypertension group there. Compare mecfs to that group. ME/CFS group had worse cholesterol and lipoprotein issues than a group made from all hypertensive patients.

The healthy control group was a baseline for is to scale all these other diseases against each other.
 
Even 5 years ago our discussion with the business groups at the University had pointed to problems they'd had in translating AI tools to GPs, that largely fallen away on the past 5 years.
Can you elaborate on what those problems were and which use cases that were discussed? And all AI isn’t created equal as you surely know, so it working for on GP application doesn’t mean it will work for another. We need specifics to look at, and so far I haven’t seen any situations that compare well to this one.
The individual biomarker associations (vs healthy) establish the biological building blocks. The LightGBM model then takes those (plus baseline characteristics) and learns to discriminate ME/CFS from the realistic mixed population (C2 + all comorbidities). Healthy controls are the “negative anchor” the algorithm needs to learn what “normal” looks like before it can pick out ME/CFS in a noisy sea of overlapping diseases.
But surely you’re not trying to discriminate ME/CFS from healthy people? Why does «normal» even matter here? You’re trying to figure out if whatever is keeping this person ill is ME/CFS or X or Y.

And why is «normal» defined as super healthy?
 
Can you elaborate on what those problems were and which use cases that were discussed? And all AI isn’t created equal as you surely know, so it working for on GP application doesn’t mean it will work for another. We need specifics to look at, and so far I haven’t seen any situations that compare well to this one.

But surely you’re not trying to discriminate ME/CFS from healthy people? Why does «normal» even matter here? You’re trying to figure out if whatever is keeping this person ill is ME/CFS or X or Y.

And why is «normal» defined as super healthy?

The problems were around AI tools could pick up patterns without explaining the why. Clinicians were weary of using tools if they themselves didn't understand the biological. I believe clinicians are taught this way. But as trust in AI is growing there is beginning to be trust in not needing to know the why, just that AI analysis has seen a complex pattern and the outcome is a result that is beneficial.

Healthy was the scale to compare other diseases to me/cfs. I think once you review the forest plots, look and understand them then you will understand why healthy was chosen as the scale.

Why do you guys keep saying super healthy? What is super healthy? These healthy controls are without diagnosed disease, that's how we usually use healthy controls. Non-mecfs control is if they have disease but not mecfs.
 
The problems were around AI tools could pick up patterns without explaining the why. Clinicians were weary of using tools if they themselves didn't understand the biological. I believe clinicians are taught this way. But as trust in AI is growing there is beginning to be trust in not needing to know the why, just that AI analysis has seen a complex pattern and the outcome is a result that is beneficial.

Healthy was the scale to compare other diseases to me/cfs. I think once you review the forest plots, look and understand them then you will understand why healthy was chosen as the scale.

Why do you guys keep saying super healthy? What is super healthy? These healthy controls are without diagnosed disease, that's how we usually use healthy controls. Non-mecfs control is if they have disease but not mecfs.
I'm with you @MelbME on this.
I maybe the only one here on s4me who likes real healthy controls.
I understand when doing CPET research you do not invite the Tour the France peloton as healthy controls.
But in this kind of research you want to find the differences in ME/CFS and healthy controls, nor almost sick and ME/CFS.
Are you trying to be holier than the Pope here @Hutan ?
 
If you are concerned then simply look at the forest plots. We have a 100% hypertension group there. Compare mecfs to that group. ME/CFS group had worse cholesterol and lipoprotein issues than a group made from all hypertensive patients.

The healthy control group was a baseline for is to scale all these other diseases against each other.
Why do you guys keep saying super healthy? What is super healthy? These healthy controls are without diagnosed disease, that's how we usually use healthy controls. Non-mecfs control is if they have disease but not mecfs.

The wider population of the members of the UK Biobank can therefore also be expected to have issues with cholesterol. The wider population of the members of the UK Biobank is named C1. We know from Table 1 in this paper that around 12% of C1 have high total cholesterol. We know that the members of the UK Biobank tend to be a bit healthier than the average UK person of their age, but, even so, a significant number have those issues with high cholesterol.

Therefore, the members of the ME/CFS group in the UK Biobank are likely to also present with issues with cholesterol. And they do. We know from Table 1 that around 10% of the ME/CFS group have high cholesterol. That seems about right, considering the percentages of women in ME/CFS group and the C1 group (more women in the ME/CFS group).

We know that the C2 population of the UK Biobank consists of people who are super healthy. They have no co-morbidities. They are not your typical UK older person, they are not even typical for the typical UK Biobank member who tends to be a bit healthier than normal. The C2 members presumably have far fewer problems with cholesterol and other lipids. Table 2 in this paper confirms that - while 15.7% of the ME/CFS group are on statins to lower cholesterol, only 0.8% of the C2 super healthy group are. All of the literature I can find suggests that that level of statin use in the ME/CFS group is normal, perhaps even low, for older people in western countries including the UK. The level of statin use in the C2 super healthy people is definitely not normal for people of their age.
From your comments, it sounds as though you have not yet understood my point.

Your paper tells us that the general population in the biobank (C1) has 12% of people with high cholesterol. This is substantially lower than prevailing rates in the general population in the same age range. We know that the people in the UK Biobank tend to be a bit healthier than the general population.

Your paper tells us that the ME/CFS population in the biobank has 10% of people with high cholesterol. This suggests that the rate of high cholesterol is very similar in the ME/CFS group as in the general population. We know women tend to have lower rates of high cholesterol in the age range that most of the people in the Biobank would have been when the samples were taken, and there is a higher percentage of women in the ME/CFS group than in the general biobank population.

Compare that to the C2 population - yes, they really are super healthy. They are not just 'healthy controls'. And, if you are looking for biomarkers that can separate out people with ME/CFS from other illnesses, surely those other illnesses should be in your comparison population? The rate of high cholesterol in the super healthy group is 0.8%. In terms of blood lipid profiles, the average super healthy person in the biobank is not like the average person in the biobank, not at all. It makes no sense to say 'Ah! there are cholesterol issues in people with MECFS and that characterises the illness!' on the basis of a comparison with the super healthy group.

I think I addressed the issue of comparing with the hypertension group upthread, but I'll go find some data on that and come back to it.

Can you tell us what population group was used as the comparator for the production of the model?
 
Back
Top Bottom