Researchers propose deep trawl of DNA to help uncover the causes of ME/CFS (Simon McG blog)

Barry · Jun 28, 2019

Sly Saint said:
I would worry that they ended up recruiting from fatigue clinics.

I think it would be crucial that existing diagnoses not be blindly accepted, but the researchers apply their own diagnostic assesments.

dreampop · Jun 28, 2019

It would be difficult but I think this is a challenge with clear parameters and a clear objective, and people are much more likely to participate that surveys and protests.

About getting the right patients, I was thinking having stations in me/cfs doctors office and asking the patient to participate might provide some of the data and those could be a reference against mail-in data as whole.

But, it would probably take at least a year to collect the data.

Would multiple countries introduce too much variability?

Cinders66 · Jun 28, 2019

http://www.megaresearch.me.uk/qanda/

I know it’s not being called MEGA anymore and is supposed to be different but I would appreciate expansion on how it is going to be different. MEGA phase 1 which they tried to get funding for was supposed to be collecting the Bioresource of 10,000. This time the biobank is involved but the numbers larger. Previously it was decided the criteria should be broad and be NICE and that CFS clinics would be the source of patients. Phase one of MEGA was not going to include the severely affected, will the new study? Holgate is still involved, has this study really changed or is it different but the along the same lines like the CMRC version 1 and 2?

There’s explanations in the MEGa webite about the process to getting funding, eg seeking approval for the idea first before submitting . I don’t know why patients were not told more of what was happening this time. I expect a similar website will be put up for the new project and a Q& A would get basic facts out.maybe they are waiting for funding assurance this time.

Adrian · Jun 28, 2019

Simon M said:
This is the key question. Not all the details have been finalised yet. Certainly it will include self-report of a ME/CFS diagnosis (I'm not sure this will be bright simply a doctor or by a consultant). There will also be some questionnaire screening (note that the UK ME/CFS biobank is involved).

There's also the possibility of a validation step: taking a subsample of those people who pass the questionnaire screen and giving them a thorough medical assessment/diagnosis.

Yes, with larger sample sizes you have more leeway (because non--cases should be easier to identify in a cluster) but I think it's important there is some quality control of who exactly is passing the standard screen.

I suspect that part of the issues will come with the algorithms used to look at the data.

In the context of a lot of data analytics techniques using an L2 norm (least squares) are quite sensitive to mislabelled data (i.e. labelled ME but not). But other techniques (L1 - median and MAD) are more to robust. But I'm not sure if this is relevant here.

Looking at the Manhattan graph which I think uses a p test to derive the importance of the features. I guess if the sample is from multiple different diseases then there may be lower thresholds that apply and also more things showing up above the threshold. I'm assuming that this produces much smaller set of features that can then be clustered (or visualized with things like PCA or T-SNE) and compared with information from the patients around diagnostic info to see if there are subsets etc. The question here is around further analysis and whether this can be sufficient to pick out interesting data?

Simon M · Jun 28, 2019

Amw66 said:
The difficulty, given history, may be targeting the patient group without alienating ( or being seen to denigrate) those who don't meet strict selective criteria.

Yes, handling people who don't/won't meet the selection criteria will be tricky. But it may be useful to have a group within the GWAS of people who have chronic fatigue the don't meet the criteria.

Amw66 said:
Could this be the project to drive wholesale cooperation?

And hopefully all the different patient groups and charities will come together to back this study: it will need the support of almost everybody in the online community. We will also need to reach beyond into the much bigger pool of patients who are not part of the online community, which will mean using traditional media as well as finding ways to reach people online beyond our bubble.

dreampop said:
Would multiple countries introduce too much variability?

I think it would be essential to recruit beyond the UK in order to reach the 20,000 target (even 10,000, I suspect).

Cinders66 said:
I don’t know why patients were not told more of what was happening this time. I expect a similar website will be put up for the new project and a Q& A would get basic facts out.maybe they are waiting for funding assurance this time.

I think the research is are indeed trying to secure funding before discussing this project in detail in public. Though they haven't hidden the fact that they are making an application and Chris was very generous with his time helping me with this blog.

Adrian said:
I suspect that part of the issues will come with the algorithms used to look at the data.

In the context of a lot of data analytics techniques using an L2 norm (least squares) are quite sensitive to mislabelled data (i.e. labelled ME but not). But other techniques (L1 - median and MAD) are more to robust. But I'm not sure if this is relevant here.

Looking at the Manhattan graph which I think uses a p test to derive the importance of the features. I guess if the sample is from multiple different diseases then there may be lower thresholds that apply and also more things showing up above the threshold. I'm assuming that this produces much smaller set of features that can then be clustered (or visualized with things like PCA or T-SNE) and compared with information from the patients around diagnostic info to see if there are subsets etc. The question here is around further analysis and whether this can be sufficient to pick out interesting data?

I have no idea of the answer to those statistical questions, but I do know they will be recruiting experts in GWAS analysis to be part of the team if it gets funded.

Robert 1973 · Jun 29, 2019

Many thanks for another excellent article @Simon M.

I’m not well enough to read all the replies at the moment so apologies is this point has already been made. My only concern about the proposed study (which is the same point I made in response to MEGA) is about the number of samples that would be needed to identify abnormalities if it transpired that ME/CFS is not one illness but a number of different illnesses with different causal pathways.

Simon writes:

GWAS need very large samples if they are to discover the very small effects that influence the likes of height and Type II diabetes. Even a study with 2,000 patients is now considered small and so studies typically use at least 10,000 patients to generate robust results. Studies of diseases include similar numbers of healthy patients to aid comparison.

If 10,000 patients would be required to generate robust results for very precisely defined conditions or characteristics such as diabetes or height, then would it not be necessary to use a far great sample size if it transpired that ME/CFS included more than one illness? Say, for example, that 10% of people with ME have a different illness with a different cause and mechanism, could this be detected in a 10,000 ME/CFS sample size or would it be necessary to have 10,000 samples of this subgroup and therefore a total ME/CFS sample size of 100,000 in order to identify the abnormal SNPs in this subgroup?

I don’t mean this to be a criticism of the proposal – I am very encouraged by it – I am just interested to understand, and to ensure that the possibility of there being subgroups with different illnesses has not been overlooked.

Hoopoe · Jun 29, 2019

A variation of the question above: if enough patients take part, would it be possible to recognize several subgroups from the data?

Andy · Jun 29, 2019

Robert 1973 said:
Many thanks for another excellent article @Simon M.

I’m not well enough to read all the replies at the moment so apologies is this point has already been made. My only concern about the proposed study (which is the same point I made in response to MEGA) is about the number of samples that would be needed to identify abnormalities if it transpired that ME/CFS is not one illness but a number of different illnesses with different causal pathways.

Simon writes:

If 10,000 patients would be required to generate robust results for very precisely defined conditions or characteristics such as diabetes or height, then would it not be necessary to use a far great sample size if it transpired that ME/CFS included more than one illness? Say, for example, that 10% of people with ME have a different illness with a different cause and mechanism, could this be detected in a 10,000 ME/CFS sample size or would it be necessary to have 10,000 samples of this subgroup and therefore a total ME/CFS sample size of 100,000 in order to identify the abnormal SNPs in this subgroup?

I don’t mean this to be a criticism of the proposal – I am very encouraged by it – I am just interested to understand, and to ensure that the possibility of there being subgroups with different illnesses has not been overlooked.

I think this is an issue that this study will possibly answer, and that doing it in the way proposed is the only sensible way to go at the moment. If/when we get 10k samples analysed in this way, the results surely would go a long way to either proving what we suspect, that what is currently called ME/CFS is a collection of illnesses or a collection of sub-groups, or showing that, despite the disparity of symptoms that many of us have, the majority seem to have the same illness/there aren't many sub-groups.

So we will have either fairly clear results, indicating that sample size was enough and there isn't actually much variation in us all, or the results are muddled, which then indicates the need for a follow-on study with a much larger sample size. As we've seen recently with Karl Morten's application for funds, the MRC aren't keen on what they consider to be overly ambitious projects, so I think that trying for a sample size which is generally reckoned to be normally sufficient is the best way to go.

Robert 1973 · Jun 29, 2019

Andy said:
If/when we get 10k samples analysed in this way, the results surely would go a long way to either proving what we suspect, that what is currently called ME/CFS is a collection of illnesses or a collection of sub-groups, or showing that, despite the disparity of symptoms that many of us have, the majority seem to have the same illness/there aren't many sub-groups.

Presumably another possibility would be that ME/CFS is one illness with a common causal pathway but that there are no significant predisposing genetic factors – ie some type of as yet unidentified pathogen. (Or that this may be the case for a subgroup.) Whatever the truth, it would seem that the proposed GWAS can only help to further our understanding – provided it is overseen by the right people, which seems probable with Chris Ponting behind it.

Yvonne · Jun 30, 2019

To compare GWAS for ME with the highly-studied diabetes*, which receives extensive clinical overview, is overoptimistic in my opinion. For any finding from GWAS to be meaningful, you need some basic understanding of the disease, and especially some biological markers, which ME lacks due to decades of clinical and scientific neglect. As others have pointed out, there is a high risk of confounding because of the lack of consensus on the case definition. And 20,000 is small sample for GWAS.

Most genetic variants identified from GWAS studies have teeny-tiny influences on the disease, and their function is unknown. How would this help health professionals understand our symptoms, explain PEM, drug intolerance etc?

What we desperately need is basic research that ME has been deprived of for decades – basic clinical, epidemiological measuring, and biochemical and physiological research.

For example, how many of you have had a brain MRI or other scan? How many have had a tilt table test?

We urgently need existing clinical testing to be offered to all patients and studies which build on the small studies with interesting findings, eg impaired cardiac function.

*Just did a quick search on Pubmed - diabetes has 674,364 articles and ME has 8,370 articles

Adrian · Jun 30, 2019

Robert 1973 said:
If 10,000 patients would be required to generate robust results for very precisely defined conditions or characteristics such as diabetes or height, then would it not be necessary to use a far great sample size if it transpired that ME/CFS included more than one illness? Say, for example, that 10% of people with ME have a different illness with a different cause and mechanism, could this be detected in a 10,000 ME/CFS sample size or would it be necessary to have 10,000 samples of this subgroup and therefore a total ME/CFS sample size of 100,000 in order to identify the abnormal SNPs in this subgroup?

I think it is worth saying that sample sizes are typically based on power calculations and by looking at the experience of other studies. I am sure that the numbers here will have been done in this way.

Yvonne · Jun 30, 2019

I'm sorry to pour cold water on this (and Simon M's excellent blog) but I'm imagining in five years time if the results are disappointing and they say:
"We spent £millions studying thousands of patients on GWAS and found nothing, so we're not funding any more research into ME".

If you go fishing, you go to the spot where you're most likely to catch. IMO, all the people spending time on this could be putting their efforts into more fruitful areas of research.

Simon M · Jun 30, 2019

Subgroups

I think this is really one for @Chris Ponting but I don't think he'll be around for a bit so let me try to answer some of these points.

Robert 1973 said:
If 10,000 patients would be required to generate robust results for very precisely defined conditions or characteristics such as diabetes or height, then would it not be necessary to use a far great sample size if it transpired that ME/CFS included more than one illness? Say, for example, that 10% of people with ME have a different illness with a different cause and mechanism, could this be detected in a 10,000 ME/CFS sample size or would it be necessary to have 10,000 samples of this subgroup and therefore a total ME/CFS sample size of 100,000 in order to identify the abnormal SNPs in this subgroup?

There are not many "very precisely defined" diseases. Type II diabetes, for instance, is not a single entity but several conditions, and that is actually the norm.

However, if there are small subgroups then it is unlikely that the GWAS will find any significant results for those particular subgroup, but it doesn't stop it finding results for other, larger, subgroups.

What has happened with other illnesses is that sample size determines how many significant SNP's are found, it is not generally an all or nothing result. (it is, of course, possible that GWAS will find nothing for ME/CFS).

Andy said:
So we will have either fairly clear results, indicating that sample size was enough and there isn't actually much variation in us all, or the results are muddled, which then indicates the need for a follow-on study with a much larger sample size. As we've seen recently with Karl Morten's application for funds, the MRC aren't keen on what they consider to be overly ambitious projects, so I think that trying for a sample size which is generally reckoned to be normally sufficient is the best way to go.

That's a good point. As I said earlier, no one knows the correct size until the first GWAS has been done and that is what we are here. Bigger studies will find more, but the MRC (and NIHR who might co-fund this) won't be keen to fund to too a big study for an initial result.

strategist said:
A variation of the question above: if enough patients take part, would it be possible to recognize several subgroups from the data?

I think that is the idea. Or rather than subgroups, it might deliver signs of very different causes: perhaps some to do with the immune system and others to do with the nervous system. Or maybe even something else. And it will be possible to see if the results map to different case definitions.

Another possibility is that even different subgroups will share a causal pathway (as well as having distinct features). That would increase the chances of finding genetic differences in the common causal pathway.

Robert 1973 said:
Presumably another possibility would be that ME/CFS is one illness with a common causal pathway but that there are no significant predisposing genetic factors – ie some type of as yet unidentified pathogen. (Or that this may be the case for a subgroup.)

This is worth exploring. Let's asssume that the single cause is a specific pathogen. Even here, you would expect to find some genetic signals e.g. relating to antiviral parts of the immune system if it was a specific virus. In fact, such a finding would point towards a pathogen as a/the cause.

That is how a GWAS works: if there is an environmental factor like an infection it is likely to throw up genetic differences that affects the ability of the body to respond to that infection.

Robert 1973 · Jun 30, 2019

Thanks again for answering all these questions, Simon.

Simon M said:
This is worth exploring. Let's asssume that the single cause is a specific pathogen. Even here, you would expect to find some genetic signals e.g. relating to antiviral parts of the immune system if it was a specific virus. In fact, such a finding would point towards a pathogen as a/the cause.

That is how a GWAS works: if there is an environmental factor like an infection it is likely to throw up genetic differences that affects the ability of the body to respond to that infection.

This is very interesting and not something I had appreciated or understand. If GWAS had been available before the discovery of, say, HIV, how would GWAS have helped to identify it? My understanding is that, with a tiny number of exceptions, without treatment, anyone who is infected with HIV will eventually develop AIDS, regardless of their genes. Presumably there may be SNPs which might be helpful in identifying genes which affect behavioural characteristics which might increase the chances of being infected with HIV but I don’t understand how they could have helped to identify HIV.

I would be very interested to understand this if anyone can explain.

Simon M · Jun 30, 2019

@Yvonne , I am glad you liked the blog, even if you don't like the idea of the GWAS. I will yry to addresss your main points.

Yvonne said:
What we desperately need is basic research that ME has been deprived of for decades

I would agree with that because that's what's needed to make progress and to develop treatments to help people with ME. I would also argue that is exactly whywe need a GWAS, because it is well designed to help identify the causes of the illness. @Jonathan Edwards put it very well here (taken from my blog):

“The success of research into causes of disease hinges on someone suddenly having a brilliant idea… yet for ME/CFS nobody has a strong enough lead to show everyone the way forward. So it makes sense to set up a comprehensive fishing trip to see if we can trawl up some clues. Genetic screening [a GWAS] is probably the best bet for finding such clues.”

The good thing about GWAS is that it doesn't require a basic understanding of the disease to find something.

Yvonne said:
If you go fishing, you go to the spot where you're most likely to catch. IMO, all the people spending time on this could be putting their efforts into more fruitful areas of research.

I would say over the last decade the standout findings are rituximab, T cell clonal expansion and the Nanoneedle salt stress test.

Of these, rituximab has already been explored. Mark Davis is continuing to work on T cell clonal expansion, and Chris Ponting is running a PhD aiming to replicate the findings (with support from the Wellcome Sanger Institute in Cambridge). Ron Davis continues to work on the Nanoneedle: he has applied to the NIH for funding and is also backed by the awesome fundraising power of OMF.

So I would argue that the most striking findings in the field are already being followed up. When it comes to other findings, I think that inevitably different people have different opinions on which is most important. But for the reasons outlined in this post and thread , I think that GWAS is a strong candidate and well worth pursuing.

[missed this para out originally]
"Most genetic variants identified from GWAS studies have teeny-tiny influences on the disease, and their function is unknown."
As you say, the individual genetic variants have a small effect on disease, but as in the example or a pathogen I gave above, those small effects can identify important causes.

And yes, most genes are poorly understood and I love that some authors coined the term "ignorome" to describe how little we know about most genes. However, if a poorly understood gene is identified by research into a poorly understood illness, that is surely a strong candidate for exploration. There are ways to track down function, including gene expression databases and lab work.

I wouldn't be at all surprised if at least some of the causes of ME/CFS, turned out to be down to mechanisms that we don't even know about at the moment. A GWAS is a good way to find those kind of things.

Hoopoe · Jun 30, 2019

Simon M said:
Mark Davis is continuing to work on T cell clonal expansion

I remember hearing during one of the recent conferences that there wasn't actually a difference between patients and controls. As far as interesting leads we seem to have: findings indicative of brain inflammation or abnormal brain metabolism, the observation that something in the blood is disrupting cells (I put the nanoneedle here), and findings that suggest that metabolism affected. Then there's also the metabolic trap and a line of research suggesting there are blood flow regulation issues / dysautonomia.

wigglethemouse · Jun 30, 2019

Simon M said:
That is how a GWAS works: if there is an environmental factor like an infection it is likely to throw up genetic differences that affects the ability of the body to respond to that infection.

Here is an example to describe what @Simon M is saying. Some folks believe mutations in the gene MBL2 play a role in making folks susceptible to ME. However this is quite a common mutation. Without a good quality study with enough participants to rule out statistical effects we do not know if this gene mutation really is more prevalent in the ME population.

The change of a single amino acid in the mannose-binding lectin subunit eliminates its ability to assemble into the functional mannose-binding lectin. Similarly, certain mutations in the promoter region of the MBL2 gene reduce production of the mannose-binding lectin subunit, leading to a decreased number of subunits available for protein assembly and a reduction in the amount of functional protein. With decreased levels of mannose-binding lectin, the body does not recognize and fight foreign invaders efficiently. Consequently, infections can be more common in people with this condition.

https://ghr.nlm.nih.gov/gene/MBL2#conditions

This is just one example. There are many gene mutations that can affect the bodies ability to fight infection.

Yvonne · Jun 30, 2019

Thanks for your response, Simon, it's much appreciated.

Sasha · Jun 30, 2019

I think this is a hugely exciting project. If we'd had this done twenty years ago, who knows where we'd be now? This basic science is exactly what we need, given the lack of a strong lead in any particular direction.

I understand people's frustration at the time-scale but the faster patients sign up (and spit!), the faster it will get done. If we all pull hard together and do everything we can to spread the word about the study, I think we could recruit on a large scale, pretty quickly. Even five years ago I don't think we could have, but in these days of #MEAction and the huge amount of online networking that we now have, I think we're ready for this.

I have no MEGA qualms about this at all, with Chris Ponting leading. I really hope it gets funded and I'll be first in line to spit in a tube when when and if that time comes...

What I especially like is that getting spit from even the most severe patients should be possible and that they will presumably be able to be included.

Thanks for an excellent blog, @Simon M.

Adrian · Jul 1, 2019

Simon M said:
[missed this para out originally]
"Most genetic variants identified from GWAS studies have teeny-tiny influences on the disease, and their function is unknown."
As you say, the individual genetic variants have small effect on disease, but as any example or other pathogen I gave above (quoted here) those small effects can identify important causes

Is it that each variant has a small effect or could it be that combinations of variants have an effect but not necessarily single ones? I think you talked about variants leading to being tall but each adding a mm but I could see that certain combinations add much more and individually a variant adds little - the correlations would still be there but an analysis would need to be done to look for clusters of variants working together. I guess from a biological perspective the variants would control different protein production(?) so perhaps combinations of these have an effect rather than a single one?

Researchers propose deep trawl of DNA to help uncover the causes of ME/CFS (Simon McG blog)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Administrator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Administrator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Administrator