Evidence for a Causal Association Between Human [CMV] Infection and Chronic Back Pain: A One‐Sample Mendelian Randomization Study, 2025, Naeini et al

Discussion in 'Other health news and research' started by forestglip, Apr 12, 2025.

  1. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    Evidence for a Causal Association Between Human Cytomegalovirus Infection and Chronic Back Pain: A One‐Sample Mendelian Randomization Study

    Maryam Kazemi Naeini, Maxim B Freidin, Isabelle Granville Smith, Stephen Ward, Frances M K Williams

    Background
    Chronic back pain (CBP) is a major cause of disability globally. While its etiology is multifactorial, specific contributing genetic and environmental factors remain to be discovered. Paraspinal muscle fat has been shown in human and preclinical studies to be related to CBP. One potential risk factor is infection by cytomegalovirus (CMV) because CMV is trophic for fat. CMV may reside in the paraspinal muscle adipose tissue. We set out to test the hypothesis that previous CMV infection is linked to CPB using a one‐sample Mendelian randomization (MR).

    Method
    The sample comprised 5140 UK Biobank participants with information about CMV serology and CBP status. A one‐sample MR based on independent genetic variants predicting CMV positivity was conducted in Northern European participants. To validate the association further, the MR study was repeated using a CMV polygenic risk score (PRS). As a negative control for confounding and spurious causal inference, we used Epstein–Barr virus (EBV) serology, because EBV is another common viral infection but is not trophic for adipose tissue.

    Results
    A genome‐wide association study for CMV seropositivity revealed 86 independent SNPs having p‐value < 2×10−4 that have been used to define genetically‐predicted categories of CMV infection risk. The CMV predicted categories were found statistically significantly associated with CBP (OR = 1.150; 95% CI: 1.005–1.317, p‐value = 0.043). Stronger significant results were obtained using the PRS for CMV seropositivity (OR = 1.290; 95% CI: 1.133–1.469, p‐value = 12E‐4). No such association was seen between EBV and CBP.

    Conclusion
    Our results provide evidence for a causal relationship between CMV infection and CBP. Further investigation is warranted to get insight into the mechanism by which CMV might contribute to the pathogenesis of CBP.

    Link | PDF (Spine) [Open Access]
     
    Turtle, Peter Trewhitt and ukxmrv like this.
  2. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,013
    Location:
    London, UK
    I have only looked at the abstract but I am not sure how they conclude that there is a causal relationship, unless they mean that a certain gene make-up seems to confer risk for both CMV infection and back pain.

    It sounds a bit odd to blame back pain on paraspinal fat infection since the pain localises to a well documented area of disc deterioration.

    Maybe I have not understood how the causal path is supposed to work.
     
    Peter Trewhitt likes this.
  3. Utsikt

    Utsikt Senior Member (Voting Rights)

    Messages:
    2,445
    Location:
    Norway
    I’ve often heard that people without back pain also often have things ‘wrong’ with their spine - and that that means that the ‘damage’ likely isn’t the cause of the pain.

    Would you say that’s wrong?
     
    Peter Trewhitt likes this.
  4. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    Yeah, I'm still trying to wrap my head around MR but I don't see how this shows CMV is causal for CBP as opposed to the reverse or the gene being causal for both.

    They speculate a bit here. It seems basically they think it's possible CMV infects adipose tissue in the back or intervertebral discs.
     
    Turtle, Peter Trewhitt and Utsikt like this.
  5. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,013
    Location:
    London, UK
    I think that is misleading. You might say that a lot of people have tiles missing on the roof who don't have water coming in. That doesn't mean that some people don't have water coming in because tiles are missing (as I have direct proof of).

    Almost everyone has degeneration of the cartilage of the joint at the base of the big toe (bunion joint) by sixty. Only a small proportion have pain. But it is probably the commonest joint in the foot to hurt as you get older. Equally, the joint gets a bit thicker in almost everyone. In a small proportion it rubs on a shoe and develops a 'bunion' bursa. But the fact that a lot of people don't have the bunion bursa doesn't mean it is due to the thickened joint rubbing.

    To me the argument quoted just doesn't follow.
     
    Peter Trewhitt and Utsikt like this.
  6. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,013
    Location:
    London, UK
    I read your summary, which is very helpful.

    I think maybe they have good reason to think that some genes create risk for a process that contributes both to CMV infection and back pain. In a sense that is a causal relation but an epiphenomenal 'steam whistle' one.

    Ask a back pain sufferer I have noted that whenever I have any viral infection my back pain (due to severe degeneration of the L4-5 disc) is worse and sometimes worse enough to stop me walking upright. In between times there are times when I forget I have aback problem. My guess is that during viral infection there are systemic changes to nerve sensitivity mediated either by cytokines or purely neural signals possibly. It might be that CMV tricks the immune system (every common organism tricks the immune system one way or another) in some way that depends on your settings for cytokine production. The same settings may mean you have more nerve sensitisation with any old virus.
     
    Peter Trewhitt, forestglip and Utsikt like this.
  7. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,120
    You need to be careful how you use the genetic database. For example there are a number of genes associated with the number of cars in a household and certain variants that have p < 10-8.
    Just a completely random example.
    http://geneatlas.roslin.ed.ac.uk/trait/?traits=662
     
    Turtle, Peter Trewhitt and Utsikt like this.
  8. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,013
    Location:
    London, UK
    Yes, that's our next-door neighbours' pedigree.
    Do you have a source for that @wigglethemouse? It was something that worried me when DecodeME was being designed and I hassled Chris Ponting about it - genes that encode for answering internet requests for patient DNA for instance (different alleles encoding for answering internet requests for healthy DNA).

    I have been impressed how well they have handled those sorts of issues in Edinburgh. King's ought to know how to handle them but you never can be sure.
     
    Peter Trewhitt likes this.
  9. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,120
    Peter Trewhitt likes this.
  10. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    Do you know if there's any way for the general public to see which specific genotypes are significant for traits on there?
     
    Peter Trewhitt likes this.
  11. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,120
    Search=>Region of gene.
    e.g. Numbers of cars in household for region 6 shown as a plot (another option is table)
    http://geneatlas.roslin.ed.ac.uk/re...10000&minregion=0&chrom=6&representation=plot

    There is a way to get access to the raw data on the UK Biobank website which a user here did to disprove the genetic association Chris Ponting (or colleague) found for ME/CFS - that case showed the variant was for one person and was very rare - something that should not be included with proper QA - for example dismissing variants with an allele frequency below a certain cutoff.

    EDIT : To clarify/correct - the mistake it was in a Chris Ponting guest Blog identifying P4HA1.

    Chris admitted the error and the gene was listed in the final paper but shown as not relevant due to a MAF of 0.00029 despite a p of 2x10E-12. Paulo Maccallini was the one that found the error and Chris acknowledged the error here. Explanation of how the error was found was in the comments of either that blog, or Paulo's blog, but relevant blog pages no longer have comments showing. I seem to remember that it was the comments that provided a link to the raw data.
     
    Last edited: Apr 12, 2025
  12. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    Thanks for that. I looked around and found that Search > By Significance was more what I wanted. Just a list of the SNPs. I don't really understand what the imputed ones are so I clicked to sort by "imp. score" twice and the 6 genotyped SNPs for vehicle count are listed first.

    One SNP is related to alcohol metabolism which is kind of interesting.
     
  13. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    514
    Location:
    USA
    From my understanding (remembering one lecture several months ago), the idea of MR is to find alleles that are so strongly correlated to your “risk exposure” that they are basically a proxy for it.

    For example, if you wanted to know whether there’s a causal relation between a certain protein level with some health outcome, you might zero in on an allele with a premature stop codon that renders the protein inert. In that way, it’s like doing an experiment where you managed to knock down the protein in the person.

    Since an allele present from birth is very unlikely to be affected by environmental/lifestyle factors, you’re minimizing confounding variables as much as you possibly can in an observational study—making it approach the level of control of confounders you’d have in an empirical setup (which is what allows causal inference).

    So for this study, they basically zeroed in on alleles that are so highly correlated with CMV susceptibility that having them basically guaranteed CMV infection at some point in life. Added: Then they check how much their “guarantee” for CMV infection is also associated with back pain.

    The ability to call it a causal relationship depends on how much their “proxy” genes fulfill a set of assumptions, including their degree of proxy-ness.

    The use of EBV as negative control also further accounts for confounders that would originate just from having a viral infection. The idea being that if any similar infection confers the same risk for back pain, you’d see it come up with confirmed EBV infection as well. So by filtering out the ones that were also associated with EBV, they can be more certain that the relationship is CMV-specific.

    Hopefully I’m getting the explanation right here—it’s been a while since I learned about it. They specifically noted that they had to raise their p-value threshold in this study to find their strongly-related-alleles, which they seem to attribute to smaller sample size. I don’t think I have the expertise to determine if they overcame that limitation successfully, though.
     
    Last edited: Apr 13, 2025
  14. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    This part I'm not sure is right. I don't think the initial alleles necessarily have to be nearly perfectly associated with the variable in question. At least in this paper, it seems like they attempted to just pick out any alleles that match the standard GWAS significance threshold (and then chose a lower threshold when they didn't find enough alleles).
    And then for example there are MR studies looking for whether something like depression causes some other condition. I think it'd be pretty big news if there were SNPs that basically guaranteed getting depression.
     
  15. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    I'm only just learning the basics, but I have questions about this study. I'm looking at a paper about Mendelian randomization that makes me think they might not have fulfilled the required assumptions:

    The first assumption is basically just finding alleles that are associated with CMV in the GWAS. They lowered the threshold for inclusion quite a bit from the standard GWAS threshold (p<5x10⁴ instead of p<5x10⁸), so I'm not sure how associated they really are.

    The second assumption is that there are no common causes of both the alleles in question and back pain. For example if people from Europe happen to more often have these alleles, and people from Europe also happen to more often have back pain.

    The third assumption is that the alleles do not cause the outcome (back pain) independently of the risk factor (CMV infection). For example if an allele both suppressed the immune system to increase risk of CMV infection and also directly affected the structure of a protein in the spine that led to pain, then you couldn't say that the study shows that CMV infection increases risk of back pain, since it might just be the gene directly increasing the risk of both. This would be called horizontal pleiotropy, and is the main thing I'm wondering about with this study. How do we know these alleles don't cause back pain independently of CMV?
    They give an example of a method to test the last assumption:
    This would be doing something like only looking at the subset of people who are seronegative for CMV. If the alleles in question are still associated with back pain in this subset of people, then it's more likely that the main hypothesis - allele causes increased CMV which causes back pain - is not correct, since the alleles are associated with back pain in the absence of CMV. I don't see that they did something like this.


    They did do some other kind of control, where they also looked for alleles associated with EBV seropositivity, then checked if those alleles were associated with back pain as well, which they were not. I guess this was to check that it is CMV specifically, and not infections in general, that is the risk factor for back pain.
    They say they used some tools to do the MR. Maybe these take care of doing tests to check that these assumptions hold?
    Anyways, I'm wondering if this study is rigorous enough to say this:
    I know very little about MR, so please no one take my ramblings as meaningful. They may very well have done everything the right way, and I just don't understand it.
     
  16. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    514
    Location:
    USA
    Yes, sorry, I was exaggerating for the purposes of brief explanation. Individually an allele is not likely to be a perfect predictor for anything other than a Mendelian disease. It’s likely to be a polygenic trait, which is why they use GWA to generate a polygenic risk score.

    And then the polygenic risk score won’t be a perfect predictor numerically because one individual will likely not have every single risk variant associated with the exposure—they’ll only have a couple. But the point of this step is to find a strong enough allelic predictor score that it can serve as a reasonable proxy for CMV infection. That’s what assumption one in your linked paper is getting towards
     
    Last edited: Apr 13, 2025
    Peter Trewhitt and forestglip like this.
  17. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    514
    Location:
    USA
    That would be part of the screening for allelic traits to include/exclude from the polygenic risk score. Alleles would be included if they were associated with back pain only when CMV infection was also confirmed (assumption 3 in your linked paper). If an allele was also associated with back pain in individuals who were seronegetive, it’s filtered out.

    Added: generating an MR proxy is the process of filtering down to a list of predictors that don’t violate any of the MR assumptions. That doesn’t completely guarantee that the predictors don’t violate the assumptions in a way that the authors couldn’t think of, it just means they’ve excluding anything that clearly violates them.

    They might then do additional filtering to address things that might be less obvious, which is why they’re doing the EBV negative control. So that’s a major limitation of these types of studies, but in that way it is like an experimental setup where you just might not be aware of confounders despite your best efforts to predict and mitigate them.

    The main difference is that a laboratory setting can usually limit the amount of uncontrolled confounders—in a population-wide study like this, the potential for confounders is increased by virtue of variability in humans and human life.

    Added: focusing on alleles in the first place is meant to limit that potential for confounders substantially for reasons I already mentioned (lifestyle factors that make you more likely to catch CMV won’t actually change someone’s alleles), but it’s not a guarantee. The hope is that the alleles are randomly distributed amongst the population to control additional confounders like you would in a randomized clinical trial.
     
    Last edited: Apr 13, 2025
  18. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,065
    Ah, are you saying some of that filtering is implied in an MR study and was probably done here? I was assuming there'd be explicit details about their methods for that.

    I'm looking at the depression MR study I linked above. It's in a Nature journal so I assume there's a better chance everything is done properly.

    I notice they also relax the threshold:
    But then they also say they run a test, which was mentioned in the MR guidelines I quoted above, to make sure these SNPs are not "weak instruments". The thread's paper doesn't mention doing this.
    For the other assumptions, they list the tests they ran to check that. I don't know the details of any of these tests, but that's the kind of thing I was looking for in the thread's paper, which seems kind of low on detail.
     
  19. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    514
    Location:
    USA
    I believe that process is what “two-stage least squares regression” refers to in their methods, though I may be wrong about that. If they didn’t do that step, it just simply wouldn’t be an MR study. I couldn’t imagine how that would get published, though I suppose it’s always possible.

    Added: though as you note in the comparison to the depression MR study, it seems like they didn’t do additional tests beyond that to double check assumptions. This thread’s paper seems to be meeting bare minimum for that assumption.
     
    Last edited: Apr 13, 2025
  20. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    514
    Location:
    USA
    Okay yes, two-stage least squares (2SLS) regression does work by estimating the causal effect as a ratio between the effect of the allele on the exposure and the effect on the outcome.

    So in effect, you're starting out by looking all the alleles which are associated with CMV from the GWAS, and then regressing out the degree to which those same alleles are predictive of back pain on their own. What you're left with is the degree to which the allele is predictive of back pain only when there was CMV infection. [Added:] If that degree is 0, then the allele is effectively weighted out of the prediction (though not explicitly filtered out).

    It looks like linkage disequilibrium and horizontal pleiotrophy are the two biggest possible confounders here [added:] even after you do 2SLS--they tested for LD, but didn't seem to do anything additional for the latter. I'm definitely not enough of an expert to determine if there was justification to skip that in this case.

    Either way, it seems like examples of more robust analyses typically include the MR-Egger intercept test for horizontal pleiotrophy, as @forestglip brings up.

    Here's some additional sources I looked at to understand the 2SLS regression method better:
    Mendelian Randomization as an Approach to Assess Causality Using Observational Data
    Basic Concepts of a Mendelian Randomization Approach
     
    Last edited: Apr 13, 2025

Share This Page