Preprint Transfer of IgG from Long COVID patients induces symptomology in mice, 2024, Vidarsson+

Discussion in 'Long Covid research' started by SNT Gatchaman, Jun 1, 2024.

  1. butter.

    butter. Senior Member (Voting Rights)

    Messages:
    226
    Could someone with more spoons summarise Dr. Edwards' and Dr. Armstrong's points in a few paragraphs?

    I will then use this this info and run it through two more experts for mice modelling, ask for improvements and then send it to the authors of the study.
     
    Last edited: Jun 2, 2024
  2. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,646
    Location:
    Belgium
    Can anyone explain why they pooled the IgG per subgroup? I would think that you get the most information if each participant's IgG was given to a separate mouse (if this is possible). Because that is the interaction that you want to test for with as many independent observations and their spread.

    As I understand it, giving the same pooled IgG to many different mice doesn't increase the information much. It's a bit like doing another blood draw in the same patients and controls or using another hometrainer for exercise tests in the same group of participants. If they would test their pooled IgG from 8 patients in 1000 mice, it wouldn't provide much more evidence that there is something wrong with the antibodies of LC patients.

    I wonder if the statistical tests they performed took this into account (if they didn't accidentally treated the same pooled IgG into difference mice as independent observations).
     
    Peter Trewhitt, Wonko, EndME and 2 others like this.
  3. MelbME

    MelbME Senior Member (Voting Rights)

    Messages:
    108
    @butter.

    This is actually all you need as a summary. I didn't actually add much more. Just that an acute infection control would have been helpful.
     
  4. MelbME

    MelbME Senior Member (Voting Rights)

    Messages:
    108
    Mice vary a lot like people, especially in behavior. So if you are going to use behavioral measurements then you really need 10 or so per IgG sample. As you can see in their data for the behaviour (each dot on the boxplot is a mouse).

    If they wanted to see each person's IgG impact then they have to get enough IgG from the patient serum to give to 8 mice. I'm guessing they didn't have enough serum per patient. Or didn't plan to have 250 mice involved in the study.

    In fact, given they have 8 people give IgG to 8 mice, I'm guessing they planned to give 1 patients IgG to 1 mouse in original planning phase but suddenly realized that mice have variance in behavioural measures. I only say that because it's quite coincidental that they had 8 patients pool IgG for 8 mice, it could have been 5 patients for 10 mice or 9 patients for 7 mice or any combination.
     
    ukxmrv, Hutan, Binkie4 and 7 others like this.
  5. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,646
    Location:
    Belgium
    Thanks for the explanation @MelbME although I must admit that I still don't fully get it.

    If the problem is the large variation in mice behaviour, then I don't see how pooling the patient IgG into 3 groups would help to solve that. One might need a lot of mice and measurements to balance out their variation and get a significant result. But the number of mice in the experiment and thus the variation in their behavioral measurements stays the same whether you pool or not.

    As I see it, pooling only throws away variance and information from the patient samples, and does not help to reduce the variation in mouse behavior. I have no background in statistics or biomedicine though, so don't take my comments too seriously, just trying to understand.
     
    Peter Trewhitt, Wonko, MelbME and 2 others like this.
  6. MelbME

    MelbME Senior Member (Voting Rights)

    Messages:
    108
    You would get much wilder variation in the groups if you did it that way and little chance of statistical significance. Though technically you are right, at the end of the day you couldn't point at 1 mouse with 1 person IgG and know if it was the IgG or the mouse that made it behave that way. So you'd be back to viewing it as 8 mice in a group and you'd have the same take away.

    Just spit balling but what could have been done is if they put the mice through the tests and then given them IgG before putting them through the tests again. Then you'd be looking at the delta drop off in performance per mouse. Giving IgG of 1 individual to 1 mouse may have been possible, still might have wanted to see 3 mice at minimum with this setup.
     
    ukxmrv, Hutan, EndME and 4 others like this.
  7. SNT Gatchaman

    SNT Gatchaman Senior Member (Voting Rights)

    Messages:
    4,806
    Location:
    Aotearoa New Zealand
    Hutan, Peter Trewhitt, EndME and 2 others like this.
  8. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,936
    Location:
    London, UK
    I think they say they used 800uL of serum from each patient. It shouldn't have been hard to get 6.4ml. Pooling serum actually degrades the information so I cannot see another reason. If they didn't have enough mice they could have studied fewer patients but mixing all the patients together just gives us a nonsense. This is particularly so for autoantibodies because the activity of the antibody can vary five hundredfold between normal and a positive case.

    So, yes, with individual case samples you would get wide variation in results, but that is the actual biology you are looking for, not noise.
     
  9. MelbME

    MelbME Senior Member (Voting Rights)

    Messages:
    108

    They had a mouse for every sample. Makes me think they planned to do one person per mouse but then realised they forgot to account for inter-mouse variability and would've struggled to get it published. So they decided to pool IgG.

    But yeah the right thing to do would have been 5 times the mice number. Cost possibly a factor. They should get funding to follow up
     
  10. Paraprosdokian

    Paraprosdokian Established Member

    Messages:
    20
    I was quite excited when I saw this paper because if it is correct it would be a big step towards understanding the cause of LC (and ME/CFS). They identified groups of patients classified by biomarkers and linked them to symptomology (as expressed in mice).

    Unfortunately, I have concluded they didnt bring enough evidence to the table. They only had 8 mice in each group so they struggled to meet statistical significance. And without that link to symptomology, what is the point of their classification? It looks like they just cherrypicked whatever results got statistical significance. I'm not an expert, so please take the below with a grain of salt.

    My first issue is that they grouped people by IgG status. These groups do not bear a strong resemblance to their definitional criteria.

    LC Group 1 is composed of people with high GFAP, NFL and TAU levels. Yet their chart shows some members had no GFAP levels and some had no NFL. The distribution of most TAU levels in group 1 doesn't seem that different from group 2 (or even group 3). See Figure 2.

    LC Group 2 "consists of 10 patients that had higher levels of IFN-a2a and IFN-B co pared to group LC-3". This seems like a pointless comparison when the three highest IFN-a2a levels are in LC1 patients. The same is true for the three highest levels of IFN-B. There's also a very high overlap between LC2 and LC3 levels of IFN-a2a.

    Group LC-3 have lower TAU, type 1 IFN, IL-1b, IL-6 "compared to LC-2" (but not LC-1?). Yet there are some members with similar levels of each of these molecules to the other two groups. Look at the whisker charts. They are not distinct groups!

    The proteomic groupings are more distinct. But they essentially say some of us have above average levels of complement and some of us have low levels. That is literally always true - half are above average and half below average by definition. It's only important if you can show clinical significance to that fact. They attempt to do this by showing that it correlates with the symptoms in mice.

    But they only had 8 mice in each group. I do not think that's enough to have confidence in any results based on statistical significance or not. Yet the authors place a great deal of emphasis on the fact some results are significant and some are not.

    The authors say that only LC2 mice reduced their activity levels and others did not. The mice injected with healthy IgG increased their locomotor activity by around 40% after injection. The mice injected with LC3 IgG also increased their activity levels by a similar amount. I assume that this wasn't caused by the injection but just normal variance in activity. So there's a lot of variability in how mice move. Mice injected with LC2 IgG reduced their activity by 40% on day 1. But is that cause the injection caused the change or cause mice activity levels vary a lot? Their activity levels returned to normal after that. They actually rested more (were more immobile) on day 5 than the healthy mice.

    If you think about it, it doesn't make sense that only some LC mice reduced their activity levels. 29 out of 34 humans had PEM. Why was only the PEM of the LC2 humans contagious?

    Similarly, the authors say that LC1 and LC3 mice had increased pain responses. But if you look at the chart, the LC2 mice also have increased pain response. It just failed to reach statistical significance. Maybe because n=8. Can we really be confident LC1 and LC3 IgG caused pain but LC2 doesn't?

    The authors also say that only LC1 and LC3 mice had increased temperature dysregulation. But again, you can see that all the mice (including the mice with healthy IgG) have increased temperature dysregulation. But only the measured effect of two groups was large enough to reach statistical significance.

    So it looks like all three LC groups might have had increased pain, increased temperature dysregulation and lower activity. But only some reached statistical significance because the test was underpowered. So what's the point of splitting these groups up?

    It's also curious that the authors didn't line up these LC groups with human symptomology. Did the LC1 and LC3 humans gave increased pain and temperature issues? Did LC2 group have worse PEM? That would have given a bit more confidence in these findings.

    I think the underlying theory holds some promise. But they really need to redo this study with more mice.
     

    Attached Files:

    Last edited: Jun 3, 2024
  11. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,646
    Location:
    Belgium
    I think that would be the correct way to do this experiment.

    I believe the authors' significance testing is somewhat misleading as it only takes into account the variability (in the measured behavioural response) of the mice, not the variability (in IgG) of human participants. In the calculation of the standard deviation shown in figure 5, the human participants and their IgG remains the same in each group, it is only the mice that change.

    So it seems like they are treating the IgG of participants as a constant, as if it was a fixed intervention and the mice were participants in a treatment trial, getting either healthy IgG or patient IgG (in three different variants).

    But the IgG they got from the 15 healthy controls is not representative of IgG of all healthy people, it is a small sample taken from it so it likely differs substantially from the true population mean. If you would repeat this experiment multiple times, take IgG from another 15 healthy controls, pool them and transfer it into 8 different mice, you will probably get a lot of different looking points and lines than the one on figure 5. Perhaps many will look more like the 3 LC measurements than the HC line.

    It is this variability that we need to know if the measurements of the LC groups are unusual or could have happened by chance. Normally you would estimate this by looking at the variance that you have within your sample (the 15 healthy controls) but because they pooled the IgG before transfer that information is lost.

    There is a bit of a conflict between their actual significance testing where they ignored this variability and their statements that suggest they did test for this. For example, the abstract says: "These findings demonstrate that transfer of IgG from Long COVID patients to mice replicates disease symptoms.'" But how can they know this as they only did 4 independent measurements of IgG transfers? Their test assumes that the IgG from 15 HC is representative of all HC and ignores potential variability among them. Same for the LC samples.

    The lines of figure 5 do indicate that there is something odd with the LC groups: while the HC returns to baseline values after 15 days those in 2 LC groups remain lowered. But the pooled IgG is not a mean of the group and is likely mostly influenced by extreme values. So it could just have been 3 outliers that caused the difference. If it had been tested in a normal way (taking individual variability into account: 1 mouse and 1 transfer for each participant's IgG) it might not have been a significant difference.
     
    Last edited: Jun 3, 2024
  12. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,710
    Location:
    Aotearoa New Zealand
    I'm interested in the low levels of IFN-y and high levels of GFAP found in some people with LC. Those analyses do not have the same problems of no individual data and no convalescent controls that the autoantibody studies have.
    Note that pro-inflammatory cytokines did not differ between the LC patients and the controls. They say the IFN-y is the interferon type produced mainly by T cells and NK cells was low in most of the LC patients compared to the controls. In a lot of the LC group, IFN-B, which they say is mostly produced by nucleated cells when infected by a virus, was higher than in the controls.

    GFAP is associated with astrocyte activity,
    I found the following paper useful as background:
    Serum glial fibrillary acidic protein and disability progression in progressive multiple sclerosis, 2023
    GFAP in your blood does not seem to be a good thing. None of the controls in the LC study had any. The levels in the LC study aren't super high though (i.e. less than 100 pg/ml, most of the patients in the study of progressive MS had higher levels than that).

    In the MS study, higher levels of GFAP were associated with worse disease severity and subsequent disease progression. (I note in passing, in the MS study, brain lesions were only identified in 11 out of 176 MRIs done in the 120 days after the blood analysis, even though about a third of people had documented functional worsening.)

    Back to the LC study, GFAP levels above zero weren't found in all the LC group. But, the fact that they were found in a significant proportion is interesting. Could the participants with high GFAP actually have MS or some other brain pathology not necessarily related to Covid-19? I hope there will be followup on those GFAP-positive individuals to determine what is causing the GFAP levels. And more studies looking for GFAP in LC - prospective studies tracking levels from infection onset would be great.

    The MS study commented that NFL levels tend to change a lot more in an individual than GFAP. Here they are trying to explain why high GFAP and low NFL was appearing to be predictive of disease progression in their study:
     
  13. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,000
    This doesn't seem to be the case or possibly I'm a bit confused. But what I'm reading is that there are 32 mice, 8 for each of the 4 groups. And the patients sample sizes are 12, 10, 12 for the 3 different LC groups and 34 for the HCs (details are provided in the section "Human IgG intraperitoneal injection leads to tissue human IgG accumulation in mice" and the supplementary tables section). I believe the n=8 next to the graphs only refers to the number of mice.

    So I'm somewhat less inclined to believe that this was just a messy mistake.

    I still don't understand why they once used plasma and once used serum. I suppose the pre-pandemic stored samples would have always been plasma automatically but then why take serum for the patients? I do wonder about this though since this group of researchers are largely the same one from Wüst et al muscle biopsy study why they didn't control for things better, independently of the pooling issues, especially since they also had access to a post-Covid control group in the first set of experiments. It would also be interesting to know whether there is an overlap between these patients and those from the Wüst et al biopsy study which I expect will exist as in both studies patients were recruited from the Amsterdam UMC Long Covid clinic and the researchers and time of experimentation are largely the same.
     
    Hutan and Peter Trewhitt like this.
  14. butter.

    butter. Senior Member (Voting Rights)

    Messages:
    226

    The main issue at hand, in terms of commonly used biomarkers to determine CNS damage (such as GFAP), is that LC patients were not controlled for illness duration or onset (in relation to GFAP). There is a kinetic time dynamic to these biomarkers; in classical neurodegenerative disorders, elevated levels are expected, but that is not what LC is probably.

    I am surprised they found GFAP to be elevated in ten patients, as GFAP is usually elevated for only a few weeks at most (probably only days) after what we could call an ‘acute injury.’ LC and ME/CFS are, from a neurological point of view, much more likely an acute injury than a neurodegenerative disease. In fact it's probably something in the middle of this spectrum. People with IACCs are probably neurologically injured while being more prone to 're-injury' (baseline shifting crashs).

    I would bet that most of these patients positive for GFAP had a more recent onset or 're-injury'.
     
    Last edited: Jun 7, 2024
    Hutan and Peter Trewhitt like this.
  15. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,000
    The GFAP/NFL/Tau LC group is LC-1 group. The median time from infection to sampling in days in that group is 232 [min=160, max=316]. On the basis of that I don't think it's sensible to suggest that the elevated levels are primarly due to the initial infection. Of course reinfections could be influencing or even driving the results, in which case there should be enough data to compare the data of these patients with GFAP data from acute Covid-19 infections (notably such a suggestion would mean that in this experiment the controls here hadn't been reinfected as their GFAP levels are consistently low or that their GFAP levels didn't raise after reinfections in which case an elevated GFAP level would still somehow be a "LC property").

    But this does raise the question whether subgrouping different LC groups somewhat artificially on the basis of some marker rather than phenotypically, always makes sense, especially if your subgroups accidentally just end up representing groups of LC patients who had been recently reinfected vs those that hadn't or something else that you cannot control for.
     
    Last edited: Jun 7, 2024
    Hutan and Peter Trewhitt like this.
  16. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,000
    In den Dunnen's talks/presentation the patients seems to have been subgrouped by symptoms, whilst in the preprint it is done via markers. I wander where this discrepancy might come from. Potentially the sample sizes are too small to do either and in particular to do both, so maybe they just decided to go with what might be more publishable.
     
    Hutan, butter. and Peter Trewhitt like this.
  17. butter.

    butter. Senior Member (Voting Rights)

    Messages:
    226
    I find it likely that it is a property of LC and also ME/CFS, but very unlikely that GFAP remains elevated for such a long period without reinfection or re-injury, also, potentially there is a 'sub-acute neurodegenerative' subgroup. There is a similar phenomenon in mild TBI.
     
    Last edited: Jun 7, 2024
    Hutan, Peter Trewhitt and EndME like this.
  18. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,000
    Perhaps my initial comment on the study could be interesting to you. It seems that for whatever reason this is not the first time that GFAP has appeared elevated in LC patients

     
    Hutan, Peter Trewhitt and butter. like this.
  19. butter.

    butter. Senior Member (Voting Rights)

    Messages:
    226
    Thank you, I have read this study! I think it's highly underrated how significant these findings could be, we need much larger samples though. If replicated in big cohorts this would largely destroy neurology's resistance to take the issue seriously. Timing and cohort size are the main issues.
     
    Hutan and Peter Trewhitt like this.
  20. MelbME

    MelbME Senior Member (Voting Rights)

    Messages:
    108
    Have previously measured NFL in ME/CFS patient serum vs controls. Found no difference. Didn't measure GFAP though.
     

Share This Page