Preprint Comparing DNA Methylation Landscapes in Peripheral Blood from [ME/CFS] and Long COVID Patients, 2025, Peppercorn et al

Thanks for the description of the PCA of the 70k fragments, jnmaciuch.
Dr. Tate mentioned that he would be modifying the results section to avoid giving the wrong impression.
That's good news and will go some way to making things clearer, although inclusion of that PCA is still a problem. Many of us went 'wow! that looks amazing' when we looked at the chart - and not everyone has the interest or time to think 'it's too good to be true' and look closer at things.

I think the people who are saying 'oh, it's what is always done' are missing the extreme level of selection that happened here. If you only use 4.6% of 70k data points for the PCA, specifically the ones that separate out your pre-defined groups, ... the pre-defined groups will be separated. It's not really more complicated than that. I want the ME/CFS researchers that we support and are relying on to find answers to do better than that.

I don't think PCA is a useful tool for this particular analysis. It doesn't need to be in the paper, it's circular and is a distraction. The manuscript would be better if it just acknowledged the very small sample sizes and gave more space to the consideration of whether the identified DMFs, especially those ones common to both disease groups, might tell us something about ME/CFS and ME/CFS-like LC.
 
A previous study by this group was Changes in DNA methylation profiles of myalgic encephalomyelitis/chronic fatigue syndrome patients reflect systemic dysfunctions (Helliwell et al., 2020). From a quick glance: very similar methodology (although DMAP + single-CpG MethylKit vs DMAP2 in the more recent one). n=20 (10 pwME, 10 controls); RRBS (146575 frags); DMAP found 76 DMFs (52% hypo), MethylKit 349 DMCs (56% hypo), highest rep in intergenic (40%) & intronic (25%) regions. (The abstract states 394, but the results section states 349.); fragment-level statistics reported without FDR correction.

This group's RRBS pipeline was documented in Chapter 9 of the new Springer Protocol (link).
 
I don't think PCA is a useful tool for this particular analysis. It doesn't need to be in the paper, it's circular and is a distraction. The manuscript would be better if it just acknowledged the very small sample sizes and gave more space to the consideration of whether the identified DMFs, especially those ones common to both disease groups, might tell us something about ME/CFS and ME/CFS-like LC.
For what it’s worth, the full PCA does actually tell an interesting story along PC2—the fact that there are distinct “Neapolitan stripes” between the three groups, despite some messy outliers, is impressive since it is actually taking into consideration all 70K sites (even despite the fact that the 5/5 selection to 70K will introduce some skewing to begin with, it’s been a very lively debate in epigenomic sequencing for years as to whether you can pre-select sites/peaks based on any a-priori knowledge)

In my opinion, the story that the full PC2 tells is that people with longer disease are trending back towards healthy control in terms of DNA methylation. If it was simply a difference of “LC or ME/CFS is different from everything else,” you wouldn’t see neopolitan, you’d see one stripe amidst a soup of the other 2 groups.

I’m certainly not going to call that a definitive finding on the basis of n=15, but it could absolutely be the basis of a future study comparing people who got LC at the beginning of the pandemic to those recently afflicted to assess impact of disease duration with the same trigger.

So I’m absolutely in agreement with you @Hutan that as it is used, the PCA ends up being circular. Which is a shame in my opinion, since there is an interesting story here despite the tiny sample size. They do go into it in the later analysis, but it almost loses the punch.

I do understand why they didn’t show it like that though. If you don’t know to look along each axis separately, it just all looks like soup. But I think there were possible ways to finesse this—using only the PC2 score as its own latent variable and showing it as a box plot, for example.
 
Last edited:
Hope you had a good dinner jnmaciuch.:)

For what it’s worth, the full PCA does actually tell an interesting story along PC2—the fact that there are distinct “Neapolitan stripes” between the three groups, despite some messy outliers, is impressive since it is actually taking into consideration all 70K sites (even despite the fact that the 5/5 selection to 70K will introduce some skewing to begin with, it’s been a very lively debate in epigenomic sequencing for years as to whether you can pre-select sites/peaks based on any a-priori knowledge)
What was the percentage variation explained by the PC1 and PC2 axes?

Yes, the 5/5 selection is worth noting here. Only fragments that were present in every participant were included. So, that means that fragments that were only present in men, or that weren't present at all in the healthy controls... etc were excluded. That's interesting information to explore when there are bigger cohorts. It also sounds as though there was an element of subjectivity here in what presence requirement was used. I imagine the difficulties that missing data presents was a factor in the decision to only use fragments with 100% presence.

I’m certainly not going to call that a definitive finding on the basis of n=15, but it could absolutely be the basis of a future study comparing people who got LC at the beginning of the pandemic to those recently afflicted to assess impact of disease duration with the same trigger.
Sure. There might be other sources of variation between cohorts too, such as whether a sample was taken in the afternoon or morning or how the specimen was stored.

I do understand why they didn’t show it like that though. If you don’t know to look along each axis separately, it just all looks like soup. But I think there were possible ways to finesse this—using only the PC2 score as its own latent variable and showing it as a box plot, for example.
I think it's reasonable to assume that readers of a paper like this will understand how to read a PCA chart, and, in any case, it's easy enough for the text to explain that it's the PC2 that separates the groups so that people without prior knowledge will understand. But, I don't think it's reasonable to expect a PCA to separate out people on the basis of disease specific DNA methylation when you only have 5 people per group and 70k fragments resulting from all sorts of biological processes.

Yes, definitely, there are better statistical analysis and presentation tools, and I don't think they have to be related to PCA at all. What we want to know from a study like this is 'what associated genes were found to be differentially methylated between the groups?', so that that information can give us ideas about the disease mechanism.
 
Last edited:
I would give the authors benefit of the doubt and let the manuscript be peer reviewed and see if the reviewers pick up something that we have been discussing about and see if they seem the paper to be deemed fit.

I would agree with Utsikt. Peer review is a lottery and generally far less fair and rigorous than the attention of members here. If the analysis seems misleading that needs to be understood by all.
 
Yes, definitely, there are better statistical analysis and presentation tools, and I don't think they have to be related to PCA at all. What we want to know from a study like this is 'what associated genes were found to be differentially methylated between the groups?', so that that information can give us ideas about the disease mechanism.

I think you're right, but it's not just the PCA it's also the fact they don't multiple test correct and find basically exactly the number of positive results you'd expect by chance. It's cool there are genes that could make sense, but as it stands I don't feel I can trust any of it.

However, if what @jnmaciuch says about the PCA done on all 70k+ fragments is true then that's a different story. If the groups really look like they're separating out on that PCA then that would be encouraging. Unlike the PCA on only the significant fragments you wouldn't expect to see group differences there I think based on noise alone.
 
Last edited:
Here is how it looks with simulated random data. From 72k features in each of three groups consisting of 5 individuals sure enough you get here ~3.6k (3575) positive hits with anovas and a PCA that look very similar to theirs on the significant fragments.
upload_2025-5-23_10-24-54.png

Note that doing the PCA on all the simulated results doesn't separate the groups; if Peppercorn/Tate's does that would be of interest. I hope we can get to see that PCA soon.
upload_2025-5-23_10-27-33.png

R code said:
library(tidyverse)
# Parameters
n_features <- 72000
n_individuals <- 15
# Create group labels
groups <- factor(rep(c("group1", "group2", "group3"), each = n_individuals/3))
individuals <- paste0("ind", 1:(n_individuals))
feature_names <- paste0("feature", 1:n_features)
# Simulate data - creating differences between groups
data_matrix <- matrix(nrow = n_features, ncol = n_individuals)
rownames(data_matrix) <- feature_names
colnames(data_matrix) <- individuals
# Simulate data
for (i in 1:n_features) {
feature_values <- rnorm(n_individuals, mean = 10, sd = 2)
data_matrix[i, ] <- feature_values
}
# Transpose for ANOVA
data_df <- as.data.frame(t(data_matrix))
data_df$group <- groups
# Function to perform ANOVA for one feature
perform_anova <- function(feature) {
aov_result <- aov(data_df[[feature]] ~ group, data = data_df)
p_value <- summary(aov_result)[[1]]$"Pr(>F)"[1]
return(p_value)
}
# Perform ANOVAs
p_values <- sapply(feature_names, perform_anova)
# Select significant features (p < 0.05)
significant_features <- names(p_values)[p_values < 0.05]
n_significant <- length(significant_features)
significant_data <- data_matrix[significant_features, ]
pca_result_significant <- prcomp(t(significant_data), scale. = TRUE)
pca_result_all <- prcomp(t(data_matrix), scale. = TRUE)

# Plot PCA
pca_df_sig <- as.data.frame(pca_result_significant$x)
pca_df_sig$group <- groups

plot_PCA_sig <- ggplot(pca_df_sig, aes(x = PC1, y = PC2, color = group)) +
geom_point(size = 3) +
stat_ellipse(level = 0.95) +
ggtitle("PCA of Significant Features") +
theme_minimal()
pca_df_all <- as.data.frame(pca_result_all$x)
pca_df_all$group <- groups

plot_PCA_all <- ggplot(pca_df_all, aes(x = PC1, y = PC2, color = group)) +
geom_point(size = 3) +
stat_ellipse(level = 0.95) +
ggtitle("PCA of All Features") +
theme_minimal()
summary(pca_result_significant)
summary(pca_result_all)
plot_PCA_sig
plot_PCA_all
 
Thank you @chillier! Yes, those charts are exactly what I was expecting.

Note that doing the PCA on all the simulated results doesn't separate the groups; if Peppercorn/Tate's does that would be of interest. I hope we can get to see that PCA soon.
@jnmaciuch described the PCA on the 72k data as Neapolitan layers with messy outliers on the PC2 axis. That chart from random variables is not far off that - in fact I can see a bit of separation of the "groups" on both axes. With so few data points, it isn't hard to find a pattern of some sort.
 
Thank you @chillier! Yes, those charts are exactly what I was expecting.
@jnmaciuch described the PCA on the 72k data as Neapolitan layers with messy outliers on the PC2 axis. That chart from random variables is not far off that - in fact I can see a bit of separation of the "groups" on both axes. With so few data points, it isn't hard to find a pattern of some sort.
[edit: sorry, hit post too soon] Yes if you pulled more of the red dots into the middle and pushed the green and blue further apart, it wouldn’t look too far off. It’s a weak signal, so probably would come down to ANOVA between groups on PC2 scores. Definitely would like to see replication on a larger cohort
 
I'm continuing an email conversation with Warren. I appreciate that he has replied, but it's clear that I'm exasperating him and he does not agree with what I have been saying. I've invited him to play with the data, shuffling the members of the cohorts around so that they are random and see what happens to the PCA, and iteratively changing the selection intensity, so that he could see the issue for himself. But, he doesn't seem to have any doubt at all that the PCA is saying something useful.

I really don't think this issue is difficult to understand, my son understood the problem immediately when I explained it and was able to replicate and confirm @chillier's coding in minutes.

I thought all it would take was to point it out and these researchers would go 'oh, yes' and remove the PCA. And then, when I tried again with chillier's random PCA chart, I thought that would make the problem obvious, and that would be the 'oh, yes' moment. I'm assuming I'm just not a valid messenger. Do we have any eminent statisticians who could communicate with Warren and his team about this?
 
I thought all it would take was to point it out and these researchers would go 'oh, yes' and remove the PCA.
It looks like Katie Peppercorn and Sayan Sharma did the actual analysis and might have a better understanding of what is being asked. I like to remember that many of our researchers are from an older generation with different skills to the younger ones.
Author Contributions
Conceptualization, W.P.T. and A.C; methodology, A.C, E.J.R, P.A.S, K.P.; software, P.A.S.; validation, E.J.R.; formal analysis, K.P, S.S .; investigation, K.P, S.S, C.D.E.; resources, W.P.T, A.C., P.A.S; data curation, E.J.R, S.S. K.P; writing—original draft /final preparation W.P.T.; writing—editing K.P, S.S, E.J.R, A.C ; visualization, S.S.; supervision, W.P.T,A.C ; project administration, W.P.T, K.P; funding acquisition, W.P.T, A.C. K.P and S.S have made equal contribution and are joint first authors. All authors have read and agreed to the published version of the manuscript
 
It looks like Katie Peppercorn and Sayan Sharma did the actual analysis and might have a better understanding of what is being asked. I like to remember that many of our researchers are from an older generation with different skills to the younger ones.
The other last author is the BSS analysis expert Tate referenced in the email to @Hutan. It seems like they achieved their PhD more recently, and probably directed/advised the first authors on the analysis. Even if the first authors saw the logic in Hutan’s point, ultimately issues like this (and whether or not to heed them as issues) would be up to the last authors.
 
The paper and the discussion are over my head but just wondering, would it be worth the effort for someone to submit a comment raising the points discussed here, to be published beside the preprint? It might then catch the eyes of the reviewers
 
Bisulphite Sequencing (RRBS) was applied to the DNA of age- and sex-matched cohorts: ME/CFS (n=5), LC (n=5) and HC (n=5).
I appreciate all the thoughtful comments on this paper but with only 5 ME/CFS patients and 5 healthy controls, I don't think it's possible to get useful results, not matter how you analyse the data.
 
I appreciate all the thoughtful comments on this paper but with only 5 ME/CFS patients and 5 healthy controls, I don't think it's possible to get useful results, not matter how you analyse the data.
I think for similar big data-driven methods, such as ATAC-seq and RNA-seq, it is possible to get worthwhile results even from such a small cohort. Information at an individual gene level is almost certainly going to be highly variable and frankly is not expected to generalize even from larger cohorts than this (unless you're talking several hundred or thousand participants). However, from my experience with the other two methods I mentioned, it is entirely possible to get surprisingly robust results at the meta-analysis level (i.e. pathway hits, or general trends such as "overall trend towards hypermethylation rather than hypomethylation") even with such a small sample size.

You'd of course need validation before any of those results are considered trustworthy--but that validation may just as well come from other similarly small studies, rather than one big study. However, even serving as one point of evidence at that meta-analysis level requires the data to be analyzed well to begin with.
 
The paper and the discussion are over my head but just wondering, would it be worth the effort for someone to submit a comment raising the points discussed here, to be published beside the preprint? It might then catch the eyes of the reviewers
That's a good idea. There is a comment facility there.

I had hoped that the emails would result in a recognition by the author team that there was a problem that needed to be addressed. It's Monday morning here, so I think I'll wait a day or two for the author team to have a chance to think about the issue and respond before making a formal comment.
 
Back
Top