Evidence of White Matter Neuroinflammation in [ME/CFS]: A Diffusion-Based Neuroinflammation Imaging Study 2026 Yu et al

I'm not at a PC currently, so can't test it.

But I think what's happening is that by increasing the influence of random noise, not only is the correlation of age with niiXX decreasing, but so is the correlation of mecfs_status with niiXX. In which case it's not surprising that mecfs_status is significant less often.

To test decreasing the correlation with age, without obscuring the real mecfs_status correlation with niiXX that was coded in, the true model can be altered by decreasing the influence of age on niiXX:
Maybe, though that seems unlikely to be the issue if in my last example I changed the sd from just 1 to 2 and the incidence rate of 0,1 plummeted already.

Plus if you decrease the influence of age, you increase the influence of ME/CFS, which increases the frequency of 1,1 on [edit; that table rather than just 0,1. You’d need to maintain a high percentage of specifically 0,1 to be in line with so many NII measurements falling away.] I think i need to finish coding for the day but maybe will get around to it later this week to confirm
 
Last edited:
Do we actually know if it increased differences? I don't see effect sizes or coefficients in the text. The predicted effect of ME/CFS status on the brain metric can decrease (which I agree, I think I would expect that with controlling for HADS), while the p-value still becomes even more significant (for example if age is very highly correlated to the brain metric relative to other variables, and so controlling for it removes a lot of noise, increasing significance.)
Yes, I was speaking in terms of significance, significant difference. Without the adjustment they made for the confounding variable, hardly anything was judged to be significantly different, just RF I think. After the adjustment for age, sex, BMI, anxiety and depression, and MET, a whole lot of measures became significant.
 
Maybe, though that seems unlikely to be the issue if in my last example I changed the sd from just 1 to 2 and the incidence rate of 0,1 plummeted already.
I had to multiply by a much smaller number than 0.5 to decrease the correlation meaningfully. But here is the result if multiplying age by 0.05:
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create that is more associated with age than me/cfs
  niiXX <- 0.05 * age + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
  cor(mecfs_status, age)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor <- cor(age, niiXX)
 
  pvals <- c(pval1, pval2, pval3, cor)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

results$age_cor_outcome %>% mean()
1773882068365.png

The correlation is now 0.65, and still about a quarter of the time, it's only significant after adding covariates.

But okay, 0.65 is still a high correlation to expect. But the thing is, if you include multiple covariates, all with low correlations, controlling for all of them can still decrease noise to where mecfs_status becomes significant, but without correction, there's too much noise for significance.

Here is similar code, but with four variables instead of just age. All with relatively low correlations (~0.25 - 0.4).
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 7, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome",
                       "bmi_corr_outcome",
                       "met_corr_outcome",
                       "hads_corr_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  bmi <- rnorm(50, mean = 10, sd = 3)
  met <- rnorm(50, mean = 10, sd = 3)
  hads <- rnorm(50, mean = 10, sd = 3)
 
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create model dependent on mecfs_status and four other variables
  niiXX <- (0.02 * age) + (0.2 * bmi) + (0.2 * met) + (0.2 * hads) +
    mecfs_status +
    rnorm(50, mean = 0, sd = 1)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age + bmi + met + hads)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor_age <- cor(age, niiXX)
  cor_bmi <- cor(bmi, niiXX)
  cor_met <- cor(met, niiXX)
  cor_hads <- cor(hads, niiXX)
 
 
  pvals <- c(pval1, pval2, pval3, cor_age, cor_bmi, cor_met, cor_hads)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

colMeans(results[, 4:7])

1773882314336.png

Around a third of the time, we see that 0,1 scenario. And the study had even more covariates than this, so the correlations could have been lower where we'd still expect to see this happen a large portion of the time
 
Yes, I was speaking in terms of significance, significant difference. Without the adjustment they made for the confounding variable, hardly anything was judged to be significantly different, just RF I think. After the adjustment for age, sex, BMI, anxiety and depression, and MET, a whole lot of measures became significant.
Yes, so I think if the strongest correlation is with age, controlling for it can remove so much noise that the increase in significance from better precision outweighs the decrease in significance due to decreased effect size you might expect from controlling for HADS.
 
Yes, so I think if the strongest correlation is with age, controlling for it can remove so much noise that the increase in significance from better precision outweighs the decrease in significance due to decreased effect size you might expect from controlling for HADS.
Yes, but my concern was that I thought the study had controls that were already well matched for age and sex, I thought they claimed that. So, I'd be surprised if controlling for those things made much difference. But, I'd have to go look at the table with demographics to be sure.

I thought that most of the things that might make a difference, BMI, MET, "depression" and "anxiety" actually were probably somewhat well aligned with the physical effects of having ME/CFS. So, controlling for them should reduce difference between ME/CFS and the controls.
 
Yes, but my concern was that I thought the study had controls that were already well matched for age and sex, I thought they claimed that. So, I'd be surprised if controlling for those things made much difference. But, I'd have to go look at the table with demographics to be sure.
Hmm. I'd need to think about it more, but I think if they were well-matched for age, that would mean that without controlling for age, we can expect the predicted coefficient to be accurate (or at least not biased by age).

But the variance due to differing age among the cohort (e.g. if some individuals were 20 and some were 70) still adds noise to the model, making it more likely to not reach significance, while controlling for that variance can make it more significant.
 
I had to multiply by a much smaller number than 0.5 to decrease the correlation meaningfully. But here is the result if multiplying age by 0.05:
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create that is more associated with age than me/cfs
  niiXX <- 0.05 * age + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
  cor(mecfs_status, age)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor <- cor(age, niiXX)
 
  pvals <- c(pval1, pval2, pval3, cor)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

results$age_cor_outcome %>% mean()
View attachment 31185

The correlation is now 0.65, and still about a quarter of the time, it's only significant after adding covariates.

But okay, 0.65 is still a high correlation to expect. But the thing is, if you include multiple covariates, all with low correlations, controlling for all of them can still decrease noise to where mecfs_status becomes significant, but without correction, there's too much noise for significance.

Here is similar code, but with four variables instead of just age. All with relatively low correlations (~0.25 - 0.4).
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 7, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome",
                       "bmi_corr_outcome",
                       "met_corr_outcome",
                       "hads_corr_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  bmi <- rnorm(50, mean = 10, sd = 3)
  met <- rnorm(50, mean = 10, sd = 3)
  hads <- rnorm(50, mean = 10, sd = 3)
 
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create model dependent on mecfs_status and four other variables
  niiXX <- (0.02 * age) + (0.2 * bmi) + (0.2 * met) + (0.2 * hads) +
    mecfs_status +
    rnorm(50, mean = 0, sd = 1)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age + bmi + met + hads)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor_age <- cor(age, niiXX)
  cor_bmi <- cor(bmi, niiXX)
  cor_met <- cor(met, niiXX)
  cor_hads <- cor(hads, niiXX)
 
 
  pvals <- c(pval1, pval2, pval3, cor_age, cor_bmi, cor_met, cor_hads)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

colMeans(results[, 4:7])

View attachment 31186

Around a third of the time, we see that 0,1 scenario. And the study had even more covariates than this, so the correlations could have been lower where we'd still expect to see this happen a large portion of the time
all your changes still result in the largest frequency being 1,1 though. To be reasonably likely to get the results from this study where only one association was significant in univariate but nearly all had significance in multivariate, you’d need 0,1 to generally dominate
 
To be reasonably likely to get the results from this study where only one association was significant in univariate but nearly all had significance in multivariate, you’d need 0,1 to generally dominate
If all the different niiXX outcomes are correlated to each other (which we might expect if these are possibly just different effects of a common pathology), then they will tend to have similar results. If the me/cfs_status association with NII-RF has that 0,1 significance pattern, then we can expect others with similar relationships to also have that pattern.

For example, we can imagine testing mecfs association with leukocyte count in four regions of the brain, but the regions are only 1 mm away from each other. Whatever p-value one region has, the others probably will as well due to them being markers of the same underlying process.
 
Back
Top Bottom