I expect @SNT Gatchaman is busy on other things right now, but I hope he will comment more on the paper at some point.
In the case of models 3 and 4, we can be more confident about the amount that ME/CFS status improves the model, since the age variable is included and can explain some of the variance.The improvement in precision gained by adding age would be present in both models, so it doesn't matter.
That’s what I’ve been trying to explain this whole time. It simply does not matter. The way the comparison is structured, it’s only looking at the difference between two models where the only change is whether ME/CFS is added to the model.It should be relatively straightforward to test what happens in this scenario when NII-RF is correlated to both age and ME/CFS, but where age and ME/CFS are not correlated to each other.
The math couldnt work out like that anyways, that’s what I’m trying to explain in my conversation with forestglip. You’re supposed to start with the univariate analysis, seeing if ME/CFS on its own is associated. If it is, then you control for confounders. The outcome will be that either the association is still significant (meaning that the confounders don’t matter) or it’s no longer significant (meaning that the first association is explained by the confounder). To get no association in the univariate analysis but a significant association in the multivariate means that something very funky happened in one of the calculations (as in human error)Yes, I had the same concern about the addition of controlling for the 'confounders' producing significant results that would not have been there without. I think the age and sex probably are valid, but I thought that the groups were matched on those anyway, in which case there is less reason to do that.
It would be nice to see charts like that one in the Keri paper of the actual signals, with some indication of variability. I wonder if it is possible to average the signals for specific tissues for each cohort and present them as a chart like the Keri chart?For some reason the chart I attached upthread isn't showing (fixed now) - here it is again, from the Keri paper:
TY, I'm hoping this w/e will give sufficient time to look properly through the foundational techniques papers to try and make sense of the problems being discussed above. On my initial read of the paper one thing that always raises an eyebrow for me with these types of findings is the asymmetry. Eg —I expect @SNT Gatchaman is busy on other things right now, but I hope he will comment more on the paper at some point.
Similarly, a negative association between NII-FF and disease severity suggests that reduced apparent axonal density in the right superior longitudinal fasciculus and right posterior corona radiata may accompany worsening clinical presentation.
on average, patients show higher NII-FF, but the most severely affected show NII-FF loss in specific tracts. One tract (right posterior corona radiata) also showed group level NII-FF increases, implying that NII-FF may initially rise (e.g., reduced dispersion or compartmental re-weighting) and later fall with more severe disease (axon or packing loss), a pattern compatible with stage dependent pathology.
I'm not sure about that.The math couldnt work out like that anyways, that’s what I’m trying to explain in my conversation with forestglip. You’re supposed to start with the univariate analysis, seeing if ME/CFS on its own is associated. If it is, then you control for confounders. The outcome will be that either the association is still significant (meaning that the confounders don’t matter) or it’s no longer significant (meaning that the first association is explained by the confounder). To get no association in the univariate analysis but a significant association in the multivariate means that something very funky happened in one of the calculations (as in human error)
It would not because of what I explained in post #69. When you report the p-value for a variable in a model with added covariates, you are reporting how well a model with ME/CFS plus all the covariates predicts the outcome variable compared to a model with all the covariates. That's what it means to "control for confounders" in a regression analysisSay you have a feature that increases a lot by age but is reliably lower in ME/CFS at each age. If you had a badly matched control group, a lot younger, then there might not be a significant difference between the ME/CFS group and the control group for that feature. So, there would be no obvious association to start with. But then, if you did control for age, the difference would appear.
Here is some R code simulating this with random data where NII-RF is correlated to age and to ME/CFS status:(Int = fitted model intercept)
Testing the association of NII-RF and ME/CFS without covariates:
p-value of ME/CFS derived by comparing difference in model performance between:
model1: NII-RF ~ Int
model2: NII-RF ~ Int + Beta1 * ME/CFS (0/1 binary)
model1 and model2 each have their own precision. If a model containing ME/CFS is better at predicting NII-RF scores than a model containing just the intercept, we say there is a significant association.
Testing the association of NII-RF and ME/CFS with covariates:
p-value of ME/CFS derived by comparing difference in model performance between:
model3: NII-RF ~ Int + Beta1 * age
model4: NII-RF ~ Int + Beta1 * age + Beta2 * ME/CFS (0/1 binary)
If the model performance improves when the ME/CFS variable is added, we say that ME/CFS is significantly associated with NII-RF accounting for age.
If ME/CFS has any predictive power for NII-RF, you will get p < 0.05 comparing models 1 and 2. If you want to ask whether that association is confounded by age, you compare models 3 and 4. The improvement in precision gained by adding age would be present in both models, so it doesn't matter.
[Edit: and for ME/CFS to be significant in the model3 vs. model4 comparison, it by definition has to have good predictive power for NII-RF. So you see why it would be weird to get significance comparing models 3 and 4, but no significance comparing models 1 and 2 (for all the measurements except NII-RF, which was significant in both comparisons)]
total_trials <- 1000
more_significant_count <- 0
for (i in 1:total_trials) {
age <- rnorm(50, mean = 45, sd = 10)
mecfs_status <- rbinom(50, 1, prob = 0.5)
niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)
model1 <- lm(niirf ~ 1) # Intercept only
model2 <- lm(niirf ~ mecfs_status)
anova_1_2 <- anova(model1, model2)
anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]
model3 <- lm(niirf ~ age)
model4 <- lm(niirf ~ mecfs_status + age)
anova_3_4 <- anova(model3, model4)
anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]
if (anova_1_2_pval > anova_3_4_pval) {
more_significant_count <- more_significant_count + 1
}
}
print(paste0("In ", more_significant_count, " out of ", total_trials, " total trials (", 100*more_significant_count/total_trials, "%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."))
anova_1_2
anova_3_4
In 978 out of 1000 total trials (97.8%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate.
> anova_1_2
Analysis of Variance Table
Model 1: niirf ~ 1
Model 2: niirf ~ mecfs_status
Res.Df RSS Df Sum of Sq F Pr(>F)
1 49 3947.8
2 48 3906.9 1 40.879 0.5022 0.4819
> anova_3_4
Analysis of Variance Table
Model 1: niirf ~ age
Model 2: niirf ~ mecfs_status + age
Res.Df RSS Df Sum of Sq F Pr(>F)
1 48 52.230
2 47 47.129 1 5.1011 5.0872 0.0288 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This explanation of restricted fraction and hindered fraction is exactly my understanding as well.The restricted fraction is the water inside small confined structures such as cells. It is water that is isotropic - it can move in all different directions, just not very far.
The hindered fraction is also isotropic, but I think excludes the restricted fraction and the FF. The NII-HR, the hindered water ratio is the ratio of that to the rest of the signal. I think it might be to the rest of the isotropic signal.
I think the anisotropy is supposed to have been "removed" from the hindered and restricted components.I would expect water both inside and outside cells in white matter to have a degree of anisotropy to its diffusivity, certainly for outside and for nerve fibres. FF refers to fibres I think.
Is it that the anisotropy measures stratify across the 'hindered' and 'restricted' components?

I don't see our pathology ending up being lateralised like this.
Your intercept only comparison is not correct:Here is some R code simulating this with random data where NII-RF is correlated to age and to ME/CFS status:
Code:total_trials <- 1000 more_significant_count <- 0 for (i in 1:total_trials) { age <- rnorm(50, mean = 45, sd = 10) mecfs_status <- rbinom(50, 1, prob = 0.5) niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1) model1 <- lm(niirf ~ 1) # Intercept only model2 <- lm(niirf ~ mecfs_status) anova_1_2 <- anova(model1, model2) anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]] model3 <- lm(niirf ~ age) model4 <- lm(niirf ~ mecfs_status + age) anova_3_4 <- anova(model3, model4) anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]] if (anova_1_2_pval > anova_3_4_pval) { more_significant_count <- more_significant_count + 1 } } print(paste0("In ", more_significant_count, " out of ", total_trials, " total trials (", 100*more_significant_count/total_trials, "%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate.")) anova_1_2 anova_3_4
Output:
To get an idea of what the p-values are, I looked at the ANOVA results for the last trial:
When comparing the models without age, the p-value was 0.48. With age, it was 0.03.Code:> anova_1_2 Analysis of Variance Table Model 1: niirf ~ 1 Model 2: niirf ~ mecfs_status Res.Df RSS Df Sum of Sq F Pr(>F) 1 49 3947.8 2 48 3906.9 1 40.879 0.5022 0.4819 > anova_3_4 Analysis of Variance Table Model 1: niirf ~ age Model 2: niirf ~ mecfs_status + age Res.Df RSS Df Sum of Sq F Pr(>F) 1 48 52.230 2 47 47.129 1 5.1011 5.0872 0.0288 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Result:total_trials <- 1000
more_significant_count <- 0
for (i in 1:total_trials) {
age <- rnorm(50, mean = 45, sd = 10)
mecfs_status <- rbinom(50, 1, prob = 0.5)
niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)
model1 <- lm(niirf ~ 0) # Intercept only
model2 <- lm(niirf ~ 0 + mecfs_status)
anova_1_2 <- anova(model1, model2)
anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]
model3 <- lm(niirf ~ age)
model4 <- lm(niirf ~ mecfs_status + age)
anova_3_4 <- anova(model3, model4)
anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]
if (anova_1_2_pval > anova_3_4_pval) {
more_significant_count <- more_significant_count + 1
}
}
print(paste0("In ", more_significant_count, " out of ", total_trials, " total trials (", 100*more_significant_count/total_trials, "%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."))
anova_1_2
anova_3_4
"In 13 out of 1000 total trials (1.3%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."
Result:for (i in 1:total_trials) {
age <- rnorm(50, mean = 45, sd = 10)
mecfs_status <- rbinom(50, 1, prob = 0.5)
niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)
model1 <- lm(niirf ~ 0) # Intercept only
model2 <- lm(niirf ~ mecfs_status)
anova_1_2 <- anova(model1, model2)
anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]
model3 <- lm(niirf ~ age)
model4 <- lm(niirf ~ mecfs_status + age)
anova_3_4 <- anova(model3, model4)
anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]
if (anova_1_2_pval > anova_3_4_pval) {
more_significant_count <- more_significant_count + 1
}
}
"In 0 out of 1000 total trials (0%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."
So the upshot is, I believe NII-RF and NII-HR are defined just using the parts of the signal they think are isotropic, according to their model.
Your intercept only comparison is not correct:
model1 <- lm(niirf ~ 1) with model1 <- lm(niirf ~ 0). 1 is the formula for an intercept. 0 means no intercept. A formula has an implied intercept term. To remove this use either y ~ x - 1 or y ~ 0 + x.
Sorry, short on time today otherwise would write more. [edit the important thing] to check is that the case I'm talking about is where the original NII-XX ~ ME/CFS association wasn't significant initially. Even if you take mecfs_status out of the outcome variable, you might still generate a spurious association randomly. The issue I've been talking about is the measurements besides NII-RF that dropped out in the univariateYou replacedmodel1 <- lm(niirf ~ 1)withmodel1 <- lm(niirf ~ 0). 1 is the formula for an intercept. 0 means no intercept.
From the R docs:
So in the first updated code you posted, in the comparison of model1 and model2, it's comparing a model that just predicts 0 for every point with a model that predicts based on mecfs_status, but is forced through the point (0,0).
In the second code, it's comparing a model which predicts 0 for every point to a model that includes mecfs_status and an intercept.
I'm not sure about that.
Say you have a feature that increases a lot by age but is reliably lower in ME/CFS at each age. If you had a badly matched control group, a lot younger, then there might not be a significant difference between the ME/CFS group and the control group for that feature. So, there would be no obvious association to start with. But then, if you did control for age, the difference would appear.
I randomly generated values for an outcome variable (niiXX) that was deliberately coded to correlate with age. I then checked the association between niiXX and mecfs_status. If that univariate association was not significant (p > 0.05), I tested a second model including mecfs_status and age.[1] "In 243 out of 4721 total trials (5.1%), the ME/CFS variable with significant with the covariate of age but not on its own."
This is because in the model you made, there is no relationship between ME/CFS and niXX, so it is all due to chance. niXX is just a function of age, so there should only be about 5% significant as false positives when testing the association with mecfs_status.I randomly generated values for an outcome variable (niiXX) that was deliberately coded to correlate with age. I then checked the association between niiXX and mecfs_status. If that univariate association was not significant (p > 0.05), I tested a second model including mecfs_status and age.
The situation I flagged, where a variable was not significant in a univariate association but is significant with covariates, only occured 5% of the time, so exactly what was expected by chance.
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)
In 4345 out of 4731 total trials (92%), the ME/CFS variable with significant with the covariate of age but not on its own.