A crumb of a clue on epidemiology

As @Yann mentioned the graphs sometimes seem to some large extent be driven by things related to public awareness which might work in all sorts of unusual ways
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
 
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
Maybe it's all the Guardian and George Monbiots fault after all! I agree nothing is lost with playing around with this as @Murph suggested.
 
The census data includes 108 ancestries. I ran the regression on all of them against the same Google Trends values. Here are the top 20 highest R^2 values:
1774240791839.png
This was showing lots of UK countries at the top, and I realized that it may be because people could report multiple ancestries.

From an overview of these datasets:
The ACS asks each respondent to write their ancestry or ethnic origin, and records up to two ancestries per person (the first two ancestries written by the respondent).
The table for People Reporting Single Ancestry (B04004) shows data for those who reported only one ancestry, while People Reporting Multiple Ancestry (B04005) shows data for those who reported more than one ancestry. People Reporting Ancestry (B04006) shows data for those who reported any ancestry, regardless of whether it was the only ancestry or part of multiple ancestries they reported.

Note: this means that values in B04005 and B04006 will not necessarily add up to match totals, because one person may be represented under more than one ancestry.

I (and I think Murph) used B04006, which counts up to two ancestries per person. So, for example, the correlations for Scottish and English might both be high because the same people reported both ancestries.

I tried again with B04004, which only includes people who reported one ancestry, to avoid double counting and allow better comparison between ancestries.

In this case, the correlation with English is pretty much gone (the sample size is also much smaller for this dataset, so the ancestry values may be less precise):
1774304511897.png

There are still some large correlations in this analysis, and British is still near the top at #9, though less significant, with an R^2=0.26 and p=0.0018.
Screenshot from 2026-03-23 18-24-19.png

The top 5 correlations with Google Searches for ME/CFS are Northern European, European, Swedish, Icelander, and New Zealander.

Here for example is the plot of the top correlation, Northern European ancestry vs Google Trends for ME/CFS:
1774306030544.png

(Note that only 35 states were included as the rest had missing data.)

However, it looks like the correlation might be mainly driven by just the five states in the upper right. If I exclude those, the correlation is much less apparent:
1774306272899.png
 
Back
Top Bottom