A crumb of a clue on epidemiology

As @Yann mentioned the graphs sometimes seem to some large extent be driven by things related to public awareness which might work in all sorts of unusual ways
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
 
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
Maybe it's all the Guardian and George Monbiots fault after all! I agree nothing is lost with playing around with this as @Murph suggested.
 
The census data includes 108 ancestries. I ran the regression on all of them against the same Google Trends values. Here are the top 20 highest R^2 values:
1774240791839.png
This was showing lots of UK countries at the top, and I realized that it may be because people could report multiple ancestries.

From an overview of these datasets:
The ACS asks each respondent to write their ancestry or ethnic origin, and records up to two ancestries per person (the first two ancestries written by the respondent).
The table for People Reporting Single Ancestry (B04004) shows data for those who reported only one ancestry, while People Reporting Multiple Ancestry (B04005) shows data for those who reported more than one ancestry. People Reporting Ancestry (B04006) shows data for those who reported any ancestry, regardless of whether it was the only ancestry or part of multiple ancestries they reported.

Note: this means that values in B04005 and B04006 will not necessarily add up to match totals, because one person may be represented under more than one ancestry.

I (and I think Murph) used B04006, which counts up to two ancestries per person. So, for example, the correlations for Scottish and English might both be high because the same people reported both ancestries.

I tried again with B04004, which only includes people who reported one ancestry, to avoid double counting and allow better comparison between ancestries.

In this case, the correlation with English is pretty much gone (the sample size is also much smaller for this dataset, so the ancestry values may be less precise):
1774308417525.png

There are still some large correlations in this analysis, and British is still near the top at #9, though less significant, with an R^2=0.26 and p=0.0018.
Screenshot from 2026-03-23 18-24-19.png

The top 5 correlations with Google Searches for ME/CFS are Northern European, European, Swedish, Icelander, and New Zealander.

Here for example is the plot of the top correlation, Northern European ancestry vs Google Trends for ME/CFS:
1774308463026.png

(Note that only 35 states were included as the rest had missing data.)

However, it looks like the correlation might be mainly driven by just the five states in the upper right. If I exclude those, the correlation is much less apparent:
1774308486210.png

Edit: Added confidence bars to plots.

Edit: Maybe worth checking if the association is due to income level.
 
Last edited:
Are these searches in English only?
This page seems to suggest that if you are looking at a simple search term, then it will show trends just for that specific term in that language. But if you search by a "Topic", then it tracks trends across different translated versions of that topic.

- https://support.google.com/trends/answer/4359550?hl=en#zippy=,compare-terms-across-languages
if you enter ねこ, the Japanese characters for "cat," you won’t find much data for the US since many people in the US search for "cat."
Topics are a group of terms that share the same concept in any language. You can find topics below your search term.

For example, if you search London, and choose the corresponding topic, your search includes results for topics like "Capital of the UK" and "Londres," which is "London" in Spanish.

So I think the "Myalgic encephalomyelitis/chronic fatigue syndrome" I linked before is a "Topic" that should transcend language.
 
Hm. I really thought it might be that people with higher income search ME/CFS more, as they might have better access to resources that would make them aware of what ME/CFS is. And maybe states with larger proportions of Northern European or British ancestry would have higher average income.

But the correlation is very weak between search trends for ME/CFS and median household income in each state.

This is again the Trends data I used in post #4: https://trends.google.com/trends/explore?q=/m/0dctd&geo=US&legacy&hl=en

And I used income data from the following dataset, using the "Households" row, and the "Median income (dollars)" column for each state: https://data.census.gov/table/ACSST...ag-Grid-AutoColumn~(Margin+of+Errorundefined)

Screenshot from 2026-03-23 22-26-50.png

Controlling for Income in the regression of search trend vs. English ancestry, ancestry still significantly predicts the trend value.
1774322202317.png

Edit: Replaced model summary with the regression results using original unscaled values. I had originally used scaled values because I thought it was causing issues with the regression due to income and ancestry being around 7 orders of magnitude apart, but the results are the same.

"Households" is the income variable.
 
Last edited:
Back
Top Bottom