A crumb of a clue on epidemiology

mariovitali · Apr 7, 2026

A counterpoint would be that mecfs hits adolescents, who are presumably not drinking and have no liver function risk factors.

Please have a look at the following thread. Alcohol has been on my radar for quite some time (May 2018). Please also make sure you read the comments :

https://forums.phoenixrising.me/threads/alcohol-tolerance-before-me-cfs.60004/

@Murph I will try to explain what may be happening (a hypothesis). Interestingly, one of my key questions with patients is whether they liked drinking when they were young.

1) I believe that patients who got MECFS early on (e.g. before 16 years of age) may have the clearest genetic signature. ( Does such genetic study exists btw?)

2) For the cohort of adolescents : I believe that we are looking at compensated functioning for a significant number of them. So there are no issues with drinking until a number of "hepatic hits" takes place (e.g. medication, EBV / COVID19/HHV6 infection, toxin exposure) that disrupts this compensated functioning. It is then that certain genetic combinations make it difficult to restore proper metabolic and immune function and as a result we get MECFS and -most likely- no tolerance to alcohol.

I do not know also whether a temporal aspect exists. For example, what if -given each one's genetic profile- each "hepatic hit" gradually affects negatively important metabolic functions? What if, growing up, lowers the tolerance of "hepatic hits" that can be taken?

forestglip · Apr 7, 2026

A few more alcohol related stats.

First, I looked at car crash deaths in which the driver had a blood alcohol content (BAC) greater than 0.01 or 0.08, as a proportion of all car crash deaths, from NHTSA. I picked the year 2015, since that seems like it would be the best year to compare to the Google Trends data, which spans 2004 to 2026, and 2015 is right in the middle.

There looks to be no positive relationship, maybe a small negative relationship, when correlating ME/CFS searches with either alcohol car crash stat.

BAC = 0.01+

BAC = 0.08+

Since it's a bit hard to interpret alcohol car crash deaths out of all car crash deaths, I also did it with alcohol impaired driving fatalities per 100,000 population, using 2015 data from Foundation for Advancing Alcohol Responsibility.

No correlation.

I also looked at number of arrests for driving under the influence in a state per 100,000 population, calculated using arrest numbers and population from FBI 2015 data. Also no relationship.

---

And since I already had the arrests dataset, I checked correlation of rate of all arrest types vs ME/CFS searches:

Many crime types are negatively correlated. So it is probably a general correlation of more "crime" is associated with less searches, as opposed to any specific class of crime. There's a small negative correlation with drunkenness too, which goes a bit opposite of the alcohol stats, but this might be confounded by the general crime rate of the state.

---

So prescriptions for alcohol cravings drugs, deaths due to alcohol, and government regulation of alcohol sales are positively correlated with ME/CFS searches.

But alcohol consumption, DUI arrests, drunkness arrests, and DUI fatalities are negatively correlated or not correlated with ME/CFS searches.

Nightsong · Apr 7, 2026

Random thought of the day: it occurs to me that collecting search trends data over time might also reveal some interesting patterns. For example there was some evidence (link) of seasonal patterns to referrals to ME/CFS paediatric clinics in the UK - that might just be artefactual but I wonder if searches for ME/CFS exhibit seasonality too. Also if ME/CFS searches spike a few months after a particularly bad winter 'flu season or localised outbreaks of infective illness...

forestglip · Apr 7, 2026

Nightsong said:
Random thought of the day: it occurs to me that collecting search trends data over time might also reveal some interesting patterns. For example there was some evidence (link) of seasonal patterns to referrals to ME/CFS paediatric clinics in the UK - that might just be artefactual but I wonder if searches for ME/CFS exhibit seasonality too. Also if ME/CFS searches spike a few months after a particularly bad winter 'flu season or localised outbreaks of infective illness...

Interesting idea. I just did a few quick plots using the time version of the Trends data, one each for worldwide, United Kingdom, and USA.

I downloaded the whole timespan at once for each region, which gives one value per month. If downloading one year at a time, it gives one value per week, if higher resolution is desired.

For each plot, it shows the data separately per year, stacked on top of each other to see if there might be a common pattern across years. I don't see any obvious yearly pattern.

MrMagoo · Apr 7, 2026

ScoutB said:
Didn’t DecodeME only use people with European ancestry (or something like that)?

Too foggy to think it all through, but it feels like “group X” not being in the study might make interpreting “this variant is associated with ME/CFS and group X has it less often” more complicated.

They used people with majority of something white ancestry, I forget how they put it, but they used mine and mine is majority European but also includes black African.

jnmaciuch · Apr 7, 2026

MrMagoo said:
They used people with majority of something white ancestry, I forget how they put it, but they used mine and mine is majority European but also includes black African.

It would’ve been by degree of similarity to reference populations. “European” from 1000 genomes project includes samples from a couple different countries including Britain

forestglip · Apr 7, 2026

I think what could be interesting is looking for the best correlations of ME/CFS searches with other searches. For example, if the states that search most for ME/CFS also search most for "mono".

There's no publicly available API for Google Trends, though, so data would have to be downloaded one at a time through the browser. The idea I had was to start by looking at the correlation of ME/CFS searches with searches for 10 random words. For the most correlated search term, find 10 concepts related to that search term, with something like a thesaurus, and test correlation with those to see if any are even better. And keep iterating.

But I think without being able to automate doing that for thousands of search terms, it might be too slow to be fruitful. But maybe still worth testing with some hand selected terms.

Murph · Apr 8, 2026

API substitute:

Code:

Library(gtrendsR)
write_csv(gtrends(keyword = c("mono", "flu"), geo = "US", time = "all")$interest_by_region, "monotrends.csv" )

Not sure if python has similar.

forestglip · Apr 8, 2026

Murph said:
API substitute:

Code:

Library(gtrendsR) write_csv(gtrends(keyword = c("mono", "flu"), geo = "US", time = "all")$interest_by_region, "monotrends.csv" )

Not sure if python has similar.

Nice, I'll take a look.

Python has (had) pytrends, another unofficial API, but it stopped working/being maintained a couple years ago, based on GitHub issues.

This note from the maintainer of pytrends is a bit worrying:

over time I learned that the data returned by this Google Trends "API" is just fake: when they detect that you're a bot, something that Google is pretty good at it, it gives you slightly altered data different from the one you can see in their website.
At first the trends found in the data more or less correlated between the website and Pytrends, unfortunately this wasn't always the case, and thus I've learned that the conclusions you can reach driven by this data can be very wrong.

If Google gives fake data if downloading with a bot, it might affect the R package too.

Though, they didn't provide any evidence of fake data, so I think it's possible they may have been seeing the Google Trends data simply changing over time, like I mentioned before. I guess I could confirm that a small sample of results from gTrendsR matches the website.

Murph · Apr 8, 2026

forestglip said:
Nice, I'll take a look.

Python has (had) pytrends, another unofficial API, but it stopped working/being maintained a couple years ago, based on GitHub issues.

This note from the maintainer of pytrends is a bit worrying:

If Google gives fake data if downloading with a bot, it might affect the R package too.

Though, they didn't provide any evidence of fake data, so I think it's possible they may have been seeing the Google Trends data simply changing over time, like I mentioned before. I guess I could confirm that a small sample of results from gTrendsR matches the website.

Google salts the data slightly, i.e. adds randomisation, to prevent it revealing too much about their business. Pulling the data a few times might help average that out.

forestglip · Apr 8, 2026

Murph said:
Google salts the data slightly, i.e. adds randomisation, to prevent it revealing too much about their business. Pulling the data a few times might help average that out.

I just did some testing and the function works (most of the time - sometimes it seems to just hang.)

I was getting worried about the R function getting different data than what I saw in the browser even though I was looking at both at the same time. Then I realized that my browser wasn't going through a VPN, but R was. So I got worried that it might be serving fake data if on a VPN (which I've used for most of my downloads of the data so far in this thread).

After some experimentation, I saw that if I set the VPN location to somewhere near my actual location, then the data was the same whether on VPN or not. On the other hand, if I set it to any city in Sweden, for example, it gives a different version of the data, but the same between cities. Similarly, if I set it to any IP in Vancouver, Canada, it's again different data from the other locations, but the same between different IPs within that city.

So I think Google Trends serves data from local data centers, close to wherever the IP requesting data is located, with each data center having a slightly different randomized version of the data (with each one probably getting re-randomized every so often as well). So at this point, I think there probably isn't any funny business related to being on a VPN or requesting from R, but rather just their regular slight randomization.

And yeah, I think I will re-download the ME/CFS data a few times and create an average version which will hopefully be more accurate than any one version.

MrMagoo · Apr 8, 2026

I was looking for that famous “outbreak” of ME in America and came across this list.

Iceland seems over represented

Outbreaks – American ME and CFS Society

ammes.org

forestglip · Apr 8, 2026

Just to check how well a Google Trends search for the ME/CFS "Topic" (which should be a combination of related terms, including other languages) compares to the data just for the specific term "ME/CFS", here is the data for both:

R^2 = 0.52. That's basically only as good as British ancestry correlates to the ME/CFS topic! I wonder what other terms they're including in the ME/CFS Topic. Maybe terms that aren't actually directly related, like "fatigue".

Edit: Well, it's substantially worse (R2=0.21) if comparing the ME/CFS topic to "fatigue":

Murph · Apr 8, 2026

forestglip said:
Just to check how well a Google Trends search for the ME/CFS "Topic" (which should be a combination of related terms, including other languages) compares to the data just for the specific term "ME/CFS", here is the data for both:
View attachment 31660

R^2 = 0.52. That's basically only as good as British ancestry correlates to the ME/CFS topic! I wonder what other terms they're including in the ME/CFS Topic. Maybe terms that aren't actually directly related, like "fatigue".

Edit: Well, it's substantially worse (R2=0.21) if comparing the ME/CFS topic to "fatigue":
View attachment 31661

I believe the list of related searches on the site is the bulk of what's included in the "topic". Includes terms like fatigue, CFS. Certainly captures plenty of noise.

Especially in Montana I'm not sure it isn't capturing searches for river flow rates at cubic feet per second (CFS) and their football competition (FCS, but mistyped)

forestglip · Apr 8, 2026

I've been using the Google Trends results for the ME/CFS Topic. Since Topics are more of a black box in terms of what search terms they contain, I wanted to check how the specific search term "ME/CFS" as well as other terms, correlate to ancestry.

This whole thing started with English ancestry, but since Scottish ancestry seems even more correlated than English, I'm focusing on Scottish for now.

I downloaded the Google Trends state results for a variety of terms that might be related to ME/CFS, like "fatigue", "exhausted", and "long covid", and some that are only vaguely related, like "multiple sclerosis" and "disability". I then tested the correlation of each search interest for each term with Scottish ancestry.

Interestingly, the specific search term "ME/CFS" was a little better than the ME/CFS Topic. Even better than both of these was "chronic fatigue". The term without quotes was number one, while the term with quotes was number two.

Correlations of state search term interest with Scottish ancestry

	Search Term	Pearson R	Pearson R2	Spearman R2
1	chronic fatigue: (1/1/04 - 3/24/26)	0.77	0.60	0.48
2	"chronic fatigue": (1/1/04 - 3/24/26)	0.72	0.53	0.45
3	ME/CFS: (1/1/04 - 3/24/26)	0.72	0.51	0.30
4	Myalgic encephalomyelitis/chronic fatigue syndrome: (1/1/04 - 3/24/26) [TOPIC]	0.70	0.50	0.32
5	myalgic encephalomyelitis: (1/1/04 - 3/24/26)	0.68	0.46	0.35
6	chronic fatigue syndrome: (1/1/04 - 3/24/26)	0.67	0.45	0.37
7	"long covid": (1/1/04 - 3/24/26)	0.61	0.37	0.11
8	orthostatic intolerance: (1/1/04 - 3/24/26)	0.53	0.28	0.19
9	fatigue: (1/1/04 - 3/24/26)	0.51	0.26	0.25
10	brain fog: (1/1/04 - 3/24/26)	0.46	0.21	0.23
11	burnout: (1/1/04 - 3/24/26)	0.44	0.19	0.12
12	sleep: (1/1/04 - 3/24/26)	0.42	0.17	0.23
13	"burn out": (1/1/04 - 3/24/26)	0.38	0.14	0.14
14	burn out: (1/1/04 - 3/24/26)	0.20	0.041	0.11
15	neuropathy: (1/1/04 - 3/24/26)	0.20	0.038	0.057
16	exhausted: (1/1/04 - 3/24/26)	0.15	0.023	0.032
17	tired: (1/1/04 - 3/24/26)	0.06	0.004	0.006
18	exercise: (1/1/04 - 3/24/26)	0.04	0.0014	0.011
19	multiple sclerosis: (1/1/04 - 3/24/26)	-0.04	0.0014	0.0002
20	disability: (1/1/04 - 3/24/26)	-0.02	0.0006	0.027

Here is the top correlation of Scottish ancestry with search interest in "chronic fatigue" with no quotes:

Google Trends allows comparing search interest between multiple terms at once, so here is "chronic fatigue" (top/blue line) and "ME/CFS" (bottom/red line) over time:

A lot more people are searching for chronic fatigue than ME/CFS. It's possible that lower searches for ME/CFS make the results for this term less precise, potentially explaining why it is less correlated to ancestry than "chronic fatigue". Or, of course, it's possible that the underlying correlation is truly higher for "chronic fatigue".

Anyway, it's good to see that the specific search term "ME/CFS" correlates even better than the ME/CFS Topic with Scottish ancestry (though just barely, probably within margin of error), suggesting that it's not just searches for things like "fatigue" driving the correlation.

----

If testing with English ancestry, the ME/CFS Topic ranks higher than the ME/CFS specific term.

Interestingly, I haven't been able to get quite as high of a correlation of ME/CFS with English ancestry as Murph in the intial post. (R = 0.52)

@Murph, do you know which specific time span and search term you used? Was it the Topic version?

Weirdly, the proportion of English ancestry I calculated is not exactly the same as what is on the website that Murph intially got data from, even though I used the same source they cited: https://data.census.gov/table/ACSDT5Y2024.B04006?g=010XX00US$0400000&moe=false

For example, for Utah, the worldpopulationreview.com website says English population was 958,812, and this matches the census. However, if dividing that value by the total Utah population given in the census (3,392,331) I get 28.26% English ancestry, while the website shows 26.82%.

For the percentage they have, they had to have divided by a Utah population somewhere between 3,574,323 - 3,575,655. I'm not sure where they got that population.

(I wondered whether it was just accidentally swapping digits for Utah (28.26 vs 26.82), but the percentage differs for other states as well (e.g. Alabama: calculated from census: 12.13%, worldpopulationreview.com: 11.81%).

It doesn't really change things much, but I'm curious if the reason the correlation was even higher with that website's percentages is because they have some more accurate population estimate somehow.

forestglip · Apr 8, 2026

MrMagoo said:
Is there any other geographical way to slice and dice it? Counties or cities?

I came back to this. I found out that the 210 "Metro" divisions for the US in Google Trends aren't just arbitrary shapes. They match up with Nielsen Designated Market Area (DMA) boundaries. (These DMAs are "a proprietary geography defined by Nielsen. They are non-overlapping geographic regions that group counties based on television viewing areas.)

The US census has data split up by county. So if we can map the counties to the DMA they belong to, we can get data on a Google Trends Metro level. They don't release any such mappings of counties to DMA publicly, but I found a dataset someone made that maps counties to DMAs, and used that to sum up the ancestry totals for each county within a DMA.

I don't know exactly how accurate the mapping dataset is. I know it has at least some inaccuracies, because as one blog points out, around 12 out of the 210 DMAs Nielsen defined include only parts of counties, instead of the whole counties, so the mapping isn't straightforward for these. It's probably good enough just to look at, though.

So here is the mapping of Scottish ancestry within 208 metro regions to Google Trends scores for ME/CFS for those regions. There's maybe a slightly smaller correlation, but it still looks like there's a relationship at the smaller Metro level.

Edit: Realized my mouse pointer was in the plot screenshot, so replaced the image.

MrMagoo · Apr 8, 2026

How interesting…thank you!
I do have quite a strong Irn Bru habit which I’m eyeing suspiciously…apart from that I’m out of ideas. Crazy to think Scotland had no ME services until recently given it’s the birthplace of it (joking)!).

forestglip · Apr 8, 2026

MrMagoo said:
I do have quite a strong Irn Bru habit which I’m eyeing suspiciously…

That's funny. I had to look it up:

- https://en.wikipedia.org/wiki/Irn-Bru

Irn-Bru is a Scottish carbonated soft drink, often described as "Scotland's other national drink" after Scotch whisky.

MrMagoo · Apr 8, 2026

forestglip said:
That's funny. I had to look it up:

- https://en.wikipedia.org/wiki/Irn-Bru

They generally have humourous cheeky adverts.
This one is a classic “made in Scotland, from girders” (apparently that’s where they extract the iron from lol)

Murph · Apr 9, 2026

forestglip said:
Interestingly, I haven't been able to get quite as high of a correlation of ME/CFS with English ancestry as Murph in the intial post. (R = 0.52)

@Murph, do you know which specific time span and search term you used? Was it the Topic version?

I just reran it with freshly downloaded search data and my r^2 is different. Actually higher! .56.

I'm surprised and encouraged. If you access some data and it amazes you but you know it contains noise, you have to expect reversion to the mean (less amazement ) if you access it again. I got the reverse.

The catch-all term Google trends is showing me now is CFS/me and the subscript the browser shows under that term is not Topic, it now says Disability.

I've been trying to stress test the idea. Certainly if you include only America's most populous few states the pattern is relatively weak. It does depend on a few smaller states. They're not all independent as you say.

But they are unlikely to be all wholly confounded. Vermont and Maine maybe, Oregon and Washington maybe. But Oregon, Utah, Montana and Vermont hopefully vary enough.

A crumb of a clue on epidemiology

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)