Preprint Percutaneous Auricular Nerve Stimulation for Treating Post-COVID Fatigue (PAuSing-pCF), 2026, Germann et al

forestglip

Moderator
Staff member
Percutaneous Auricular Nerve Stimulation for Treating Post-COVID Fatigue (PAuSing-pCF)

Germann, Maria; Maffitt, Natalie J; Burton, Olivia A; Ashhad, Amn; Baker, Anne M.E; Cherlin, Svetlana; Shahmandi, Marzieh; Charlton, Norman; Baker, Aidan S; Zaaimi, Boubker; Ng, Wan-Fai; Soteropoulos, Demetris S; Baker, Stuart N; Wason, James M.S; Baker, Mark R

[Line breaks added]

Abstract
Even mild SARS-CoV-2 infection can lead to post-COVID syndrome, 70% of such patients have post-COVID fatigue (pCF).

Many physiological abnormalities observed in pCF could be explained by reduced vagus nerve activity. The vagus nerve, central to metabolic and inflammatory homeostasis, can be activated non-invasively by transcutaneous auricular vagus nerve stimulation (taVNS). Can taVNS improve symptoms in pCF?

Data were collected from a randomized study including 114 individuals with pCF. They completed 16 weeks of daily home-based active, sham, or placebo taVNS. Data on subjective fatigue, captured by a Visual Analogue Scale (VAS), and objective measures of cortical excitability, muscle fatigue and autonomic function were collected.

In participants meeting minimum adherence (≥1 h/day on ≥50% of days), VAS and peripheral fatigue improved significantly after 8 weeks of active (but not sham or placebo) taVNS (11.9 ± 17.8 points improvement, p=0.003, N=24). These results support taVNS as a potential therapy for pCF.

Web | DOI | PDF | medRxiv | Preprint
 
In participants meeting minimum adherence (≥1 h/day on ≥50% of days), VAS and peripheral fatigue improved significantly after 8 weeks of active (but not sham or placebo) taVNS (11.9 ± 17.8 points improvement, p=0.003, N=24).
It seems like the main result is based on testing significance within each group separately and comparing p-values. This isn't how you demonstrate that one group differs significantly from another group.

Also, there are different numbers of people in each group:
24 were randomized to active nVNS, 19 received sham stimulation and 16 were allocated to the placebo group.
If we assume each group might have similar amounts of improvement due to placebo, the group with the most participants (the active group) will tend to have the lowest p-values just due to increased statistical power.

It looks like they did compare the groups directly to each other, and found that the difference was not significant:
A repeated measures ANOVA showed no significant interaction between group and session for the visual analogue scale (VAS F(4,110)=1.516, p=0.203), FIS scores (FIS_Total F(4,112)=0.727, p=0.575, FIS_Social F(4,112)=0.770, p=0.548, FIS_Physical F(4,112)=0.680, p=0.607, FIS_Cognitive F(4,112)=0.691, p=0.600), or peripheral fatigue (TI_PeriphFatigue F(4,104)=2.884, p=0.026; not significant after correcting for multiple comparisons).
 
It seems like the main result is based on testing significance within each group separately and comparing p-values. This isn't how you demonstrate that one group differs significantly from another group.
Yet they claim that’s what they are assessing:
The main between-group results of the randomised study following the Statistical Analysis Plan are to be reported elsewhere. This paper reports a subgroup analysis which only included participants meeting minimum adherence (≥1 h/day on ≥50% of days), to explore causality and efficacy of the intervention.
The protocol also said they would include activity measurements:

Home monitoring​

In addition to participant reported levels of fatigue, we will collect data on objective digital biomarkers of fatigue using CE marked wearables technology. This will involve a wrist-worn accelerometer (AX6, Axivity, UK) and a sticky patch on the chest (Vitalpatch, MediBioSense, UK; measures ECG, skin temperature, respiratory rate and energy expenditure among others). Accelerometer data will be collected for 14 days (week 0 & week 1) and 7 days (week 8). Vitalpatch data will be collected for 7 days during week 0 and week 8 respectively.

Android smartphones (Samsung Galaxy A32 5G Enterprise) will be given to participants, which will capture and store Vitalpatch data through the MediBioSense Health Mobile Application. The smartphones can also be used to answer the online questionnaires, should participants not own a smartphone or not wish to use their personal phone.
It sounds like they know they got null results, are delaying the publication of those, and are trying to find a positive spin on something.
 

Attachments

  • IMG_0543.jpeg
    IMG_0543.jpeg
    298.5 KB · Views: 2
If we assume each group might have similar amounts of improvement due to placebo, the group with the most participants (the active group) will tend to have the lowest p-values just due to increased statistical power.
I read further and they literally say the same thing, except in explaining why the placebo group didn't improve when they were crossed over to get the intervention:
The placebo group did not show any significant changes in fatigue PROs after crossing over to receive active nVNS (between week 8 and week 16). This is presumably because, after adjusting for dropouts and participants not meeting minimum usage, the placebo group was left with fewer participants (n=16) than the active nVNS (n=24) or the sham (n=19) groups
So the active group significantly improved but the placebo group didn't in the first part - apparently evidence the intervention works. The placebo group gets the intervention but doesn't improve - well, the placebo group is too small to get significant improvements...
 
It sounds like they know they got null results, are delaying the publication of those, and are trying to find a positive spin on something.
It's so obvious how if this discipline is to achieve anything, they will need to separate the people developing interventions from those testing them. There is obviously no amount of internal checks that can account for the extreme biases at play. The idea that people will properly test something they developed for the specific function they are testing is truly absurd. Quality assurance is always a separate department in any serious organization.

But it won't ever happen because it would almost always lead to null results. The trialists would also need training that does not currently exist, would basically need to be inquisitor-level of skeptical. They would need to be explicitly adversarial and incentivized to be so, like red/blue teams in security testing. The current system is basically one where the defence, prosecution and the judge all work together with the same goal. It's complete unfit for purpose.
 
Back
Top Bottom