Trial Report Plasma cell targeting with the anti-CD38 antibody daratumumab in ME/CFS -a clinical pilot study, 2025, Fluge et al

If we're talking about averages, it has to be about the whole group. Otherwise, it would be cherrypicking - the same as if we found one person in the Rituximab study with a 5000 step count increase, called them a responder, and were comparing to that.

From Dara trial:

So average increase of 2503 steps at 8-9 months.

@forestglip


Pooling response by survey/SF36 data:

Responders in P3/Dara:

Average responder P3 n=46: 1k - 2.2k to 3.2k.

Average responder Dara n=6: 4k- 3k to 7k.

Non responders in P3/Dara:

Average stable symptoms P3 n=90: 1.5k - 2k to 3.5k

Avreage worsening symptoms P3 n=15: 0.6k - 1.8k to 2.4k

Average non responder Dara n=4: 0 - 3k to 3k

------------------------------------

In ritux P3, step effect does not correlate with SF36 response. In Dara, step effect correlates with SF36 response quite well.

Surveys like SF36 are breeding grounds for placebo effects, step counts are not.
 
Last edited:
There are good reasons to be cautious, I think.

In the dara pilot, the SF36 physical function scores of people given rituximab increased by a mean 29 points (from a mean 26 to a mean 55) over [timeframe unclear to me] months.

In the phase II rituximab study (Fluge et al. 2015) , the SF36 physical function scores of people given rituximab increased by around 28 points by 20 months (from a mean 40 to a mean 68), and stayed there to 36 months (see last line of table below):

1770383053577.png
 
Average responder step count increase in P3 n=46: 1k steps.

Average responder step count increase in Dara n=6: 4k steps.
Ah, sorry. I didn't realize you were comparing to responders in the Rituximab study.

I think there might be some statistical issues with comparing averages of responders only. One way to think of it is, imagine both studies have a small number of individuals with a huge step count increase of 4K due to natural recovery. These people go into the responder category. And imagine Rituximab is the drug that actually works and increases step count an average of 750 steps, while Dara does nothing.

We imagine there are many "responders" in the ritux study who improved by about 750 steps, and maybe when we average with the natural recovery people, it comes out to 1000 steps.

With dara, we imagine there was no real benefit, so the only "responders" are the ones with natural recovery of 4000 steps.

So we'd be seeing 4000 in dara vs. 1000 in ritux, even though ritux is where it worked. It's better to look at the overall group for statistical inference, before any such selection bias can occur.
 
There are good reasons to be cautious, I think.

In the dara pilot, the SF36 physical function scores of people given rituximab increased by a mean 29 points (from a mean 26 to a mean 55) over [timeframe unclear to me] months.

In the phase II rituximab study (Fluge et al. 2015) , the SF36 physical function scores of people given rituximab increased by around 28 points by 20 months (from a mean 40 to a mean 68), and stayed there to 36 months (see last line of table below):

View attachment 30449

I think phase 3 proves that survey data is meaningless and step counts are the true revealer.

Survey data is only reliable if it correlates with step count data. In Ritux P3, it did not, in Dara, it did.

Again, self reported outcomes vs observed outcomes.
 
Ah, sorry. I didn't realize you were comparing to responders in the Rituximab study.

I think there might be some statistical issues with comparing averages of responders only. One way to think of it is, imagine both studies have a small number of individuals with a huge step count increase of 4K due to natural recovery. These people go into the responder category. And imagine Rituximab is the drug that actually works and increases step count an average of 750 steps, while Dara does nothing.

We imagine there are many "responders" in the ritux study who improved by about 750 steps, and maybe when we average with the natural recovery people, it comes out to 1000 steps.

If a drug increases step counts by a mean of 750, then by definition it cannot work. If it cures ME, over time, people return to normal life, and their step count shoot up.

I don't see any evidence that natural recovery happens at all. If it did, and there is a small % of people who do (which it does not), then the odds of selecting 6 natural recovery patients in Dara are simply too low for this to be true, as I previously said.
 
Last edited:
If a drug increases step counts by a mean of 750, then it can't be defined as working.
Ok, but this doesn't change the issue outlined in the scenario above.

If a study puts the people who have a small increase in step count due to a (very minor) benefit from Rituximab, along with naturally recovered individuals, into a responder group, while it doesn't put anyone taking Dara, apart from those with natural recovery, into a responder group, then Dara will look better.
 
Ok, but this doesn't change the issue outlined in the scenario above.

If a study puts the people who have a small increase in step count due to a (very minor) benefit from Rituximab, along with naturally recovered individuals, into a responder group, while it doesn't put anyone taking Dara, apart from those with natural recovery, into a responder group, then Dara will look better.

I already addressed this issue. If we assume the odds of picking a "natural recovery" patient from a population are low, then the odds of picking 6 natural recovery patients from a sample of 10 in Dara are impossibly low. To which the counterargument was that we don't know the parametric distribution of natural recoveries.

The statistic 5% of people naturally recover literally assumes a binomial/bernoulli distribution.

Also, where is the evidence for natural recoveries? Many arguments are built on this. To me, natural recovery stories come from Long Covid patients who had some post viral covid fatigue, not ME, and it went away naturally. That's not ME.
 
Last edited:
I already addressed this issue. If we assume the odds of picking a "natural recovery" patient from a population are low, then the odds of picking 6 natural recovery patients in Dara are impossibly low.
Is it 6? Looking at the 6 "responders" in Supplementary Table 1, these are the changes in step count over a year:
-155, 1307, 3205, 6132, 6634, 6727

And the non-responders:
-360, -227, 215, 710

Also, are the criteria for being classified as a responder and the time period for step count measurement the same when you talk about 1000 step count improvement in responders in Rituximab?
 
Also, are the criteria for being classified as a responder and the time period for step count measurement the same when you talk about 1000 step count improvement in responders in Rituximab?
Yes this is in the table I posted.

Well by max Sf36 difference, its 6, but by step count, its really 4/5. There is one patient, green line, who relapsed.

Assume 5% naturally recover. By binomial distribution, odds of 4/10 natural recoveries is less than 1%. It's 1 in 1000.

If 1% naturally recover, the odds are 1 in 500k. At those odds one should buy lottery tickets. I hope this helps debunk the natural recovery argument. You can work out the number @forestglip

10C4 x 0.01^4 x 0.99^6
 
Last edited:
Yes this is in the table I posted.
In this post? https://www.s4me.info/threads/plasm...ilot-study-2025-fluge-et-al.44736/post-672520

That table is showing baseline steps. I see Table 2 in the study has step count at 17–21 mo, but on a skim, I don't see a step count value for 12 months like Dara.

For "response" criteria, in the Phase 3 Ritux paper, I see this:
The scale for symptom change was adapted from a Clinical Global Impression scale previously used in CFS (18). The relative scale for each symptom was 0 to 6, in which 3 denoted no change from baseline; 4, 5, and 6 slight, moderate, and major improvement, respectively; and 2, 1, and 0 slight, moderate, and major worsening, respectively. The primary variable, fatigue score (scale, 0 to 6), was calculated as the mean of the following 4 items, which correspond to the 4 fatigue-related symptoms: “fatigue,” “postexertional exhaustion,” “need for rest,” and “daily functioning.”
The median time to first clinical response (that is, fatigue score ≥4.5 for ≥8 consecutive weeks) was 41 weeks among 46 patients who fulfilled the response criteria for the fatigue measure, and time lags were comparable for responders in both treatment groups.

For Dara, I see this:
The characterization of patients with clinical improvement or no improvement during follow-up, was based on the clinical assessment at investigator visits in addition to patient-reported measures and Fitbit data.

It seems like it may have been easier to be labeled a responder in the Ritux trial, since it was just based on a questionnaire, while in the Dara trial, it was based on several factors, including step count. If the criteria made it easier to become a responder in the Ritux trial, then the average step count improvement for responders in the Ritux trial will be lower.
 
I don't see any evidence that natural recovery happens at all. If it did, and there is a small % of people who do (which it does not), then the odds of selecting 6 natural recovery patients in Dara are simply too low for this to be true, as I previously said.
There is evidence that people with MECFS sometimes improve a little or a lot. It is not the norm but it is common enough, especially early on. This is probably why so many people think they recovered from brain retraining and stuff.

The odds of 6/10 patient with MECFS as we understand it naturally improving, 5 of whom improve to a remission like state, during a drug trial, do seem vanishingly small. But it's just not enough data to be confident dara works.

If the higher NK cell severe patients they are treating currently respond in a similar manner I will start to get excited, but even then it won't be proof until we have the placebo controlled study results.
 
The odds of 6/10 patient with MECFS as we understand it naturally improving, 5 of whom improve to a remission like state, during a drug trial, do seem vanishingly small. But it's just not enough data to be confident dara works.
With a binomial distribution of 1% odds the odds are literally 1 out of 500,000. That is basic probability. At 5% its 1 in 1000. Still impossibly low.
 
With a binomial distribution of 1% odds the odds are literally 1 out of 500,000. That is basic probability.
As Jonathan already said you can never safely assume that a clinical trial is randomly sampling out of a natural distribution. Things like selection criteria can and do introduce a lot of bias that even the investigators are unaware of, the intramural study is proof of that too. That’s the whole point of doing a placebo control arm.
 
Can you give some exaemples of biased selection criteria then?

They put out an advert for a study in Bergen. People respond.

If picking 10 patients from a pool of ME patients is not randomly sampling out of a natural distribution, then what is it? You are picking out of people who want to apply for the study. I assume you would only want to apply for the study if you were desperate enough to want treatment.

So the distribution is people who are aware of these studies and want to get in. If you are aware, it means you have probably been following ME research for a while, and also you wish to improve so much that you want to take a shot at a clinical trial. It also means you have tried alot of things, nothing has worked, and you are kind of desperate.

There were also no mild patients. This actually helps the argument.
 
Last edited:
I think phase 3 proves that survey data is meaningless and step counts are the true revealer.

Survey data is only reliable if it correlates with step count data. In Ritux P3, it did not, in Dara, it did.

Again, self reported outcomes vs observed outcomes.
I klnow where you're coming from, but I don't think the data we have to date support that position. It is an assumption.

In the 2015 phase II study, the people who were thought to have responded particularly well to rituximab had step counts that are the stuff of dreams for most of us:
After 15–20 months follow-up, we had available Sensewear electronic armbands that continuously measured physical activity in the home setting. No data from baseline before intervention were available. The analyses were not preplanned, and were performed only in some patients (mainly in responders). They were performed in order to gain experience with the armbands for design of the protocol for the now ongoing randomized phase III-study. However, 12 out of 14 major responders in this study measured physical activity for 4–6 consecutive days in the time interval 15–20 months follow-up, with a mean value for “mean number of steps per 24h” 9829 (range 5794–18177), and a mean value for “maximum number of steps per
24h” 14623 (range 9310–23407).

That step count sounds about right for people who are at about 80 on the SF36 PF scale (at 15-20 months when their step count was measured):

1770386718549.png
 
I klnow where you're coming from, but I don't think the data we have to date support that position. It is an assumption.

In the 2015 phase II study, the people who were thought to have responded particularly well to rituximab had step counts that are the stuff of dreams for most of us:


That step count sounds about right for people who are at about 80 on the SF36 PF scale:

View attachment 30452
I saw that, but because they did not measure the pre baseline steps for a run in period, for non responders and responders, it is hard to conclude if this was a measurement error or not. In the Dara study, the baseline steps were measured.

Also, they had people wear the armband for 4-6 days. In Dara I believe they wore the device for a year.

I am quite suspicious of this number
 
Last edited:
I saw that, but because they did not measure the pre baseline steps for a run in period, for non responders and responders, it is hard to conclude if this was a measurement error or not. In the Dara study, the baseline steps were measured.

Also, they had people wear the armband for 4-6 days. In Dara I believe they wore the device for a year.
Mm, that's too big of an assumption for me. Am I understanding correctly that you're saying that the step count (for major responders in phase II ritux above) must be wrong, even though it tallies well with the SF36 physical function score, because you think step count only increases if someone's getting a real effect from a drug/intervention?

I think step count goes up with both real and placebo effects, though not to the same extent. Alas, I've never had the pleasure of experiencing either!

Without a placebo group, we just don't know.

I am eternally grateful to @Jonathan Edwards for reducing my expectations of the phase III rituximab trial before its publication. It was like getting advance warning that someone's going to dump you/fire you. Not pleasant, but helpful.

I wouldn't be surprised at all if Fluge & Mella are part of a research breakthrough in ME/CFS in the future. And their studies are a pleasure to read.
 
Mm, that's too big of an assumption for me. Am I understanding correctly that you're saying that the step count (for major responders in phase II ritux above) must be wrong, even though it tallies well with the SF36 physical function score, because you think step count only increases if someone's getting a real effect from a drug/intervention?

No, I am saying I am suspicious because

1. There is no baseline pre run in measurement, for both groups, whereas in Dara we know the run in baselines for 90 days.
2. There is no data on the severity of these patients pre treatment, whereas in Dara we know severity pre treatment. By severity, I mean step count, not survey score.
3. The step counts were recorded for a period of 4-6 days, in contrast to a period of 9 months pre and post treatment in Dara.

If I saw a pre recorded value of say 3k for 90 days run in, and a gradual increase to 9k post Ritux, I would instantly be a non believer in Dara.
 
Last edited:
Can you give some exaemples of biased selection criteria then?
Sure. For example, having stricter diagnostic criteria to make sure the people you’re studying actually have ME/CFS can end up selecting for people who are temporarily at the worst point of their illness. We know plenty of people can float in between severities.

Also inadvertently selecting for people who have the ability to travel for treatment. That can mean they have a caretaker with a lot of availability, allowing them to pace much more at baseline but have more leeway to test out higher levels of activity later on. And the fact that this is a risky immunosuppressant treatment—people who have tried a bunch of things with not even mild symptom relief may have meaningfully different biology, or people who don’t mind limiting their social contact to reduce risk of infection might also be different wrt baseline pacing. Duration of illness can be a factor as well, as it seemed to be in the intramural study.

That’s only a few potential issues, all of which can be very hard to predict the exact effect of in advance, which is why you set up a placebo arm with the same conditions so you don’t have to guess the cumulative effect.

Also you can’t assume that natural recovery from ME/CFS follows a normal or binomial distribution either. The oft-cited 5% figure is a very limited and context-specific estimate. The number of people who recover with different duration of illness, or who just partially recover, or who actually have a different underlying biology despite all being under the label of ME/CFS might be quite different.

I can definitely understand your frustration coming against people who seem to ignore the common sense logic of these results. I think what many people are trying to explain is that phase 2 results are quite notorious for being deceptive, precisely because of all the things that common sense doesnt account for. We all would love for Dara to work and see these results as encouraging, but know from experience to hold back from drawing conclusions.
 
Phase 2 is generally the hardest phase to get through. Drugs with a lot of funding behind them and preliminary results that look like a slam dunk frequently don’t make the cut. From AI:

The overall likelihood of approval (LOA) for a drug entering
Phase I clinical trials is low, generally ranging from 6.7% to 13.8%. Success rates vary by phase: Phase I (approx. 47–63% success), Phase II (approx. 28–31% success), and Phase III (approx. 55–58% success), with Phase II representing the highest hurdle (lowest success rate).
Clinical Phase Success Rates & Transition Probabilities
Key Factors Influencing Success
  • Overall Probability of Success (PoS):Data from 2014-2023 shows the average likelihood of approval for a new Phase I drug is 6.7%, a decrease from previous, higher estimates.
  • Disease Area: Oncology drugs often have lower success rates (3.4% overall) compared to other areas, while vaccines can have higher success rates (33.4%).
  • Trial Design: The use of biomarkers to terminate ineffective programs early in Phase II has contributed to lower overall success rates but higher efficiency in weeding out failures.
Summary of Transition Probability (Example Data)
  • Phase I to II: 47%
  • Phase II to III: 28%
  • Phase III to Approval: 55%
Once a, drug passes all three phases, the final, regulatory review 92% is usually successful.
 
Back
Top Bottom