Patient led measure of outcomes

Trish · Jul 26, 2025

Sasha said:
Why do you think it would only be suitable as a short-term approach?

I think the approach of each patient picking 3 activities they currently can't do and want to be able to do may be OK for a trial of a treatment, with the follow up running for, say, a year, might be OK to give a clear indication of whether the treatment has had a clinically significant effect. Also for pwME, effects of activities are cumulative, and PEM is delayed, so that needs to be factored in.

I think for something like a long term study of pwME's fluctuations over time that would not be useful. It doesn't take into account worsening, and, depending on the activities picked, may turn out to be to limited, with further improvements or worsening not registering. Something like FUNCAP which covers a wider range of activities across all severity level would be better for that.

bobbler · Jul 26, 2025

I'm of the opinion that certain test things like eg a shower (and that would need to include as well as frequency: time, an activity monitor but also issues that might change like if people get a seat etc) are going to be harder ones for the bps to cheat. Because its the one thing I could never cheat other than picking the only window over x time period my body wouldn't faint and loading up on meds, caffeine or whatever I used at that level of severity to help adrenaline-up if needed for it. My hair length makes a difference now because I have to brush it before, sometimes just before sometimes I pace that the day before. BUt hey that hasn't happened by accident either.

I also know that as I've got worse over the years I've been in denial particularly whilst I was working, putting on my best performance for work as the priority. But that if someone looked closely then first the grocery shops went online, then work from home, shower at different times and frequency. But also big things like having to be driven places, get stairlift, how much I can even use my downstairs, ability to sit in a chair or different types of chair.

Well... it all seems pretty significant stuff when you look at it over those longer spaces of time, and me not really being surrounded by anyone being any much different or kinder or more accepting or being a different personality myself.

So sometimes when I look at that I wonder whether I'm 'overthinking it' with the forensic nature of what my calendar has been like/how punishing it has been in the preceding weeks and months being so pertinent. But it's true. I'm having an awful week (which of course those near me still hint maybe I caught a bug even decades in to avoid 'getting it about ME/CFS timescales') because of a load that finished 3 weeks ago. And I'm worse than last week, even though I've had a week more 'rest' vs that hectic time before. Even if you were using an app I'm not sure it would make sense. Or have picked up the differences from this week to last in how awful I feel and debilitated I am.

So it has reminded me that even the 'home experiment' format if you could trust people to put aside eg a week to not have other things and test if they could get shower or teethbrushing in and compared it to the year before will be affected by that. And I don't know how much of that will come out in the wash over matched pairs that had a comparatively less threshold vs committment week a year ago vs today or if for most of us it will tend to just go that one way of the tightening ratchet.

I know that things like the teaching survey format wouldn't work (they give each person a date in the calendar and they have to fill in exactly what % of time they spent on tasks related to teaching, research, admin - with the idea across all the people the dates will 'even out' because some will be term time, exams, some out of term etc) ie it has to be the individual compared to themself, not relying on size of sample to even out over a population. For multitude reasons (including some things might work for certain types).

I'm keeping on thinking because as a saddo this does fascinate me professionally as conundrums to how to get a research method to tackle something thanks to my background, as well as personally, hence the questions I ask myself. And then checking I'm not doing overkill/putting in excess that doesn't matter. ie remembering what are we looking at 'differentiating on' and what does or doesn't matter too. Those different angles. And not getting caught in the trap of just using what's available as a measure vs picking something there isn't even a good proxy for.

I do think that somehow having better 'frames of reference' to those measuring points in time is the clever bit and needs something far better than a survey question, certainly of the standard format (it would need to be more experiment type). I wonder whether seeing a video of myself a year ago would help for example.

It depends who we are needing to prove it to as well I guess. I think we can sort of get there but it might need to be specific things for specific things.

But yes if you were monitoring a whole population over very long periods of time then things like the shower and 5 items. And funcap even in the basic form I'm surprised by how at first you think you are teetering over whether the 'can't do at all' or'3 days impact' vs 'can't do on same day' makes sense but then as the very different grades of items come through you realise how less that matters in the big picture anyway because I'm so disabled that half the questions later on are just dreamland stuff, so I can see how it separates pretty well even with that. And how much of a change on how many of the features are we talking about. Given it should be long-term and it should be 'enough' I guess on at least something.

bobbler · Jul 26, 2025

Utsikt said:
Wouldn’t free text answers also be impacted by mood, etc?

I'd warn against free text type stuff. the iller I am the more my ability to find words (anomia) or just want/need to lie down and not be able to do it at all will be. I might be more likely to undersdescribe or say the equivalent of the in-person 'I'm fine +small talk' when I'm so ill I'm having to dictate what would be 'top of head' stuff rather than when I'm less ill but can at least access my words and 'meta' how do I feel part (which won't happen until I get a better moment when I'm most ill)

there is also the ambiguity of different people meaning very different things by the same words another person uses. if we ever needed to compare one person with another. and misinterpretation.

Sasha · Jul 26, 2025

Thanks again to @Utsikt for giving me the link to Jo's paper. I've copied the relevant bits out below, reformatted for ease of reading.

***

On entry, a set of criteria was laid down for each patient on the basis of their clinical state at entry, indicating what would be considered 'ideal' improvement, 'useful' improvement, no change, and deterioration.

Ideal improvement was intended to indicate the best possible outcome which might be expected in the face of any irreversible problems such as joint deformity or chronic uraemia, sustained for at least three months.
Useful improvement was intended to indicate an improvement short of ideal which justified the cost, inconvenience, and potential hazard of high dose steroid infusion, and which was sustained for at least three months.
A static state was intended to indicate the absence of either useful improvement or significant deterioration, assessed at three months, or earlier if withdrawn for alternative dosage treatment.
Significant deterioration was intended to imply that clinical problems had worsened or that new problems had developed which were of greater importance than any coexisting improvement. The appearance of renal disease in the face of improved arthralgia would be considered deterioration, but the development of arthralgia in the face of significant improvement in renal function would not.

Assessment was made at three months, or earlier if withdrawn for alternative dosage treatment. The criteria for outcome were different for each patient, based on the problems of relevance to that individual.

Criteria were often quite complex, being derived from a range of baseline clinical and laboratory data. An example of a set of criteria is given in Table 2.

Table 2 Criteria for patient 10 (first arm)

Clinical features on entry: pyrexia, leucopenia, anaemia, pleural effusion, pleuritic pain, and proteinuria.

Deterioration=Death. or cerebral disease, or proteinuria more than 8 g/day, or increase in pleural effusion on radiography. or neutrophil count below i0(/i.
Static=Neither 'deterioration' nor 'useful improvement'.
Useful improvement =At least two of the following at three months: no pyrexia. proteinuria less than I g/day. 80% resolution of pleural effusion on radiography. neutropphil count above 2-5x1l()/l on two separate occasions.
Ideal=No features of SLE. and specifically. all criteria for useful improvement fulfilled and haemoglobin greater than I 10 g/l at three months without transfusion.

[...]

In trying to answer a question of outcome, the power of the statistical analysis is reduced in proportion to the number of analyses made which relate to the question. When dealing with very small numbers the only option is to use one outcome measure.

A point scoring system using a range of clinical data which provided a sensitive reflection of each patient's problems would have to be based on an unmanageable set of rules involving many inter-related contingencies, involving time relationships and subtle grades of severity. As an alternative we used a system of individualised criteria.

This is equally valid statistically and can be much more closely tailored to events of importance to each patient. It suffers from the disadvantage that one physician's assessment of important outcome events may differ from another's.

It became clear during the trial, however, that the two or three physicians drawing up each set of criteria agreed very closely on what constituted ideal improvement and useful improvement as originally defined.

The study shows that it is feasible to conduct double blind trials using individualised outcome criteria. Even with the use of individualised outcome criteria the power of the statistical analysis is weak because of small numbers.

The study may not have detected a modest difference in effect between the two dosages, demonstrating the almost insuperable problems of studying uncommon heterogeneous disease. Nevertheless, we consider that individualisation of outcome criteria goes part of the way to solving this problem and can be a very valuable technique.

bobbler · Jul 26, 2025

Hutan said:
Five seems like quite a lot - would three be good enough?

I think it should just be 'did you do them?' That cuts out the interpretation. I think any study still needs to be long enough to pick up deterioration due to overexertion. So, that covers part of the negative knock ons. I think the measure needs to be completed daily - that way there is an aspect of frequency, so that also covers part of the negative knock ons.

I like the idea of picking activities from Funcap.

there aren't many measures I can think of I'd complete daily if I was in a crash. Today on the app lying flat on my back (admittedly looking at my phone for much of it, but TV mostly off etc) I've been in 'activity' all day. Until today I've found it interesting where the switch-point of angle I'm sitting up at makes it rest vs activity, laying on my back put me in rest mostly. So clearly today somethign is different. And this isn't the worst day I've had this week. SO I like the over a month and something you mightn't do every day

there are ones like toothbrushing where that is my aim, so if it doesn't it says something. particularly over a month if I could pass a month if I had silly amounts of committment vs threshold. Because the odd difference will iron out.

and I have one app recording where I wasn't sat down the whole time and the intensity is so hugely different vs the rest of them. plus even with sitting down it's enough of an exertion it always registers as a time peak - trouble is that when I'm least unwell I'll be more efficient, when I'm most unwell I'll be least efficient and give up sooner than when I'm middle unwell. So it doesn't capture everything.

vs the ones that are daily like drinking or toilet or moving in bed which are impacted by other symptoms as well as indicators of how well I am. eg I need to wee a lot at certain stages of PEM and drink more if I can. and the staggering and quality and method of doing these says more on the debility etc.

hotblack said:
Lots of really useful and insightful feedback. Thank you everyone.

So maybe something like…

- Pick 3 activity descriptors that you feel best describe your current limitations and level of activity, you can choose from FUNCAP55 or write your own.

Try to pick a range which represents you best, with one you can usually do without much difficulty one less often and one you’re rarely able to do

- Each day (or week) record how many times you do each activity (without significant difficulty?)

- Each day (or week) record if you consider it a good, average or bad day/week for you

In this way you will get both weekly and monthly totals of ‘activities’ and ‘good/average/bad’ periods

You can use a spreadsheet, a piece of paper, or whatever method works for you (I/we could provide some ideas and templates to use, copy or print)

Sasha · Jul 26, 2025

I'm not concentrating very well at the moment, so I hope I'm not just repeating someone else's idea that I've read and then forgotten. But, having just read Jo’s paper, I think that the approach it takes gets around a lot of our issues about only occasionally being able to do particular things, and struggling to imagine what would happen if we did certain things (a major problem with FUNCAP, IMO).

In his paper, the clinicians described each patient at entry, and for each of them, said what they would consider to be deterioration, what would constitute useful improvement, and what would be the best possible outcome, allowing for the permanent damage that that patient had from their illness.

As JemPD says, if you don’t have a reasonably stable baseline, this may not be a feasible approach. Maybe any clinical trials would need to be done only on stable patients.

But for people with reasonable stability, we could first define what our basic day looks like. It’s likely to be our comfortable maximum activity, because when we are so limited, I think we generally live up to whatever energy we’ve got.

So that baseline description might be something like, ‘Have to lie in bed for 10 hours a day, can just about prepare food, have a basic flannel wash, be on the computer for a couple of hours, can’t talk, difficult to walk from one room to another, zero house work.’

Then deterioration and useful improvement would need to be defined as a change from that, but I’m not sure what the best approach is to being specific about it. Someone living like that would consider losing any of those abilities to be deterioration, but how much loss would look meaningful in a trial?

Similarly, a useful improvement to a PwME existing at such a low level of function could be something as simple as, ‘Can talk for 10 minutes’ or ‘Can unload the dishwasher’. But if you were running a trial and wanted to demonstrate that your side-effect-laden drug was worth taking, wouldn’t you want something bigger? How would you determine that?

Any thoughts, @Jonathan Edwards? How did you determine this in your trial, for the patients? Or were there already well-established clinical criteria that you could use?

bobbler · Jul 26, 2025

Sasha said:
. Maybe any clinical trials would need to be done only on stable patients.

But for people with reasonable stability, we could first define what our basic day looks like. It’s likely to be our comfortable maximum activity, because when we are so limited, I think we generally live up to whatever energy we’ve got.

So that baseline description might be something like, ‘Have to lie in bed for 10 hours a day, can just about prepare food, have a basic flannel wash, be on the computer for a couple of hours, can’t talk, difficult to walk from one room to another, zero house work.’

Then deterioration and useful improvement would need to be defined as a change from that, but I’m not sure what the best approach is to being specific about it. Someone living like that would consider losing any of those abilities to be deterioration, but how much loss would look meaningful in a trial?

Similarly, a useful improvement to a PwME existing at such a low level of function could be something as simple as, ‘Can talk for 10 minutes’ or ‘Can unload the dishwasher’. But if you were running a trial and wanted to demonstrate that your side-effect-laden drug was worth taking, wouldn’t you want something bigger? How would you determine that?

Any thoughts, @Jonathan Edwards? How did you determine this in your trial, for the patients? Or were there already well-established clinical criteria that you could use?

I don't think we can have something that relies on that assumption to that level - I obviously don't mind if there is a much smaller caveat of 'those in the early days or in a particularly unusual situation'.

I think the issue is that noone can predict for sure that they will be stable. We are all a noisy neighbour/building work/virus/injury/new or no carer/boss change/family emergency/powercut/broken down car away from significant change if our threshold then goes beneath what is constantly achievable/needed over time.

And it's only over the space of decades that I know what someone might think is stable actually isn't because of deterioration or slight differences in hindsight.

But further than that anyone who says or thinks they are to get into a trial isn't necessarily actually going to be more stable than the one who doesn't get in because they take that assessment too seriously and have the more cautious/conservative knowledge.

I also have a huge problem with those who think they can control their PEM dominating the research because it will exclude the more severe and not really be testing out any treatment to the full extent. Only on those who have the type and situation which is pretty rare to be able to stay out of PEM.

Which will likely also come with all sorts of related sociodemographic backfires on representativeness and so on which in today's healthcare climate will bite us in the bum.

But also not be testing whether it actually works by reducing the impact of activities that would induce PEM. If you don't have the people who are regularly in PEM to different levels.

poetinsf · Jul 26, 2025

I just measure the time I spend lying down. All my functionings, including brain, are inversely proportional to the amount of time I spend lying down.

One advantage of TSLD is that you can compare it across patients. Questionnaires and VAS are subjective and therefore more difficult to compare. It also let you compare longitudinally. I used to spend 7-8 hours lying down. Now my range is 1-4 with occasional 0 or 5. Everything is so much more difficult on 4-hour days.

Trish · Jul 27, 2025

A big difference between ME/CFS and the example given for SLE is we don't have any lab tests to include in the outcome criteria. So we're left with the fairly objective measuring via wearables of steps, time upright and heart rate, HRV etc, whether we are able to do specified activities, and symptom presence and severity.

It might be interesting to each try designing our own descriptors of deterioration, no change, significant improvement and recovery.

hotblack · Jul 27, 2025

JemPD said:
Sorry don’t know the answer to the problem but just throwing it into the pot

This is really important Jem and what I am trying to address here. But I understand I may not have got things right yet.

Is there a way I can distill or explain what I’m aiming for you so I can get your feedback on what may or may not work better? Please feel free to reply here or send a private message. Or indeed ignore me

But rest assured I have heard what you say and really get (and live) what you describe.

Sasha · Jul 27, 2025

I'd like some more input on what would be meaningful deterioration and improvement, and it might vary from trial to trial. Trials have what they define as a 'clinically significant difference', which is (Google AI) 'a noticeable and meaningful change in a patient's condition that is considered important enough to potentially alter treatment or management decisions... and focuses on whether the observed change has a practical, real-world impact on the patient's well-being.'

In most conditions that have treatments, it may be that patients are starting from a much higher baseline. If you're already functioning at 70%, maybe it takes a bigger change to feel that you've meaningfully improved or deterioriated, but if you're at 15%, then 2% more or less could be life-changing. 2% less and you could be off to the care home. 2% more and you'd be able to have a short conversation every day with loved ones.

So if you were reading about a drug trial of PwME, and you read that 100% had 'meaningful improvement', would it really be that informative? What if the drug had potential nasty long-term side effects (such as cancer, or death)? You'd also want some more accurate measure of how big it was.

In @Jonathan Edwards's lupus trial, that data would have been available, because lupus has lab measures, but the reason for the worse/same/better/recovered approach was to provide maximum statistical power on an unavoidably small group by having a single measure.

So maybe this approach would only be suitable in similar ME/CFS trials? Or maybe as a primary measure? But my worry is that it could produce quite noisy data if a 'meaningful' difference to a PwME could also be a small one, because small changes might happen quite often in a fluctuating condition such as ours, especially over long time periods.

hotblack · Jul 27, 2025

Trish said:
I think the approach of each patient picking 3 activities they currently can't do and want to be able to do may be OK for a trial of a treatment,

That wasn’t quite what I was proposing, but rather

Pick 3 activity descriptors that you feel best describe your current limitations and level of activity, you can choose from FUNCAP55 or write your own.

Try to pick a range which represents you best, with one you can usually do without much difficulty one less often and one you’re rarely able to do

Each day (or week) record how many times you do each activity without significant difficulty

So for me I may record
- Getting to the toilet in the morning (something I struggle with when bad but manage ok when better)
- Sitting by the window on a chair for a short time (something I like to do but don’t often because of the impact or need to do other things)
- Having a short phone call or face to face visit with a friend (something I rarely do and whenever I have there is a significant impact)

This would capture ups and downs quite well. If I’m having difficulty with the first then that shows I’m in a bad way. How often I do the second is probably a good indicator of how well I am whin my normal to good range. And if I’m doing the third then that indicates a significant improvement.

Any dips or PEM directly influence what I do and therefore the counts. What ‘difficulty’ means is subjective. That’s fine, these are subjective.

Granted it takes an understanding of yourself. But that’s the point, it would capture my unique experience and limitations and how ai balance things in a way I don’t see other methods doing.

Add on a simple good/bad/average day count and ai think there’s a lot of useful information.

hotblack · Jul 27, 2025

Trish said:
Something like FUNCAP which covers a wider range of activities across all severity level would be better for that.

I see what you mean Trish. FUNCAP is probably better for longer term tracking and any significant changes. I like its structure, it’s just a lot to be doing even monthly and certainly more regularly. I can see it being a once or twice a year job or something for some significant longitudinal studies.

I’m partially basing my ideas from things I’ve been involved in in the past. DecodeME was great and detailed but a huge undertaking. The recent HHV6 study Jackie and co are doing had relatively short (half a dozen or so points) weekly symptom tracking, which was work but a routine once I got used to it.

Like many I’ve also tried tracking things myself. Subjective descriptions, giving scores or counts of activities and how I feel, steps, HRV, etc etc. The draw of objective measures is great but I think a mirage until we have something objective to measure.

Trish · Jul 27, 2025

Trish said:
It might be interesting to each try designing our own descriptors of deterioration, no change, significant improvement and recovery.

SLE paper said:
On entry, a set of criteria was laid down for each patient on the basis of their clinical state at entry, indicating what would be considered 'ideal' improvement, 'useful' improvement, no change, and deterioration.

Ideal improvement was intended to indicate the best possible outcome which might be expected in the face of any irreversible problems such as joint deformity or chronic uraemia, sustained for at least three months.

Useful improvement was intended to indicate an improvement short of ideal which justified the cost, inconvenience, and potential hazard of high dose steroid infusion, and which was sustained for at least three months.

A static state was intended to indicate the absence of either useful improvement or significant deterioration, assessed at three months, or earlier if withdrawn for alternative dosage treatment.

Significant deterioration was intended to imply that clinical problems had worsened or that new problems had developed which were of greater importance than any coexisting improvement. The appearance of renal disease in the face of improved arthralgia would be considered deterioration, but the development of arthralgia in the face of significant improvement in renal function would not.

Assessment was made at three months, or earlier if withdrawn for alternative dosage treatment. The criteria for outcome were different for each patient, based on the problems of relevance to that individual.

Criteria were often quite complex, being derived from a range of baseline clinical and laboratory data. An example of a set of criteria is given in Table 2.

Table 2 Criteria for patient 10 (first arm)

Clinical features on entry: pyrexia, leucopenia, anaemia, pleural effusion, pleuritic pain, and proteinuria.

Deterioration=Death. or cerebral disease, or proteinuria more than 8 g/day, or increase in pleural effusion on radiography. or neutrophil count below i0(/i.

Static=Neither 'deterioration' nor 'useful improvement'.

Useful improvement =At least two of the following at three months: no pyrexia. proteinuria less than I g/day. 80% resolution of pleural effusion on radiography. neutropphil count above 2-5x1l()/l on two separate occasions.

Ideal=No features of SLE. and specifically. all criteria for useful improvement fulfilled and haemoglobin greater than I 10 g/l at three months without transfusion.

[...]

My version for me in my current state:
On entry on a better day: Upright (sitting, standing, walking) time no more than 5 minutes at a time and total 30 minutes per day, steps measured by fitbit as non dominant forearm movement 400 to 1000 per day, no more than 20 steps at a time, mild nausea, OI, muscle pain, rapid muscle fatiguability. Active concentration, eg forum, reading, max 1 hour at a time, max 4 hours per day. Mild sensory sensitivity.
Able to shower about once a week.
On a bad day (PEM): bedbound, nausea, headache, loss of appetite, unable to concentrate, worse pain, OI, severe sensory sensitivity.

Deterioration: Any one or more of: Completely bedbound, unable to shower without assistance or at all, upright less than 10 minutes per day and less than 2 minutes at a time, concentration less than 1 hour per day in short bursts. Daily steps less than 200 per day, PEM symptoms most or all the time.
Static: neither deteriorated nor improved
Useful improvement: No episodes of PEM and at least 2 of the following 3: Steps per day consistently more than 3000, upright time more than 4 hours per day including at least 1 hour at a time. Significantly reduced muscle pain and fatigability.
Ideal: No episodes of PEM and: Steps averaging 5000 per day, upright time (sitting, standing, walking) 8 hours per day, without needing to lie down, no symptoms of ME/CFS.

Sasha · Jul 27, 2025

hotblack said:
Like many I’ve also tried tracking things myself. Subjective descriptions, giving scores or counts of activities and how I feel, steps, HRV, etc etc. The draw of objective measures is great but I think a mirage until we have something objective to measure.

Same here - I've tried to track time lying flat and time with my feet in contact with the ground (i.e. sitting properly, standing or walking), but it's very burdensome and I've given up. I now track something that I used to be able to do daily but now can't do so often, but would be the first thing I'd start again daily if I improved (whether I go downstairs). That kind of thing makes a good indicator of improvement but is hardly major by a healthy person's standards (or a trial's, probably).

I like the idea of wearables measuring key activities such as steps and time spent upright (calves horizontal and torso horizontal would be useful). Straightforward and objective, and would indicate the size of any improvement very clearly. But I don't have a smartphone and so have never used any of these things (except for step counters outside my home). They wouldn't be much use for the very severe (floor effect) but for those who can be upright sometimes and mobilise a bit, are they very burdensome? Is there any reason not to use these over FUNCAP or some simplified/easier version of it?

Or is it horses for courses and we're interested in tracking outside of trials, on zero budget?

bobbler · Jul 27, 2025

Trish said:
A big difference between ME/CFS and the example given for SLE is we don't have any lab tests to include in the outcome criteria. So we're left with the fairly objective measuring via wearables of steps, time upright and heart rate, HRV etc, whether we are able to do specified activities, and symptom presence and severity.

It might be interesting to each try designing our own descriptors of deterioration, no change, significant improvement and recovery.

It’s tricky because I thought that this was the bit that is hard to unbundle and so the idea of finding a treatment of two that works is, in our case, going to be one of the key clues in helping us find out what is going on.

So by combining the two - the issue of which of these he or hrv or being upright or steps show what and the issue of ‘X treatment works’ - are we risking dragging one angle down with the problems of the other (given where we are right now with knowing what the combo of these measures would look like if something good vs bad happened)

By this I mean I feel like there has been talk about history of eg other diseases where discovery of X pill helping have a clue of validity from which it being treated and investigated went on from

And whilst I 100% agree we need objective measures to ‘stop the con’ of just behaviourally coercing us to tick s box short-term and actually making us iller long-term having been the rinse-repeat cycle we have all been harmed by.

I think the worri is that we need to focus on that ‘long term’ being the shift in method. And moving away from the short-term.

If something cures or does a lot for us then it might be obvious quickly to us, but really make an impact once we are a few months in and having had a few months of that x fixed then whatever was damaged has started to properly ‘recover’. Ie it’s after that healing process where we get our feet under us for our body to operate that bit differently.

How do we predict on measures we’ve developed to try and show our disability and how it works (and when eg I don’t know when it hits my hr or my hrv atm after doing what cumulatively if one-off) , what ‘better’ will look like using those same measures?

I think it would be in the simplest stuff that laypersons can understand , like being able to shower three days in a row . When I started steroids I could talk by phone for longer than the short time I’d been able to previously without the regular collapsing. And yes that probably shifted what else I did in my day as it didn’t cure my me/cfs but talking was important yo me.

So that reminds me of what @Kitty has reminded of in the past that for other illnesses where they have access to good clinicians then those clinical assessments pick up those things. But the issue we have I know is we can’t trust those external assessors due to their ideology warping their vision and thinking.

I just think it’s sad this bigotry has put seemingly onto our shoulders the idea we have to prove and find a way and formula to prove things that shouldn’t be up for distortion . And to prove them in terms like this.

I get the 2 day cpet test idea. I also get the issue we have that someone can influence if we feel better by gerrymandering the exertion involved running up to the second assessment of something.

I get, because I’m always feeling I have to explain (not that anyone would actually listen to it - it’s rhetorical) that ‘it’s cumulative, not just I have 2dsys PEM from that one activity then a break’

I just can’t predict what my app would look like vs now in six months and a years time if I felt 40% better vs what it looks like now.

I do know it all really matters with us. Because much as we like to politely say it’s ’because We don’t have a biomarker’ etc that we are treated the way we are I think sadly it’s just human nature and about vulnerability and ‘picking on/targeting/ what some do to the most vulnerable’. And even those who aren’t doing it directly rely on keeping us here because their relationships working on this inaccurate power imbalance isn’t something they fancy changing (it’s why all those friends chose not to treat us like a human once the cpet or nice guideline came out once they realised it was ‘their choice’ not a ‘social norm’ they’d have to follow to treat us better).

So yes, I think for us specifically, just because someone has a cure one day won’t mean that what we are surrounded by will change and those people won’t do all they can to stop us accessing it. They aren’t logical in getting us out of hospital faster , they won’t heed logic in ‘their lives will be better too if we aren’t debilitated’ either. Because we aren’t heeding the secondary benefits of their ‘being better than us’ hierarchy shift that stays as long as we are kept disabled or begging for access to what means we aren’t disabled. Too many see life as a zero-sum game.

Blood tests in lupus work because it’s a simple ‘take y, x goes down’.

Fixing a heart valve is believed and funded because of mortality stats and maybe blood pressure. They don’t get into ‘can people do x more in their week’ analysis.

I think it’s worth us also thinking one thing might help a lot but not cure - like lots of the treatments in these other diseases. And we have to be prepared for selling they are still worth having even if they don’t get everyone ‘back to work’ but eg stop progression or make important quality of life differences

But they might come with side effects that for many are worth weathering but not all and have impacts on these measures over time. Eg someone taking steroid for IBD is having to trade quality of life vs longer term steroid effects.

I don’t think we will have the oversight of any general public laypersons if we start using things too complex (they hate stats when we talk about that to debunk PACE) and others manipulate the order of things on us. But they’d understand eg now not bedbound and can see for themselves eg in videos of Parkinson’s tremor disappearing . I guess that could be termed the ‘who cares test’ on picking measures and whether anyone would care to notice if it got misused.

In essence we struggle with that playing off the ‘feel better’ vs the ‘functionality’ and the short-medium term vs the medium-long outcome/hatms. It isn’t unreasonable that means we have to reiterate (as Nice did in their analysis) that means those ‘long term measures’ have to be required and have to be a deal breaker if harm comes longer term ie supersede any short term claims.

I’m also conscious I couldn’t have predicted specifically how b12 would have made me feel better. And access to that shouldn’t depend on that foresight just that it is a genuinely big significant difference.

Which is different to claiming external validity to a population any bigger or different than those who are the same as me - another big trick/problem used in me/cfs land. But of course lowest common denominator ing that out from being applicable to those who it did make a significant difference to would suffer from that exact same error too.

bobbler · Jul 27, 2025

Sasha said:
Same here - I've tried to track time lying flat and time with my feet in contact with the ground (i.e. sitting properly, standing or walking), but it's very burdensome and I've given up. I now track something that I used to be able to do daily but now can't do so often, but would be the first thing I'd start again daily if I improved (whether I go downstairs). That kind of thing makes a good indicator of improvement but is hardly major by a healthy person's standards (or a trial's, probably).

I like the idea of wearables measuring key activities such as steps and time spent upright (calves horizontal and torso horizontal would be useful). Straightforward and objective, and would indicate the size of any improvement very clearly. But I don't have a smartphone and so have never used any of these things (except for step counters outside my home). They wouldn't be much use for the very severe (floor effect) but for those who can be upright sometimes and mobilise a bit, are they very burdensome? Is there any reason not to use these over FUNCAP or some simplified/easier version of it?

Or is it horses for courses and we're interested in tracking outside of trials, on zero budget?

This. We don’t sound too far apart in level of certain things. I’ve tried an app. I have a stair lift. Being downstairs is a huge indicator over a longer time period of being more well (short time it might be lack of choice due to having to do something).

The app doesn’t capture any difference between the activity of being downstairs vs not - if I made a tea or talked on the phone it’s the same to it. But they’d be indicators if done regularly of very different things - because getting back up to bed means I’d have to be that more well as a starter.

I think whilst some orrible (I’m always astounded how many/how high a % of people it is when it comes to that moment of truth) people will pretend they can’t see that difference, when it’s us, there are some who will ‘get it’ on that

but no app would. And no measure would. And no test would as even if I wanted to collapse on stair lift or be stuck downstairs without a loo too ill to move that’s not what any of those would describe.

In fact the app probably thinks downstairs is a low exertion activity vs many things I have to do because of what it does measure and when I can do it.

Getting car sick and the impact of motion from a journey etc is another one. So a huge exertion of an appointment might only be measuring my footsteps or hr (which seems to vary with the same situation on different days in an ‘after’ way but I can’t predict it yet) not holding my body up and trying to answer questions and look less arduous than the day where I walked into the garden x steps as I’d rested lots and felt unusually well for me but also knew I could collapse into all sorts of furniture and had help around and could stop any moment.

Peter T · Jul 27, 2025

If there was a list of preferred terms, would a voice activated recording device work. Then you could just say, things like ‘bed, feel very bad’, ‘chair, feel OK’, for later computer analysis, in conjunction with physical activity monitoring.

This would mean the person recording would just have to say one or two words and not have to enter any other data.

Sasha · Jul 27, 2025

Peter T said:
If there was a list of preferred terms, would a voice activated recording device work. Then you could just say, things like ‘bed, feel very bad’, ‘chair, feel OK’, for later computer analysis, in conjunction with physical activity monitoring.

This would mean the person recording would just have to say one or two words and not have to enter any other data.

That would mean remembering to do it, and constant effort of monitoring, though - I think that would be quite burdensome.

bobbler · Jul 27, 2025

Sasha said:
Same here - I've tried to track time lying flat and time with my feet in contact with the ground (i.e. sitting properly, standing or walking), but it's very burdensome and I've given up. I now track something that I used to be able to do daily but now can't do so often, but would be the first thing I'd start again daily if I improved (whether I go downstairs). That kind of thing makes a good indicator of improvement but is hardly major by a healthy person's standards (or a trial's, probably).

I like the idea of wearables measuring key activities such as steps and time spent upright (calves horizontal and torso horizontal would be useful). Straightforward and objective, and would indicate the size of any improvement very clearly. But I don't have a smartphone and so have never used any of these things (except for step counters outside my home). They wouldn't be much use for the very severe (floor effect) but for those who can be upright sometimes and mobilise a bit, are they very burdensome? Is there any reason not to use these over FUNCAP or some simplified/easier version of it?

Or is it horses for courses and we're interested in tracking outside of trials, on zero budget?

Ps weirdly yesterday I felt rotten and had app on.

So rotten that lying flat on my back I was still in ‘activity zone’ all day. So if you just looked at the overall points for exertion it was 50% over limit. Yet I did nothing. Days where I’ve done more have been 40% under.

This makes sense to me. But in a finger in the wind decades of experience with my own body and this illness way. I’ve some approx guesses where it’s from eg cumulative from all sorts of time lags. But not a ‘I did a marathon 24hrs before’ or even ‘the same thing happened xhrs after I’d done that y last time too’.

I can imagine the gaslighting (you must have done more yesterday than you realised you were doing) crud I would get. Or the slap dash people who just take the total and assume yesterday was an active day where I did lots because it was 50% over and my heart rate pattern - without caring for my one little note trying to say ‘no, all of this swathe of hours where it looks like I’m sitting up level of colour on the graph, today I was lying flat on my back’

It’s interesting at my micro level to keep me a bit occupied with some curiosities that bang me over the head.

But i worry for trying to use the data without it being through the conduit of the actual user. And then it would need to be a user with a lot of experience too. To not take it all literally. Or should I?

Patient led measure of outcomes

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)