Use of EEfRT in the NIH study: Deep phenotyping of PI-ME/CFS, 2024, Walitt et al

EndME · Mar 6, 2024

Something we haven't analysed yet is whether pwME have to take breaks to be able to do hard tasks, i.e. whether there are more prominent features of high and low number of clicks that alternate and whether these dominate the lower mean behaviour more than ME/CFS vs HV does. I'll try to explain why it might be worthy of a look (even if it reveals nothing). Let's look at the plot for the first 35 rounds and only look at the hard rounds, i.e.

On the HV side the larger mean i.e. "higher effort preference" is driven by HV H, HV P and HV O, these people simply don't exist on the ME/CFS side of things. The highest performer (in terms of opting for hard games) on the ME/CFS side of things is HV D and you might not expect him to get close to those high performers in the HV group because he's one of 2 people with the least ability to complete the rounds, so it's possible that he has to take breaks. The question then is why there aren't pwME that can go hard and repetively go hard, i.e. why ME/CFS C, K & F choose far less often to go for hard then HV H, HV P and HV O, what about the other ME/CFS patients with a 100% success rate on hard trials? If that is due to a necessity of having to take breaks in-between rounds or is it a different strategy (I don't think all of these people are trying to optimize their pay-out) and if you look at the comparison of first half vs second half you will see that this difference only starts increasing in the second half so it could be fatigue driven.

I think it also makes sense to look in the literature of EEfRT studies to see how common people such as HV H, HV P and HV O usually are amongst healthy people. I'd guess they are very common (especially because in some trials people are told different instructions to ensure people don't follow strategies that end-up minimising play time on hard rounds), because I think many people will go hard if they can and if they haven't really understood what the better choice is in terms of end-rewards, but you never know, perhaps these are outliers (but I don't think so).

Evergreen · Mar 6, 2024

EndME said:
The problem here is that 90% itself is a rather arbitrary and for which I see little scientific justification.

I see it as data-driven.

Alternatively, you could say =100% vs <100, and the percentages would be 45% and 35%, and check that. I suggested above and below 90% because it seemed more reasonable clinically to allow healthy volunteers to not be perfect (but still not lose much confidence, thus not affecting their decision much to choose between hard or easy), since 3/16 were in the 90-95% range (one patient was also in that range).

[Removed a part here where I think I had errors in my data.]

I thought dividing into 2 groups with as much of a mix of HVs and patients as possible was the best option.

The argument would be that when people find they're failing some hard tasks, it factors into their choice between hard and easy tasks, after the probability of the reward, and alongside the value of the reward (reading from @andrewkq 's analysis). And there might be a small amount that they consider reasonable to fail without it factoring into their choice much or at all. And then to explain why some people hurl themselves at it repeatedly despite all the failure, I would say, they're playing a game, they know what they've been told to do and they're doing it, and/or they remain reward-driven, whether that's the small monetary win or the approval of testers. Resilience, innit. Or some such.

EndME said:
I assume below mean "=<" and above means ">"...If you instead choose the value 90.00001% you only cut out HV A (HV B sits at precisely 90%).

I don't have anyone on 90%. Closest I have are HV N at 92% and MECFS N at 87%. So it is less than and greater than 90, no equals.

Edited to correct.

EndME · Mar 6, 2024

Evergreen said:
I don't have anyone on 90%. Closest I have are HV N at 92% and MECFS N at 87%. So it is less than and greater than 90, no equals.

I was using a cut-off (at 35 rounds) are you using the full game?

Evergreen · Mar 6, 2024

So my explanation for why there's no further reduction of proportion of hard tasks chosen below 90% successful completion would be that the rewards of money, and perhaps approval (of testers) and knowledge that you did what you were supposed to do are stronger factors in people's choices.

Evergreen · Mar 6, 2024

EndME said:
I was using a cut-off (at 35 rounds) are you using the full game?

Full game, no practice rounds. They're looking at every choice between hard and easy, so it has to be the full game.

EndME · Mar 6, 2024

Evergreen said:
Full game, no practice rounds. They're looking at every choice between hard and easy, so it has to be the full game.

Ah ok, I've been cutting off after 35 games (no practice rounds), which seemed the slightly better choice to me (to account for fatiguability, ensuring everyone plays same amount of rounds and is slightly more consistent with previous literature as far as I can tell, but certainly a choice rather than anything else) and with how unrobust this method are these choices seem to make quite a big difference as I've previously discovered.

And at 35 rounds the 90% doesn't seem too robust to me, but arguably that's different for the whole game.

Evergreen · Mar 6, 2024

EndME said:
Ah ok, I've been cutting off after 35 games (no practice rounds), which seemed the slightly better choice to me (to account for fatiguability, ensuring everyone plays same amount of rounds and is slightly more consistent with previous literature as far as I can tell, but certainly a choice rather than anything else) and with how unrobust this method are these choices seem to make quite a big difference as I've previously discovered.

And at 35 rounds the 90% doesn't seem too robust to me, but arguably that's different for the whole game.

Yeah, and I think Treadway would agree with you given that following discussions with him, Reddy et al. modified the EEfRT in their schizophrenia study to have everyone do 50 trials. But since Walitt & co include them all, I did too. I wonder if they considered the modifications others have introduced for clinical populations and rejected the idea, or did not consider modifications.

EndME · Mar 6, 2024

Evergreen said:
Yeah, and I think Treadway would agree with you given that following discussions with him, Reddy et al. modified the EEfRT in their schizophrenia study to have everyone do 50 trials. But since Walitt & co include them all, I did too. I wonder if they considered the modifications others have introduced for clinical populations and rejected the idea, or did not consider modifications.

They had a sufficiently large team, time and resources to familiarise themselves with the material. I think we can be certain they looked into this (especially since they did make some modifications).

bobbler · Mar 6, 2024

EndME said:
Thanks! Nothing pops out of this data immediately, at least when I look at it. Maybe plotting ME/CFS in red and HV in blue would be nicer visually, but the way this data appears here, makes it seem like it won't give us immediate insights.

One could probably also look at SF36 vs hard tasks chosen (with the cut-off at 35 rounds), but I don't expect much here either because ME/CFS H, who "destroys" the correlation between hard tasks chosen and not beeing able to do hard tasks, appears to have a high SF36 according to this plot, whilst the other "destroyer" ME/CFS D appears to have a low SF36.

interesting on ME-CFS H

bobbler · Mar 6, 2024

Peter Trewhitt said:
I am surprised how much people scoring so low on SF36 seem able to do. Does this reflect issues with SF36 scale? I did score myself several years ago on the SF36 scale and though I can’t remember the exact score I was in the severe ME range. However I could not imaging being able go to participate at any level in such a research project.

Obviously there will be some correlation with the SF36 score and task completion, but why these outliers?

Does this argue the need to calibrate the task for each individual, or unfortunately seemingly support the nonsense idea of effort preference being the limiting factor rather than physical ability?

I'm guessing that on the highlighting of ME-CFS H that means that button-press rate isn't correlated to eg SF-36 physical function. As effort-preference is defined by choice then if that button-press rate stands in the warm-up it 'isn't that' causing (the button-press rate) there?

Good questions, trying to get my head around it

Evergreen · Mar 6, 2024

EndME said:
They had a sufficiently large team, time and resources to familiarise themselves with the material. I think we can be certain they looked into this (especially since they did make some modifications).

The relevant modifications made - reducing the number of button presses required for the hard task from 100 to 98 and reducing the time from 20 mins to 15 mins - were clearly not enough to make the hard task doable for pwME. I haven't read enough of the other studies to know if those modifications appear in the literature, or came from the NIH. I would have expected them to look into modifications others had made for clinical groups, but I don't think we can be certain of anything. Maybe they did and had reasons not to do the same, maybe they didn't.

EndME · Mar 6, 2024

Evergreen said:
The relevant modifications made - reducing the number of button presses required for the hard task from 100 to 98 and reducing the time from 20 mins to 15 mins - were clearly not enough to make the hard task doable for pwME. I haven't read enough of the other studies to know if those modifications appear in the literature, or came from the NIH. I would have expected them to look into modifications others had made for clinical groups, but I don't think we can be certain of anything. Maybe they did and had reasons not to do the same, maybe they didn't.

The modification from 100 to 98 seems to be new in the literature or at least I haven't come across it, the 20 minute to 15 minute reduction has already appeared in other papers see for instance Effort-based decision-making impairment in patients with clinically-stabilized first-episode psychosis and its relationship with amotivation and psychosocial functioning (which is a study that has a similar flair to the one Walitt did, see also the thread Worth the ‘EEfRT’? The Effort Expenditure for Rewards Task as an Objective Measure of Motivation and Anhedonia, 2009, Treadway et al).

Evergreen · Mar 6, 2024

Peter Trewhitt said:
I am surprised how much people scoring so low on SF36 seem able to do. Does this reflect issues with SF36 scale? I did score myself several years ago on the SF36 scale and though I can’t remember the exact score I was in the severe ME range. However I could not imaging being able go to participate at any level in such a research project.

Obviously there will be some correlation with the SF36 score and task completion, but why these outliers?

Does this argue the need to calibrate the task for each individual, or unfortunately seemingly support the nonsense idea of effort preference being the limiting factor rather than physical ability?

I hear you. I'm in the same range as you. But to be honest it would be my cognition, sensitivities to noise/movement and orthostatic intolerance that would stop me participating, rather than anything the SF36PF would pick up. Imagine all the talking involved in this study? All the things you're expected to sit for and not wilt?

The SF36PF is a pretty blunt instrument, very focused on walking/lower limbs and functional things like walking up the stairs. I think the hard task entails such different skills - a completely unfunctional rapid repetitive movement with a finger you would likely never choose to use if you had to repeatedly press a button. I don't know how much manual dexterity and speed really correlate with ability to walk a few blocks or play golf.

bobbler · Mar 6, 2024

OK so done the button-press rate for the warm up rounds. Sorted by button-press rate for the hard ones of the warm-ups.

Only 4 HVs had less than 4.6446 rate (EDITED to update this to be more specific rate needed to complete a hard task) but 10 ME-CFS were beneath this rate.

But other than obvious slight connections you might expect, it doesn't seem to explain anything necessarily?

There seems to be some sort of connection with the number of hard ME-CFS can go onto complete, but with outliers thrown in.

I sense there are too many overlapping phenomena, but also issues, going on here.

EDITED table as spotted an error

EndME · Mar 6, 2024

bobbler said:
OK so done the button-press rate for the warm up rounds. Sorted by button-press rate for the hard ones of the warm-ups.

Only 4 HVs had less than 4.7 button-press rate and only 4 ME-CFS reached the 4.7 button-press rate across the hard warm-ups.

But other than obvious slight connections you might expect, it doesn't seem to explain anything. I sense there are too many overlapping phenomena, but also issues, going on here.

View attachment 21360

I'm guessing the "problem" will once again be that both ME/CFS D & H are anomalies in the sense that they keep on failing hard but keep on trying hard and that that will ruin any explanations because there are the too many different overlaps in the pwME group. One simply can't do something like a retrospective calibration because the calibration itself could have impacted the decision hard vs easy in pwME.

Hutan · Mar 6, 2024

A lot of the time, I can't believe that we actually have to take this nonsense of a study seriously.

Yes, we need to know what the participants were told, and when it was decided that game play like HVF's was not acceptable. I still think, reading between the lines of the information we have, that the investigators probably made it up as they went along.

On the misleading confidence limits on the Figure 3 charts that @bobbler notes above
Bobbler, I agree. I mentioned upthread somewhere that those confidence limit lines are misleading (e.g. there is hardly any data in the later trial numbers) and that it would be a lot more enlightening to see the actual data points.

EndME said:
I'm not fully understanding this point, as I don't see how that is a counterargument to their argument "pwME try hard less often because they prefer to exert less effort".

You mentioned that we might be able to come up with a better explanation of the data but our explanation might not falsify the Walitt et al explanation (that people with ME/CFS prefer to expend less effort).

I'm not sure that we would ever be able to falsify their hypothesis with this experiment. We might be able to show that the half of the people with ME/CFS who could tap at a healthy rate chose hard tasks at the same rate as most of the HVs, but, still, the half of the people with ME/CFS who couldn't tap that fast chose hard tasks at a slightly lower rate.

But, why are those ME/CFS people with the slow-tapping rate choosing slightly less of the hard tasks? I think it's partly because a sensible strategy when being able to complete a hard task is uncertain is to ensure that you will win at least something from the easy tasks. And, if your finger is easily fatigued, it will be more important to rest it and allow it to recover, so, spacing out the hard tasks. For example, ME/CFS participants were not able to do as many hard tasks in such a concentrated way as HV-P, H and K. I think that alone might account for the difference in % of hard choices. (edit- I wrote this before reading EndMe's comments about the HVs who did many hard tasks back to back - it's the same point.)

But, then, why are they tapping slow? Is it because there is a physical problem? Or because they think they have a physical problem? If they started on a graded finger tapping therapy, would they in fact find that their brain has been misinterpreting signals or pre-judging the world wrong, and they can in fact tap with the best of them? Who knows?

And the normal-tapping ME/CFS participants are not let off the hook either. If they can in fact tap normally and choose the normal number of hard tasks, what is their problem? Perhaps they only think that they are sick? (sarcasm, based on the usual approach of applying irrelevant tests and then announcing 'good news, everything is fine!)

I don't think we can falsify Walitt's hypothesis with this data, other than pointing out that their abnormal effort preference idea only applies to some of the participants. But, I don't think that they can prove their hypothesis with this data either.

bobbler said:
I'm guessing that on the highlighting of ME-CFS H that means that button-press rate isn't correlated to eg SF-36 physical function.

Might be worth checking out which participant is accounting for which point in the 'time to failure - for the grip test' vs % of hard tasks chosen. It might tell us more about ME/CFS- H's physical capability.

Murph · Mar 6, 2024

EndME said:
It's not that easy to just say his strategy is "better" than some of those strategies of people with ME/CFS by doing a post-hoc analysis of his rewards.

His strategy dominates everyone else's. Losing easy tasks is the simplest way to not dilute whatever hard tasks you've won (or those you hope to win later) ; in terms of prizes "in the basket". Winning easy tasks is an almost* pure bad strategy.

Even if you're not sure if you can win a hard task, losing easy tasks, resting and trying to win a hard task make sense. *The only person who should win an easy task is person who has won no tasks of either sort yet, has formed the view they can't win a hard task, and has roughly two tasks left.

HVF doesn't execute the strategy in a totally optimal way (he wins one easy task and also one hard task with a lower prize than those he has already banked). it is pretty close though!

Is the question of whether HVF's strategy is "better" than anyone else's a distraction or important? I'm not sure. It is probably only somewhat relevant to the question of whether to exclude him. It is however illuminating with respect to the question of whether the EEfRT is a bad test that makes all downstream decisions about data analysis dubious.

The really big point is that a test that requires you to put yourself in the head of a player and think about their motivations and guess their strategy and think about how tired they are and the interplay of hard and easy tasks - that's a bad test.

bobbler · Mar 6, 2024

and, just in case the average wasn't fair I've done another check using 'best of' any hard trials they did in the warm-up round (which might equivalate to the idea of a 'max button-press' sort of test).

It isn't being revelatory either in explaining the data on its own. I had some sort of hypothesis around seeing whether if we looked at button-press in warm-ups it at least indicates those who stood no chance of getting many hards (and so had to choose what to do with that).

Having said that, there is 'some' link with the number of hard completed (4th column from left) for ME-CFS (or the ones who didn't complete many), apart from ME-CFS E being an outlier.

EDIT: OK I've now updated to reflect what I think is the more specific cut-off button-press for completion of hard being 4.6446

EDIT: if you can ignore ME-CFS E and M

rofl: there is the catch on all these things) then falling much below the 4.6446 in the best-of seems to be predictive on not getting many hard completed. But it's imperfect and the relationship with whether that meant they chose less hard isn't straightforward.

This is sort by this 'best hard warm-up button-press time' which is the column 3 from the right bordered in light blue, and the lavender border is around the first of each group to drop below the 4.7 button-press rate needed to complete a hard.

edited table as spotted an error

EDIT: updated to reflect new cut-off for completion of hard being 4.6446

walitt warm up button press hard best of.png

Eddie · Mar 6, 2024

Hutan said:
A lot of the time, I can't believe that we actually have to take this nonsense of a study seriously.

Exactly. This whole discussion makes me so angry. Not because this conversation isn't great, but because this is a nonsense test which should have had no impact on the conclusions of the paper. Instead, because of the beliefs of one researcher, we have this terrible test shaping the entire paper's narrative. Without this we would have no "effort preference" and no way to tie this to the questionable fmri findings. It is really such an embarrassment to those involved. I hope they come to realize that.

bobbler · Mar 7, 2024

EndME said:
I'm guessing the "problem" will once again be that both ME/CFS D & H are anomalies in the sense that they keep on failing hard but keep on trying hard and that that will ruin any explanations because there are the too many different overlaps in the pwME group. One simply can't do something like a retrospective calibration because the calibration itself could have impacted the decision hard vs easy in pwME.

Ahh

I've updated it all slightly as I made an error first in my table sorting and then have updated the cut-off needed for completion of hard to be 6.446

My concern ended up being whether it explained 'number of hard completed'. Having been staring at the magic eye table for a few days now it seems that there are a variety of approaches people who are struggling seem to take.

But the two obvious are 'picking your best chances' vs 'carry on regardless on the off-chance you nail one or two/because you wonder what it is the experiment is measuring [given how far off you are]'. And the former seemed to be more likely in those who were closer in click-rate to success in their initial rounds

I did, however, think it was an interesting point raised by @Murph when he had a go at the test that there was no countdown clock. I don't know for sure whether this was the case with the Walitt version, or whether there were any watches or ways of measuring time anywhere else. So whilst I imagine that latter group might have sensed things were tight, if it just shuts off when you complete then how are you to know unless you are missing regularly?

Either way I think it is reasonable that there would be different strategies - both of which relate to issues with calibration/how achieveable the hard task is even before fatigue within the task. That of course will make it difficult to just run a single test and get something perhaps on things we might normally manage to and the numbers are too small really to be doing that, but to start trying cluster analysis or anything clever it would be even worse.

There seems to be something around the rate of 4.7something on this one for a 90% completion rate, but that is approx for both ME-CFS and HVs. Because yes there are a good few outliers/participants defying predictions a bit. Like ME-CFS M and E on this occasion.

I'm being wary about jumping to the % chosen, but I keep myself seeing the pattern and am pretty sure that I'm not imagining it, that it is those who are 'in the middle' who are the ones most likely to 'be more choosy'. This makes logical sense to me, given this is the group where 'it matters'/they can make a difference vs those who are going to lose or win all the hard ones that they do.

It is interesting to look at the SF-36 data, as the best we have, it's a shame we don't have something more precise to the condition. But at least it gives us a sense of the range that we have for even such a small group and explains why for 15 individuals something like this becomes difficult to analyse - whereas I suspect if you had larger numbers covering the same range you then have options, or vice versa in an imaginary world where we know what is going on you get a tighter sample to 'whatever is lying underneath'.

Brain having a ponder to try and take all this in and think through, but I get what/why you are looking at with the 'progression through rounds' approach. It's interesting, just above my capability atm to think-it-out!

Use of EEfRT in the NIH study: Deep phenotyping of PI-ME/CFS, 2024, Walitt et al

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)