Use of EEfRT in the NIH study: Deep phenotyping of PI-ME/CFS, 2024, Walitt et al

Discussion in 'ME/CFS research' started by Andy, Feb 21, 2024.

  1. Keela Too

    Keela Too Senior Member (Voting Rights)

    Maybe also ask for the instructions themselves at the same time.
     
    Last edited by a moderator: Sep 27, 2024
  2. Kitty

    Kitty Senior Member (Voting Rights)

    Messages:
    6,803
    Location:
    UK
    Great work, everyone. I've struggled to follow the thread because there are so many figures and graphs in it and I don't really understand them.

    I agree that the throwing out of one participant's results because he was smarter than they expected looks shonky, but I wonder if it's a distraction? The nub of the argument that the results are invalid seems to go something like this:


    The Treadway EEfRT test was developed for use on a healthy population. It was designed to fatigue but not to exhaust, validated by the original cohort's ability to achieve a 98% completion rate in the easy tests and 96% in the hard ones.

    The healthy volunteers in the current study achieved rates of 96% in the easy tests and 99% in the hard ones. The ME/CFS patients were able to complete 98% of the easy tasks but only 65% of the hard ones, despite trying again and again, indicating that they did become exhausted.

    Walitt et al have thus demonstrated that the Treadway EEfRT test is invalid for use on sick patients.
     
    Last edited: Mar 3, 2024
    rvallee, Mij, Peter Trewhitt and 16 others like this.
  3. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    I hope I'm not taking things on too much of a sidetrack here but the more I look into this the more this exact type of point strikes me. Particularly given the MRI claims.

    Even if there were limits on how long after initial infection those who are ME-CFS in the study are, I think we forget how long even 10months is in the life of a normal person who then suddenly has to drastically change their limits of what they can do.

    It is like doing an 'immersion course' in 'making a silk purse out of a sow's ear' (old phrase: CAN'T MAKE A SILK PURSE OUT OF A SOW'S EAR Definition & Usage Examples | Dictionary.com that you'll note in it's in entireity is 'that you can't) ie 'operating when you don't have enough energy for things to actually 'add-up' necessarily'. My point is that it's like the 'pacing' cross-purposes/cross-communication/not getting the 'penny drop' with most healthies where they miss the issue that you don't have enough energy (whatever or however you do things - even if that 'helps' it doesn't magic 'extra') and there are unexpecteds, so it isn't about making an itinerary for yourself and going slow/building in tasks you like etc that works, even if doing that detail was 'worth the energy'.

    Basically it gives us experience, over and over, of looking at a situation in front of us and guestimating what broad-brush approach to take given limitations mean we will be 'short' and just have to chart the best course possible given this. Which isn't dissimilar to the task itself. For healthies. Then of course we have the complications that the task isn't really 'the task' but a side-show within our real task (getting through the day and the other priorities) for ME-CFS when you actually calculate rewards and paybacks.

    My point is that if they are inferring from certain 'parts of the brain being lit up' comparisons with healthies then of course it isn't like for like.

    It's like putting a healthy person in a wheelchair as a surprise condition for them doing a test vs a wheelchair-bound comparator group and then just focusing on how their brain lit up in relation to eg the 'test of finding things on a treasure hunt with clues' and overlooking that additional shock and lack of experience (eg with how that thing on the shelf will be hard to reach, and 'I'll need to use the other entrance due to the narrow doorway' knowledge you'd just embed to automatic over time) involved for the healthies of these limitations vs it being run-of-the-mill 'brain activities' for those who've lived with these for years (or at least directly 'translatable').

    But then also not adapting the test for the wheelchair-bound who are so due to a condition that might exhaust them or mean they can't stretch to pick up an object like a clue etc

    And thinking - because your test looks at motivation or effort (when validated under other circumstances with different cohorts) - that the differences 'you've found' must be due to those, rather than lack of suitable controlling of bigger factors.
     
    Last edited: Mar 3, 2024
    Peter Trewhitt, EndME, Ash and 6 others like this.
  4. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    I did email Treadway, back on the 22nd of February before I'd even dug in much.

    Hi Michael

    This new Nature Communications paper from a big NIH working group uses your effort metric and it ends up being a part of their conclusions.

    https://www.nature.com/articles/s41467-024-45107-3#Abs1

    Does the way they used it look legit to you? Is it appropriate and validated to use in a group with fatiguing illness?

    Thanks for any response you're able to provide!


    No reply yet.
     
  5. AuroraNY

    AuroraNY New Member

    Messages:
    1
    I’m a lurker, not a poster. But I’m making an exception here to say that reading along with these posts (and the ones on the other thread) has been the most interesting, intellectually-satisfying, and FUN thing I’ve done in quite awhile. You all are incredibly amazing, and I wanted to thank everyone who has participated in this for their hard work and analysis. As a non-scientist, I’ve found it all mind-blowing. Hats off to all of you!
     
  6. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    and on that note there was a (I think it was an Ohmann reference but I'd need to check) paper on EEfRT where they asked about 'value of money' to get a sense of the measure.

    In this instance though it would be interesting to think of the 'consequences for pwme' from doing these tasks in different ways eg if they ended up not being able to use their arms for 3hrs (even to drink) or lost 5 days literally to being unable to get out of bed.

    And then asking the HVs 'what value ie monetary worth would you give eg not being able to move your arms for 3hrs/what would someone need to pay you for you to sign up for that' and the same for the 'losing 2,3,4,5 days (and you could equate it to severe flu or whatever as a frame of reference_'.

    Which - given this is consumer behaviour somewhat could numerically contextualise the task vs the real incentives apparently operating as reward vs effort/payback/cons etc. Which was also within a trials of lots of tasks, ie where people might have thought there is a test in a day or two's time they will be unable to participate in if they 'blow it' ie exert too much.
     
    RedFox, Peter Trewhitt, EndME and 6 others like this.
  7. Eddie

    Eddie Senior Member (Voting Rights)

    Messages:
    145
    Location:
    Australia
    They were certainly told this before hand "Participants were told at the beginning of the game that they would win the dollar amount they received from two of their winning tasks, chosen at random by the computer program (range of total winnings is $2.00–$8.42)."

    Of course, there is the possibility that participants didn't fully understand how the game works. This seemly pretty likely given how overly complex the games comes across as. The researchers themselves don't seem to understand the optimal strategy to maximize winnings. My guess is that in the ME/CFS cohort the fatigue, desire to limit PEM and complex nature of the game lead to them choosing the easier options as time went on as they probably had little idea what they should be doing anyways. Given it took us several days to figure it out, there is no way I could have figured out what should be done on the spot while fatigued and having PEM.
     
    RedFox, cfsandmore, rvallee and 9 others like this.
  8. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    This looks really useful :) and it really gives a sense of the difference between the two 'things' (words are starting to fail me a bit)

    Is it possible.. on the ME-CFS chart to have it 'ordered by' the exact same criteria, but 'ascending' (as it looks like the HV data is in ascending order of % hard tasks completed)?
     
    Last edited: Mar 3, 2024
    Peter Trewhitt and Kitty like this.
  9. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    Am I correct in thinking that technically @Murph 's point could apply to anyone who chose a hard task despite knowing they might not complete it.

    As long as the task 'didn't count' if someone didn't complete it vs 'counting as $0' then from a purely money point of view once there are a few 'wins' in the bag that are worth eg $3.50 it is better not to ie the worst tactic would be to select the easier, lower value magnitude of the two and complete it thereby adding eg $1 dollar to the pot.

    If it is a low probability one there is less chance of that amount being added to the pot anyway/likely to be a 'don't count' trial. Which is where I guess the idea of if you aren't suffering from a condition that could cause fatiguability or be energy-limiting you might choose easy for either/both low value/low probability because 'the shorter time' is perhaps preferable - to give more chance you might get to see more of the trials with higher value/higher probability.

    BUT if you either have a condition that will cause fatiguability that will affect your performance or cause 'PEM' which 'puts the reward value into perspective vs payback consequences/differential benefits from an extra $0.50, then you might consider the hard task gives you a longer rest. Or that it doesn't matter as long as you pace your button-pressing to health rather than rewards except where value/probability is high enough to be worth having.

    SO yes, I think it was @Hutan and others' points that the irony is the optimal strategy of the game mightn't be far off the strategy those who have ME need to think of anyway (getting to the heart of the issue of what needs to be done to get 'enough' and avoiding 'waste') due to capability factors too.
     
  10. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    this
     
  11. Hutan

    Hutan Moderator Staff Member

    Messages:
    29,377
    Location:
    Aotearoa New Zealand
    I've participated in a trial and then read in the resulting paper what was said about what the participants were told - and there was a significant difference with what I was told. What happened in practice in terms of what the participants were told and also, as you say, what the participants understood from the way that they were told, could easily be different from what the paper says.

    It took a while for me to even start to know what the experiment was, for example, to be sure that the easy option value was different from the hard option value, and constant, as the spreadsheet suggests the opposite.

    But, I think once you understand the scenario, it is reasonably obvious that it would be good to restrict the pool of winning tasks to just a handful of the potentially highest value ones. Try setting up the scenario for a friend and see what happens. I really do think that more of the participants would have worked out how to do that, especially with the four trial runs. Maybe not the exhausted people, but I think we should have seen more of the HV and some of the ME/CFS participants do it more of the time. And surely, the investigators, who had lots of time to think about it and test the experiment would have worked it out ahead of time and realised that that incentivised approach would mess up their experiment?

    I forget what @bobbler found - has the exact same set up with respect to calculating the winnings been used in other EFFrT investigations?

    Even if the investigators didn't work out people would use the HVF strategy in advance of the experiment, surely it would have occurred to them when HVF actually applied it. I can easily imagine HVF chatting, laughing about it with the investigators as he collected his winnings afterwards. I doubt that all of the EFFrt experiments were run on the same day. So, it's even possible that the investigators changed what they said to people part of the way through the experiment, maybe telling people that it was important to try hard on all of the tasks that they select, or something.

    Of course, I don't know. It's just that it's easy to assume that investigations are all nice and consistent, with no messy human interactions biasing the results. I think it's probably often not the case.
     
    Last edited: Mar 3, 2024
    Lilas, Mij, Simon M and 11 others like this.
  12. Eddie

    Eddie Senior Member (Voting Rights)

    Messages:
    145
    Location:
    Australia
    I agree with everything you said. Especially given that it was a 21 year old guy who figured it out, there is no way he didn't at least mention his strategy to the investigators once he realized he could beat the system. There is clearly bias all throughout this test and the fact that Walitt threw out one of the results because it didn't fit with his understanding of the game only further demonstrates this bias.
     
    Lilas, Peter Trewhitt, Kitty and 5 others like this.
  13. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734

    This is the paper that you want. It sort of covers all of the key bits you note: Examining the reliability and validity of two versions of the Effort-Expenditure for Rewards Task (EEfRT) | PLOS ONE

    There is a lot of discussion about how it can be as much down to individual strategy as motivation throughout.


    ALSO, Certainly in this paper they used a different incentive and I don't know how many in the 'validating versions' might have used something different or maybe they all stuck to the same thing.

    In this paper they note that a limitation of their original version (for the purpose of their paper which is comparing it with an actual modified version) was that they modified that from the validated version because not to give the incentive based on the 'two of the trials won' but instead an average.


    And your idea of interviewing participants afterwards is also used in this paper in the discussion :

    and in the 3.5 secondary analyses section (more is there on this):

    EDIT: sorry I didn't realise I'd left the start of this bit in, so I'll correct it. One of the main purposes in their paper is developing a modified version that instead of using a defined number of clicks for 'hard' uses a 'how many can you do in x time' multiplied by eg 2,3,4,5 as a 'reward' number.

    In order to include the likelihood of motoric ability affecting this, they produced a Max figure for this by doing tests at the start.

     
    Last edited: Mar 3, 2024
  14. bobbler

    bobbler Senior Member (Voting Rights)

    Messages:
    3,734
    ❤️
     
    Yann04, Hutan, Evergreen and 2 others like this.
  15. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    995
    Location:
    UK
    I think it might be – based on how authors responded to letters about the Pace trial and similar. Those criticised ignore all the strong points and focus on more marginal ones. if there are a few marginal points across several letters, or in a paper, that makes their reply can look stronger than it really is, particularly to neutrals, who will then be more likely to walk away from a contentious area.

    The high non-– completion rate fatally undermines these of EEfRT. I don’t think pointing out that they excluded an individual who tried to game the study makes the case any stronger, but it does give the authors a chance to mount a defence of sorts.

    the analysis that shows what is happening with healthy volunteer is brilliant, and the fact that Flips a marginally significant result into a clearly non-significant one is striking. But I wonder, if including the point in the letter would be productive,
     
    Last edited: Mar 3, 2024
    horton6, rvallee, Evergreen and 10 others like this.
  16. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,204
    I have contacted Nath and Walitt and asked them to supply additional details that other EEfRT studies had supplied. These details are crucial to the understanding of the trial. I have also contacted Ohmann and asked @andrewkq how one could coordinate things. I think we should take our time (certainly not months, but at least a couple of days until we've made sure that every angle has been looked at) and don't have much reason for a rushed response.

    Regarding "figuring things out" or trying to strategizes within the trial, there's even a study where the experiment is repeated 4 times and participants had a weeks break between the first 2 turns and the last 2 turns, it seems "strategising" wasn't a problem there. Focusing a response on the fact that one can "strategize" based on looking at HV-F alone, wouldn't make sense to me, especially when it is abundantly clear that his strategy is not even optimal and he makes non-optimal decisions multiple times, which makes it clear that he in fact is not beating the game at all, rather than just being an outlier that is gaming differently. Based on what I've seen some other studies might have excluded him as well.

    I also find it interesting that in several studies authors would tell the participants different things about what the pay-out would be to control for the motivation. I think we have to know how exactly these things went in the intramural study and I think @Hutan's point of getting this information from a participant as well is crucial. Where they all chatting in a room, waiting in line or what was going on, is there a slight difference to what is reported in the paper?

    I think it could be valuable to have a closer look at this thread I made:
    Worth the ‘EEfRT’? The Effort Expenditure for Rewards Task as an Objective Measure of Motivation and Anhedonia, 2009, Treadway et al and look at some of those studies a bit closer.

    I don't think it makes sense to focus too much just on the original 2009 paper, as the EEfRT has been used in a tremendous amount of different studies. The results of all the different EEfRT studies differ vastly and so do the interpretations of their results. For example people not using a "good strategy" is sometimes even argued to be a property of an illness. Furthermore multiple studies have excluded some participants, I haven't seen what reasons have been given, but typically an analysis was provided for and without these people and never would the results drastically change. I believe it makes sense to see if somewhere standard exclusion criteria were specified and if this was a priori or a posteriori. Multiple studies also had a difference in between people being able to complete tasks, I still have to have a closer look at that. I haven't found a study where there was a hard task completion even being close to as low as in the pwME group in the intramural study. I think it might makes sense to go through some of these studies and see what the authors said about the study if they had slightly slower completion rates in one group and what the lowest completion rate on hard tasks was. Perhaps there is a study somewhere where there is a lower completion rate on hard tasks that is statistically significant (I haven't seen one yet) and then see what the authors response to this was, since this could be the line of defence by Walitt et al.

    Most trials made adaptations to the orginial trial, very often to account for fatiguability or some other deficits of the participants (for example people with cognitive problems not having a time limit on having to make a decision between hard and easy). Not having a calibration phase would be problematic in the intramural study if fatiguability has any influence on the results (which might not seem to be the case, but I don't think someone has fully looked at this yet). Often they also adapted their analysis accordingly, I have started looking into what this might mean for the results of the intramural study.

    Looking at this has made me crash, but I hope to present some graphs in the next few days and once I've gotten some responses via email.
     
    Last edited: Mar 3, 2024
  17. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    While the exclusion of HVF's data is an outrage (he isn't even an outlier in terms of hard tasks chosen, all players played hard more often when the prize was high so his strategy isn't odd, and losing the easy tasks doesn't affect the primary endpoint) I agree choosing that battle is like meeting the study on its own terms.

    Perhaps there was an instruction onscreen that said "push the button to fill up the bar", for example, and they can argue he didn't follow that instruction (even though others didn't either).

    The fact PWME couldn't complete hard tasks seems like a stronger argument to present to a hostile audience (or an audience that doesn't want the embarassment of errata).

    Both of them together make a good general narrative for a more general audience.
     
  18. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    Here's a chart of hard tasks chosen (% terms) vs expected prize money (2x the mean of the prize awarded for tasks completed). We can see HVF is an outlier in these terms (top left in blue). PWME shown in red.

    prize vs percent hard.jpeg

    If this test was really well-designed you'd expect the points to form a tighter upward-sloping line. There would be a tight link between the desired behaviour and the reward.
     
  19. Eddie

    Eddie Senior Member (Voting Rights)

    Messages:
    145
    Location:
    Australia
    From the perspective of a healthy control, the aim of the game has to be to maximize the reward they receive. From this perspective it makes complete sense to fail any task with a lower reward and thus remove it from the pool of possible rewards. The true optimal strategy would depend on how the rewards are generated (are they randomly selected?, are they on a bell curve?) but it certainly involves failing the lower reward tasks on purpose. Does anyone know if they included the payouts (or expected payouts) in the raw data as that would give a general idea of how good different strategies were. I also think this isn't particularly important. As other have mentioned, I agree that it makes sense to avoid focusing on this in any response other than recognizing that without this exclusion the case for "effort preference" would have been weakened.

    Edit: Thanks Murph for answering my question!
     
    bobbler, cfsandmore, Hutan and 4 others like this.
  20. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    There's history and precedent of booting out the data of people who try to maximise their payout, as shown in the next two screengrabs. This should be evidence EEfRT is a mess. But in terms of a fight over whether HVF's data should have been excluded, it's likely to weigh on Wallitt's side.

    It is further evidence the best approach is to focus on rates of hard task non-completion by fatigued participants.

    1.

    Screenshot 2024-03-03 at 9.04.20 pm.png

    2. This is where footnote 37 in the above screenshot leads:
    Neuropsychopharmacology. 2021 May; 46(6): 1078–1085.

    Dose-response effects of d-amphetamine on effort-based decision-making and reinforcement learning
    Screenshot 2024-03-03 at 9.07.36 pm.png
     
    JoClaire, Yann04, Hutan and 7 others like this.

Share This Page