Cardiopulmonary and metabolic responses during a 2-day CPET in [ME/CFS]: translating reduced oxygen consumption [...], Keller et al, 2024

Discussion in 'ME/CFS research' started by Nightsong, Jul 5, 2024.

Tags:
  1. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    110
    So, this was an attempt to replicate two-day cpet with a really big sample and the differences persisted but don't look very big.

    Is that a fair summary? Is that why most of this thread is discussion of the inclusion criteria? People trying to explain why the differences don't look big?

    I tried to read this paper myself but holy heck, is it long. and dense.

    Another question: are these the same patients as in Hanson's huge metabolomics studies? she did two-day exercise challenges there too .
     
  2. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    I think the inclusion criteria line of thought was mostly just speculation by me. I wouldn't put too much stock in it. But yes, it was an attempt to explain why the differences seem smaller with deconditioned controls. Although, even though they are smaller, this is yet another study showing significant differences between groups, even in the (somewhat controversial) fitness matched pairing.

    Side question: Have any of the controlled 2-day CPET studies shown greater reductions in the control group for VO2 or work at VAT, even non-significantly greater, or is it universally pwME showed greater reduction (including non-significant)?
     
    Last edited: Jul 12, 2024
    Kitty, Sean, Robert 1973 and 2 others like this.
  3. SNT Gatchaman

    SNT Gatchaman Senior Member (Voting Rights)

    Messages:
    5,305
    Location:
    Aotearoa New Zealand
    It would have been much better served as two papers, I think.

     
  4. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    Just wanted to highlight that this paper, in contrast to previous exercise studies in ME/CFS, found no evidence of chronotropic incompetence or the inability to increase HR during the exercise test. They used 3 measures for this (%predicted HRmax, %HRRadjusted, and CTIpeak) and none seem to differ between patients and controls.
     
    Kitty, SNT Gatchaman, Mij and 3 others like this.
  5. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    The raw data is now on mapmecfs.org.
     
  6. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    Already had a quick peak and it seems that there is quite a lot of overlap between the two groups if you plot the difference between day 1 an day 2. Here's for example the workload at the ventilatory threshold for the total sample, which in the past showed the biggest differences.

    upload_2024-8-29_20-9-58.png
    (EDIT: The previous plot I posted had the same data but with each datapoint was accidentally shown multiple times.)

    Perhaps this is not the best statistical approach but if I do a t-test on the difference between day1 and day2, I get a p-value of 0.17 and a standardized effect size of 0.12.

    Plan to take a closer look in the coming days. Kudos to the authors for putting the data online.
     
    Last edited: Aug 29, 2024 at 8:31 PM
    Kitty, Michelle, chillier and 4 others like this.
  7. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    I wanted to visualize what the differences in the matched pairs were individually, to see how many people in the ME group had a larger difference in the metric than their matched control. If pretty much all pwME had a larger drop in VO2 or another metric than their matched counterpart, that'd be pretty impressive and promising as a biomarker. Now, I did this really quickly and may have made some mistakes, but I manually checked a couple points and they matched the raw data. On first glance, it doesn't look like it's that cut and dry.

    Each x value is a matched pair. So x=1 will have two dots, one for the HC (blue) and one for the MECFS (red) in the first VO2peak matched pair. The y value is the difference in the chosen metric at VAT between the two CPETs. I ordered by the HC's difference just to make it a little easier to look at, but this order isn't otherwise meaningful.

    Lots of MECFS (red dots) both below and above their corresponding blue dot in all metrics that the study showed were significant for pwME but not for HC.

    VO2plot.png Ve_VCO2plot.png VO2_HRplot.png PETCO2plot.png

    Edit: Here are the graphs using percentage difference between days instead of absolute difference.

    VO2_pct_diff_plot.png Ve_VCO2_pct_diff_plot.png VO2_HR_pct_diff_plot.png PETCO2_pct_diff_plot.png

    Also, for 23.8% of ME/CFS patients (using full cohort), their VO2 at max increased on the second day. For 33.3% their VO2 at VAT increased.

    Here is the Python code that can be run in Jupyter in case anyone wants to verify the code to make these graphs is correct:
    Code:
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Load the data from a TSV file
    file_path = 'cpet_clinical_data.tsv'
    df = pd.read_csv(file_path, sep='\t')
    
    df_matched = df.dropna(subset=['matched_pair']).copy()
    
    metrics = [
        'DBP',        # Diastolic Blood Pressure
        'HR',         # Heart Rate
        'OUES',       # Oxygen Uptake Efficiency Slope
        'PETCO2',     # Partial Pressure of End-Tidal CO2
        'PETO2',      # Partial Pressure of End-Tidal O2
        'PP',         # Pulse Pressure
        'RER',        # Respiratory Exchange Ratio
        'RPE',        # Rate of Perceived Exertion
        'RPM',        # Respiratory Pump Mechanism
        'RPP',        # Rate Pressure Product
        'RR',         # Respiratory Rate
        'SBP',        # Systolic Blood Pressure
        'VCO2',       # Carbon Dioxide Production
        'VO2',        # Oxygen Consumption
        'VO2_HR',     # Oxygen Consumption per Heart Rate
        'VO2_t',      # Time for VO2 Measurement
        'Ve_BTPS',    # Ventilation (BTPS)
        'Ve_VCO2',    # Ventilatory Equivalent for CO2
        'Ve_VO2',     # Ventilatory Equivalent for O2
        'Vt_BTPS_L',  # Tidal Volume (BTPS in Liters)
        'wkld'        # Workload
    ]
    
    # Convert all metric columns to numeric, coercing errors
    for metric in metrics:
        df_matched[metric] = pd.to_numeric(df_matched[metric], errors='coerce')
    
    # Split the data by time points
    time_points = ['AT', 'max', 'rest']
    
    # Initialize an empty list to store differences
    diff_data = []
    
    for time_point in time_points:
        # Filter data for the current time point
        df_tp = df_matched[df_matched['Time_Point'] == time_point]
      
        # Split the data into Day 1 and Day 2
        day1 = df_tp[df_tp['Study_Visit'] == 'D1']
        day2 = df_tp[df_tp['Study_Visit'] == 'D2']
      
        # Merge day1 and day2 on ParticipantID
        merged = pd.merge(day1, day2, on=['ParticipantID', 'matched_pair', 'sex', 'Time_Point', 'race', 'phenotype'], suffixes=('_D1', '_D2'))
      
        # Calculate the percentage difference for each metric
        for metric in metrics:
            merged[metric + '_pct_diff'] = ((merged[metric + '_D2'] - merged[metric + '_D1']) / merged[metric + '_D1']) * 100
      
        # Keep only relevant columns and add time point information
        diff_columns = ['ParticipantID', 'matched_pair', 'sex', 'Time_Point', 'phenotype'] + [metric + '_pct_diff' for metric in metrics]
        merged['Time_Point'] = time_point
        diff_data.append(merged[diff_columns])
    
    # Combine the differences for all time points
    cpet_diff = pd.concat(diff_data)
    
    
    # Define the metric you want to plot
    metric_to_plot = 'PETCO2'  # Change this to the metric you want to plot
    
    timepoint_to_plot = 'max' # Change this to the time point you want to plot [max, AT, rest]
    
    # Filter the data for the chosen time point
    timepoint_data = cpet_diff[cpet_diff['Time_Point'] == timepoint_to_plot].copy()
    
    # Separate 'HC' phenotype data and sort it by the selected metric
    hc_data = timepoint_data[timepoint_data['phenotype'] == 'HC']
    hc_data_sorted = hc_data.sort_values(by=metric_to_plot + '_pct_diff')
    
    # Create a mapping from matched_pair to its sorted index
    matched_pair_order = {mp: i for i, mp in enumerate(hc_data_sorted['matched_pair'])}
    
    # Reorder the 'matched_pair' column based on the sorted indices
    timepoint_data['matched_pair_order'] = timepoint_data['matched_pair'].map(matched_pair_order)
    
    # Set up the plot
    plt.figure(figsize=(12, 8))
    
    # Create a scatter plot for the 'AT' time point, ordered by matched_pair_order
    sns.scatterplot(
        data=timepoint_data,                 
        x='matched_pair_order',       
        y=metric_to_plot + '_pct_diff',   
        hue='phenotype',             
        palette={'MECFS': 'red', 'HC': 'blue'},
        s=100                         
    )
    
    # Customizing the plot
    plt.xlabel('Matched Pair')
    plt.ylabel(f'Percentage Difference in {metric_to_plot}')
    plt.title(f'Percentage Difference in {metric_to_plot} for Time Point: {timepoint_to_plot}')
    
    
    # Rotate x-axis labels to prevent overlap
    plt.xticks(ticks=range(len(matched_pair_order)), labels=[int(mp) for mp in matched_pair_order.keys()], rotation=45, ha='right')
    
    plt.legend(title='Phenotype', bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.grid(True)
    plt.tight_layout()
    
    plt.savefig(f'{metric_to_plot}_{timepoint_to_plot}_pct_diff_plot.png', bbox_inches='tight')
    
    # Show the plot
    plt.show()
    
    
    
    #####
    ##### The following is just to make the percentage of MECFS at max that increased VO2.
    #####
    
    # Filter the original df for ME/CFS participants at the AT time point
    mecfs_at_data = df[(df['phenotype'] == 'MECFS') & (df['Time_Point'] == 'max')]
    
    # Split the data into Day 1 (D1) and Day 2 (D2)
    day1 = mecfs_at_data[mecfs_at_data['Study_Visit'] == 'D1']
    day2 = mecfs_at_data[mecfs_at_data['Study_Visit'] == 'D2']
    
    # Merge Day 1 and Day 2 data on ParticipantID
    merged_mecfs = pd.merge(day1, day2, on=['ParticipantID', 'Time_Point'], suffixes=('_D1', '_D2'))
    
    # Calculate the percentage of ME/CFS participants where VO2 increased from D1 to D2
    total_mecfs = merged_mecfs.shape[0]
    mecfs_increased = merged_mecfs[merged_mecfs['VO2_D2'] > merged_mecfs['VO2_D1']].shape[0]
    
    # Calculate the percentage
    if total_mecfs > 0:
        percent_increased = (mecfs_increased / total_mecfs) * 100
    else:
        percent_increased = 0
    
    # Print the result
    print(f"Percentage of ME/CFS participants with increased VO2 at max from D1 to D2: {percent_increased:.2f}%")
    
     
    Last edited: Aug 29, 2024 at 3:25 PM
    Kitty, Michelle, Nightsong and 3 others like this.
  8. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    Interesting graphs @forestglip, the red ME/CFS dots seem scattered around the blue HC ones without a clear pattern.

    Has anyone been able to replicate their calculation of effect sizes?

    For example for VO2 (ml.kg−1.min−1) at maximal exercise, they report mean values for the ME/CFS group of 20.8 on day1 and 19.7 on day 2. They report an effect size for this of 0.33.

    But when I tried to calculate this I got 0.218 using different methods described here:
    https://real-statistics.com/students-t-distribution/paired-sample-t-test/cohens-d-paired-samples/

    EDIT: or 0.448 when dividing the mean of the differences by the sd of the differences as described here:
    https://stats.stackexchange.com/questions/598615/effect-size-for-paired-t-test
     
    Last edited: Aug 29, 2024 at 6:08 PM
  9. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    I've tried to use JASP, the statistical program Keller et al. used and it gave the same result as the method explained in the quote above.

    upload_2024-8-29_17-34-4.png

    If I calculate the mean and std I get the same results as in the paper so I don't think I've made an error in data extraction. Anyone who can explain the difference?
     
    Kitty, forestglip and Peter Trewhitt like this.
  10. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    The authors wrote:
    I see that Heart Rate (HR) is included as a variable in the data, but no the percentage of age-predicted maximum heart rate. Did anyone find something about this in the paper or data, or how they might have calculated this?
     
    Kitty and Peter Trewhitt like this.
  11. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    Someone also pointed out to me that the effect sizes reported for VO2 (ml/min) are sometimes quite different than for VO2, (ml/kg-1 min-1).

    For example In Table 3, for the matched pairs at the anaerobic threshold, ME/CFS patients had an effect size of 0.16 for VO2 (ml/min) but an effect size of 0.21 for VO2 (ml/kg-1 min-1), an increase of more than 30% that is unlikely due to rounding error. These are the same measures but the latter is standardised for weight. In the dataset provided, the weight inserted for each participant is the same on day1 as on day2, so this cannot explain the difference.
     
    Kitty and Peter Trewhitt like this.
  12. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    Are the reported mean metric values correct? Looking at Table 3 in the study at the four values that represent the full cohort of controls for those two metrics you mentioned on both days: 12.7, 11.8 for standardized to weight and 960.0, 934.5 for without weight. The percentage decrease for the standardized to weight metric is 7.0% and non-standardized is 2.7%. These percentages should be identical, right?
     
  13. Nightsong

    Nightsong Senior Member (Voting Rights)

    Messages:
    417
    Haven't been following this thread but it was my understanding that age-predicted maximum HR is usually calculated based on standard formulae (often just 220 - age) - below are some relevant snippets from exercise physiology/testing texts (L Wasserman's Principles of Exercise Testing and Interpretation, R ACSM's guidelines for Exercise Testing):

    Wasserman_Principles_Exercise_Testing_and_Interpretation__PeakHR.jpg ACSM_Guidelines_HR_max.jpg
     
  14. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    I just checked all means for those two metrics (VO2 and VO2/kg). The only one that didn't match up with my calculation was full cohort of controls on day 1. I got 12.18 instead of 12.7. This shouldn't affect the effect sizes you were talking about, though @ME/CFS Skeptic .
     
  15. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    I made some charts to see if there is any obvious indicator that decreases in CPET metrics correlate to deconditioning/fitness. I charted using VO2peak as the deconditioning metric, as well as percentage of 24 hours lying or sitting, number of hours in bed during 24 hr day, and BMI. Greater drops in performance on VO2 or workload are lower on the chart.


    VO2peak on day 1 vs. change in VO2 at VAT
    D1_max_VO2__AT_VO2_diff_percentage.png

    VO2peak on day 1 vs. change in workload at VAT
    D1_max_VO2__AT_wkld_diff_percentage.png


    Hours in bed vs. change in VO2 at VAT
    hrs_vo2.png

    Hours in bed vs. change in workload at VAT
    hrs_wkld.png


    Time reclined vs. change in VO2 at VAT
    reclined_vo2.png

    Time reclined vs. change in workload at VAT
    reclined_wkld.png


    BMI vs. change in VO2 at VAT
    bmi_vo2.png

    BMI vs. change in workload at VAT
    bmi_wkld.png


    Correlations between these variables for the full cohort of MECFS and HC:
    upload_2024-8-30_18-16-38.png

    Edit: I'm not completely sure what the "percentage of 24 hours lying or sitting" (q_reclined) metric means. How can it be 0% for so many people if humans normally require sleep? If it doesn't include time spent sleeping, how can it be 100% of the day in others? Maybe it's percentage of waking hours, not percentage of 24 hours.
     
    Last edited: Aug 31, 2024 at 12:46 AM
    Kitty, Michelle and Peter Trewhitt like this.
  16. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    515
    I checked a couple other potential fitness metrics (VO2 peak normalized to weight on day 1 and VO2 at VAT on day 1) against the difference metrics, but nothing stood out to me there either.

    I don't see anything here that makes me think deconditioning is associated with a decrease on second day CPET. (Edit: Well, within this one study. Maybe compared to non-deconditioned control groups from other 2-day CPET studies, the small mean effect in the controls here might provide some evidence of that.) (Edit2: Maybe a small correlation here on some of the deconditioning vs. difference metrics, but with a high p/q value, so should be looked at with a larger, more varied sample.)

    Though I also don't see good evidence here that 2-day CPET is useful for classifying these two groups of ME/CFS and sedentary controls.

    The fact that so many people in both groups had increases in VO2 at VAT all the way up to 20% higher on day 2 (and a few even higher) makes me think there is a lot of natural day to day variation, and it would probably take at least more than a 20% decrease to have good specificity.

    Looking at the charts above, it does look like most of the people who decreased more than 20% are ME/CFS (looks like about 11/13 people). Maybe these are the most severe. I'll have to check how Bell Activity Score correlates with the difference in days.

    ---

    Maybe a cohort that includes patients mild through severe would show a clearer effect, though that would be difficult to do for obvious reasons.

    Would also be good to have maybe at least a week of actigraphy data to get more accurate data on deconditioning in HC and disease severity in ME/CFS.

    Also, as I think I said elsewhere, maybe the second test should be done after 48 or 72 hours, instead of 24, to be sure most patients are experiencing PEM.
     
    Last edited: Aug 31, 2024 at 12:31 PM
  17. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,118
    The easiest thing for such studies to do would in my eyes be to ask patients "are you currently experiencing PEM" and rate it on a scale of 1-10 (the scale is a bit arbitrary and someone probably has a better idea, but the idea would be to get some notion for where the patients were during the 1st vs 2nd time) during the first and second CPET. I'm not sure why people aren't doing this.

    Wouldn't this be the most straightforward and obvious thing to do?
     
  18. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    I think the problem might be that people would interpret PEM very differently.
     
    Last edited: Aug 31, 2024 at 9:35 AM
  19. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,842
    Location:
    Belgium
    Yes I suspect the data refutes rather than validates previous finding on 2-day CPET.

    Haven't been able to analyse everything but it looks like the authors did two types of significance tests:
    • They looked at changes over time (from day 1 to day 2) in each group separately and calculated an effect size for this. Then if the effect was significant in the ME/CFS but not or less so in the control group, they highlighted this in the paper and abstract.
    • In tables 2 and 3 they also tested between group differences. The legend of tables 2 and 3 for example say: 'a p ≤ 0.05, aa p ≤ 0.01 between groups for CPET-1, b p ≤ 0.05, bb p ≤ 0.01 between groups for CPET-2.' But it seems that these test the difference for each day separately rather than the difference between days.
    So in my view the test that really matters is one that compares the differences (from day 1 to day 2), between groups (ME/CFS versus controls). I don't think they have done this and a independent t-test of these difference showed a really small effect that was not significant for VO2 and workload.

    I'm not sure of a t-test of the differences would be the best approach. One alternative would be a ANCOVA of day 2 CPET values that controls for day 1 CPET as a covariate.
     
    Last edited: Aug 31, 2024 at 10:37 AM
  20. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,118
    Sure and there will be problems related to "cognitive function/cognitive PEM" that for example wouldn't be picked up, but at least for each individual there should be some consistency right? That is to say "are you feeling like you are experiencing more PEM on the first test vs second test" should tell us what this person is feeling to some degree. It could also help us understand whether the whole undertaking of going there, possibly by plane etc had a similarly exhausting effect as the 1st CPET or not.

    I'm no statistician but surely there would be a way to analyse the data in a somewhat meaningful way (comparing average decline in pwME vs HC in workload at the ventilatory threshold and also looking at the average of "experience of PEM on 1st day vs 2nd day" seems unsuitable if people are interpreting PEM very differently and group differences would just arbirtraly average out such effects, but an analysis looking at 1 pwME vs 1 HC by taking to account the "experience of PEM on 1st day vs 2nd day" could possibly be sensible)?

    It could also be helpful to know that we're getting useful data on HCs. Because unless you are using deconditioned controls, where muscle pain on the second day should be expected, you would expect HCs to feel good on both attemps right?

    On a more philosophical note I am wondering about how sensible this argument is. These people are participating in this procedure essentially because they have described to be people that experience PEM in general (the problem here might lie that they have described to be experiencing PEM in general vs PEM following a CPET which this procedure hopes to somehow measure in some realted form) and because the procedure is supposedly supposed to measure exactly that (more precisely it measures the effects of physical exercise in the hopes of that somehow capturing something that is related to PEM). If the interpretation of what people are experiencing during the 2 different rounds is that different that it makes asking a question on their experience uninterpretable or doesn't result in usable data it appears to me one could have a dilemma i.e. "we believe this measures something related to PEM because this is what the people have said vs we can't ask them whether it measures something in relation to PEM because we cannot rely on their different interpretations of the PEM experience" how sensible is the procedure in the first place?

    Essentially I’m not able to see how both statements below could make sense at the same time:
    Person says he experiences PEM in day to day life-> Hope to measure effects of experiencing PEM via CPETs
    but also
    Cannot ask whether person experiences PEM at CPETs because interpretation of PEM is different for everyone
     
    Last edited: Aug 31, 2024 at 10:35 AM
    forestglip, Kitty and Peter Trewhitt like this.

Share This Page