Cardiopulmonary and metabolic responses during a 2-day CPET in [ME/CFS]: translating reduced oxygen consumption [...], Keller et al, 2024

Discussion in 'ME/CFS research' started by Nightsong, Jul 5, 2024.

Tags:
  1. EndME

    EndME Senior Member (Voting Rights)

    Messages:
    1,204
    Would be good for someone with a background in exercise physiology to weigh in on this. Cardiopulmonary Exercise Test Methodology for Assessing Exertion Intolerance in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome writes
    I have had a look at “Cardiopulmonary Exercise Test Methodology for Assessing Exertion Intolerance in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome” https://www.frontiersin.org/journals/pediatrics/articles/10.3389/fped.2018.00242/full which is a guideline for performing 2-day CPETs in ME/CFS by van Ness (the person who first published on this subject in ME/CFS) and others. The guideline is very much focused on “this test provokes PEM” and the authors state that “CPET also elicits a robust post-exertional symptom flare (termed, post-exertional malaise)”.

    Unfortunately I didn't find any information on how this had been ensured or was somehow quantified (I may have simply missed it).

    Other than that the authors do state the importance and give ways to ensure that patients are at their “usual rested levels” before the 1st CPET procedure, which if always adhered to would reduce my worries on people being exhausted more than usual going into the first test

     
    forestglip and Peter Trewhitt like this.
  2. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Later in the paper they say:
     
    Last edited: Aug 31, 2024
  3. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Not a lot of correlation using Bell scale either. Lower scores indicate more severe disability.

    BAS vs change in VO2 at VAT:
    bas_vo2diff.png

    BAS vs change in workload at VAT:
    bas__wkld_diff.png

    Correlations (only MECFS):
    upload_2024-8-31_10-7-35.png

    Correlations (full cohort):
    upload_2024-8-31_10-10-39.png
     
    ME/CFS Skeptic likes this.
  4. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Seems that you are right: the legend under table 4 says:

    "%pred HRmax actual peak HR/(220-age)"​
     
    Kitty, forestglip and Nightsong like this.
  5. Amw66

    Amw66 Senior Member (Voting Rights)

    Messages:
    6,769
     
    Kitty and forestglip like this.
  6. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    I've tried to calculate the differences between CPET1 and CPET2 for each group and then compared them using a t-test. Here's what I got for the peak values:

    Most of the effect sizes are quite small and none will be statistically significant if one were to correct for multiple tests.
    upload_2024-9-8_12-28-40.png

    Here are the values at the ventilatory threshold. Here the differences seem even smaller, especially for Work (wkld) and VO2.

    upload_2024-9-8_12-31-29.png
     
    Kitty and forestglip like this.
  7. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Should this maybe be using percent decrease to account for different size bodies?

    I didn't even really look at day 2 peak when I was analyzing because if they used scores even when people quit before hitting max, it's no longer a very objective measurement.
     
    Kitty and ME/CFS Skeptic like this.
  8. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Yes good point. I still have to look at these criteria so the values I reported above used all the data and will probably be quite different once I restrict the analysis to those who met the required thresholds.
     
    Kitty, MEMarge and forestglip like this.
  9. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Had a look but it seems that surprisingly only 10 participants did not meet the maximum effort criteria which are described as follows:
    Here's how I implemented this in my code (using Python) - hopefully somebody can check and try to replicate.
    Code:
    df_original['HR_predicted'] = df_original['HR'] / (220 - df_original['age'])
    df = df_original[df_original['Time_Point'] == 'max']
    
    criteria = (
    (df['RER'] >= 1.10).astype(int) +
    (df['HR_predicted'] >= 0.85).astype(int) +
    (df['RPE'] >= 17).astype(int)
    ) >= 2
    
    excluded_participants = df[~criteria]['ParticipantID'].unique()
    The results look like this. These are the 10 excluded participants, of which 8 where ME/CFS patients:
    upload_2024-9-8_22-14-7.png

    And here's an overview of how many participants did not reach one of the three criteria:

    upload_2024-9-8_22-14-38.png

    Because these were only 10 patients, excluding them did not have large effect on the effect size:

    upload_2024-9-8_22-58-17.png

    The authors probably excluded the 2 HC and the 2 MECFS that did not reach the criteria on day 1 (thus not excluding the 6 MECFS patients who did not met the criteria only on day 2). I've compared those results with those where all 10 patients who did meet the threshold were excluded:

    upload_2024-9-8_22-58-24.png

    So I think, all in all minor difference that do not really matter in this paper. Strange that they did not mention this in the paper.
     
    RedFox, Kitty and forestglip like this.
  10. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Just checking the excluded participants, I got a couple things different. Number of observations for below .85 of HR predicted = 88. That was 38 MECFS and 22 HC. This was probably just missing PI-057 on D2, the only participant to have no HR value for day 2.

    PI-043 was excluded for day 1, not day 2.

    Exclusions for day 1 and day 2:
    Screenshot from 2024-09-08 19-51-40.png Screenshot from 2024-09-08 19-51-55.png

    Code I used:
    Code:
    import pandas as pd
    
    cpet_data = pd.read_csv('cpet_clinical_data.tsv', sep='\t')
    
    # Only max timepoint
    cpet_data_max = cpet_data[cpet_data['Time_Point'] == 'max'].copy()
    
    # Convert to numbers
    cpet_data_max['HR'] = pd.to_numeric(cpet_data_max['HR'], errors='coerce')
    
    # Get age-predicted max HR
    cpet_data_max['Predicted_HR'] = 220 - cpet_data_max['age']
    
    # Create columns for satisfying criteria
    cpet_data_max['RER_include'] = cpet_data_max['RER'] >= 1.10
    cpet_data_max['HR_include'] = cpet_data_max['HR']/cpet_data_max['Predicted_HR'] >= 0.85
    cpet_data_max['RPE_include'] = cpet_data_max['RPE'] >= 17
    
    # Create column for whether they satisfied at least two criteria
    cpet_data_max['at_least_two_true'] = (cpet_data_max[['RER_include', 'HR_include', 'RPE_include']].sum(axis=1) >= 2)
    
    # Dataframe for all rows that did not satisfy at least two criteria
    exclusions_df = cpet_data_max[cpet_data_max['at_least_two_true'] == False][['ParticipantID', 'phenotype', 'Study_Visit', 'Time_Point', 'RER_include', 'HR_include', 'RPE_include']]
    
    print(exclusions_df[exclusions_df['Study_Visit'] == 'D1'])
    print(exclusions_df[exclusions_df['Study_Visit'] == 'D2'])
     
    Kitty and ME/CFS Skeptic like this.
  11. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    I get the same Cohen's D, but slightly different p values from you. Most are the same, but for example, for wkld, I got .09 instead of .08.

    For max:
    upload_2024-9-8_21-51-10.png

    It's kind of a mess, but if you want to look at my code:

    Code:
    from numpy import var, mean, sqrt
    from scipy import stats
    from pandas import Series
    import pandas as pd
    
    def cohend(d1: Series, d2: Series) -> float:
    
        # calculate the size of samples
        n1, n2 = len(d1), len(d2)
    
        # calculate the variance of the samples
        s1, s2 = var(d1, ddof=1), var(d2, ddof=1)
    
        # calculate the pooled standard deviation
        s = sqrt(((n1 - 1) * s1 + (n2 - 1) * s2) / (n1 + n2 - 2))
    
        # calculate the means of the samples
        u1, u2 = mean(d1), mean(d2)
    
        # return the effect size
        return (u1 - u2) / s
    
    cpet_data = pd.read_csv('cpet_clinical_data.tsv', sep='\t')
    
    metrics = [
        'DBP',        # Diastolic Blood Pressure
        'HR',         # Heart Rate
        #'OUES',       # Oxygen Uptake Equivalent Slope
        'PETCO2',     # End-Tidal CO2
        'PETO2',      # End-Tidal O2
        'PP',         # Pulse Pressure
        'RER',        # Respiratory Exchange Ratio
        'RPE',        # Rating of Perceived Exertion
        'RPM',        # Pedal RPM
        'RPP',        # Rate Pressure Product
        'RR',         # Respiratory Rate
        'SBP',        # Systolic Blood Pressure
        'VCO2',       # Carbon Dioxide Production
        'VO2',        # Oxygen Consumption
        'VO2_HR',     # Oxygen Consumption per Heart Rate
        'VO2_t',      # Oxygen consumption (ml/min)
        'Ve_BTPS',    # Ventilation (BTPS)
        'Ve_VCO2',    # Ventilatory Equivalent for CO2
        'Ve_VO2',     # Ventilatory Equivalent for O2
        'Vt_BTPS_L',  # Tidal Volume (BTPS in Liters)
        'wkld',        # Workload
        'time_sec'
    ]
    
    # Convert all metric columns to numeric, coercing errors
    for metric in metrics:
        cpet_data[metric] = pd.to_numeric(cpet_data[metric], errors='coerce')
    
    # Split the data by time points
    time_points = ['AT', 'max', 'rest']
    
    # Initialize an empty list to store differences
    diff_data = []
    
    for time_point in time_points:
        # Filter data for the current time point
        df_tp = cpet_data[cpet_data['Time_Point'] == time_point]
     
        # Split the data into Day 1 and Day 2
        day1 = df_tp[df_tp['Study_Visit'] == 'D1']
        day2 = df_tp[df_tp['Study_Visit'] == 'D2']
     
        # Merge day1 and day2 on ParticipantID
        merged = pd.merge(day1, day2, on=['ParticipantID', 'matched_pair', 'sex', 'Time_Point', 'race', 'phenotype'], suffixes=('_D1', '_D2'))
     
        # Calculate the absolute difference for each metric
        for metric in metrics:
            merged[metric + '_abs_diff'] = merged[metric + '_D2'] - merged[metric + '_D1']
    
        # Calculate the percentage difference for each metric
        for metric in metrics:
            merged[metric + '_pct_diff'] = ((merged[metric + '_D2'] - merged[metric + '_D1']) / merged[metric + '_D1']) * 100
     
     
        # Keep only relevant columns and add time point information
        diff_columns = ['ParticipantID', 'matched_pair', 'sex', 'Time_Point', 'phenotype'] + [metric + '_pct_diff' for metric in metrics]
        merged['Time_Point'] = time_point
        diff_data.append(merged)
    
    # Combine the differences for all time points
    cpet_diff = pd.concat(diff_data)
    
    
    
     
    # List of absolute difference metrics
    abs_diff_metrics = [metric + '_abs_diff' for metric in metrics]
    
    # Dictionary to store DataFrames for each time point
    time_point_dfs = {}
    
    # Calculate differences, Cohen's d, and p-values for each time point
    for time_point in time_points:
        # Filter data for the current time point
        tp_data = cpet_diff[cpet_diff['Time_Point'] == time_point]
     
        # Initialize lists to store results
        results = []
     
        for metric in abs_diff_metrics:
            # Get data for each group
            mecfs_data = tp_data[tp_data['phenotype'] == 'MECFS'][metric].dropna()
            hc_data = tp_data[tp_data['phenotype'] == 'HC'][metric].dropna()
     
            # Calculate means
            mecfs_mean = mecfs_data.mean()
            hc_mean = hc_data.mean()
            mean_diff = mecfs_mean - hc_mean
     
            cohens_d = cohend(mecfs_data, hc_data)
     
            # Perform t-test
            t_stat, p_value = stats.ttest_ind(mecfs_data, hc_data)
     
            # Store results
            results.append({
                'Metric': metric,
                'Mean_Difference': mean_diff,
                'Cohens_d': cohens_d,
                'p_value': p_value
            })
     
        # Convert results to DataFrame
        df = pd.DataFrame(results)
     
        # Sort by p-value
        df = df.reindex(df['p_value'].sort_values(ascending=True).index)
     
        # Store the DataFrame in the dictionary
        time_point_dfs[time_point] = df
    
    at_diff = time_point_dfs['AT']
    max_diff = time_point_dfs['max']
    rest_diff = time_point_dfs['rest']
    Edit: I'm confused about VO2 vs VO2_t. The code book says they are "Oxygen consumption (L/min)" and "Oxygen consumption (ml/min)", so shouldn't one be one just be 1000 times the other? Is VO2_t something else?
     
    Last edited: Sep 9, 2024
    Kitty and ME/CFS Skeptic like this.
  12. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Yes that was the difference. I forgot to include him because he had no valid data for HR.

    Yes I got the same result but made an error in writing it down in my table/overview.

    The different p-values might be due to me using:

    t_value, p_value = stats.ttest_ind(difference_MECFS, difference_HC, equal_var=False)​

    Setting equal_var = False is called a Welch test and is often considered a better default (it is the default in R) because it does not assume that the variance in both groups is the same. But I don't think it makes a big difference.

    Thanks so much for checking and replicating. This is really helpful. I think we got pretty much the same results.

    I plan to look at the correlations with the Bell scale today, will check if I get the same results as you.
     
    Kitty and forestglip like this.
  13. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Can confirm changing equal_var to False makes the p values the same.

    I just wanted to check percent difference too, but not much more impressive. Workload at AT drops to last place. For max, I didn't remove the people who don't meet the criteria above.

    Percent decrease at max:
    upload_2024-9-9_8-26-13.png

    Percent decrease at AT:
    upload_2024-9-9_8-27-9.png

    For the correlations of Bell score, I used Pearson, though that might not have been best as there are a couple outliers. Unfortunately, I didn't do absolute differences for this analysis, and don't have time right now, but here is Spearrman, with the same features from before:

    MECFS only:
    upload_2024-9-9_8-56-40.png

    Full cohort:
    upload_2024-9-9_8-56-52.png
     
    Last edited: Sep 9, 2024
    Kitty and ME/CFS Skeptic like this.
  14. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    @ME/CFS Skeptic Do you know what VO2_t is? It looks like it's a different ratio to VO2 for every person. For example VO2_t/VO2 for PI-002 is 75.00 at all timepoints and days. 100.91 for PI-003.

    Edit: Oh, VO2_t is ml/min. And VO2 is that but normalized to weight, using kilograms. The codebook just doesn't have the correct label.
     
    Last edited: Sep 9, 2024
    Kitty and ME/CFS Skeptic like this.
  15. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Thanks, got the same values as you.

    EDIT: this is an error see: https://www.s4me.info/threads/cardi...on-keller-et-al-2024.39219/page-4#post-552976
    I noticed that these average percentage changes are quite smaller than if you calculate the percentage change based on the reported means. This is what we used to for previous 2day CPET studies because these summary statistics were all that was reported.

    So take for example Work at AT for the MECFS group: it goes from 51.2 on day 2 to 46.4 on day 2. In percentage that is a change of 9.4%. Based on the means you would expect that MECFS patients decrease by an average of 9.4%.

    But if calculate the percentage change for each participant and then take the mean, the change is only 0.45%. That is a surprisingly big difference. I suppose it means that larges changes were seen in those who had large baseline score differences?

    Got the same results as you. I don't know which one is more appropriate (spearman or pearson) but they both are really small and non-significant, so I suppose the message is clear enough.
     
    Last edited: Sep 9, 2024
    Kitty and forestglip like this.
  16. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    I think that VO2 is VO2 divided per weight of the participant (so ml kg−1 min−1) while VO2_t is just the VO2 (ml/min), probably an error in the codebook.

    There's still something about these two values that don't add up because they should result in the exact same effect sizes but they often don't, with some small differences. I do not have an explanation for this because it seems that they used the same weight for each participant on day1 and day2.
     
    Kitty likes this.
  17. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    874
    Thanks for checking that. And yes, I was confused about the effect sizes, and my brain is kind of short circuiting trying to visualize it so I might be wrong, but I think effect sizes can be different for absolute difference, while for percentage difference they should be the same, which they are.

    That is a large difference. Regression to the mean, maybe?
     
    Kitty and ME/CFS Skeptic like this.
  18. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    My bad, I was comparing the difference within ME/CFS to the difference between groups - will adjust my comment above.
     
    forestglip and Kitty like this.
  19. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    I've now recalculated with the correct comparison of ME/CFS patients but it is still the same large difference:

    This calculation first takes the means, then expresses the change in means as a percentage
    (day2_MECFS.mean() - day1_MECFS.mean()) / day1_MECFS.mean() * 100
    Result: 9.4%

    This one takes the percentage change for each participant first, then takes the mean
    ((day2_MECFS - day1_MECFS) / day1_MECFS).mean() * 100
    Result: 0.08%
     
    forestglip likes this.
  20. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,002
    Location:
    Belgium
    Anyway, this is a bit besides the point.

    I plan to write a blog post about this because what the data shows is a quite different than what the paper reports and focuses on.

    - I think the data show that there is no significant effect for any of the outcomes, whether you look at AT or max values, matched pairs or not, excluding patients who failed to meet the maximum value criteria or not.

    - VO2 and workload differences do not correlate with severity as measured with the Bell Scale (they call it bell activity scale but it is really a measure of disability rather than activity).

    - As the authors point out, the differences in the ME/CFS group are often slightly larger than in the control group but this was not statistically significant and possibly due to chance. If there is an effect it will likely be a very small one, one that can only be detected reliably with even larger sample sizes.

    - Surprisingly the largest effect sizes were seen at peak values, not at the ventilatory threshold. This could be due to the criteria to determine peak effort which have been questioned in the literature. Some argue to use lactate measurements instead of relying on %predicted HR.

    - The large overlap between ME/CFS patients and controls means that 2day CPET is not a useful measurement for ME/CFS disease activity or PEM.​
     
    RedFox, MEMarge, Mij and 2 others like this.

Share This Page