Treating patients suffering from myalgic encephalopathy/chronic fatigue syndrome (ME/CFS) with sodium dichloroacetate, Comhaire 2018

Thank you @Woolie for this fantastic post! I agree an extra discussion would be interestinbg (new thread). You put so clearly into words what I think for a long time: "In a word, it's a mess."

In a word, its a mess. And that's not even considering the broad generalisations we make from animal "stress" studies (which generally involve very physical stressors, like being starved or tortured or having to swim for your life for protracted periods).
 
This needs to be checked. I never checked it, and I guess I would have to dig into logistic regression. I don't think it's trivial - a problem might be axiom 3. After seeing the definition of logit and reading a bit (and not understanding entirely!), e.g.


(https://www.quora.com/Why-is-the-output-of-logistic-regression-interpreted-as-a-probability)

I take it that logit itself is not a probability, but that the output needs to be mapped in order to receive a conditional probability. It also seems that often the output of logistic regression is interpreted as a (conditional) probability - which, it seems, could only be possible if the mapping of output to conditional probability is linear (-> axiom 3?). That's a big constraint (that's often ignored or forgotten).

I would say there are infinitely many mappings from a space to the interval [0,1]. Not every mapping will be a probability of course.



citation from the manual of the MedCalc programme (Schoonjans, MedCalc Ltd. Ostend, Belgium):
"The goal of logistic regression is to find the best fitting (yet biologically reasonable) model to describe the relationship between the dichotomous charateristic of interest (dependent variable=response or outcome variable) and a set of independent (predictor or explanatory) variables. Logistic regression generates the coefficient (and its standard errors and significance levels) of a formula to predict a logit transformation of the probability of presence of the characteristic of interest.
 
This needs to be checked. I never checked it, and I guess I would have to dig into logistic regression. I don't think it's trivial - a problem might be axiom 3. After seeing the definition of logit and reading a bit (and not understanding entirely!), e.g.


(https://www.quora.com/Why-is-the-output-of-logistic-regression-interpreted-as-a-probability)

I take it that logit itself is not a probability, but that the output needs to be mapped in order to receive a conditional probability. It also seems that often the output of logistic regression is interpreted as a (conditional) probability - which, it seems, could only be possible if the mapping of output to conditional probability is linear (-> axiom 3?). That's a big constraint (that's often ignored or forgotten).

I would say th
ere are infinitely many mappings from a space to the interval [0,1]. Not every mapping will be a probability of course.

I think its something to do with the maximum likelihood optimization over what are assumed to be probabilities using the cost function:
J(θ)=−∑i(y(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i))))
With y(i) being the target ith training sample and x(i) being the input for the ith training example and hθ being the function that is being optimized (with θ being the weights). Given its a class being fitted with y(i) being either 0 or 1 then the idea is this is the conditional probability of y|x. But I've not really understood yet. There is a tutorial from Andrew Ng at Stanford (regression then logistic regression) who does some excellent tutorials (and coursera courses) on machine learning.
http://cs229.stanford.edu/notes/cs229-notes1.pdf

It feels like there needs to be something around sample theory as well. Which is my objection to this work. If you take a small sample and use that to derive a model then basically you will pick up on random features within your training set. So you need to know that the data you train on is in someway representative of all the set of data otherwise features in your predictor variables that easily minimize error but just for that given data set can be picked up on. Normally a model would be tested against a separate data set or the training set is split into portions and a cross validation technique is used (averaging error on say 4/5 used for training and 1/5th used for testing over all the different combinations). But I don't think this has been done in this case (but I'm happy to be corrected and this is one of the methodological issues I have with this work).

I think when doing significance tests on a trained model against the data it was trained on you will just get something about how well the model fits the data it has been trained on which should be good (unless the function being learned is more complex than the model and the training data sufficiently captures that relationship).

[Edit]

It looks like given the optimization performed the outputs are probabilities given the training set. I still don't get the intuition but I may eventually.
 
Last edited:
Thank you, those participant(s) who consider my scientific work and publications "shoddy". Nonetheless, my work ranks at the 97th percentile of scientific publications according to "Research Gate". Clearly, I work with patients, and ethical reasons limit the kind of "clinical experiments" I can do. In addition, pragmatic clinical research is as valuable as so-called evidence based research (see the recent publications and comments on this subject on internet).

So the impression I get from your paper (which could be wrong I only read it quickly) is that you gave a supplement to a number of patients with a disease that is varies over time and then sorted the patients into those who improved in a given time period and those who didn't. It seems hard to draw a conclusion about that given there is no control arm. It could be that this just represents the expected number of improvers with no treatment without a control we don't know.

But then you take a set of features that describe the patients such as answers to the fatigue scale and then you fit a model on all the data you have to claim that you can classify which will respond and which don't. This doesn't test a hypothesis further than saying you can fit a logistic regression model to a given data set. It says nothing about how the model generalities to unseen cases. To me (as someone who has done and is getting back to doing research in machine learning) this doesn't say anything about your data and it seems methodologically strange.

Then you seem to be making very positive claims about the supplements and the model. If you were saying that you think the results were positive and worth a blinded trial then I would agree.

That is why I'm concerned. But have I read the paper wrongly, did you have a control group? did you test your model on unseen data?

[added]
Currently it feels like hype without sufficient justification for the hype.
 
So the impression I get from your paper (which could be wrong I only read it quickly) is that you gave a supplement to a number of patients with a disease that is varies over time and then sorted the patients into those who improved in a given time period and those who didn't. It seems hard to draw a conclusion about that given there is no control arm. It could be that this just represents the expected number of improvers with no treatment without a control we don't know.

But then you take a set of features that describe the patients such as answers to the fatigue scale and then you fit a model on all the data you have to claim that you can classify which will respond and which don't. This doesn't test a hypothesis further than saying you can fit a logistic regression model to a given data set. It says nothing about how the model generalities to unseen cases. To me (as someone who has done and is getting back to doing research in machine learning) this doesn't say anything about your data and it seems methodologically strange.

Then you seem to be making very positive claims about the supplements and the model. If you were saying that you think the results were positive and worth a blinded trial then I would agree.

That is why I'm concerned. But have I read the paper wrongly, did you have a control group? did you test your model on unseen data?

[added]
Currently it feels like hype without sufficient justification for the hype.




The hype is not created by me. This is a preliminary, proof-of-principle, pilot pragmatic trial, as the title says, and it is published in a Journal called "medical hypotheses". The finding that pre-treatment characteristics seemingly can differentiate between patients who did and those who did not respond to sodium dichloroactetate is an interesting observation. It may suggest that the mechansim involved in the pathogenesis of the disease is different. Please stop speculating on things that are not in the paper, en please read the paper attentively and completely before criticising!
 
Last edited:
  • Like
Reactions: sb4
(see the recent publications and comments on this subject on internet)
That's not a very specific reference - "please see the internet"!

The semantic discussion about "what is stress" is futile. It is the reaction of the body to "stress" that is important.
The reaction of the body to what?

please read the paper attentively and completely before criticising!
That's exactly what the PACE authors say - have you read the paper? If you have and you still have any questions, you obviously didn't read it properly, so read it again.
 
ELS_NS_Logo_2C_RGB.svg

Most Downloaded Medical Hypotheses Articles

The most downloaded articles from Medical Hypotheses in the last 90 days.

Treating patients suffering from myalgic encephalopathy/chronic fatigue syndrome (ME/CFS) with sodium dichloroacetate: An open-label, proof-of-principle pilot trial
May 2018
Frank Comhaire

eta::emoji_face_palm:
 
Last edited:
Back
Top Bottom