@jnmaciuch Thank you so much for your work. One of the problematic areas in metabolomics and its results is to try to understand why certain metabolites are different from healthy controls. I would like to provide an example of how different types of analysis can be used to -hopefully- help in generating some hypotheses.
I took the following concepts from the study and submitted them to an Information retrieval system : FGF23 AND FGF21 AND CXCL9 AND TNFRSF11B AND TNFRSF9 AND MMP10 AND CSF1
Here are the ranked results (the second snapshot on the right is actually the second page of the results) :
View attachment 25672 View attachment 25673
I believe I have seen you mentioning macrophages. My question here is -as an example- why are proteoglycans the top ranked concept? Also, why do glycoproteins appear so high (tightly associated with proteoglycans?)
Feel free to PM me if such tools may be of interest to your work.
Thanks for your engagement! I actually did a similar analysis when I was trying to characterize some of our findings for the text. Unfortunately, it didn’t provide much of anything useful—at least nothing that some Google searching or our pathway analysis didn’t already reveal.
I suspect that your results here are primarily due to the fact that all the analytes in your short list are from the Serum O-link target 96 inflammation assay. Cytokine interactions with proteoglycans was (and is) a hot topic in extra-cellular matrix studies, so the literature is pretty saturated with text that links lists of cytokines to proteoglycans.
Macrophages are likely cropping up for the exact same reason—macrophages are important secretors of cytokines so a lot of literature will link macrophages and cytokines together. Given that this was a serum assay where macrophages aren’t present, it’s more likely that this particular signature comes from other cell types.
Either that, or the proteoglycan and macrophage hits are both being driven by references to MMP10, which is a proteoglycan degrading enzyme highly expressed in macrophages and has been extensively characterized in the literature.
I’m not sure what system you’re using, but you could confirm this by feeding different randomized lists of features from that assay and seeing if the results are unique to that first list.
As it happens, I just wrapped up a rotation in a lab that does a lot of meta-analysis of scientific literature, often using AI tools. One of the big concerns in the lab was the issue of existing bias in the literature: an analyte gets studied extensively in one context, others cite that finding, methods get developed to measure that analyte and its effects very well, and then those methods prompt people to see if the same analyte crops up in other contexts. It’s sort of a positive feedback loop of literature prevalence—it doesn’t mean the analyte isn’t important, but any text scraping tool is going to conflate literature prevalence with relevance in most cases.
I think those sorts of tools might be useful in certain cases, though overwhelmingly in my use of them, they tend to just recapitulate what’s already been extensively discussed regardless of whether it’s relevant to the topic at hand.
But I still find it very useful as a first step to get a sense of where results might intersect with existing literature!
Thanks again for the discussion!