Multifaceted evolution focused on maximal exploitation of domain knowledge for the consensus inference of Gene Regulatory Networks, 2025,Segura-Ortiz+

forestglip

Senior Member (Voting Rights)
Staff member
Multifaceted evolution focused on maximal exploitation of domain knowledge for the consensus inference of Gene Regulatory Networks

Adrián Segura-Ortiz, Karen Giménez-Orenga, José García-Nieto, Elisa Oltra, José F. Aldana-Montes
[Line breaks added]


Highlights
• BIO-INSIGHT optimizes GRN consensus inference via biologically guided functions.
• Expands the objective space to achieve high biological coverage during inference.
• Novel architecture amortizes the cost of optimization in high-dimensional spaces.
• Outperforms MO-GENECI and other methods in AUROC and AUPR on 106 benchmarks.
• Reveals disease-specific GRN patterns in ME/CFS and FM with clinical potential.

Abstract
The inference of gene regulatory networks (GRNs) is a fundamental challenge in systems biology, aiming to decipher gene interactions from expression data. However, traditional inference techniques exhibit disparities in their results and a clear preference for specific datasets. To address this issue, we present BIO-INSIGHT (Biologically Informed Optimizer - INtegrating Software to Infer GRNs by Holistic Thinking), a parallel asynchronous many-objective evolutionary algorithm that optimizes the consensus among multiple inference methods guided by biologically relevant objectives.

BIO-INSIGHT has been evaluated on an academic benchmark of 106 GRNs, comparing its performance against MO-GENECI and other consensus strategies. The results show a statistically significant improvement in AUROC and AUPR, demonstrating that biologically guided optimization outperforms primarily mathematical approaches.

Additionally, BIO-INSIGHT was applied to gene expression data from patients with fibromyalgia, myalgic encephalomyelitis, and co-diagnosis of both diseases. The inferred networks revealed regulatory interactions specific to each condition, suggesting its clinical utility in biomarker identification and potential therapeutic targets.

The robustness and ingenuity of BIO-INSIGHT consolidate its potential as an innovative tool for GRN inference, enabling the generation of more accurate and biologically feasible networks.

The source code is hosted in a public GitHub repository under the MIT license: https://github.com/AdrianSeguraOrtiz/BIO-INSIGHT. Moreover, to facilitate its reproducibility and usage, the software associated with this implementation has been packaged into a Python library available on PyPI: https://pypi.org/project/GENECI/3.0.1/.

Link | PDF (Computers in Biology and Medicine) [Open access]
 
I think I understood about 1% of what I skimmed. A shame they took the time to analyse data on HC, FM, ME/CFS, and coexisting diseases, yet the sample gene expression dataset they used was relatively small.

I actually don't know if there are any large gene expression datasets in ME/CFS. I guess there is the Grimson/Cornell pbmc exercise single cell gene expression dataset.......

ME/CFS : 8
FM : 10
Both : 16
HC :9

The performance of BIO-INSIGHT was also evaluated using non-simulated gene expression data from 43 female subjects: 8 diagnosed with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS), 10 with Fibromyalgia (FM), 16 with both ME/CFS and FM (co- diagnosed from now on), and 9 healthy controls (GSE269048 dataset)

This seems to be the findings (details in the Supplementary file 1-s2.0-S0010482525009837-mmc2.xlsx - Table S1, S2, S3)

A set of 25 gene-gene interactions are predicted to be missing across all disease groups (Table S3 in Suplementary Material, control unique intersections), including gene pairs such as IL6-CCL8, TNF-TAB2, or RPS10-RPL19, involved in critical pathways like cytokine signalling, immune activation, and protein synthesis, respectively (Fig. 10B-C). Conversely, 27 gene-gene interactions related to programmed cell death are uniquely present in ME/CFS (Table S3 in Suplementary Material and Fig. 10B-C), suggesting distinct regulatory alterations specific for this condition.

Among them the CD74-EIF4G2 interaction, involving a cell surface protein that participates in several immune processes, including inflammatory or autoimmune diseases [77] and a protein involved in the regulation of protein synthesis, leading to immune dysregulation when its function is impaired [74], [75], [76] should be highlighted for the available validating information.

In addition to CD74, a human endogenous retrovirus (HERV) (the MLT1_5q32 element) encoded in one of its introns, were found differentially expressed in ME/CFS, supporting its potential implication in this disease [69]. Furthermore, CD74 gene’s potential role linking monocyte functioning and neurological symptoms in ME/CFS, with biomarker value, had been previously described by [78]. On another hand, the physical interaction between CD74 and EIF4G2 is experimentally confirmed by [79] using Affinity Capture-MS, further validating BIO-INSIGHT’s predictive capacity and underscoring its utility in identifying clinically relevant gene predictions.

These results demonstrate that BIO-INSIGHT provides a valuable tool for understanding the molecular underpinnings of ME/CFS and FM, offering insights into gene regulatory networks that could inform future therapeutic strategies.
 
Conversely, 27 gene-gene interactions related to programmed cell death are uniquely present in ME/CFS (Table S3 in Suplementary Material and Fig. 10B-C), suggesting distinct regulatory alterations specific for this condition.
What would happen if cells did the opposite of cancer - destroyed themselves when they shouldn’t?
 
This is a computational methods paper first and foremost—there are a million and one of these papers which propose some new way of applying algorithms to high throughput biological data (transcriptomics, usually, because it’s the most widely available).

The basic structure is just 1) claim that existing computational methods to do X are insufficient and explain your new method for fixing that problem 2) “benchmark” your method against other methods for doing the same thing 3) use the method to do some cursory analysis on some publicly available data set.

A gene regulatory network in this context is just an application of particular algorithms to transcriptomic data to infer networks of genes that are co-regulated by the same transcription factors. It’s a mathematically fancy way to say “across 10000 cells in this dataset, expression levels of genes X Y Z all seemed to go up and down in tandem, and these levels correlated with transcription factor F, which implies that XYZ are all under the control of F”

It can give interesting results sometimes, but is also prone to give nonsensical or uninterpretable results.

I’ll give the paper a read through when I have a chance, though I’ll be honest, so many of these papers come out that I usually don’t pay attention unless people in the field actually start using it. I’d just read the analysis part of the paper as more of an attempt to show off a product than to derive actual understanding of ME
 
Last edited:
In an effort to understand this paper, I note their reference #69 is a thread here:
Now published:

HERV activation segregates ME/CFS from fibromyalgia while defining a novel nosologic entity

which I commented on just now:

I understand little of this study (still trying, though...) , but it prompts questions:
a) is HERV profile a potential MECFS diagnostic?
b) do these T-cell profiles provide disease pathway clues?
 
Back
Top Bottom