Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data...Modafinil, 2024,Walsh

Discussion in 'ME/CFS research' started by Dolphin, Dec 12, 2024.

  1. Dolphin

    Dolphin Senior Member (Voting Rights)

    Messages:
    6,103
    https://www.jmir.org/2024/1/e54321

    Walsh J, Cave J, Griffiths F. Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences. J Med Internet Res 2024;26:e54321

    Abstract

    Background:

    Patient experience data from social media offer patient-centered perspectives on disease, treatments, and health service delivery. Current guidelines typically rely on systematic reviews, while qualitative health studies are often seen as anecdotal and nongeneralizable. This study explores combining personal health experiences from multiple sources to create generalizable evidence.

    Objective:

    The study aims to (1) investigate how combining unsupervised natural language processing (NLP) and corpus linguistics can explore patient perspectives from a large unstructured dataset of modafinil experiences, (2) compare findings with Cochrane meta-analyses on modafinil’s effectiveness, and (3) develop a methodology for analyzing such data.

    Methods:

    Using 69,022 posts from 790 sources, we used a variety of NLP and corpus techniques to analyze the data, including data cleaning techniques to maximize post context, Python for NLP techniques, and Sketch Engine for linguistic analysis. We used multiple topic mining approaches, such as latent Dirichlet allocation, nonnegative matrix factorization, and word-embedding methods. Sentiment analysis used TextBlob and Valence Aware Dictionary and Sentiment Reasoner, while corpus methods including collocation, concordance, and n-gram generation. Previous work had mapped topic mining to themes, such as health conditions, reasons for taking modafinil, symptom impacts, dosage, side effects, effectiveness, and treatment comparisons.

    Results:

    Key findings of the study included modafinil use across 166 health conditions, most frequently narcolepsy, multiple sclerosis, attention-deficit disorder, anxiety, sleep apnea, depression, bipolar disorder, chronic fatigue syndrome, fibromyalgia, and chronic disease. Word-embedding topic modeling mapped 70% of posts to predefined themes, while sentiment analysis revealed 65% positive responses, 6% neutral responses, and 28% negative responses. Notably, the perceived effectiveness of modafinil for various conditions strongly contrasts with the findings of existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness.

    Conclusions:

    This study demonstrated the value of combining NLP with linguistic techniques for analyzing large unstructured text datasets. Despite varying opinions, findings were methodologically consistent and challenged existing clinical evidence. This suggests that patient-generated data could potentially provide valuable insights into treatment outcomes, potentially improving clinical understanding and patient care.

     
    Last edited: Dec 12, 2024
    RainbowCloud and forestglip like this.
  2. Jaybee00

    Jaybee00 Senior Member (Voting Rights)

    Messages:
    2,267
    Is there an English language version?
     
    alktipping and European7 like this.
  3. Turtle

    Turtle Senior Member (Voting Rights)

    Messages:
    199

    Yes.
    "Existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness". Last sentence of "Results" in the abstract.

    This research: someone on facebook, x and tiktok says it helps.
     

Share This Page