'AAAS: Machine learning 'causing science crisis'', BBC article Feb. 2019

Discussion in 'Research methodology news and research' started by Trish, Feb 17, 2019.

  1. Trish

    Trish Moderator Staff Member

    Messages:
    55,414
    Location:
    UK
    AAAS: Machine learning 'causing science crisis'
    By Pallab Ghosh Science correspondent, BBC News, Washington

     
    DokaGirl, BruceInOz, Simbindi and 4 others like this.
  2. Adrian

    Adrian Administrator Staff Member

    Messages:
    6,563
    Location:
    UK
    I think we have seen one thing here which is people training an ML model and then quoting how well it does on the training set. Or with regression pointing out which are the dependent variables with the biggest effects but without testing the model on independent data.

    There is also quite a lot of work within the ML community looking at how to understand a model and the features it is looking at.

    But ML is good at finding relationships in data but will find things that just happen to be there and things that are indicative not causal. Care needs to be taken around the quality and coverage of the data sets used for training, validation and test. Something that is lacking but also where there is a lack of tools to help. Old style ML used to have a big feature extraction stage where meaningful features were chosen in terms of the data. But these days with deep learning a neural network is presented with raw data and is expected to learn feature extraction as well (with hints through network shape) but this can lead to strange features being developed (from a human perspective) which means mis-classifications don't always make sense.

    So caution is necessary but this is a developing area and one which is useful just it needs to be done with care and a conversation needs to be held between subject matter experts and machine learning experts to really understand what is happening. That can be challenging.
     
  3. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    I can't read the article but as i have said in other threads machine learning is todays shiny new object
    The data out is only as good as the data in and how good the algorithm is.
    One can get lucky and sometimes its good enough to lead to something useful but i would never rely on it nor would i burn money on it that i don't have
     
    DokaGirl, MEMarge and andypants like this.
  4. Adrian

    Adrian Administrator Staff Member

    Messages:
    6,563
    Location:
    UK
    With the coming of self driving cars you may need to rely on it (even as a pedestrian or other road user).

    More generally ML is becoming used in many solutions (even though it is not really robust enough) and that should worry us.
     
    DokaGirl, MEMarge, Snowdrop and 3 others like this.
  5. Trish

    Trish Moderator Staff Member

    Messages:
    55,414
    Location:
    UK
    The bit that particularly bothered me was this:
    With ME research heading in the direction of metabolomics, genomics, microbiomics and other omics relying on searching big sets of data for patterns, there are surely to be lots of apparent patterns that appear in one study, only to be found when someone else tries to replicate it that it vanishes. I have no idea whether this is the sort of thing being referred to in this article.
     
  6. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    13,662
    Location:
    Canada
    A valid concern but scientists doing science the "old way" are not connected to some deeper level of reality, they use data with the same flaws and issues that comes with reducing reality into data, just on a much smaller scale because they have to reduce dataset size to be able to handle them. This problem has nothing to do with ML and everything to do with the fundamental problem of properly describing reality, one that has always existed.

    This is a bizarre issue to raise in this context, it's like saying relying on electronic computers is potentially misleading because they make so many calculations that nobody can check them. True enough, but that's also true when it's a wholly human process. With a large enough dataset it becomes so much work that no one can afford the luxury of allocating the budget to double-check that calculations were done right initially.

    Meanwhile some "scientists" are basically promoting the idea that you can treat serious chronic illness with clay modeling, interpretative dancing or jumping on a mat using statistical sleights of hand, cherry-picking and maximally inducing bias. There's always some good and some bad but the idea that scientists have a deeper intuitive understanding of reality that transcends data collection is just disconnected from reality. If anything, the history of science shows that intuitive understanding is almost always wrong, hence the need for a process that promotes objective conclusions as free from human bias as possible.

    When humans were involved in PACE, they basically distorted the data at every single step. That's a wholly human problem and they could have done exactly the same thing decades before computers were invented.
     
    Last edited: Feb 17, 2019
    DokaGirl, andypants, MEMarge and 3 others like this.
  7. wdb

    wdb Senior Member (Voting Rights)

    Messages:
    320
    Location:
    UK
    I don't know the stats but I suspect that machine learning was only used in a tiny portion (if any) of the studies that led to the awareness of a replication crisis especially as the term was coined in 2012 and many of the studies failing replication predate that by many more years. Researchers have been using poor methodology, cherry picking, basing results and exaggerating conclusions long before AI ever came along.
     
    DokaGirl, andypants, rvallee and 2 others like this.
  8. Snowdrop

    Snowdrop Senior Member (Voting Rights)

    Messages:
    2,134
    Location:
    Canada
    I like that the evolution of machine learning is highlighting the huge real issues around medical research--that so much is wrong.

    I'm worried here that a conversation with subject experts can wind up with the experts providing input skewed to a political agenda with the hope of getting data that is confirmatory but wrong. I don't know are there safeguards in place for this?
     
    DokaGirl and andypants like this.
  9. roller*

    roller* Senior Member (Voting Rights)

    Messages:
    249
    assumption:

    - there was a global urn assigned to me
    - containing the 10,000 most popular meds (by patient choice)
    - and at every visit to the doctors office
    - i had drawn 3 meds

    im wondering, if this had been more successful than the shamans decision.

    something like "hypergeometric prescription" ... perhaps they could test, if its more likely to have a "hit" with this.
     
    DokaGirl and andypants like this.
  10. Adrian

    Adrian Administrator Staff Member

    Messages:
    6,563
    Location:
    UK
    The conversation is normally about the quality and reliability of the data along with the quality of the labelling. It could be biased but the point of it being a conversation between two groups is that those doing the ML can help to pull out the biases and of course they look carefully at the data.
     
    DokaGirl, andypants and Snowdrop like this.
  11. Alvin

    Alvin Senior Member (Voting Rights)

    Messages:
    3,309
    I didn't say its completely useless but i do say its not a panacea.
    And there is a huge difference between heuristics of reading the colour of traffic lights or determining if something is a dog or a bump in the road vs medical research where your trying to find patterns and figure out how biology and physics works.
    I am also not saying don't use it, i am saying its at best a supplement and should not take priority over humans who have far superior heuristic abilities and ability to figure out how things work.
    Since its a supplement fund it at that level. If i had to choose between a machine learning rig and its programming or a good human scientist i would pick the human. If i had the funding to do both then thats fine. But in ME research we are very money constrained, we need every expert cortex we can get.
    Also Waymo/Tesla have huge amounts of money to spend on developing their dog/traffic light/traffic cone sensing algorithms. They also have lots of human capital and the ability to hire as many as they want.
     

Share This Page