Unutmaz post about AI model use for MECFS

“In summary, Manus AI is a shockingly powerful AI agent, comparable in capability to Deep Research combined with an OpenAI operator. This clearly marks another DeepSeek moment or even bigger than that, thus perhaps we should call it a “Manus AI moment” and I believe this will further accelerate our timeline towards AGI !”
 
“Let me clearly state from the outset: these two reports, generated in a matter of hours, could potentially accelerate our understanding and treatment of this disease by months or even years!”


Instead of these grandiose statements he should state specifically what new specific insights he got from these reports. That would be helpful.
 
and I believe this will further accelerate our timeline towards AGI !”
For context, AGI is Artificial General Intelligence. It’s a concept that can be explained as AI that is better than humans at pretty much everything. Most experts believe that it’s decades away at best, if it’s even possible. Some are very optimistic.

The thing that Unutmaz ignores is that being good at producing papers isn’t useful for solving other problems. A paper AI getting better at writing papers does nothing for its ability to drive a car.
 
https://twitter.com/user/status/1898802353077141950

Link to Google Docs for the OpenAI Deep Research: docs.google.com/document/d/1Y7…

https://docs.google.com/document/u/0/d/1Y7vf82p0MDL9iFY8TWHtmt4nPXvN5kJ2--nFPuom7vQ/mobilebasic

Link to Manus AI analysis: docs.google.com/document/d/1wY…

https://docs.google.com/document/d/1wYtxCW3bDH543r0Xr7Rej6IHzweQghJEocSC8UoEX_Q/mobilebasic

Link to our paper used for context: biorxiv.org/content/10.110…

https://www.biorxiv.org/content/10.1101/2024.06.24.600378v2

Link to Manus AI replay and full prompt (same for both models):

https://manus.im/share/RiwQh3pvyc1xXSxAgq9yZP?replay=1
 
Last edited:
I wonder what would happen if you told one of these AI machines to invent the wheel without telling it what a wheel is or what it is for?
That would be a bit like asking you to invent ashflhieutfk.

These Large Language Models are unable to do maths because they never learned to. They just use fancy methods to guess what the next character, word, sentence, etc. should be. A key breakthrough was when someone thought of building them in a way that allows them to selectively pay more and less attention to different parts of a sentence.
 
I think those hyperventilating about how powerful and amazing AI is going to be, and soon, are likely to be quite disappointed.

More likely is that it will run into a major barrier/s to progress, if it has not already. I seriously doubt that we are going to see truly sentient AI any time soon, if ever.

What AI is already good for, and presumably will get better at, is increasing efficiency. But the results are always going to need humans to parse and mesh with the real world. AI might be able spot patterns in raw data better than humans, but it will have no idea what they mean nor how to apply them and integrate them into broader theoretical frameworks.
 
I seriously doubt that we are going to see truly sentient AI any time soon, if ever.

There is no possibility of AI being sentient in any sense comparable to us. We can be sure that because we have designed AI to compute without events of complex integration with many degrees of freedom. Human sentience is characterised by such events - called rich experiences. AI only ever use events of integration with two degrees of freedom.

The idea that AI might be sentient is a confusion between intelligence and sentience. We built them to have the first without using the second.

AI is good at performing logical procedures humans understand much more quickly or reliably. But as I see it recent AI has been doing something else - performing procedures that we either do not know about or comprehend despite us having given it the tools to do them. For all we know it is building ever more invalid procedures because we don't realise we have programmed in some false assumption that we would identify if we knew what the AI was doing.


It is also using these procedures to trawl human consensus opinion when that is almost always garbage in a situation where you think you want an opinion. We ask AI for opinions only when we are dealing with a tricky or complex question where we find it hard to be sure ourselves. These are precisely the situations where the majority opinion is almost always wrong because most people miss important caveats. I am seeing this all the time now both in mind research and in chronic disease biology. It will drive the biomedical science community further and further into a consensus of meaningless waffle.
 
For all we know it is building ever more invalid procedures because we don't realise we have programmed in some false assumption that we would identify if we knew what the AI was doing.
We don’t really program in anything in the normal sense of the word. A key issue with AI is that we’re unable to tell it how to solve a problem - we can only tell them what to solve.

An example is someone who tried to make an AI that could differentiate between images of a husky and a wolf. When they checked which parts of the image the algorithm relied on, they saw that it only looked at the background. Snow = wolf, no snow = husky.

Some developers describe it as working with the world’s smartest toddler that does everything in it’s power to not do what you want it to. False assumptions are a huge issue, but not because ‘we’ programmed them. Although programmers have to accept models without knowing their ‘assumptions’ for it to become a problem, so humans are ultimately responsible.
We ask AI for opinions only when we are dealing with a tricky or complex question where we find it hard to be sure ourselves. These are precisely the situations where the majority opinion is almost always wrong because most people miss important caveats. I am seeing this all the time now both in mind research and in chronic disease biology. It will drive the biomedical science community further and further into a consensus of meaningless waffle.
This is a concern of mine as well. AI tries to get close to the ‘true nature’ of the data it has been trained on. If the ‘true nature’ of the training data doesn’t reflect the ‘true nature’ of reality, AI will miss every single time. (Edit: miss compared to reality, not compared to the training data)

If you train a model for recruitment on historic data, it will be misogynistic and racist. If we want a ‘neutral’ AI, we need nautral data first.
 
We don’t really program in anything in the normal sense of the word. A key issue with AI is that we’re unable to tell it how to solve a problem - we can only tell them what to solve.

There is work that tries to look how LLMs do reasoning (For example, this paper [2202.05262] Locating and Editing Factual Associations in GPT looks at how knowledge is localized and thus how to edit it).

This is a concern of mine as well. AI tries to get close to the ‘true nature’ of the data it has been trained on. If the ‘true nature’ of the training data doesn’t reflect the ‘true nature’ of reality, AI will miss every single time. (Edit: miss compared to reality, not compared to the training data)

If you train a model for recruitment on historic data, it will be misogynistic and racist. If we want a ‘neutral’ AI, we need nautral data first.

Since Deepseek training is becoming reinforcement based (with good/bad marks overall) which seems to be bringing gains particularly around reasoning capabilities. Its still very dependent on the signals given of course but so are people!. The other thing that is happening with agentic reasoning is a push for models that can form a plan to perform a tasks and then go and get data it doesn't know about which could help with more flexibility here.

Also asking a model to show its reasoning can change some of the reasoning paths and get much better results. Chain of reasoning tends to build this into the training approach.
 
Also asking a model to show its reasoning can change some of the reasoning paths and get much better results. Chain of reasoning tends to build this into the training approach.
If the AI is sub-symbolic, how can we verify it’s reasoning? How do we know that we don’t just train a model that’s good at saying what we want it to say? Or that rationalises after the fact?

Humans are obviously prone to the same errors, but that doesn’t excuse the AIs.
 
It is usually a good sign when research targets converge.
Do you know what kind of data his model had access to? If it’s based on more or less the same data that you used, would it not be expected to produce more or less the same results?
 
The other thing that is happening with agentic reasoning is a push for models that can form a plan to perform a tasks and then go and get data it doesn't know about which could help with more flexibility here.
By ‘go and get data it doesn’t know about, do you mean that it can have an agent that is programmed to e.g. search the web for info about topics that the model has deemed to be relevant?

If so, how do they get around the issue that @Jonathan Edwards highlighted where the current consensus becomes the foundation for the AI’s reasoning, regardless of the validity of the consensus?

An example would be that many papers state falsehoods as facts - how would the AI differentiate between true facts and false or uncertain ‘facts’?
 
By ‘go and get data it doesn’t know about, do you mean that it can have an agent that is programmed to e.g. search the web for info about topics that the model has deemed to be relevant?

Yes the latest LLMs you can ask to form a plan and describe tools to use such as a websearch. The LLM will then give search instructions that an agent can execute, then the results are provided as context in the next LLM call.
 
Do you know what kind of data his model had access to? If it’s based on more or less the same data that you used, would it not be expected to produce more or less the same results?

From my understanding, Unutmaz has primarily used a paper he very recently published and he also directed AI to collect other papers using metabolomics.

You can see the prompt he used here : https://manus.im/share/RiwQh3pvyc1xXSxAgq9yZP?replay=1

The data I used cannot have significant overlaps with what Unutmaz used since I used studies which were available until 2017.
 
An example would be that many papers state falsehoods as facts - how would the AI differentiate between true facts and false or uncertain ‘facts’?

Try asking an LLM to assess different facts or statements. In the same way as you may ask a reader to do this. LLMs get things wrong as do people. It does matter how you prompt an LLM.
 
Back
Top Bottom