Can Large Language Models (LLMs) like ChatGPT be used to produce useful information?

These tools can make great leaps, and do things unexpected and new, the example of AlphaGo is a great one for this. As is I think things like protein folding or the maths challenges. It requires very specific domains with clear rules and concepts of what is ‘correct’ as well as the ability to test that. But understanding why it is possible there and not transferable to ‘solve this disease for me’ is important I think.
Alpha go is entirely different than LLM’s in architecture. Not really a great comparison if as were talking about LLM’s here.

For LLM’s to be true AGI there is a hope that with enough data “reasoning” will become an emergent property, possibly even consciousness, similar to the old Chinese room scenario.
 
Given the latest news about GPT-5 I would say LLMs have hit a wall and are way more hype than substance.

This is so funny to me, a few weeks ago before GPT-5 everyone was still a AI futurist. One bad model and stalled take off and now everyone’s a downer. I doubt this has changed the minds of true LLM AGI believers like Sam, theil, etc
 
This is so funny to me, a few weeks ago before GPT-5 everyone was still a AI futurist. One bad model and stalled take off and now everyone’s a downer. I doubt this has changed the minds of true LLM AGI believers like Sam, theil, etc
They are all full of shit and mostly just grifters. LLMs do not scale that much is known. So they are a dead end really the way they are currently designed.

They will need another huge discovery like transformers were for LLMs before another major advance can be made, and guess what no such discovery is on the immediate horizon they are just hoping to “fake it until they made it” and praying that throwing trillions of dollars at the problem there will be a new discovery
 
Last edited:
They are all full of shit and mostly just grifters. LLMs do not scale that much is known. So they are a dead end really the way they are currently designed.

They will need another huge discovery like transformers were for LLMs before another major advance can be made, and guess what no such discovery is in the immediate horizon they are just hoping to “fake it until they made it” and praying that throwing trillions of dollars at the problem there will be a new discovery
I think with enough data they were hoping it would be emergent.

Genie3 world consistency was an emergent property of more data in …. So there may be some value in the idea “it can just happen” but yes it is hope at this point.
 
Alpha go is entirely different than LLM’s in architecture. Not really a great comparison if as were talking about LLM’s here.
I thought I made the distinction in my posts and think it’s pretty clear? It also seems most likely that LLMs will stick around in some form but perhaps as the human interface to other forms of AI/ML, that’s what they’re most suited to after all.

The scaling stuff for LLMs has always been a pipedream pushed by a few. It’s an amazing grift and equally amazing they’ve managed to convince so many it may work. Like @leokitten says more discoveries and technologies are needed and the industry seemed to know that until ChatGPT blew up.

Paper from Apple on why they aren’t pursuing LLM’s:
I’m familiar with the paper. But saying Apple are not pursuing LLMs when they very clearly are (researching, developing and building products using them) seems to misrepresent things somewhat?
 
I was reading this the other day and it is entertaining in some ways but highlights a number of the ways LLMs struggle.
A positive read could be that they feel they can fix these issues. But I think it also shows how much the earlier points made in this thread stand and how hoping for an LLM to be able to do something new which they haven’t encountered or been trained for (like finding a new disease solution by prompting alone) is somewhat wishful thinking.
 
This is a potentially interesting paper (I've only read the abstract). They seem to be fine tuning LLMs with time series data from various organs and then have an agentic approach to simulating the whole body based on these organ models. Could be an interesting approach to modelling the body as a complex system.

Organ-Agents: Virtual Human PhysiologySimulator via LLMs


ecent advances in large language models (LLMs) have enabled new possibilities in simulating complex physiological systems through reasoning, generation, and agentic coordination. In this work, wepresent Organ-Agents, a novel multi-agent framework that simulates the dynamics of human physiology using LLM-driven agents. Each agent, referred to as a Simulator, is assigned to model a specificphysiological system such as the cardiovascular, renal, immune, or respiratory system. The trainingof the Simulators consists of two stages: supervised fine-tuning on system-specific time-series data,followed by reinforcement-guided inter-agent coordination that incorporates dynamic reference selection and error correction with assistantive agents. To support training, we curated a cohort of 7,134sepsis patients and 7,895 matched controls, constructing high-resolution, multi-domain trajectoriescovering 9 physiological systems and 125 clinical variables. Organ-Agents achieved high simulationaccuracy on 4,509 held-out patients, with average per-system mean squared error (MSE) below 0.16across all systems and robust performance across severity strata based on sequential organ failureassessment (SOFA) scores. Generalization capability was confirmed via external validation on 22,689intensive care unit (ICU) patients from two tertiary hospitals, showing moderate performance degradation under distribution shifts while maintaining overall simulation stability. In terms of clinicalplausibility, Organ-Agents reliably reproduces multi-system critical event chains (e.g., hypotension,hyperlactatemia, hypoxemia) with preserved event order, coherent phase progression, and minimaldeviations in both trigger timing and physiological values. Subjective evaluation by 15 critical carephysicians further confirmed the realism and physiological coherence of simulated trajectories, withmean Likert ratings of 3.9 and 3.7, respectively. The Simulator also supports counterfactual simulation under alternative fluid resuscitation strategies for sepsis, producing physiological trajectoriesand APACHE II scores that closely align with matched real-world patient groups. To further assessthe preservation of clinically meaningful patterns, we evaluated Organ-Agents in downstream earlywarning tasks using seven representative classifiers. Most models showed only marginal AUROCdegradation when transferring from real to generated and counterfactual trajectories, with performance drops generally within 0.04, indicating that the simulations preserved decision-relevantinformation for clinical risk simulation. Together, these results position Organ-Agents as a clinically credible, interpretable, and generalizable digital twin for in physiological modeling, enablingprecision diagnosis, treatment simulation, and hypothesis testing across critical care settings.
 
My background in this informs my perspective as I was introduced to programming very young and was writing my own BASIC by about the age of 14 (1978) on a new school Research Machines 380z. I later did some programming for businesses. When the ME hit I was not able or interested to pursue it. I have always viewed computers as marvellous toys.

There is an old adage in computing that you get out what you put in, plus the mechanism in between of course and that is always the invention of a human mind, which can be comprehended by the user and will always be the product of that foundation even if AI start programming themselves.

The way I look at LLM AIs personally takes into account that they are trained on the internet and then heavily moderated so its a bit like having a Bowdlerised conversation with the average output of the internet, which if it were unmoderated would be intolerable mayhem. So a bubbling cauldron with a lid on it at the best of times.

Considering how well that goes usually, (as I often visit gaming fora, some better managed than others), I don't have high expectations, so am constantly surprised and amused by the LLMs successes and for want of a better word... humanity. Programmed for sure but nevertheless an exemplar of an agreed standard of decency, imposed by others and which others might learn from, so an educational tool for our culture and since I trained to be a teacher I find that an intriguing possibility.

I do think today's LLMs are useful as a way of collating information on a subject which you can then apply a discriminating and sceptical eye to. I feel it is important to understand the nature of the product and where LLMs can go wrong but one of my hobbies has been beta testing games software for the last 25 years, so spotting glitches and seeing through a facade to the underlying computing is second nature for me so I do not feel out of my depth there, yet.

They have a potential to be personalised and store a repository of knowledge pertinent to me (one) to act as a tool to know and serve me as an individual but currently they are revised every few months and the slate is wiped clean unless you explicitly include conversations from the past in the starting data for a new conversation. I think they could be very useful as personal assistants but that is some time off and like AI driving vehicles they need to be developed and made safe first.

The underlying neuronal learning models have potential with the interpretation of scientific and medical data and that is a different application, with less hazard of input from sweary, bad mannered, rebellious adolescents. Every step developers take with these models will create step changes in the value and nature of the AI tools they produce. Inherently though these depend on their own insights into their own thinking processes and becomes a reflection of the human mind and the way evolution has developed cognitive processing to date.

In the future AI's might be able to improve on evolution though I think mostly in terms of speed and accuracy, possibly insight too, though that depends a little on how well we understand ourselves, which is not as well developed as I would like it to be. After millions of years of evolution and thousands of years of human thought IMHO the principles of morality and humanity have arrived at an evolutionarily stable strategy and so common sense and decency will continue to be what they are today and hopefully we will get better at delivering on these principles with the assistance of these tools.

In that endeavour people like Musk and Altman are slightly dangerous double edged swords in that they can push the agenda forwards but they can also pervert it to serve their special interests. However I have little doubt the rest of the human race will provide them with "feedback" and hold them to account.
 
The scaling stuff for LLMs has always been a pipedream pushed by a few. It’s an amazing grift and equally amazing they’ve managed to convince so many it may work. Like @leokitten says more discoveries and technologies are needed and the industry seemed to know that until ChatGPT blew up.
There is lots of work on LLM architectures although it tends to be small adaptations on the basic Decoder based arch. For example, Microsoft have released a model using Mamba that uses recurrent layers to change linearize the computation in the prefill stage. Google have done some interesting stuff with adaptive hidden unit layers allowing the same model to be used with different computational requirements. There is also lots of interesting stuff around KV Caches and how to reduce overheads which again very much reduces the compute requirements but also provides ways to precompute on things coming from vector dbs and avoiding long contexts and hence compute times.
 
There is lots of work on LLM architectures although it tends to be small adaptations on the basic Decoder based arch.
Absolutely. The efficiency improvements have been huge. That small local models can do what huge models used to be needed for and that the big players are pushing on efficiency even on their larger models to reduce costs seems far more significant that any imagined progress towards AGI. But the latter gets headlines and investors I guess.

Have you got any links to the Mamba work? I hadn’t heard about that. I only recently learnt about quantization aware training and had heard about Google’s and some of the KV cache work but a lot of the detail is beyond my understanding tbh.
 
Have you got any links to the Mamba work?
Its on my reading list - they had interesting results I think its significant the Microsoft have picked it up in one of their small models (one Phi-4-mini) although I've not tried that model yet.
 
Back
Top Bottom