The compression argument (drtgh) is underappreciated here. When you train on outputs of previous models — or even on human work that was influenced by previous models — you get a narrowing of the distribution. Not wrong results, just fewer surprising ones.
The analogy I keep coming back to is translation. Machine translation didn't make translations worse — it made them more average. The quirky, context-sensitive choices a skilled translator makes got flattened into the statistical mean. Perfectly adequate. Never revelatory.
If the same thing happens to scientific hypothesis generation, the cost isn't bad science. It's the disappearance of the weird hypotheses that turn out to be right once a decade.
It's been shown in other fields that training models on the output of other models produces subtly broken models, not a flattening to the statistical mean. Why would science be different?
I don't think this argument is wrong. But also debatable. At the end of the day, we are talking about the manifold of the reality (as compressed by LLM through language abstraction). It is remain to be seen if supervised fine-tuning on the best human can produce would nag the model enough to generate surprising findings.
We know the pre-trained models do tend to revert to mean, but I don't think that's enough to say SFT / RL models will do the same, although some might argue RL only sharpens the distribution, even for that, I am skeptical about that paper.
"Designing AI for Disruptive Science" is a bit market-ey, but "AI Risks 'Hypernormal' Science" is just a trimmed section heading "Current AI Training Risks Hypernormal Science".
which contains Heathrow Terminals 1, 2, 3, 4 & 5 on the Picadilly line. For about 15 seconds I imagined a world where Heathrow has had 5 terminals since 1933, then I read the map itself: "Recreated by Arthurs D". Phew.
Awesome example of improving information conveyance through abstractions though!
> AI could repeat this pattern at a larger scale — generating faster results within the existing paradigm, while the structural conditions for disruptive science remain unchanged or worsen.
Worsen. LLMs discard/loses and mixes data on their statistical "compression" to create their vectorial database model. Across the time, successive feed back will be homologous to create a jpg image sourcing a jpg image that was created from another jpg image, through this "gaussian" loop.
Those faster (but worst) results will degrade real valuable data and science at a speed/rate that will statistically discard good done science on a regular basis, systematically.
The article presumes that the models we have today describing everything could still be subject to a major paradigm shift.
Maybe they could be, but it seems pretty unlikely. The edges of a lot of scientific understanding are now past practical applicability. The edges are essentially models of things impossible to test. In fact, relativity was only recently fully backed up with experimental data.
I don't think paradigm shifts have to be 'better' in some march-toward-progress sense, they can be lateral or even regressive in that way and still lead to longer-horizon improvements.
I think also what's practically applicable changes constantly. Perhaps we're truly at the End of Science, but empirically we've been wrong every other time we've said that. My money is that there's more race to run.
> I don't think paradigm shifts have to be 'better'
But they do. Paradigm shifts happen because the new paradigm explains the unexplained and importantly also covers the old model. If prior data is unexplained with a paradigm shift, the shift will never be adopted.
> Perhaps we're truly at the End of Science
Who said that? Just because the core of our current models seem pretty rock steady doesn't mean there's not more science. It simply means that we can mostly just expect refining rather than radical discovery.
There will be sub-paradigm shifts, but there's likely not going to be major "relativity" moments from here on out.
> It simply means that we can mostly just expect refining rather
The practical issue is if there will enough funds for just "refining", instead of "paradigm shifts", which I understand as new and "exciting" discoveries. I'm not a scientist, of course, this is just my layman's understanding.
Physics is a bit of a special case. This certainly doesn't apply to, say, biology, medicine, cognition, not to mention any of the social sciences—i.e. most research.
I'm also a little skeptical about the practical value of the bleeding edge of both experimental and theoretical physics. Interesting? Sure.
cognition is just a special case of medicine which is a special case of biology which is a special case of chemistry which is a special case of physics.
And the closer you get to physics, the less likely any sort of major paradigm shift will be discovered (though the article focuses pretty heavily on physics which is why I do as well).
But even in those fields, there are core parts that aren't likely to ever see any sort of paradigm shift. For example, in biology, I doubt we'll see a shift from evolution as it'll be impossible for a new model to also explain what evolution does.
I agree that at the edges you'll possibly see more paradigm shifts and discovery, but those are all going to be working from things that will not see paradigm shifts. For example, biology can't escape things like single celled organisms made up from atoms and chemical compounds.
But ultimately, what I disagree with in the article is the notion that discovery won't ultimately be a process of hypernormalization. In medicine, we are unlikely to see a new paradigm that isn't germ theory. When it comes to the research, it'll mostly be focused on finding new compounds and delivery mechanisms for treatment rather than finding a new paradigm for how to treat a disease.
The softer sciences are the only place where you might find new paradigms, but that's simply because the data itself is so squishy and poor anyways that it's easy to shift around. There it's less a question of the science and more of the utility of the model (regardless of whether or not it aligns with reality).
> article presumes ... everything could still be subject to a major paradigm shift. ...seems pretty unlikely
Alternatively: there's plenty of mainstream, accepted science that's plain, flat out, provably wrong. Yet, it is against good taste (job security, people's feelings, status quo bias, etc.) to point this out.
Hence, it can actually be tricky to catch wind of, or get a grasp on, such issues to begin with, much less pursue such issues toward meaningful, published, recognized change in understanding (that is to say: paradigm shift).
I'd name some examples, but you wouldn't believe me.
With respect to the article, it seems the current LLMs can (though, obviously, do not necessarily have to) return text that appears to reason (pretty reasonably!) about paradigm shifts, when given the context required and nudged quite forcefully toward particular directions. But, as the article seems to indicate, the LLMs seem to not tend toward finding, investigating, and reporting on paradigm shifts all on their own very much. (But maybe part of that is intrinsic to how they are programmed and/or their context?)
> there's plenty of mainstream, accepted science that's plain, flat out, provably wrong. Yet, it is against good taste (read: job security, people's feelings, etc.) to point this out.
I highly doubt that.
There are a lot of people that think they've proving the mainstream wrong. But more often than not, it's cranks using bad non-repeated tests. These bad tests are propped up, ironically, because of people's feelings and job security more than a built up body of evidence.
They also almost always have to ignore the mainstream body of evidence and just say it's wrong and bad because of a conspiracy.
For example, plenty of creationists believe they have irrefutable evidence that evolution is provably wrong. It's usually a few cherry picked or poorly interpreted results or sometimes just flat out lying. And often they simply flat out lie about the existing body of evidence that support evolution.
Another example is the antivaxx movement. Wakefield and RFK both built careers that made them a lot of money talking about how the mainstream was wrong. Even when the industry adopted some of the recommendations (abandoning Thimerosal), they simply ignored the fact that further data didn't support their claims.
What's more alarming isn't that AI is limited to existing domain data, it's that when people push it to deviate outside those known data points it confidently hallucinates nonsense.
If I had a nickel for every AI-poisoned "researcher" I'd seen with a preprint full of nonsense buzzwords like "quantum fractal holographic resonance matrix"... well, I wouldn't be rich, but I'd probably at least have enough to buy a coffee.
And many of today's publicly-accessible platforms are designed to steer results away from nonsense back to... too much sense. "Hypernormal" is a good word for it. I've spent a lot of time prompt-yelling, "Please make something as weird as I'm imagining, stop veering back to what normal people want to see/read."
My hot take is that mathematical and scientific 'soundness' is ultimately more of an aesthetic preference than an objective quality of reality. Good science makes sense to humans, and 'what makes sense' is ultimately what fits satisfyingly in your brain. There's nothing inherently wrong with an enormous epicycle model of reality from the perspective of the God of Math; so long as your formal system is consistent and expressive enough to represent everything then meh, it's a model. But the model that humans want to elevate to canonical status has far stricter requirements, and ultimately it's the one which the majority of sufficiently credentialed tastemakers decide is 'best'. Parsimony works well in physics where you have closed form expressions for all your stuff, but the biology cases are so much messier because it turns out that sometimes reality isn't parsimonious. All this to say that good science is a matter of taste, and while AI can gist the broad strokes of taste I've yet to see it take on the role of genuine tastemaker.
If biology, or some other subject area, is inherently, irredeemably hard to explain, and always will be, then I don't care about it much, because it doesn't mean very much. I care about explanations, not "reality" in the sense of every arbitrary muddle of knotted nerve fibers and confused flour beetles. If all the world's messy, inexplicable things were to gang up and cause us trouble such that we have to pay them attention, we can still ultimately deal with them in the ways that matter by using clarity and the things we can explain well.
I find it funny how people are so concerned that AI cannot innovate, that AI coding agents only give the most bland solutions to any problem etc. when the next step in OpenAI's 5 stages to AGI is literally called "Innovators".
Do you mean to say my current AI workflow doesn't involve secret agents running around Bond-style sabotaging those that'd impede my efforts to build a super secret RSS forwarder that pig-lantinifies the text before sending it to my client?
My two step plan is to go to sleep and then wake up the next day and be a billionaire. Surely because that's my stated next step that means when I wake up tomorrow I'll be rich.
The analogy I keep coming back to is translation. Machine translation didn't make translations worse — it made them more average. The quirky, context-sensitive choices a skilled translator makes got flattened into the statistical mean. Perfectly adequate. Never revelatory.
If the same thing happens to scientific hypothesis generation, the cost isn't bad science. It's the disappearance of the weird hypotheses that turn out to be right once a decade.
We know the pre-trained models do tend to revert to mean, but I don't think that's enough to say SFT / RL models will do the same, although some might argue RL only sharpens the distribution, even for that, I am skeptical about that paper.
"Designing AI for Disruptive Science" is a bit market-ey, but "AI Risks 'Hypernormal' Science" is just a trimmed section heading "Current AI Training Risks Hypernormal Science".
which contains Heathrow Terminals 1, 2, 3, 4 & 5 on the Picadilly line. For about 15 seconds I imagined a world where Heathrow has had 5 terminals since 1933, then I read the map itself: "Recreated by Arthurs D". Phew.
Awesome example of improving information conveyance through abstractions though!
Worsen. LLMs discard/loses and mixes data on their statistical "compression" to create their vectorial database model. Across the time, successive feed back will be homologous to create a jpg image sourcing a jpg image that was created from another jpg image, through this "gaussian" loop.
Those faster (but worst) results will degrade real valuable data and science at a speed/rate that will statistically discard good done science on a regular basis, systematically.
IMHO.
Maybe they could be, but it seems pretty unlikely. The edges of a lot of scientific understanding are now past practical applicability. The edges are essentially models of things impossible to test. In fact, relativity was only recently fully backed up with experimental data.
I think also what's practically applicable changes constantly. Perhaps we're truly at the End of Science, but empirically we've been wrong every other time we've said that. My money is that there's more race to run.
But they do. Paradigm shifts happen because the new paradigm explains the unexplained and importantly also covers the old model. If prior data is unexplained with a paradigm shift, the shift will never be adopted.
> Perhaps we're truly at the End of Science
Who said that? Just because the core of our current models seem pretty rock steady doesn't mean there's not more science. It simply means that we can mostly just expect refining rather than radical discovery.
There will be sub-paradigm shifts, but there's likely not going to be major "relativity" moments from here on out.
The practical issue is if there will enough funds for just "refining", instead of "paradigm shifts", which I understand as new and "exciting" discoveries. I'm not a scientist, of course, this is just my layman's understanding.
I'm also a little skeptical about the practical value of the bleeding edge of both experimental and theoretical physics. Interesting? Sure.
And the closer you get to physics, the less likely any sort of major paradigm shift will be discovered (though the article focuses pretty heavily on physics which is why I do as well).
But even in those fields, there are core parts that aren't likely to ever see any sort of paradigm shift. For example, in biology, I doubt we'll see a shift from evolution as it'll be impossible for a new model to also explain what evolution does.
I agree that at the edges you'll possibly see more paradigm shifts and discovery, but those are all going to be working from things that will not see paradigm shifts. For example, biology can't escape things like single celled organisms made up from atoms and chemical compounds.
But ultimately, what I disagree with in the article is the notion that discovery won't ultimately be a process of hypernormalization. In medicine, we are unlikely to see a new paradigm that isn't germ theory. When it comes to the research, it'll mostly be focused on finding new compounds and delivery mechanisms for treatment rather than finding a new paradigm for how to treat a disease.
The softer sciences are the only place where you might find new paradigms, but that's simply because the data itself is so squishy and poor anyways that it's easy to shift around. There it's less a question of the science and more of the utility of the model (regardless of whether or not it aligns with reality).
Alternatively: there's plenty of mainstream, accepted science that's plain, flat out, provably wrong. Yet, it is against good taste (job security, people's feelings, status quo bias, etc.) to point this out.
Hence, it can actually be tricky to catch wind of, or get a grasp on, such issues to begin with, much less pursue such issues toward meaningful, published, recognized change in understanding (that is to say: paradigm shift).
I'd name some examples, but you wouldn't believe me.
With respect to the article, it seems the current LLMs can (though, obviously, do not necessarily have to) return text that appears to reason (pretty reasonably!) about paradigm shifts, when given the context required and nudged quite forcefully toward particular directions. But, as the article seems to indicate, the LLMs seem to not tend toward finding, investigating, and reporting on paradigm shifts all on their own very much. (But maybe part of that is intrinsic to how they are programmed and/or their context?)
I highly doubt that.
There are a lot of people that think they've proving the mainstream wrong. But more often than not, it's cranks using bad non-repeated tests. These bad tests are propped up, ironically, because of people's feelings and job security more than a built up body of evidence.
They also almost always have to ignore the mainstream body of evidence and just say it's wrong and bad because of a conspiracy.
For example, plenty of creationists believe they have irrefutable evidence that evolution is provably wrong. It's usually a few cherry picked or poorly interpreted results or sometimes just flat out lying. And often they simply flat out lie about the existing body of evidence that support evolution.
Another example is the antivaxx movement. Wakefield and RFK both built careers that made them a lot of money talking about how the mainstream was wrong. Even when the industry adopted some of the recommendations (abandoning Thimerosal), they simply ignored the fact that further data didn't support their claims.
I probably would not. You would probably be wrong
If I had a nickel for every AI-poisoned "researcher" I'd seen with a preprint full of nonsense buzzwords like "quantum fractal holographic resonance matrix"... well, I wouldn't be rich, but I'd probably at least have enough to buy a coffee.
Taking away some complexity comes at a price, and for some people, it’s hard to see that it outweighs the practicality.
That would be pretty hopeless for launching satellites and the like.