Breaking news from famed machine learning researcher Ilya Sutskever:
Below is another summary of a just-released interview of his that is making waves, a bit more technical. Basically Sutskever is saying that scaling (achieving improvements in AI through more chips and more data) is flattening out, and that we need new techniques; he is even open to neurosymbolic techniques, and innateness. He is clearly not forecasting a bright future for pure large language models.
Sutskever also said that “The thing which I think is the most fundamental is that these models somehow just generalize dramatically worse than people. And it’s super obvious. That seems like a very fundamental thing.”
Some of this may come as news to a lot of the machine learning community; it might be surprising coming from Sutskever, who is an icon of deep learning, having worked, inter alia, on the critical 2012 paper that showed how much GPUs could improve deep learning, the foundation of LLMs, in practice. He is also a co-founder of OpenAI, considered by many to have been their leading researcher until he departed after a failed effort to oust Sam Altman.
But none of what Sutskever said should actually come as a surprise, especially not to readers of this Substack, or to anyone who followed me over the years. Essentially all of it was in my pre-GPT 2018 article “Deep learning: A Critical Appraisal”, which argued for neurosymbolic approaches to complement neural networks (as Sutskever now is), for more innate (i.e., built-in, rather than learned) constraints (what Sutskever calls “new inductive constraints”) and/or in my 2022 “Deep learning is hitting a wall” evaluation of LLMs, which explicitly argued that the Kaplan scaling laws would eventually reach a point of diminishing returns (as Sutskever just did), and that problems with hallucinations, truth, generalization and reasoning would persist even as models scaled, much of which Sutskever just acknowledged.
Subbarao Kambhampati, meanwhile, has been arguing or years about limits on planning with LLMs. Emily Bender has been saying for ages that an excess focus on LLMs has been “sucking the oxygen from the room” relative to other research approaches. The unfairly dismissed Apple reasoning paper laid bare the generalization issues; another paper called “Is Chain-of-Thought Reasoning of LLMs Mirage? A Data Distribution Lens” put a further nail in the LLM reasoning and generalization coffin.
None of what Sutskever said should come as a surprise. A machine learning researcher at Samsung, Alexia Jolicoeur-Martineau summed the situation up well on X, Tuesday, following the release of the Sutskever’s interview:
§
Of course it ain’t over til it’s over. Maybe pure scaling (adding more data and compute without fundamental architectural changes) will somehow magically yet solve what researchers as such Sutskever, LeCun, Sutton, Chollet and myself no longer think it could.
And investors may be loathe to kick the habit. As Phil Libin put it presciently last year, scaling—not the generation of new ideas—is what investors know best
And it’s not just that venture capitalists know more about scaling businesses than inventing new ideas, it’s for the venture capitalists that have driven so much of field, scaling, even if it fails, has been a great run: it’s been a way to take their 2% management fee investing someone else’s money on plausible-ish sounding bets that were truly massive, which makes them rich no matter how things turn out. To be sure, the VC get even richer still if the investments pan out, to be sure. but they are covered either way; even if it all falls apart, the venture capitalists themselves will become wealthy from the management fees alone. (It is their clients, such as pension funds, that will take the hit). So venture capitalists may continue to support LLM mania, at least for a while.
But let’s suppose for the sake of argument that Sutskever and the rest of us are correct, and that AGI will never emerge straight from LLMs, and that to a certain extent that they have run their course, and that we do in fact need new ideas.
The question then becomes, what did it cost the field and society that it took so long for the machine learning mainstream to figure out what some of us, including virtually the entire neurosymbolic AI community had been saying for years?
§
The first and most obvious answer is money, which I estimate, back of the envelope as (roughly) a trillion dollars, much of it on Nvidia chips and massive salaries. (Zuckerberg has apparently hired some machine learning experts at salaries of $100,000,000 a year).
According to Ed Zitron’s calculations, “Big Tech Needs $2 Trillion In AI Revenue By 2030 or They Wasted Their Capex”. If Sutskever and I are right about the limits of LLMs, the only way to get to that $2T is to invent new ideas.
If the definition of insanity is doing the same thing over and over and expecting different results, trillion dollar investments in ever more expensive experiments aiming to reach AGI may be delusional to the highest degree.
To a first approximation, all the big tech companies, from OpenAI to Google to Meta to xAI to Anthropic to several Chinese companies, keep doing the same experiment over and over: building ever larger LLMs in hopes of reaching AGI.
It has never worked. Each new bigger, more expensive model ekes out measurable improvements, but returns appear to be diminishing (that’s what Sutskever is saying about the Kaplan laws) and none of these experiments has solved core issues around hallucinations, generalization, planning and reasoning, as Sutskever too now recognizes.
But it’s not just that a trillion dollars or more might go down the drain, but that there might be considerable collateral damage, to the rest of society, both economic and otherwise (e.g., in terms of how LLMs have undermined college education). As Rogé Karma put in a recent article in The Atlantic, “The entire U.S. economy is being propped up by the promise of productivity gains that seem very far from materializing.”
To be fair, nobody knows for sure what the blast radius would be. If LLM-powered AI didn’t meet expectations and became valued less, who would take the hit? Would it just be the “limited partners” like pension funds who entrusted their money with VC firms? Or might the consequences be much broader? Might banks go down with the ship, in 2008-style liquidity crisis,possibly forcing taxpayers to bail them out? In the worst case, the impact of a deflated AI bubble could be immense. (Consumer spending, much of it fueled by wealthy people who could a hit on the stock market, might also drop, a recipe for recession.)
Even the White House has admitted concerns about this. As the White House AI and Crypto Czar David Sacks himself put it earlier this week, referring to a Wall Street Journal analysis, “Al-related investment accounts for half of GDP growth. A reversal [in that] would risk recession.”
Quoting from Karma’s article in The Atlantic :
That prosperity [that GenAI was supposed to deliver] has largely yet to materialize anywhere other than their share prices. (The exception is Nvidia, which provides the crucial inputs—advanced chips—that the rest of the Magnificent Seven are buying.) As The Wall Street Journal reports, Alphabet, Amazon, Meta, and Microsoft have seen their free cash flow decline by 30 percent over the past two years. By one estimate, Meta, Amazon, Microsoft, Google, and Tesla will by the end of this year have collectively spent $560 billion on AI-related capital expenditures since the beginning of 2024 and have brought in just $35 billion in AI-related revenue. OpenAI and Anthropic are bringing in lots of revenue and are growing fast, but they are still nowhere near profitable. Their valuations—roughly $300 billion and $183 billion, respectively, and rising—are many multiples higher than their current revenues. (OpenAI projects about $13 billion in revenues this year; Anthropic, $2 billion to $4 billion.) Investors are betting heavily on the prospect that all of this spending will soon generate record-breaking profits. If that belief collapses, however, investors might start to sell en masse, causing the market to experience a large and painful correction.
…
The dot-com crash was bad, but it did not trigger a crisis. An AI-bubble crash could be different. AI-related investments have already surpassed the level that telecom hit at the peak of the dot-com boom as a share of the economy. In the first half of this year, business spending on AI added more to GDP growth than all consumer spending combined. Many experts believe that a major reason the U.S. economy has been able to weather tariffs and mass deportations without a recession is because all of this AI spending is acting, in the words of one economist, as a “massive private sector stimulus program.” An AI crash could lead broadly to less spending, fewer jobs, and slower growth, potentially dragging the economy into a recession. The economist Noah Smith argues that it could even lead to a financial crisis if the unregulated “private credit” loans funding much of the industry’s expansion all go bust at once.
The whole thing looks incredibly fragile.
§
To put it bluntly, the world has gone “all in” on LLMs, but, as Sutskever’s interview highlights, there are many reasons to doubt that LLMs will ever deliver the rewards that many people expected.
The sad part is that most of the reasons have been known – though not widely accepted – for a very long time. It all could have been avoided. But the machine learning community has arrogantly excluded other voices, and indeed whole other fields like the cognitive sciences. And we all now may be about to pay the price.
An old saying about such follies is that “six months in the lab can you save you an afternoon in the library”; here we may have wasted a trillion dollars and several years to rediscover what cognitive science already knew.
A trillion dollars is a terrible amount of money to have perhaps wasted. If the blast radius is wider it could be a lot more. It is all starting to feel like a tale straight out of Greek tragedy, an avoidable mixture of arrogance and power that just might wind up taking down the economy.