“…all these [LLM] copies can learn separately but share their knowledge instantly. So it’s as if you had 10,000 people and whenever one person learnt something, everybody automatically knew it. And that’s how these chatbots can know so much more than any one person.”
As you’ve no doubt heard, there was a potentially historic revelation last week. Apparently, Google was sitting on it for a month before it leaked, but Googler Luke Sernau’s brilliant essay, “We have no moat, and neither does OpenAI”, spells out a radical change that appears to be in progress.
He’s sounding the alarm on the performance trajectory of the OSS LLaMA virus strain, shown here in this diagram.
Before last week’s news, we already knew that Google’s Transformer architecture, which is the basis for all LLMs, is truly an invention for the ages. It’s a simple mathematical tool, much like the Fourier transform, except instead of picking out frequencies from noise, it picks out meaning from language.
When you make Transformers big enough, in so-called “Trillion-parameter space”, they begin to develop surprising higher-order capabilities, such as being able to visualize concepts, perform multi-step reasoning, and demonstrate a theory of mind.
St. Andrej Karpathy, patron saint of LLMs, has even tweeted about how some computations happen outside the normal hidden state layers, with these capabilities triggered simply by starting a prompt with “Let’s think step by step”.
And the way SaaS has evolved this year, the assumption has been that this will be a showdown between a handful of players, sort of an oligopoly like cable or utilities: OpenAI/GPT, Google/Bard, Meta/LLaMA, Anthropic/Claude, Musk/Kampf, and maybe a handful of others.
That capability is that Transformers can also learn from each other, via a set of new DLCs dropped by modders. The biggest mod (Sernau highlights many of them in his leaked essay, but this is the doozy) is Low Rank Adaptation (LoRA).
LoRA makes LLMs composable, piecewise, mathematically, so that if there are 10,000 LLMs in the wild, they will all eventually converge on having the same knowledge.
GPT became a moat, for a while, which made ChatGPT really hard to compete with, and only a few companies managed it.
Right around ten weeks ago, a chain of events kicked off a ten orders of magnitude reduction in LLM training and serving costs. In Moore’s Law terms, with a doubling/halving happening every 24 months, that’s 20 years of progress that just took place in the last 10 weeks.
We are now moving over 100 times faster along the exponential technology curve than we were just 15–20 years ago.
So nothing really changed in February, except, now every tinkerer on earth with a GPU laptop and PyTorch suddenly knew how the ChatGPT sausage was made.
Meta, according to Sernau’s reasoning, came out the clear winner among the Big Fish, because they are now the company with the architecture best suited for scaling up OSS LLMs, thereby taking advantage of all the OSS improvements.
Why? because LLaMA and all derivative strains are Meta’s architecture. According to Sernau, Meta was the surprise winner, since now everyone’s using LLaMA.
The upshot for the industry at large, is: the LLM-as-Moat model has begun to disappear, and may be gone even by the end of this year. “We have no moat, and neither does OpenAI” was an adieu to the LLM moat at the center of a SaaS ecosystem. AI is being commoditized practically overnight.
It’s sort of like the decades-long gradual miniaturization of computers from mainframes to embedded chips that run full OSes. Except it happened in ten weeks.
If you’re relying on LLMs for your moat, well… I hope you also have a data moat. You’re going to need it.
More Stuff I Like