Ezra Klein interviews Demis Hassabis, head of Google DeepMind & architect behind AlphaFold.
"GPT-3, had been around for a while... What ChatGPT did was it allowed you to talk to GPT-3 like you ... and it was a human... created this huge land rush for A.I.s that functionally mimic human beings".
But ask: "what if, instead of building A.I. systems that mimic humans, we built those systems to solve some of the most vexing problems facing humanity... clean energy, and drug discovery... create an era of innovation like nothing humanity has ever experienced."
One of very few examples: Google DeepMind's AlphaFold, unveiled in 2020, "uses deep learning to solve ... protein-folding problem ... essential for addressing ... challenges, from vaccine and drug development to curing genetic diseases... By 202, the system had identified 200 million protein shapes, nearly all the proteins known to humans. DeepMind is also currently building similar systems to accelerate nuclear fusion".
"describe the difference between some of the more logic-based, or rules-based, or knowledge-encoded systems and deep learning: ...
think of A.I. as being this overarching field of making machines smart... there are... two approaches:
Doing primitive AI-based games as a teenager, got his compuscience PhD, "ran my own games company ... went back to academia to do a Ph.D. in ... cognitive neuroscience ... I’ve always been fascinated by the brain". Also, he wanted to understand the big questions, but instead of doing philosophy or physics "I thought building A.I. would be the fastest route... I still love physics... [but] these incredible people like Feynman and Weinberg... I wasn’t that convinced they’d got that far... perhaps ... we need ... extra intellectual horsepower... that’s when I decided I was going to work on A.I".
Instead of "working on robotics and embodied intelligence ... fiddling around with the hardware ... better to work in simulation:
"the key breakthrough of ... deep reinforcement learning, combining neural networks with the reward-seeking algorithms, is that we played these games directly from the pixels ... not tell the system anything ... It had to figure out for itself ... just like a human". These are general systems - the G in A.G.I. - because they can learn and play all the games: "those two elements, the generality and the learning, are the key differences... [they] learn for themselves, so you just give it the high-level objective" and "the digital equivalent of a doggy treat" whenever it scores a point.
However, it "still may not know it’s playing Pong... you’re ... coaxing the system... you have a little bit less direct control", hence the alignment problem: "sometimes you may care not just about the overall outcome. You may care about how it got there... you have to add extra constraints ... [hence] reinforcement learning with human feedback".
"deep learning and reinforcement learning ... very complementary:
Combine them into "deep reinforcement learning: the model... [plus] reward-seeking planning system on top that uses that model to reach its objectives."
"AlphaGo ... created new strategies ... never seen before, even though we’ve played Go for ... a couple of thousand years... using human games as training data ... a pretty general system, but ... some specific things about Go."
Then they removed those specific things to create AlphaZero, which can "play any two-player game ... Go, or chess, or backgammon..." without human data. It starts as a blank slate and "plays itself millions and millions of times, different versions of itself ... learns from its own data and experience... starts literally from random and explores the space... comes up with incredible new strategies not constrained by what humans have played in the past".
Hence it's wrong to say that AI's "constrained by what we know". We humans are - "we’re prone to faddishness... follow in the footsteps ... taught by others ... everybody who’s done it before did it this way" - but AI "was being held back by what we knew": training it on our data "means huge swaths of useful strategy information, ideas have been cut off the board because we just don’t do that."
Exploring science, AI could "help us discover new knowledge but also inspire the human experts to explore more as well in tandem."
How to tell if the system got it right? "one of the hardest things with learning systems ... is formulating the right objectives ...a simple-to-optimize objective function" They used a database of 100-150,000 proteins, developed over 50 years, "as a training corpus... [and to] test our predictions".
But with 100-200million proteins out there, that's not a lot of training data. So "the system begins training itself on its predictions... basically inbreeding the A.I.". Does that risk model collapse (cf “The Curse of Recursion")? Their method:
(*) A confidence score is quite rare in these AIs. Alphafold colour-coded its confidence into its predicted 3d structures so biologists knowing nothing of AI could "understand which parts of the prediction could they trust".
ChatGPT et al "don’t know what they’re doing in the same way that your system didn’t know it was playing Pong. They just know that on the internet... this is the word that would be most likely to come next".
If they had a confidence score, maybe they'd simply say "I don't know" at some point, or at least signal caveats. Moreover, a low confidence score could prompt them to doublecheck their own predictions. Even better, the user could set the confidence threshold.
Scraping the internet doesn't capture that sense of correctness: "Reddit is not about if you get the right answer. People are just talking... language doesn’t... have an input and an output, where you can see that some outputs were correct and some outputs weren’t".
AlphaFold, in contrast, had training data which tells it what's right and what's not, and that feedback loop can be automated.
"With language and ... human knowledge, it’s much more nuanced ... subjective... you need human feedback ... why everyone’s reinforcement learning with human feedback to train these systems... a noisy process and very time consuming", difficult to automate.
ALphaFold's already released the proteome of humans, "all the important research organisms... some important crops ... as a free-access database", so that's not the business model. But "knowing the structure of proteins [is] only one small bit of the whole drug discovery process": you also need to identify the proteins to target, and can even design the molecule that binds to it and nothing else: "A.I. is the perfect tool to accelerate the time scales", from big pharma's 5-6y of incredibly slow and expensive experiments to an order of magnitude less.
Isomorphic will therefore venture into "chemistry... designing small molecules, predicting their properties", minimising toxicity.
"from a certain perspective, the stock market has the structure of a game... There’s a lot of training data out there". You could it "as a time series of numbers... predict the next numbers... analogous to predicting the next word... [but] probably a bit more complex [as] ... those numbers describe real things... real people running those companies, and having ideas... also macroeconomic forces... to understand the full context ... you’d somehow have to encapsulate all of that knowledge".
Perhaps that's true to "fully win the game, but ... local strategies could be very profitable and/or very destructive... short out this company, destroying this competitor... very weird strategies by a system that has the power to move money around and a lot of data in it and is just getting reinforcement learning from making money." On the other hand the stock market is already dominated by top-end algorithms so "it’s not clear that a more general learning system would be better".
On the one hand there's general LLMs, on the other very application-specific AIs like Alphafold, which "can do something amazing in the protein space, and it cannot help me write a college essay".
One theory: many "different specialized systems ... tuned to do different things... legal contracts ... proteins ... radiology results".
Another: "GPT-12, or whatever ... attains a kind of general intelligence ... can do everything" - this is Deepmind's actual final goal: "the way the brain works... one system... can do many things". But we don't have to wait for AGI before we can benefit from domain-specific AI.
Moreover, we can combine them: "increasingly more powerful general system that you basically interact with through language but has other capabilities... math and coding, perhaps some reasoning and planning... [and] can use tools", from calculators to Photoshop to other AIs. Eventually those external tool's abilities "will be folded back into the general system".
Between here and there we'll need both more data and processors, as well as "a handful of innovations [currently] missing ... factuality, robustness... planning and reasoning and memory".
More Stuff I Like