Curated Resource ( ? )

Opinion | A.I. Could Solve Some of Humanity’s Hardest Problems. It Already Has

my notes ( ? )

Ezra Klein interviews Demis Hassabis, head of Google DeepMind & architect behind AlphaFold.

"GPT-3, had been around for a while... What ChatGPT did was it allowed you to talk to GPT-3 like you ... and it was a human... created this huge land rush for A.I.s that functionally mimic human beings".

But ask: "what if, instead of building A.I. systems that mimic humans, we built those systems to solve some of the most vexing problems facing humanity... clean energy, and drug discovery... create an era of innovation like nothing humanity has ever experienced."

One of very few examples: Google DeepMind's AlphaFold, unveiled in 2020, "uses deep learning to solve ... protein-folding problem ... essential for addressing ... challenges, from vaccine and drug development to curing genetic diseases... By 202, the system had identified 200 million protein shapes, nearly all the proteins known to humans. DeepMind is also currently building similar systems to accelerate nuclear fusion".

Some definitions of AI

"describe the difference between some of the more logic-based, or rules-based, or knowledge-encoded systems and deep learning: ...

think of A.I. as being this overarching field of making machines smart... there are... two approaches:

  • logic systems or expert systems: "programmers ... effectively solve the problem, be that playing chess ... program up these routines and heuristics... very brittle... can’t deal with the unexpected ... can’t learn anything new.
  • machine learning systems learn for themselves... the structure, ... heuristics and rules ... directly from data or experience eg AlphaGo"

Hassabis' path

Doing primitive AI-based games as a teenager, got his compuscience PhD, "ran my own games company ... went back to academia to do a Ph.D. in ... cognitive neuroscience ... I’ve always been fascinated by the brain". Also, he wanted to understand the big questions, but instead of doing philosophy or physics "I thought building A.I. would be the fastest route... I still love physics... [but] these incredible people like Feynman and Weinberg... I wasn’t that convinced they’d got that far... perhaps ... we need ... extra intellectual horsepower... that’s when I decided I was going to work on A.I".

G is for games, and generality

Instead of "working on robotics and embodied intelligence ... fiddling around with the hardware ... better to work in simulation:

  • run millions of experiments in the cloud ... much faster learning rates
  • games are built to be challenging to humans... top human players a great benchmark
  • clear objectives... win the game ... very useful if you want to train reinforcement learning systems... reward seeking and goal directed
  • you can go up the ladder of complexity" from Pong to StarCraft.

"the key breakthrough of ... deep reinforcement learning, combining neural networks with the reward-seeking algorithms, is that we played these games directly from the pixels ... not tell the system anything ... It had to figure out for itself ... just like a human". These are general systems - the G in A.G.I. - because they can learn and play all the games: "those two elements, the generality and the learning, are the key differences... [they] learn for themselves, so you just give it the high-level objective" and "the digital equivalent of a doggy treat" whenever it scores a point.

Alignment problem

However, it "still may not know it’s playing Pong... you’re ... coaxing the system... you have a little bit less direct control", hence the alignment problem: "sometimes you may care not just about the overall outcome. You may care about how it got there... you have to add extra constraints ... [hence] reinforcement learning with human feedback".

Reinforcement learning

"deep learning and reinforcement learning ... very complementary:

  • deep learning ... really complex stacks of neural networks ... loosely modeled on brain neural networks ... learn the statistics of the environment ... or data stream they’re given ... building a model
  • reinforcement learning ... does the planning and the reward learning ... a reward-seeking system ... humans learn with reinforcement learning."

Combine them into "deep reinforcement learning: the model... [plus] reward-seeking planning system on top that uses that model to reach its objectives."

From AlphaGo to AlphaZero: creative AI

"AlphaGo ... created new strategies ... never seen before, even though we’ve played Go for ... a couple of thousand years... using human games as training data ... a pretty general system, but ... some specific things about Go."

Then they removed those specific things to create AlphaZero, which can "play any two-player game ... Go, or chess, or backgammon..." without human data. It starts as a blank slate and "plays itself millions and millions of times, different versions of itself ... learns from its own data and experience... starts literally from random and explores the space... comes up with incredible new strategies not constrained by what humans have played in the past".

Hence it's wrong to say that AI's "constrained by what we know". We humans are - "we’re prone to faddishness... follow in the footsteps ... taught by others ... everybody who’s done it before did it this way" - but AI "was being held back by what we knew": training it on our data "means huge swaths of useful strategy information, ideas have been cut off the board because we just don’t do that."

Exploring science, AI could "help us discover new knowledge but also inspire the human experts to explore more as well in tandem."


How to tell if the system got it right? "one of the hardest things with learning systems ... is formulating the right objectives ...a simple-to-optimize objective function" They used a database of 100-150,000 proteins, developed over 50 years, "as a training corpus... [and to] test our predictions".

But with 100-200million proteins out there, that's not a lot of training data. So "the system begins training itself on its predictions... basically inbreeding the A.I.". Does that risk model collapse (cf “The Curse of Recursion")? Their method:

  • build a first version of AlphaFold on the training data
  • -> "just about good enough... about 1 million predictions of new proteins...
  • got it to assess itself, how confident it was on those predictions(*) ...
  • [put the] best 30-35% around 300,000 predictions... back in the training set along with the real data"
  • That gave them "about half a million structures... to train the final system ... good enough to reach this atomic accuracy threshold...
  • also there were lots of very good independent tests, of how good these predictions" as biologists added new data to the database after the cutoff training date.

(*) A confidence score is quite rare in these AIs. Alphafold colour-coded its confidence into its predicted 3d structures so biologists knowing nothing of AI could "understand which parts of the prediction could they trust".

Why doesn't it hallucinate?

ChatGPT et al "don’t know what they’re doing in the same way that your system didn’t know it was playing Pong. They just know that on the internet... this is the word that would be most likely to come next".

If they had a confidence score, maybe they'd simply say "I don't know" at some point, or at least signal caveats. Moreover, a low confidence score could prompt them to doublecheck their own predictions. Even better, the user could set the confidence threshold.

Scraping the internet doesn't capture that sense of correctness: "Reddit is not about if you get the right answer. People are just talking... language doesn’t... have an input and an output, where you can see that some outputs were correct and some outputs weren’t".

AlphaFold, in contrast, had training data which tells it what's right and what's not, and that feedback loop can be automated.

"With language and ... human knowledge, it’s much more nuanced ... subjective... you need human feedback ... why everyone’s reinforcement learning with human feedback to train these systems... a noisy process and very time consuming", difficult to automate.

Isomorphic: the spinoff

ALphaFold's already released the proteome of humans, "all the important research organisms... some important crops ... as a free-access database", so that's not the business model. But "knowing the structure of proteins [is] only one small bit of the whole drug discovery process": you also need to identify the proteins to target, and can even design the molecule that binds to it and nothing else: "A.I. is the perfect tool to accelerate the time scales", from big pharma's 5-6y of incredibly slow and expensive experiments to an order of magnitude less.

Isomorphic will therefore venture into "chemistry... designing small molecules, predicting their properties", minimising toxicity.

Bring it to (stock) market

"from a certain perspective, the stock market has the structure of a game... There’s a lot of training data out there". You could it "as a time series of numbers... predict the next numbers... analogous to predicting the next word... [but] probably a bit more complex [as] ... those numbers describe real things... real people running those companies, and having ideas... also macroeconomic forces... to understand the full context ... you’d somehow have to encapsulate all of that knowledge".

Perhaps that's true to "fully win the game, but ... local strategies could be very profitable and/or very destructive... short out this company, destroying this competitor... very weird strategies by a system that has the power to move money around and a lot of data in it and is just getting reinforcement learning from making money." On the other hand the stock market is already dominated by top-end algorithms so "it’s not clear that a more general learning system would be better".

Future evolutions

On the one hand there's general LLMs, on the other very application-specific AIs like Alphafold, which "can do something amazing in the protein space, and it cannot help me write a college essay".

One theory: many "different specialized systems ... tuned to do different things... legal contracts ... proteins ... radiology results".

Another: "GPT-12, or whatever ... attains a kind of general intelligence ... can do everything" - this is Deepmind's actual final goal: "the way the brain works... one system... can do many things". But we don't have to wait for AGI before we can benefit from domain-specific AI.

Moreover, we can combine them: "increasingly more powerful general system that you basically interact with through language but has other capabilities... math and coding, perhaps some reasoning and planning... [and] can use tools", from calculators to Photoshop to other AIs. Eventually those external tool's abilities "will be folded back into the general system".

Between here and there we'll need both more data and processors, as well as "a handful of innovations [currently] missing ... factuality, robustness... planning and reasoning and memory".

Read the Full Post

The above notes were curated from the full post

Related reading

More Stuff I Like

More Stuff tagged creativity , ai , science , google , biology , healthcare , alignment , alphafold , reinforcement learning , recursion , deep learning , ezra klein

See also: Digital Transformation , Innovation Strategy , Psychology , Productivity , Science&Technology

Cookies disclaimer saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.