Curated Resource ( ? )

What Do Large Language Models “Understand”?

my notes ( ? )

"When we attribute human-like abilities to LLMs, we fall into an anthropomorphic bias ... But are we also showing an anthropocentric bias by failing to recognize" what they can do?

They don't have memories, although we can remind them through prompt engineering (summarise conversation, include summary in next prompt). They are goal-free and "can’t act in the physical world", although again clever prompting could make it look like they can.

But they do seem "to understand what we want. But to what extent do LLMs truly “understand”?"

Does a model understand?

The author uses a couple of excellent metaphors to answer why the M is in LLM, starting with weather modelling:

  • a weather model divides the atmosphere into patches with physical "attributes like humidity, temperature, and air pressure", and can forecast weather over time steps
  • as "the time-steps get shorter and the patches smaller the model" becomes a closer representation, or model, of the actual world, and "may learn to very accurately predict... the emergence of cyclones ... where air is warm, moist, and of low pressure"
  • "But it’s not a simulation of the physics of Earth’s weather any more than an LLM is a simulation of brain activity",

If weather models "capture the statistics of the atmospheric conditions that generate the weather", LLMs are "a statistical model of text" which, as they predict the next word, must be capturing something about the human brain processes which generate text. Is it understanding?

The author then compares how LLMs are trained with how children learn understanding and language. Unlike LLMs, children are embodied: in “Climbing Towards NLU” (Bender & Koller, 2020) argues that true natural language understanding (NLU) requires grounding in the real world."

What is understanding, anyway? "Ludwig Wittgenstein suggested that understanding is context-dependent and is shown through intelligent behaviour rather than mere possession of knowledge".

LLMs work like "Philosopher John Searle’s “Chinese Room” experiment", where someone has a conversation with a room containing "detailed instructions on how to respond to someone writing in Chinese" and someone inside who uses those instructions to respond in chinese to notes slid under the door. This room builder may understand Chinese but is not there, the person inside doesn't understand Chinese at all, so who is having the conversation? The room?

For humans "understanding comes with a subjective conscious experience of understanding. But it’s easy to see that this experience can be deceiving... Measuring ‘understanding’ is not straightforward": there are levels of understanding, as shown in problem-solving experiments measuring the understanding of animals. "Do we have that same level of understanding of LLMs to conduct similar experiments?"

How has the understanding of LLMs been measured to date?

  • GPT-3 era: "phrase questions in different ways and find the failure modes ... [which] indicate that no real “understanding” is happening but rather just pattern matching."
  • ChatGPT era: "LLMs have strong abilities including comprehension and reasoning, but ... limited abilities on abstract reasoning and ... confusion or errors in complex contexts”. Even the most powerful multimodal LLMs, unifying language and images, "shows mediocre performance"

Anthropomorphism and anthropocentrism both undermine our assessment of LLMs. According to "“Anthropocentric bias and the possibility of artificial cognition” (Millière & Rathkopf)... we use flawed tests...:

  • 1. Type-I anthropocentrism... assume that an LLM’s performance failures on a task designed to measure competence C always indicate that the system lacks C... overlooks the possibility that auxiliary factors caused the performance failure.
  • 2. Type-II anthropocentrism assume that even when LLMs achieve performance equal to or better than the average human... [is] evidence that the LLM’s solution is not general... [as] all cognitive kinds are human cognitive kinds... [so] the LLM’s approach is not genuinely competent"

Another problem is LLMs tend to "repeat patterns seen in their training data", so it's quite easy to set up an adversarial test to demonstrate their lack of understanding (the author uses a Monty Hall experiment).

But maybe "somewhere in [there] ... is something we would call understanding... buried under a tendency to repeat memorised text?" After all, "humans have a very similar bias" when we use System 1 thinking to "respond quickly when we’ve identified a heuristic we can use to avoid thinking deeply about a problem... [so] the ability to be tricked can’t be sufficient evidence of poor understanding."

After comparing an LLM to his 2yr old daughter again, the author concludes:

  • "the problem with understanding is that it is inherently multifaceted and difficult to measure in a standardised way", as well as of course complicated by anthropocentric and anthropomorphic biases.
  • Still, "I contend that understanding does not require embodiment or real world interaction... the most important part of understanding is an accurate internal model of the world" - witness blind people's understanding that windows are transparent.
  • "adversarial questions ... consistently demonstrate [systematic] flaws in understanding... suggesting... the lack of understanding is itself systematic... [but] it’s possible to design adversarial tests for humans and they don’t necessarily mean that humans lack understanding."
  • Just as "we gauge the cognitive abilities of animals differently from humans, perhaps we need new conceptual tools and frameworks to assess and appreciate what LLMs do know, without falling into biases of anthropomorphism or anthropocentrism...
  • LLMs have some limited understanding but the form it takes is different to our own... [and often] overshadowed by a bias towards coherent text...
  • While "our current LLM architectures ... [could] learn understanding... [if] the underlying training mechanism is “next token prediction” then any understanding is likely to be marginal and easily corrupted."

Read the Full Post

The above notes were curated from the full post towardsdatascience.com/what-do-large-language-models-understand-befdb4411b77.

Related reading

More Stuff I Like

More Stuff tagged ai , understanding , cognitive bias , llm , embodiment , anthropomorphic , chinese room , anthropocentric

See also: Digital Transformation , Innovation Strategy , Science&Technology , Large language models

Cookies disclaimer

MyHub.ai saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.