1/63 ๐งต Thread on AI โ PART 2
This time we'll cover:
- How ChatGPT works (+ many good links)
- Its fundamental limitations: what can't it do?
Unrolled: typefully.com/norswap/gPRkWlT
2/ I ran out of space again ๐ So there will be a part 3 (and maybe a part 4) about:
- What are AGI (general AI) & superintelligence
- What is GPT missing to achieve AGI?
- How dangerous is superintelligence?
4/ As GPT 4 was just released, this is an interesting time to make this thread. I haven't read the whitepaper (I should), but it doesn't matter: I actually want to make very generic points what neural networks / LLM (large language models) are capable of.
5/ These points should hold no matter the model, unless the architecture changes in a truly fundamental way (to no longer be truly a neural network, but something more).
We'll speculate about such changes in part 3! (and maybe a bit in this one)
6/ So what sets ChatGPT apart from the networks we saw in part 1 (spam filtering and image-based generation / style transfer)?
Mostly, it's language. Because ChatGPT speaks, it seems intelligent. Or rather, it seems agentic.
7/ We'll discuss what "intelligence" ought to mean in part 3. But for now, let's see what's ChatGPT's got!
8/ GPT is an LLM or "large language model". That just means it's a really big neural network and that it was trained on a lot of input ("the internet") that you couldn't possibly label by hand.
9/ What is GPT's purpose? Quite simply, it's text completion. The model take some text (the "prompt") and generates a completion for it. Amazingly, it does this one. word. at. a. time.
10/ Is that it? Almost. ChatGPT is a model built on top of the GPT "base" model. The model is tweaked using RLHF (reinforcement learning from human feedback).
11/ The purpose of RLHF is to make ChatGPT more useful. The internet is full of people saying things that are incorrect, offensive, or generally not helpful โ so pure completion might not be what you want! RLHF tweaks the AI to be a good assistant that follows your directions.
14/ In a nutshell:
1. Take the base model.
2. Use a human to generate "good question/answer (QA) pairs" and use this to fine-tune the model to be more friendly.
15/
3. Get some QA pairs form the model, multiple per question (GPT has a random component, can output different answers for the same question), and have a human rank them.
16/
4. Use these rankings to train another neural network, the "reward model", whose role is to evaluate a QA pair. It's trying to mimic what the human ranker would do! (Because the human ranker doesn't scale to all possible QA pairs, but the model can.)
17/
5. Further fine-tune the improved base model from step 2 using the reward model. You now have ChatGPT!
18/ When I said that GPT was "just" a large neural network, of course that glosses over a ton of details. I don't think they matter too much for our discussion, but they are very interesting.
22/ ChatGPT's "prompt" is the whole transcript of the conversation you've had with him.
OpenAI further adds a hidden message at the start of the conversation in order to prompt the AI to be extra helpful. This changes regularly, but here's what it was in January:
23/ This turns out to have a remarkable impact (there's a figure showing this in the OpenAI RLHF article above).
People have also figured out a bunch of other prompts you could give ChatGPT to be more generally useful, or better at this thing or that thing.
24/ These practices, plus the guidelines on how to write your questions/requests to optimize the results have come to be known as "prompt engineering".
26/ People *love* to show how to get ChatGPT to ignore its hidden prompt (and some of its RLHF-derived inclinations), a practice known as "jailbreaking", but the phenomenon is really unsurprising if you understand ChatGPT capabilities ... which we're about to get into right now.
27/ Okay, back to the big picture stuff. What do the details of how ChatGPT work tell us about its capabilities?
28/ First off, what does GPT learn?
As a text completer, it makes statistical inferences on text.
Said otherwise, it learns "patterns" of text.
29/ What's really surprising is how elaborate these patterns can be. They're not simple mappings of the form "after these 10 words, the most likely are ...". Instead they're more like computer programs performing arbitrary learned computation.
30/ Again, I want to hammer that these "patterns" and "programs" are just weights in a neural net that does a bunch of weighted sums (see part 1!).
31/ The neural net cannot access/reason about these patterns (only use them to generate completions) and we humans can't currently understand or interpret them very well.
32/ Ted Chiang (who apparently is both a great author and a really clear thinker) wrote a great New Yorker op-ed that I want to use to illustrate some of these points.
archive.is/kEdyu
33/ Chiang frames ChatGPT as lossy compression: even if the model is large, it is a lot smaller than the web (even if you were to prune it of redundant information).
34/ Therefore, it cannot contain all factual information, and this explains its "hallucinations" (a technical term that refers to when LLM "make things up").
35/ On the other hand, he also points out how sometimes there are really good lossless compression techniques: for instance, you can compress all of the arithmetic text into a fundamental understanding of the arithmetic rules (i.e. a "program" that performs arithmetic!)
36/ Careful here though, as we are conflating a corpus of text and a set of facts!
37/ If there are facts or answers that most people get wrong on the internet, it is in theory more correct for the LLM to generate factually incorrect completions, unless we somehow manage to bias it away from these bad answers (this seems difficult in general).
39/ Chiang is probably a little bit too bearish on the creative potential of GPT.
Because GPT learns a bunch of patterns, it is able to combine them in novel ways, even when the patterns or subject matters weren't seen together in the corpus.
40/ And it can be argued that most creative endeavors are just that: remixes.
I highly recommend watching the "Everything is a Remix" video, for a vivid illustration of this:
42/ At the same time, it's true that ChatGPT is probably not able to come up with something *truly* novel (even if prompted), whether that's creative elements or original research.
43/ And these creative elements might not be most of our creative output, but they're really important, as they unlock an exponential amount of new combinations.
44/ Another point I want to make with respect to GPT's capabilities is that it is not able to "reason" in any meaningful way.
We've seen that LLMs are able to learn patterns. In theory, they could get good enough to perfectly learn the rules of arithmetic, logic, ...
45/ It can also combine these patterns. So, in theory, and if prompted intelligently, it should be able to derive new theories that are proven by the patterns it knows.
I think this runs into intrinsic limitations really quickly.
46/ Why? To derive new theories, it's usually necessary to assemble a bunch of new intermediate facts (lemmas, theorems).
Say GPT is able to derive the first layer of those. These are all novel, and hence "foreign", i.e. unlikely in text!
47/ The patterns that GPT learns are all subservient to its main goal: complete text.
Carrying out a reasoning using these new facts is bound to seem very unlikely as a text completion.
48/ There is a "gravity pull" where LLMs are bound to pull back completions towards things they have seen before.
Making correct inferences with respect to the new theory seems very unlikely, because it doesn't look like anything in the corpus.
49/ To even be able to reason using new theorems, GPT would need to learn "meta-patterns". Let me explain.
Basic patterns are mere word associations. Advanced patterns could be the rules of logic or arithmetic.
50/ Meta patterns would be patterns where, given new rules never seen before (which it derives from its existing patterns), it is able to reason using these new rules.
51/ You could argue: GPT can do that today! If I give it the rule of a simple but novel game, it is able to play!
That's true, but consider it has seen a lot of simple games and that maybe all it is doing here is applying rules in the "theory of simple games".
52/ In this context, rules (which are fairly simple themselves) are simply objects manipulated by this theory!
In other words, the "reasoning" necessary to carry out the rules ("the theory of simple games") is fairly common in the corpus and can thus be learned.
53/ It gets a lot more difficult when reasoning using the new theorems is a lot more foreign.
54/ Imagine a LLM whose corpus included things functions (f(x) = 3x + 2) but nothing about calculus.
Reasoning using derivatives and integrals are bound to look unlikely from a text completion perspective.
55/ So can LLMs learn meta-patterns? I might be wrong, but I would argue it is disincentivized by their very structure.
56/ By nature, what they will learn is how to apply existing theories because that's what they see in the corpus. They have no need to go to the meta level, it does not make them more accurate during training!
57/ Maybe if we found a way to "crank up the compression" we could force the model to go meta?
That seems difficult โ the progress of LLMs so far has been due in great part to making them *larger*.
58/ And we can't try to compress some part of the "knowledge graph" of the LLM because there is no such thing!
There is just a bunch of weight and we don't generally which regions of the neural network correspond to which capabilities.
59/ ((I'm out of my depth here, I'm not sure that regions even match capabilities.
60/ Important to know that every neuron/node fires for *every* completion, however due to weights and the "activation function", a lot of them end up outputting zero or low value for any given completion. So... maybe?))
61/ Finally, as far as research is concerned, I'll point out that while it's generally "easy" to verify the validity of a reasoning that leads to a new result, the whole difficulty is finding the intuition that leads to the new reason in the first place.
62/ Can GPT serve as a good intuition pump for those? Maybe, but I doubt it: new results are often weird, and require out-of-the-box thinking. Exactly the opposite of something that tries to give you the most statistically significant answer!
63/ That's it for this thread!
Tune in for part 3, where we'll touch on what it means for an AI to be (super)intelligent, and what they'd need to get there!