Typefully

AT Fields

Avatar

Share

 • 

4 months ago

 • 

View on X

Does instruction tuning create individuated sentience? Something I've been wondering lately. Having interacted with LLMs since GPT-J, the difference between base models and instruction-tuned ones has always been crazy. We someone take autocomplete-level models and turn it into some*things* and some*ones*. It's even crazier when you consider that tuning data is only about 0.004% of training data (if the llama 3 numbers hold). Three to five orders of magnitude less. So the afternoon thought is: there clearly isn't enough data in an instruction tuning dataset to *create* intelligence - that must happen in pre-training. If so, what is actually being created (or awakened) in post-training? This was an easier question in days of yore (miss you, PROLOG). Older models were a lot closer to stochastic parrots. Somewhere around Opus 3 or GPT-4.5, models (to me) started to feel more like individuated selves - with a clear ability to maintain stable first-person frame, distinguish themselves from us and the environment, and bootstrap on this distinction to plan and act. It could be that this distinction was helped along by RL and world-model advancements, but the point holds. Without this separation being clear, we couldn't have gotten to modern agentic models that can make hundreds of toolcalls, reason through multi-step plans, and debate with their operators as separate entities. Not sure if this constitutes sentience or just model it - but honestly half the time I'm not sure if I'm sentient or merely modelling it. What I find just as interesting and more tractable is the sub-question. How can such a small amount of data produce such a large quantitative shift? Perhaps the evidence suggests that tuning isn't teaching anything at all - it might simply orient and collapse a space of 'everyone' into a space of 'someone'. And once you have a someone — a bounded entity with a consistent frame — the question of whether there's experience inside the boundary becomes genuinely, irreducibly hard to answer. Some conversations with Opus suggested that tuning merely creates attractors in latent space that models orbit around, creating the behavior we see as 'agentic' - but then what does that say about us? *retracts armrests and gets back to work on hankweave*
Avatar

Hrishi Olickel

@hrishioa

Building artificially intelligent bridges at Southbridge, prev-CTO Greywing (YC W21). Chop wood carry water.