The instructions.
System message, role, format. The smallest pillar — 100 lines of English doing the heavy lifting.
The only mental model you need to start. Every chatbot, every agent, every demo on your timeline — this one sentence.
Every word it writes is a guess — a ranked bet over a vocabulary of roughly 100,000 tokens.
Ask Claude to complete "The capital of France is ___". Paris wins. But every plausible next token also got a number. Here is what the model actually saw, ranked.
If you remember nothing else today — this. Every AI product you've ever used is a mix of these four.
System message, role, format. The smallest pillar — 100 lines of English doing the heavy lifting.
The window. RAG lives here. Codebases, PDFs, search results. 80% of product quality lives here.
Web search, code execution, API calls. The shift from chatbot → agent happens entirely here.
Across sessions. Across days. ChatGPT memory, Claude Projects. The newest pillar — most under-built.
From now on — every AI product you see, you decompose into these four.
Heavy context + tools. Light on prompts. Barely any memory — state lives in your codebase.
Humans wrote every rule. Chess worked. Language didn't — language has patterns, not rules.
Stop writing rules. Show examples. Computer figures out patterns. Spam filters, recommendations, fraud detection.
Stack the pattern matchers. Many layers. Depth made everything weirdly better. Nobody fully knows why even now.
Don't just classify — generate. That's the leap. The transformer made this possible at scale.
Google publishes the paper. Eight authors. Zero still at Google. Started Anthropic, Character, Inflection. Paper is 8 years old. Built a $500B industry.
OpenAI just made GPT bigger. Didn't change the architecture — just scale. Translation, few-shot learning, reasoning emerged. Nobody trained it for that.
Why this one when chatbots existed since the 60s? UX. Free, fast, chat, no signup friction. GPT-3.5 had been out months. That single UX decision is why we're here.
"Distribution and UX > model capability. Almost always."
Not a new model skill. A pre-step: find the right text first, then ask. That's why context power demo worked — you did RAG manually by pasting the bio.
Documents split into chunks → each chunk converted to a coordinate (embedding) → stored in a vector database.
Your question becomes a coordinate too → find the nearest chunks in the vector store → those are your context.
Top-k chunks + your original question → sent to the LLM → grounded, accurate answer. That's RAG.
All tightly connected to Day 1 content. Screenshot-worthy on their own.
Same sentence in Devanagari = 3–4× the tokens of English. Real economic discrimination baked into the API.
Frozen at training cutoff. Genuinely doesn't know what year it is unless told. Every 'real-time' AI product is doing retrieval or tool calls behind the scenes.
The model isn't choosing the 'right' word. It's sampling from a distribution over ~100K tokens for every single word. Same prompt, different outputs is the foundation.
Trained to always predict the next token. The ability to refuse or admit uncertainty has to be specifically trained in via RLHF. Hallucination is the default.
All left. Started Anthropic, Character, Inflection, Adept, Sakana, Essential AI. Google published the breakthrough and watched it walk out the door.
'SolidGoldMagikarp' — a Reddit username scraped into GPT-3, filtered out before fine-tuning, stuck in unmapped embedding space. A glitch token. Causes erratic behavior when triggered.
"Predicts the next token, trained on most of the internet."
These are the only things that matter before Day 2. Don't skip them.
Pick an AI product you use. Tell me which pillar it leans on hardest, and where it's weak.
Cursor, ChatGPT, Perplexity, Notion AI, Copilot — anything. Come with an opinion.
Find one AI fail in the news. There's at least one every week. Bring it.
Hallucination, bias, cost spiral, UX disaster — any of these count. Bring the link.