AI Homework
It happened to be Wednesday night when my daughter, in the midst of preparing for “The Trial of Napoleon” for her European history class, asked for help in her role as Thomas Hobbes, witness for the defense. I put the question to ChatGPT, which had just been announced by OpenAI a few hours earlier:
This is a confident answer, complete with supporting evidence and a citation to Hobbes work, and it is completely wrong. Hobbes was a proponent of absolutism, the belief that the only workable alternative to anarchy — the natural state of human affairs — was to vest absolute power in a monarch; checks and balances was the argument put forth by Hobbes’ younger contemporary John Locke, who believed that power should be split between an executive and legislative branch. James Madison, while writing the U.S. Constitution, adopted an evolved proposal from Charles Montesquieu that added a judicial branch as a check on the other two.
The ChatGPT Product
It was dumb luck that my first ChatGPT query ended up being something the service got wrong, but you can see how it might have happened: Hobbes and Locke are almost always mentioned together, so Locke’s articulation of the importance of the separation of powers is likely adjacent to mentions of Hobbes and Leviathan in the homework assignments you can find scattered across the Internet. Those assignments — by virtue of being on the Internet — are probably some of the grist of the GPT-3 language model that undergirds ChatGPT; ChatGPT applies a layer of Reinforcement Learning from Human Feedback (RLHF) to create a new model that is presented in an intuitive chat interface with some degree of memory (which is achieved by resending previous chat interactions along with the new prompt).
What has been fascinating to watch over the weekend is how those refinements have led to an explosion of interest in OpenAI’s capabilities and a burgeoning awareness of AI’s impending impact on society, despite the fact that the underlying model is the two-year old GPT-3. The critical factor is, I suspect, that ChatGPT is easy to use, and it’s free: it is one thing to read examples of AI output, like we saw when GPT-3 was first released; it’s another to generate those outputs yourself; indeed, there was a similar explosion of interest and awareness when Midjourney made AI-generated art easy and free (and that interest has taken another leap this week with an update to Lensa AI to include Stable Diffusion-driven magic avatars).
More broadly, this is a concrete example of the point former GitHub CEO Nat Friedman made to me in a Stratechery interview about the paucity of real-world AI applications beyond Github Copilot:
I left GitHub thinking, “Well, the AI revolution’s here and there’s now going to be an immediate wave of other people tinkering with these models and developing products”, and then there kind of wasn’t and I thought that was really surprising. So the situation that we’re in now is the researchers have just raced ahead and they’ve delivered this bounty of new capabilities to the world in an accelerating way, they’re doing it every day. So we now have this capability overhang that’s just hanging out over the world and, bizarrely, entrepreneurs and product people have only just begun to digest these new capabilities and to ask the question, “What’s the product you can now build that you couldn’t build before that people really want to use?” I think we actually have a shortage.
Interestingly, I think one of the reasons for this is because people are mimicking OpenAI, which is somewhere between the startup and a research lab. So there’s been a generation of these AI startups that style themselves like research labs where the currency of status and prestige is publishing and citations, not customers and products. We’re just trying to, I think, tell the story and encourage other people who are interested in doing this to build these AI products, because we think it’ll actually feed back to the research world in a useful way.
OpenAI has an API that startups could build products on; a fundamental limiting factor, though, is cost: generating around 750 words using Davinci, OpenAI’s most powerful language model, costs 2 cents; fine-tuning the model, with RLHF or anything else, costs a lot of money, and producing results from that fine-tuned model is 12 cents for ~750 words. Perhaps it is no surprise, then, that it was OpenAI itself that came out with the first widely accessible and free (for now) product using its latest technology; the company is certainly getting a lot of feedback for its research!
ChatGPT launched on wednesday. today it crossed 1 million users!
— Sam Altman (@sama) December 5, 2022