Capturing human knowledge

There are a lot of parallels between the release of wikipedia onto the web and the release of LLMs. Both seek to be a consultative authority on the whole of human knowledge, although we can argue how well each achieves those goals.

Wikipedia is “trained” by humans in a continuous, open process. It stores knowledge in a form understandable by humans – text, hyperlinks, categories and so on. Consequently, the knowledge is revealed by searching and browsing.

LLMs are trained by humans in an upfront, closed process. It stores knowledge in a form alien to humans – weights, parameters, matrices and other numerical formats. The only way for the knowledge to reveal itself is through a mathematical calculation applied to a prompt.

For the purpose of this article, I constructed a simple exercise: Name three world leaders in 1996. A search revealed this wikipedia page where an answer can be found. A prompt into ChatGPT threw up the answer “Bill Clinton, John Major, Boris Yeltsin”

Of course, many other questions can be posed, and the accuracy or satisfaction of the result will vary with complexity. The broader point though is that both wikipedia and LLMs are designed to be able to answer any question you care to throw at it, to the best of it’s training.

Since wikipedia has a twenty year head start on LLMs, it’s worth reviewing how our relationship with it has altered over that time. Back then, there was a considerable amount of hysteria when wikipedia first emerged. Most of the population were extremely sceptical that the training model would ever produce anything that was reliable.

However, fast forward to today, and there are very few people who would not consult a wikipedia page to get an overview of a topic. Of course, everyone knows that inaccuracies exist within the knowledge pool, but the utility of having an instant, reasonable answer is what draws us to rely on it without thinking.

In twenty years or so, I believe that wikipedia will be extinct and all our curiosity will be satisfied by a descendant of LLMs. And it won’t even bother us that much, at least from a knowledge perspective. That task will be delivered so perfectly and flawlessly, it will cause no concern.

However, LLMs and AI more generally is capable of way more than just communicating knowledge. Other applications such as decisioning, research, genetic mutation, robot reproduction and a whole manner of other uses will be vexing humanity instead.

We’ll be looking back with misty eyes on the “olden days” of the 2020s when the worst thing AI did was recommend eating rocks as part of a daily diet.

Read more on this topic . . .