Home
Book
What Is ChatGPT Doing ... and Why Does It Work?

What Is ChatGPT Doing ... and Why Does It Work?

What Is ChatGPT Doing ... and Why Does It Work?

GET THE #1 EMAIL FOR EXECUTIVES

Subscribe to get the weekly email newsletter loved by 1000+ executives. It's FREE!

Summary of "What Is ChatGPT Doing ... and Why Does It Work?" by Stephen Wolfram

Stephen Wolfram's book "What Is ChatGPT Doing ... and Why Does It Work?" delves into the inner workings of ChatGPT. Wolfram explores the architecture, training, and implications of ChatGPT, providing clear examples and explanations along the way.

The fact that this is freely available on the Wolfram website is amazing. It's 105 pages long and is a perfect read that is a step up from a typical blog post without the overhead that some AI books have that make them quite difficult to read.

We just covered the Wolfram website in a previous post which has a whole bunch of articles on AI, physics, and mathematics.

Key topics that the book covers

  • Introduction to ChatGPT: Explaining the purpose of ChatGPT and it's capabilities.

  • Architecture and Training: The book delves into the architecture of ChatGPT and how it is trained to generate human-like text.

  • Implications of ChatGPT: Wolfram discusses the implications of ChatGPT for AI, language understanding, and human-computer interaction.

  • Future Directions: The book concludes with a discussion of future directions for ChatGPT and AI in general.

We don't know why this works!

That quote led me down a bit of a rabbit hole, but it's true. We don't know why ChatGPT works. It's a bit like the universe. We don't know why it works, but we know it does.

I found this interesting quote by Sam Bowman

So there’s two connected big concerning unknowns. The first is that we don’t really know what they’re doing in any deep sense. If we open up ChatGPT or a system like it and look inside, you just see millions of numbers flipping around a few hundred times a second, and we just have no idea what any of it means. With only the tiniest of exceptions, we can’t look inside these things and say, “Oh, here’s what concepts it’s using, here’s what kind of rules of reasoning it’s using. Here’s what it does and doesn’t know in any deep way.” We just don’t understand what’s going on here. We built it, we trained it, but we don’t know what it’s doing.

*Reference: Sam Bowman Interview

Anyway, that isn't the main takeaway of the book! We are here to understand what ChatGPT is doing and why it works.

One word at a time

I enjoyed this initial description of how ChatGPT works:

Chat GPT Next Work Example

Simple, right? He used an example showing that if you just sample how often the different letters occur in English text, and then just pick letters at random according to these frequencies, you get something that looks a bit like English text.

Chat GPT Next Work Example

The temperature of words

Your intuition would be that you always just take the highest probability next word, but it turns out if you do that, you get a very flat and boring essay that starts to repeat itself. The concept of temperature is introduced to make the essay more interesting. Temperature essentially means that you sometimes pick lower-ranked words to make the essay more interesting.

It turns out that a 0.8 temperature seems to be the best for essay generation. You will also see this parameter in AI models when you use them.

Here is an example with 3 different temperature settings for email.

Example 1 - Low Temperature (0.2 - 0.5)

Professional follow-up on the status of an invoice

Dear [Client Name],

I hope this email finds you well.

I am writing to follow up on the status of Invoice #12345, which was issued on [Date]. As of today, we have not yet received payment for this invoice. We kindly request that you review this matter and let us know if there are any issues or if additional information is needed to process the payment.

If the payment has already been made, please disregard this message. Otherwise, we would appreciate your prompt attention to this matter.

Thank you for your cooperation.

Best regards,

Example 2 - Moderate Temperature (0.6 - 0.7)

Friendly yet professional follow-up on the status of an invoice

Hi [Client Name],

I hope you're doing well!

I'm reaching out to check on the status of Invoice #12345, which we sent over on [Date]. We haven't seen the payment come through yet and just wanted to make sure everything is in order on your end. If there's any issue or if you need additional information from us, please let me know.

If the payment is already in process, thank you, and please ignore this email. Otherwise, we would appreciate your assistance in settling this invoice at your earliest convenience.

Thanks so much for your attention to this. Looking forward to hearing from you soon.

Best,

Example 3 - High Temperature (0.8 - 1.0)

Casual follow-up on the status of an invoice

Hey [Client Name],

Hope all's well on your end!

Just wanted to touch base regarding Invoice #12345 that we sent on [Date]. Haven't seen the payment come through yet and wanted to make sure everything is good to go on your side. If there's any issue or if you need more info, just let me know.

If the payment is already on its way, awesome, and you can ignore this email. Otherwise, we'd appreciate it if you could take a look when you get a chance.

Thanks a bunch and talk soon!

ChatGPT is awesome at estimating probabilities of words

So the model essentially repeats the same process as the letter example but with words. It estimates the probabilities of words based on a large corpus of text and generates sentences by picking words at random according to these probabilities. This is the essence of ChatGPT's ability to generate human-like text.

The big idea is to make a model that lets us estimate the probabilities with which sequences should occur—even though we’ve never explicitly seen those sequences in the corpus of text we’ve looked at. And at the core of ChatGPT is precisely a so-called “large language model” (LLM) that’s been built to do a good job of estimating those probabilities.

Neural Networks

How is this done? Through a neural network. The neural network is trained on a large corpus of text to estimate the probabilities of sequences of words. This is the essence of ChatGPT's ability to generate human-like text. The book goes into a lot of detail on how this is achieved.

A visualisation of ChatGPT words in their embedding space

Chat GPT Training

The learning and training of ChatGPT

The Fundamental Tension of Learnability and Computational Irreducibility

I liked this concept talking about how there is a tension between learnability and computational irreducibility. Learning involves compressing data by leveraging regularities, but computational irreducibility implies that there’s a limit to what regularities there may be.

An example of this

An example of this is weather forecasting. While meteorologists use patterns and historical data to predict weather, the atmosphere's behavior is ultimately computationally irreducible. Despite advancements in technology and data analysis, long-term weather predictions remain challenging due to the atmosphere's chaotic nature.

The fundamental tension between learnability and computational irreducibility lies in the contrast between the ease of extracting patterns from data and the inherent complexity of certain processes. Learning, at its core, seeks to simplify and generalise data by identifying and leveraging regularities, making it possible to predict and understand phenomena with reduced effort. However, computational irreducibility asserts that some systems or processes are so complex that they cannot be simplified or predicted without fully simulating each step.

Other examples where this tension plays out

Financial Markets: Traders and analysts use historical data and patterns to predict stock prices and market trends. However, the market is influenced by countless variables and human behaviors, making it computationally irreducible. Sudden events or shifts in sentiment can render predictions inaccurate, demonstrating the limits of learnability in financial forecasting.

Biological Systems: Scientists study genetic sequences and biological pathways to understand diseases and develop treatments. While certain regularities and patterns can be identified, the sheer complexity of biological interactions and the influence of countless factors make precise predictions challenging. This complexity showcases the tension between the desire to learn and simplify versus the irreducibility of biological processes.

Urban Planning: Urban planners use models and historical data to design efficient cities and predict future growth. However, the myriad of human behaviors, economic factors, and environmental changes make urban development highly unpredictable. This unpredictability highlights the computational irreducibility of urban systems, despite efforts to learn and model them effectively.

Have computers become vastly more powerful?

Well its interesting because he argues that the tasks that ChatGPT is doing are not fundamentally different from what computers were able to do before. The difference is that now we have models like ChatGPT that can do these tasks more efficiently and effectively.

Quotes around tokens and words

How ChatGPT works

The book explains how ChatGPT operates in three basic stages:

  1. Token Sequence to Embedding: ChatGPT takes the sequence of tokens corresponding to the text and finds an embedding that represents these tokens.

  2. Neural Network Operation: The embedding is processed through a neural network, with values rippling through successive layers to produce a new embedding.

  3. Token Probability Generation: The last part of the new embedding generates an array of about 50,000 values that represent probabilities for different possible next tokens.

The computational cost of generating text

I found it interesting just how much computation is required to generate text!

"But for each token that’s produced, there still have to be 175 billion calculations done (and in the end a bit more)—so that, yes, it’s not surprising that it can take a while to generate a long piece of text with ChatGPT."

The simplicity of language

The future of ChatGPT & laws of thought

I really enjoyed the conclusion of the book where he talks about the future of ChatGPT and the implications of its success. Essentially there might be other laws of thought out there that we can discover.

What laws of thought might we discover?

Well for that, I did the main thing that we should do in an article like this... I asked ChatGPT!

Prompt: What are three example of potential laws of thought that could be out there?

Law of Conceptual Association: This potential law might explain how different concepts are linked together in human cognition. For instance, it could outline the principles by which the brain connects ideas like "apple" and "fruit" or "vehicle" and "car." Understanding this law could enhance our ability to design AI systems that mimic human thought processes more closely, allowing them to draw connections between seemingly disparate ideas with greater accuracy and relevance.

Law of Cognitive Economy: This law could describe how the human brain prioritizes information processing to optimize mental efficiency. It might detail how we allocate cognitive resources to tasks based on their perceived importance or relevance, and how we filter out unnecessary information. Discovering and formalizing this law could lead to AI systems that manage resources and prioritize tasks in a manner similar to human thinking, improving their ability to handle complex, multitasking environments.

Law of Semantic Generalization: This law could define how humans generalize from specific experiences to broader concepts. For example, it might explain how learning about a specific dog leads to understanding the general concept of "dogs." By making this process explicit, we could develop AI that learns and generalizes from data more effectively, allowing it to apply knowledge gained from specific instances to a wider range of situations, thereby improving its adaptability and problem-solving capabilities.

Thanks for reading! If you found this helpful, please share this article with 1 friend!

More Articles

The Economics of Airbnb Icons

ARTICLE

The Economics of Airbnb Icons

Why exactly did they build the UP house?

The Time a PhD Mathematician Won the Olympics

ARTICLE

The Time a PhD Mathematician Won the Olympics

The story of Anna Kiesenhofer's incredible victory in Tokyo

The Hidden Psychology of Decision-Making: What Executives Can Learn from Hostage Negotiators

ARTICLE

The Hidden Psychology of Decision-Making: What Executives Can Learn from Hostage Negotiators

Explore how understanding emotional under currents can enhance decision-making in business...

Olympic Glory in the Digital Age: How Paris 2024 Reshaped Athletes' Social Media Landscapes

ARTICLE

Olympic Glory in the Digital Age: How Paris 2024 Reshaped Athletes' Social Media Landscapes

We look at Instagram follower counts and see if there is a correlation between winning a medal or not...

The 5 Word Meeting Technique

ARTICLE

The 5 Word Meeting Technique

Google, Apple and Amazon were told to run their companies this way...

The Almanack of Naval Ravikant

BOOK REVIEW

The Almanack of Naval Ravikant

A 6/10 read with useful frameworks on wealth, agency, and leverage...

The Four Conversations

BOOK REVIEW

The Four Conversations

A practical model for selling expertise without slipping into vendor mode...

Using a can of beans to figure out consulting pricing

ARTICLE

Using a can of beans to figure out consulting pricing

Get better at pricing your consulting jobs...

51 Books Every Executive Should Read in 2024

BOOKS

51 Books Every Executive Should Read in 2024

Hand picked, each of these has shaped us in some way...

25 Strategic Moves That Established Companies Need to Steal from Startups

ARTICLE

25 Strategic Moves That Established Companies Need to Steal from Startups

Sick of slow progress in your organisation? This will help you focus on what matters...

What Every CEO Can Learn from GitHub's 100-Day Leadership Challenge

ARTICLE

What Every CEO Can Learn from GitHub's 100-Day Leadership Challenge

How Nat Friedman's Bold 100-Day Strategy Transformed GitHub and Redefined Leadership....

AI paying Humans?

ARTICLE

AI paying Humans?

A new company Payman is betting that the future of work involves AIs paying us to do their boring tasks ...

Jeff Bezos is famous for reading slowly - here's why you need to do it too

ARTICLE

Jeff Bezos is famous for reading slowly - here's why you need to do it too

Honestly this is not an easy thing to do. I tried and it felt like the mental equivalent of deadlifting

How Youtube can help you to find your North Star Metric

ARTICLE

How Youtube can help you to find your North Star Metric

Learn why the biggest companies choose one thing to focus on...

AI-Powered Networking: Building 50+ Connections in a New City

ARTICLE

AI-Powered Networking: Building 50+ Connections in a New City

Discover how I leveraged AI to transform networking in London, creating a scalable system for building meaningful professional relationships.

The Trillion Dollar Coach: Steve Job's Coach

BOOK REVIEW

The Trillion Dollar Coach: Steve Job's Coach

This is a simple book that is a must read for any leader...

Running a large organisation? You need to think this way

ARTICLE

Running a large organisation? You need to think this way

Day one thinking and why you need to make sure that your company thinks like this...

Struggling to Empower Your Team? Read This Book

BOOK REVIEW

Struggling to Empower Your Team? Read This Book

Learn how the best companies build products...

No Rules Rules: Netflix and the Culture of Reinvention

BOOK REVIEW

No Rules Rules: Netflix and the Culture of Reinvention

An inside look into the culture of Netflix...

Do you lead people? Your mood is like electricity - it spreads

ARTICLE

Do you lead people? Your mood is like electricity - it spreads

Discover how a leader's emotional state can spread through an organisation like wildfire, influencing performance at every level...

Company

Site Information

Fun Stuff

© 2024 Cub Digital. All Rights Reserved.