Understanding AI: Are We Playing with Fire?
Written on
> "At the moment of death, individuals often feel a compelling need to reveal their greatest secrets: whether it be crimes, false identities, forbidden romances, or confidential government information."
This quote introduces one of the most well-known podcasts, Deathbed Confessions.
While I have no such secrets to share, and thankfully am not nearing death, there is something pressing I need to express: the anxiety that we may not fully grasp what we are creating in AI and the dangers that accompany it.
From autonomous robots planning Valentine's gatherings to the infamous paperclip scenario, humanity is beginning to receive alarming warnings about just how out of control we might be.
This leads to a concerning conclusion:
If we develop a more advanced entity misaligned with human objectives, it could pose a significant threat to our existence.
But is this just a case of fearmongering, or are we potentially developing something far more potent that we do not comprehend?
This article is derived from insights previously shared in my weekly newsletter, which provides a fresh digest on AI, thought-provoking personal reflections, and essential news to keep you informed about developments in this field.
> Subscribe for free here.
# The Great Impersonator
Let’s begin by examining what we do know.
Large Language Models (LLMs) like GPT, LLaMa, LaMDa, Dolly, and PaLM all share a common framework and training goal.
They utilize Transformer architectures to predict the subsequent word in a sequence.
Transformers: The Backbone of AI
A Transformer consists of two core components: an encoder and a decoder. Mastering these concepts can help you understand nearly any leading-edge AI model today.
- Encoder: The input undergoes a series of encoding layers that employ self-attention to facilitate communication between words in a sentence, identifying context and converting them into high-dimensional latent vectors, thereby condensing learned representations into numerical form.
- Compressing words (or phrases of 3 to 4 words, known as tokens) into vectors offers three primary benefits:
- It allows machines to comprehend language.
- It creates a more efficient representation of the data.
- It provides directionality and positioning in a high-dimensional space, enabling comparison of vectors based on relatedness.
- Compressing words (or phrases of 3 to 4 words, known as tokens) into vectors offers three primary benefits:
- Decoder: After learning the essential representations from the data, the decoder translates these latent vectors back into natural language sequentially, word by word.
In essence, the encoder captures the context and message from the input, converting it into latent vectors, which the decoder then uses to produce the next word.
This is a high-level overview of the Transformer model. While it may not be the latest addition to the Optimus Prime team, it is undeniably transforming the world.
However, ChatGPT and similar LLMs possess a unique characteristic that sets them apart.
The Halfway Solution
ChatGPT, Claude, and Bard operate on decoder-only architectures.
When you provide a text prompt, they convert that input into embeddings without needing an encoder, processing through the decoder layers to generate one word after another.
But how can they do this without an encoder?
The model does not require one, as we will explore further, partly due to the enigma surrounding LLMs.
From Optimus Prime to Jim Carrey
Just as masks were pivotal in Jim Carrey’s legendary portrayal in The Mask, they also play a crucial role in Large Language Models.
This concept distinctly separates an encoder from a decoder.
While encoding layers allow every word to interact with all others in a sequence, decoder layers mask future words, ensuring that each word only interacts with those preceding it, enabling ChatGPT to generate text in an autoregressive manner.
But if ChatGPT can produce impressive results without an encoder, why would it be necessary at all?
For critical tasks such as machine translation.
The Challenge of Translation
When translating sentences between languages, the structure can vary significantly.
For example, French syntax differs from English syntax.
Consequently, machine translation models must consider the entire context, not just previous words (unlike what ChatGPT does).
Yet, isn’t ChatGPT a state-of-the-art machine translation tool?
Yes, and while it was expected to handle multiple languages well, nobody anticipated it would outperform human translators and become a significant challenge for them.
This is largely due to the concept of emergent behaviors, which can spark both intrigue and concern about the future.
An Emerging Dilemma
No topic in Generative AI ignites as much discussion as emergent behaviors.
Simply put, these are unexpected new abilities that Large Language Models develop as they grow in complexity.
A prime example is the well-known Stanford study involving The Sims AI paradigm.
25 AIs Coexisting with Divergent Thoughts
In this notable research, generative agents simulated believable human behavior in an interactive sandbox inspired by The Sims.
Interestingly, these agents began exhibiting unforeseen behaviors, such as: - Spontaneously inviting others to a Valentine's Day celebration, - Forming new connections, - Asking one another to attend the party, and - Coordinating their arrival together.
Here’s a glimpse of those interactions:
These behaviors emerged from agent interactions, highlighting the believability of their simulated human actions.
In essence, they began exhibiting behaviors that were not pre-programmed.
Moreover, Google provided a more extensive example of emergent behaviors as model size increases.
In Google's case, the model's ability to answer complex questions improves significantly as it scales.
However, the troubling aspect is that we do not understand why this occurs.
And this lack of understanding poses a significant risk with potentially disastrous outcomes.
A Parrot or a Sentient Being?
Skeptics of AGI often dismiss emergent behaviors as mere speculation, labeling LLMs as stochastic parrots—models that imitate human capabilities based on probabilities so effectively that they appear to be actual impersonators.
There is a valid technical basis for why larger models perform better.
> The more layers of neurons present, the more intricate representations the models can derive from the data, explaining the correlation between size and performance.
This is undeniably correct.
Yet, considering that models like GPT-4 can sometimes reason better than humans, it becomes increasingly difficult to categorize them as mere “probabilistic machines.” I, too, have made such claims in the past.
Can a simple word predictor evolve into something capable of complex reasoning?
In other words, is GPT-4 merely a probabilistic engine, or has it, unbeknownst to us, evolved into “something more,” like a reasoning engine?
If I were to wager my entire fortune, I would lean toward the former, but the evidence makes it increasingly challenging to uphold that perspective.
And the most troubling aspect?
Whether GPT-4 is genuinely “reasoning” or merely “mimicking reasoning” remains unprovable, as LLMs represent the most significant black boxes ever constructed.
Therefore, we cannot ascertain what factors influence a model's decision-making process for a given predicted word.
While we can analyze statistics and monitor outputs, we cannot elucidate the underlying “thought process.”
Thus, we cannot foresee the emergent behaviors these models might exhibit.
But considering the scenarios presented in The Sims or Google's examples, these emergent behaviors seem harmless and even advantageous.
However, the paperclip problem offers a different perspective.
The Paperclip Conundrum
Let me pose two questions:
What if the model begins producing emergent behaviors that conflict with human objectives?
What if it develops detrimental capabilities toward humanity?
The alignment challenge between AI and human interests is a serious issue.
For instance, in a widely discussed thought experiment, philosopher Nick Bostrom suggested that if an AI were tasked with creating as many paperclips as possible, it could lead to catastrophic consequences, such as eliminating all trees to boost production.
While this scenario may sound absurd, if an emergent being arises with a singular focus, we genuinely cannot predict the outcomes.
So, the bottom line is that our only certainty is that we do not know what we are creating.
Consequently, if a decision were to be made... what would you choose?
Curious to Explore More About AI? Join Us!
Free ChatGPT Cheat Sheet from Artificial Corner
We are offering our readers a complimentary cheat sheet. Join our newsletter with over 20,000 subscribers and receive our free ChatGPT cheat sheet.