Q Star Info
Q Star Info
In this document I will be revealing information I have gathered regarding OpenAI’s (delayed) plans to create
human-level AGI by 2027. Not all of it will be easily verifiable but hopefully there’s enough evidence to
convince you
Summary: OpenAI started training a 125 trillion parameter multimodal model in August of 2022. The first
stage was Arrakis also called Q*. The model finished training in December of 2023 but the launch was
canceled due to high inference cost. This is the original GPT-5 which was planned for release in 2025. Gobi
(GPT-4.5) has been renamed to GPT-5 because the original GPT-5 has been canceled.
The next stage of Q*, originally GPT-6 but since renamed to GPT-7 (originally for release in 2026), has been
put on hold because of the recent lawsuit by Elon Musk
...
Q* 2023 = 48 IQ
Q* 2024 = 96 IQ (delayed)
Elon Musk caused the delay because of his lawsuit. This is why I’m revealing the information now because
no further harm can be done
I’ve seen many definitions of AGI – artificial general intelligence – but I will define AGI simply as an
artificial intelligence that can do any intellectual task a smart human can. This is how most people
define the term now.
2020 was the first time I was shocked by an AI system – that was GPT-3. GPT-3.5, an upgraded
version of GPT-3, is the model behind ChatGPT. When ChatGPT was released, I felt as though the
wider world was finally catching up to something I was interacting with 2 years prior. I used GPT-3
extensively in 2020 and was shocked by its ability to reason.
GPT-3, and its half-step successor GPT-3.5 (which powered the now famous ChatGPT -- before it
was upgraded to GPT-4 in March 2023), were a massive step towards AGI in a way that earlier
models weren’t. The thing to note is, earlier language models like GPT-2 (and basically all chatbots
since Eliza) had no real ability to respond coherently at all. So why was GPT-3 such a massive leap?
...
Parameter Count
“Deep learning” is a concept that essentially goes back to the beginning of AI research in the 1950s.
The first neural network was created in the 50s, and modern neural networks are just “deeper”,
meaning, they contain more layers – they’re much, much bigger and trained on lots more data.
Most of the major techniques used in AI today are rooted in basic 1950s research, combined with a
few minor engineering solutions like “backpropogation” and “transformer models”. The overall point is
that AI research hasn’t fundamentally changed in 70 years. So, there’s only two real reasons for the
recent explosion of AI capabilities: size and data.
A growing number of people in the field are beginning to believe we’ve had the technical details of
AGI solved for many decades, but merely didn’t have enough computing power and data to build it
until the 21st century. Obviously, 21st century computers are vastly more powerful than 1950s
computers. And of course, the internet is where all the data came from.
So, what is a parameter? You may already know, but to give a brief digestible summary, it’s analogous
to a synapse in a biological brain, which is a connection between neurons. Each neuron in a
biological brain has roughly 1000 connections to other neurons. Obviously, digital neural networks are
conceptually analogous to biological brains.
...
…
The most commonly cited figure for synapse count in the brain is roughly 100 trillion, which would
mean each neuron (~100 billion in the human brain) has roughly 1000 connections.
If each neuron in a brain has 1000 connections, this means a cat has roughly 250 billion synapses,
and a dog has 530 billion synapses. Synapse count generally seems to predict higher intelligence,
with a few exceptions: for instance, elephants technically have a higher synapse count than humans
yet display lower intelligence.
The simplest explanation for larger synapse counts with lower intelligence is a smaller amount of
quality data. From an evolutionary perspective, brains are “trained” on billions of years of epigenetic
data, and human brains evolved from higher quality socialization and communication data than
elephants, leading to our superior ability to reason. Regardless, synapse count is definitely important.
Again, the explosion in AI capabilities since the early 2010s has been the result of far more computing
power and far more data. GPT-2 had 1.5 billion connections, which is less than a mouse’s brain (~10
billion synapses). GPT-3 had 175 billion connections, which is getting somewhat close to a cat’s brain.
Isn’t it intuitively obvious that an AI system the size of a cat’s brain would be superior to an AI system
smaller than a mouse’s brain?
...
Predicting AI Performance
…
In 2020, after the release of the 175 billion parameter GPT-3, many speculated about the potential
performance of a model ~600 times larger at 100 trillion parameters, because this parameter count
would match the human brain’s synapse count. There was no strong indication in 2020 that anyone
was actively working on a model of this size, but it was interesting to speculate about.
The big question is, is it possible to predict AI performance by parameter count? As it turns out, the
answer is yes, as you’ll see on the next page.
[Source: https://www.lesswrong.com/posts/k2SNji3jXaLGhBeYP/extrapolating-gpt-n-performance]
[The above is from Lanrian’s LessWrong post.]
…
As Lanrian illustrated, extrapolations show that AI performance inexplicably seems to reach human-
level at the same time as human-level brain size is matched with parameter count. His count for the
synapse number in the brain is roughly 200 trillion parameters as opposed to the commonly cited 100
trillion figure, but the point still stands, and the performance at 100 trillion parameters is remarkably
close to optimal.
By the way – an important thing to note is that although 100 trillion is slightly suboptimal in
performance, there is an engineering technique OpenAI is using to bridge this gap. I’ll explain this
towards the very end of the document because it’s crucial to what OpenAI is building.
Lanrian’s post is one of many similar posts online – it’s an extrapolation of performance based on the
jump between previous models. OpenAI certainly has much more detailed metrics and they’ve come
to the same conclusion as Lanrian, as I’ll show later in this document.
So, if AI performance is predictable based on parameter count, and ~100 trillion parameters is enough
for human-level performance, when will a 100 trillion parameter AI model be released?
...
GPT-5 achieved proto AGI in late 2023 with an IQ of 48
…
The first mention of a 100 trillion parameter model being developed by OpenAI was in the summer of
2021, mentioned offhand in a wired interview by the CEO of Cerebras (Andrew Feldman), a company
which Sam Altman is a major investor of.
Sam Altman’s response to Andrew Feldman, at an online meetup and Q&A called AC10,
which took place in September 2021. It’s crucial to note that Sam Altman ADMITS to
(Sources: https://albertoromgar.medium.com/gpt-4-a-viral-case-of-ai-misinformation-c3f999c1f589
https://www.reddit.com/r/GPT3/comments/pj0ly6/sam_altman_gpt4_will_be_remain_textonly_will_not/
The reddit posting itself is sourced from a LessWrong post, which was deleted at Sam Altman’s request:
https://www.lesswrong.com/posts/aihztgJrknBdLHjd2/sam-altman-q-and-a-gpt-and-agi )
…
AI researcher Igor Baikov made the claim, only a few weeks later, that GPT-4 was being trained and
would be released between December and February. Again, I will prove that Igor really did have
accurate information, and is a credible source. This will be important soon
Gwern is a famous figure in the AI world – he is an AI researcher
and blogger. He messaged Igor Baikov on Twitter (in September 2022) and this is
and “multimodal”.
use the subreddit to discuss AI topics deeper than what you’ll find in the mainstream.
A “colossal number of parameters”? Sounds like Igor Baikov was referencing a 100 trillion parameter model, as 500 billion
parameter models and up to 1 trillion parameter models had already been trained many times by the time of his tweet in
summer 2022 (making models of that size unexceptional and certainly not “colossal”).
These tweets from “rxpu”, seemingly an AI enthusiast (?) from Turkey, are interesting because they make a very similar claim
about GPT-4’s release window before anyone else did (trust me – I spent many hours, daily, scouring the internet for similar
claims, and no one else made this specific claim before he did).
He also mentions a “125 trillion synapse” GPT-4 – however, he incorrectly states GPT-3’s parameter count as 1 trillion. (It
seems as though rxpu did have inside information, but got something mixed up with the parameter counts – again, I will
illustrate this later, and prove that rxpu was not lying).
…
This is a weaker piece of evidence, but it’s worth including because “roon” is fairly notable as a Silicon
Valley AI researcher, and is followed by Sam Altman, CEO of OpenAI, and other OpenAI researchers
on Twitter.
In November 2022 I reached out to an AI blogger named Alberto Romero. His posts seem to spread pretty far online so I was
hoping that if I sent him some basic info about GPT-4 he might do a writeup and the word would get out.
The results of this attempt were pretty remarkable as I’ll show in the next two pages.
Alberto Romero’s post. The general response will be shown
So, after all, Igor really did mean “100 trillion parameters”
reliability shortly.
Somewhere around Oct/Nov 2022 I became convinced
OpenAI’s official position, as demonstrated by Sam Altman himself, is that the idea of a 100 trillion
parameter GPT-4 is “complete bullshit”. This is half true, as GPT-4 is a 1 trillion parameter subset of
the full 100 trillion parameter model.
Just to illustrate that the 100 trillion parameter model hasn’t arrived yet and is still in development, Semafor in
March 2023 (shortly after the release of GPT-4) claimed GPT-4 is 1 trillion parameters. (OpenAI has refused to
officially disclose parameter count).
Something else worth nothing is that OpenAI claims GPT-4 was “finished training” in August, whereas we know
that a “colossal” multimodal model was being trained between August and October. One explanation for this is,
OpenAI lied. Another possibility is that the 1 trillion parameter GPT-4 may have finished its first round of training
in August, but went through additional retraining between August and October, which is when the bulk of the full
100 trillion parameter model was trained.
I will now provide my evidence that GPT-4
was not just trained on text and images, but was also
this page is not the most solid piece of evidence – I’m including it
The point is, training a human-brain-sized AI model on all the image and video data on the internet will clearly
be more than enough to handle complex robotics tasks. Common sense reason is buried in the video data,
just like it’s buried in the text data (and the text-focused GPT-4 is stunningly good at common sense
reasoning).
A recent example from Google, of robotics capabilities being learned from a large vision/language model. (Minimal robotics data
was required on top of the language and vision training, and the knowledge from visual and text tasks transferred to the
robotics tasks. OpenAI is training their 100 trillion parameter model on “all the data on the internet” which will undoubtedly
include robotics data). Palm-E is a ~500 billion parameter model – what happens to robotics performance when you train a 100
trillion parameter model on all the data available on the internet? (More on Google’s Palm-E model on the next page).
Another robotics development – this time from Tesla (May
DEMONSTRATIONS.
robotics performance...
The image on the left shows what the 1 trillion parameter GPT-4 is capable of in terms of image recognition. The response is
already clearer and more well written than what many humans would have come up with. So, again, what happens when you train a
model 100 times larger than GPT-4, which is the size of the human brain, on all the data available on the internet?
VIDEO^
<--VIDEO <--IMAGES
Examples of the current level of quality of publically available video & image generation AI models. These models are
less than 10 billion parameters in size. What happens, when you train a model 10,000 times larger, on all the data
available on the internet, and give it the ability to generate images and video? (The answer: images and videos completely
indistinguishable from the real thing, 100% of the time, with no exceptions, no workarounds, no possible way for anyone to
tell the difference, no matter how hard they try). - (update: SORA IS FROM GPT-5 Q* 2023 MODEL)
Two posts from Longjumping-Sky-1971. I’m including this because he accurately predicted the release date
of GPT-4 weeks in advance (no one else posted this information publicly beforehand, meaning he had an
inside source). His posts now have much more credibility – and he claimed image and audio generation
would be trained in Q3 of 2023. If video generation training is simultaneous or shortly after, this lines up with
Siqi Chen‘s claim of GPT-5 being finished training in December of 2023.
Let’s take a stroll back
an AI system trained on
The next slide will reveal some quotes from the President of OpenAI – from 2019 – and it will tell you what their plan was.
OpenAI president Greg
2019 + 5 = 2024
Both of these sources are clearly referring to the same plan to achieve AGI – a human-brain-sized AI
model, trained on “images, text, and other data”, due to be trained within five years of 2019, so, by
2024. Seems to line up with all the other sources I’ve listed in this document...
Source: Time Magazine, Jan 12 2023
As I’ll show in these next few slides, AI leaders are suddenly starting to sound the alarm – almost like they know
something VERY SPECIFIC that the general public doesn’t.
Date of NYT interview: May 1 2023
“I thought it was 30 to 50 years or even longer away. Obviously, I no longer think that.”
What made him suddenly change his mind -- AND decide to leave Google to speak about the dangers of AI?
Shortly after the release of GPT-4, the Future of Life Institute, a highly
being-trained GPT-5)”. Why was that included, and why was it removed?
Sam Altman’s blunt, immediate response, interrupting the man asking the question:
“Yes.”
Sam elaborates: “Yeah, we’re confident there is. We think about this and measure it quite a lot.”
Sam’s reply: “One of the things I think that OpenAI has driven in the field that’s been really healthy is that you can treat scaling laws as
a scientific prediction. You can do this for compute, you can do this for data, but you can measure at small scale and you can predict
quite accurately how it’s going to scale up. How much data you’re going to need, how much compute you’re going to need, how many
parameters you’re going to need, when the generated data gets good enough to be helpful… And the internet is…there’s a lot of data
out there. There’s a lot of video out there too.”
[Note – an AI winter is an extended period of time where the AI field receives limited
funding and is not given much attention by serious researchers. This happened twice --
once in the 70s and 80s and again from the mid 80s until roughly the late 2000s.]
Sam Altman’s response: “Could we have an AI winter and what might cause it… yeah, of course. I think we won’t have one very soon.
Because even if we never figure out another research idea, the economic value of the current paradigm and how much further that
can be pushed is gonna carry us for many years to come. But it is possible, however unlikely, that we are still missing the key idea to
go beyond behavioral cloning and these models are gonna be, like, stuck at human-level forever. There’s a bunch of reasons why I
don’t think that’s true but if anyone tells you we could not possibly ever have another winter in this research field you should never
believe them.”
I detail why these Sam Altman quotes are concerning on the next page.
On Sam Altman’s Q&A
Firstly, Sam Altman seems highly, highly confident that there exists enough data on the internet to
train an AGI system – confident to the point that it makes one question if they’ve already done it, or
are in the process of doing it.
Secondly, the “AI winter” concept generally refers to a period where progress TOWARDS AGI has
been slowed, but Sam Altman retooled the term to refer to a period where progress TOWARDS
SUPERINTELLIGENCE is slowed. This seems to suggest that OpenAI has already built an AGI
system, or are very close to it, and AGI is no longer the goal because it already exists.
As I mentioned earlier in the document, a 100 trillion parameter model is actually slightly suboptimal, but there is a new scaling
paradigm OpenAI is using to bridge this gap – it’s based on something called the “Chinchilla scaling laws.”
Chinchilla was an AI model unveiled by DeepMind in early 2022. The implication of the Chinchilla research paper was that
current models are significantly undertrained, and with far more compute (meaning more data) would see a massive boost in
performance without the need to increase parameters.
The point is, while an undertrained 100 trillion parameter model may be slightly suboptimal, if it were trained on vastly more
data it would easily be able to EXCEED human-level performance.
The Chinchilla paradigm is widely understood and accepted in the field of machine learning, but just to give a specific example
from OpenAI, President Greg Brockman discusses in this interview how OpenAI realized their initial scaling laws were flawed,
and have since adjusted to take the Chinchilla laws into account: https://youtu.be/Rp3A5q9L_bg?t=1323
People have said, “training a compute optimal 100 trillion parameter model would cost billions of dollars and just isn’t feasible.”
Well, Microsoft just invested $10 billion into OpenAI in early 2023, so I guess it isn’t such a ridiculous possibility after all...
Alberto Romero wrote about DeepMind’s Chinchilla scaling breakthrough. Chinchilla showed that, despite being vastly smaller
than GPT-3 and DeepMind’s own Gopher, it outperformed them as a result of being trained on vastly more data. Just to
reiterate this one more time: although a 100 trillion parameter model is predicted to achieve slightly suboptimal performance,
OpenAI is well aware of the Chinchilla scaling laws (as is pretty much everyone else in the AI field), and they are training Q* as
a 100 trillion parameter multimodal model that is COMPUTE OPTIMAL and trained on far more data than they originally
intended. They have the funds to do it now, through Microsoft. This will result in a model that FAR, FAR exceeds the
performance of what they had initially planned for their 100 trillion parameter model. 100 trillion parameters without Chinchilla
scaling laws = roughly human-level but slightly suboptimal. 100 trillion parameters, multimodal, WITH Chinchilla scaling laws
taken into account = …………?
Starting in July of 2022, the US started making moves to block new computer chips from being sent to China, in an attempt to
halt their AI progress. This plan was finalized in October of 2022. According to a San Francisco AI researcher, Israel Gonzales-
Brooks, Sam Altman was in DC in September of 2022. Israel has claimed to be in contact with Sam Altman (I could not verify
this), but what gives him credibility is the fact that Sam Altman was confirmed to have taken a trip to DC in January of 2023.
If GPT-4/GPT-5 began training in the summer of 2022, and Sam Altman visited DC during this time (probably multiple times),
the China chip ban can’t possibly be a coincidence.
OpenAI planned to build human-level AI by 2027 and then scale up to superintelligence. This has been
delayed because of Elon Musk’s lawsuit but it will still be coming shortly
In closing, I’ll reveal an incredible source of info – which comes from Scott Aaronson,
famous computer scientist. In the summer of 2022 he joined OpenAI for one year to work on AI safety…
and he had some very interesting things to say about it on his blog, as I’ll show next.
…