Elon Musk unveils Grok-3 and Grok-3 mini that outperform other AI chatbots

-

-Advertisement-

In a highly-anticipated online live-stream, owner of X and xAI and CEO of Tesla & SpaceX Elon Musk unveiled the latest versions of the Grok AI chatbot i.e. Grok-3 and Grok-3 mini.

Top members of the xAI’s AI engineering team joined Musk in this presentation to showcase the capabilities of Grok-3 and its scaled-down version Grok-3 mini.

According to xAI’s internal testing and public benchmarking tools, both Grok-3 and Grok-3 mini have surpassed existing chatbots like ChatGPT, DeepSeek R1, and Gemini-2 Flash Thinking by a large margin.

Elon Musk founded xAI in July 2023 to compete with companies like OpenAI where he was an early investor but later had to leave the company because of conflicts with other board members.

– Advertisement –
Elon Musk and his team at xAI unveil the Grok-3 AI chatbot to the world on Monday 17th Feb 2025.
Elon Musk and his team at xAI unveil the Grok-3 AI chatbot to the world on Monday 17th Feb 2025. “Our mission is to understand the universe”. Credit: Elon Musk / xAI / X (Twitter).

Benchmarking Against Other Chatbots

Elon Musk’s xAI tested its Grok-3 and Grok-3 mini chatbots against other advanced AI chatbots in 3 major areas — Mathematics, Science, and Computer Programming (coding and game development).

Testing against benchmarks, the xAI Grok-3 AI bot topped all the charts against existing chatbots in all of the areas mentioned above.

The xAI Grok-3 AI chatbot is designed to think deeply. According to its creators, it reasons many times while thinking. It solves the same problem many times before it concludes what’s the right solution.

Graph: Grok-3 and Grok-3 mini benchmarking results in Maths, Science, and Coding abilities against Gemini-2 Pro, DeepSeek-V3, Claude 3.5 Sonnet, and ChatGPT 4o.
Graph 1: Grok-3 and Grok-3 mini benchmarking results in Maths, Science, and Coding abilities against Gemini-2 Pro, DeepSeek-V3, Claude 3.5 Sonnet, and ChatGPT 4o. Credit: xAI / Elon Musk.
– Advertisement –

Math: AIME 2024 and 2025

During the presentation, an xAI engineer said “Grok-3 is ready to go to college”. As we can see from the above benchmarking comparison bar chart, Grok-3 and Grok-3 mini performed significantly better compared to the competition in the AIME’24 mathematics test.

Even Grok-3 mini scored 40 while DeepSeek-V3 got 39 points. The larger Gork-3 version outperformed every other AI chatbot available on the market with a benchmarking score of 52. The closest to it is its own younger brother Grok-3 mini. Interestingly, ChatGPT’s GPT-4o did the worst in this area with a score of only 9.

xAI has named the current early version of Grok-3 as Chocolate. Elon Musk‘s team at his artificial intelligence company is constantly improving the AI model of Grok to solve even more complex mathematical challenges.

AIME 2025 Performance Bar Chart: Grok-3 Reasoning Beta and Grok-3 mini reasoning compared to ChatGPT o3 mini (high), o1, DeepSeek-R1, and Gemini-2 Flash Thinking.
Graph 2: AIME 2025 Performance Bar Chart: Grok-3 Reasoning Beta and Grok-3 mini reasoning compared to ChatGPT o3 mini (high), o1, DeepSeek-R1, and Gemini-2 Flash Thinking. Credit: xAI / Elon Musk via X.
– Advertisement –

Science: GPQA

The mini version of Grok-3 scored 65 points in Ph.D.-level science questions. This score is the same as Gemini-2 Pro and Claude 3.5 Sonnet. In this specific area, China’s DeepSeek-V3 scored only 59 points.

The larger Grok-3 version did exceptionally well in science with 75 points in the GPQA benchmarking test.

Grok performing the science test with an ace is a positive sign that it will be able to understand the universe better than the rest of AI bots in the future.

Coding / Game Development

Looking at Graph 1 above, we can see that both Grok-3 and Grok-3 mini have a fair advantage in coding and game development over its competing AI chatbots.

Using Grok-3 and its Big Brain option (extra compute), Elon Musk created a combo of Tetris and Bejeweled using the following command statement in plain English:

Using pygame, make a game that is a mix of Tetris and Bejeweled. The code could be very long. Output it as one file. Mat it insanely great.

There was no prompt engineer involved in making this game by the user. This command also gave Grok-3 the freedom of creativity and to choose its own criteria in putting the video game together.

“We’re seeing the beginnings of creativity,” Elon Musk said.

– Advertisement –
Screenshot of Grok-3-created hybrid video game of Tetris and Bejeweled.
Screenshot of Grok-3-created hybrid video game of Tetris and Bejeweled. Credit: xAI / Elon Musk via X (live-stream recording video below).

ChatBot Arena (LMSYS) Benchmarking

xAI’s Grok-3 Chocolate (early version) outperformed all other AI ChatBots tested on the ChatBot Arena benchmarking system.

Grok-3 scored 1400 points in the ChatBot Arena benchmarking test. The closest AI chatbot to Grok-3 is Google’s Gemini-2 Flash-Thinking which scored between 1380 and 1400 (see Graph 3 below).

Graph: Grok-3 Chocolate vs other AI chatbots in ChatBot Arena benchmarking test (LMSYS).
Graph: Grok-3 Chocolate vs other AI chatbots in ChatBot Arena benchmarking test (LMSYS). Credit: xAI / Elon Musk via X.
– Advertisement –
Video: Recording of the Grok-3 AI chatbot unveiling live-stream.

Stay tuned for more Elon Musk news, and videos. Follow us on:
Google News | Flipboard | RSS (Feedly).

Related Elon Musk / AI News

Iqtidar Ali
Iqtidar Alihttp://www.teslaoracle.com
Author of more than 1500 articles on Tesla, SpaceX, and EVs. His work has been liked and tweeted by Elon Musk and other prominent influencers. You can reach him on Twitter @IqtidarAlii

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest News

Tesla FSD v13 performs unexpectedly well on unmarked Chinese roads (video)

China is the first country outside North America to get the Tesla Full Self-Driving (FSD) software regularized for city...

Donald Trump buys an Ultra-Red Model S to support Elon Musk’s Tesla (TSLA)

47th President of the United States, Donald Trump was accompanied by Tesla (TSLA) CEO Elon Musk yesterday at the...

President Trump to buy a Tesla today to show support for Elon Musk and his company

Yesterday, 47th President of the United States of America Donald J. Trump posted on Truth Social that he's going...

Tesla (TSLA) to expand 4680 cell manufacturing at the Fremont factory, new job openings suggest

Elon Musk's electric vehicle company Tesla (TSLA) is once again expanding its 4680 cell manufacturing. The automaker has just...
- Advertisement -

SpaceX lost another V2 Starship in the Flight 8 launch test, debris spotted over the Caribbean

SpaceX launched a Starship prototype into space for the 8th time yesterday. Interestingly, the Flight 8 Super Heavy rocket...

Live updates: Starship Flight 8 — watch the live stream recordings (Booster 15 catch, Ship 34 lost)

SpaceX is re-attempting the Flight 8 Starship launch test (IFT-8) today. Get the latest live text updates below and...

SpaceX Bastrop factory ramps up Starlink production to 70,000 kits per week in just 20 months

SpaceX ran a Starlink promo video between the Starship's Flight 8 launch attempt on Monday. Although Starship didn't fly...

Tesla Tips & Tricks

Here’s how to recalibrate your Tesla battery to regain lost miles on range

Due to some improper charging habits, a Tesla vehicle's...

Leaked photo of the Tesla Cybertruck bed reveals 3 power outlets of 120 and 240 volts

As close as we get to the Cybertruck Delivery...

Tesla adds Automatic Headlights with wipers and a ton of new features in the 2023.26 update (Release Notes)

Tesla rolled out a brand new over-the-air (OTA) software...

Tesla Quarterly Reports & Eearnings

Tesla’s automotive revenues slide a little in 2024 but energy business grows immensely — Q4 report

Tesla (TSLA) released its Q4 and full-year 2024 financial...

Tesla (TSLA) announces the date and time of the Q4 2024 Earnings Call

Yesterday, we published a comprehensive report of Tesla's Q4...
- Advertisement -

You might also likeRELATED
Recommended for You