OpenAI Releases ChatGPT o1 Reasoning Model

Generative AI

September 13, 2024 |

Rokas Jurkėnas

OpenAI just launched the ChatGPT o1 reasoning model. It can help users solve complex math, and programming problems faster. We will discuss its main functions and what makes it different.

How is the ChatGPT o1 preview model any different?

The ChatGPT o1-preview model is designed to solve more complex problems by spending more time reasoning through tasks, making it better suited for challenging domains such as science, coding, and math.

It is a significant advance over previous models such as GPT-4o, as evidenced by its much higher performance on tasks such as the International Mathematical Olympiad and coding competitions. However, it lacks some features such as browsing and file uploading that GPT-4o provides.

It also has a new security framework that greatly improves its ability to comply with security policies compared to previous models.

Although with this early model paid users will only get 30 messages per week to try out this model and in the future, we will probably see that it will not be quite cheap for users for daily work.

What benchmarks have ChatGPT-o1 excelled in?

Here are some more technical details for the technical AI users:

Programming Competitions (Codeforces): ChatGPT-01 ranked in the 89th percentile, demonstrating strong coding abilities in competitive programming scenarios, significantly outperforming previous models like GPT-4o.
Mathematical Competitions (AIME 2024): In the USA Mathematical Olympiad qualifier (AIME), it solved 74% of problems on average, ranking among the top 500 students in the U.S.
Science Problems (GPQA Diamond): The model exceeded human PhD-level accuracy in physics, biology, and chemistry questions, marking it as a leader in solving complex scientific problems.
MMLU Benchmark: It outperformed on 54 of 57 reasoning-heavy subcategories in the MMLU (Massive Multitask Language Understanding) benchmark, with improved accuracy over GPT-4o.

O1 New Model Can Finally Count How Many Rs Are In Strawberry

A funny point about the current models was that they could not do simple tasks like telling how many N letters are in the word banana or how many R letters are in the word strawberry, well it is not currently confirmed that this model can finally perform these tasks successfully.

The Future Of ChatGPT Models and What To Expect In The Next Update

OpenAI stated on social media that the idea is that these thinking models will take a few hours, days, or even weeks to come up with the best answer to a query. We will see how true that is in the future.

The model is still in preview, but the final version looks promising.

Final thoughts

The new OpenAI ChatGPT O1 model is all about solving complex problems. It can handle more advanced tasks like math, coding, and science more effectively. This update is especially impressive in competitive fields, with proven results in international math and coding competitions. On the other hand, it doesn’t have all the browsing and file-uploading features you’ll find in previous models.

The ChatGPT o1 model is great for tricky reasoning tasks. It shines in areas like physics, biology, and coding. It’s ranked pretty high in the benchmarks, and it’s outperforming some older models like GPT-4o. Even with these improvements, it’s still limited to 30 messages per week for early-access users.