OpenAI Introduces “o3” and “o4‑mini,” the Company’s Most Sophisticated Multimodal to Date

OpenAI today unveiled two new foundation models—o3 and o4‑mini—designed to raise the ceiling on advanced reasoning while adding native visual‑understanding capabilities. The announcement, published on the company’s site, signals OpenAI’s intent to maintain technological leadership as global competition in AI intensifies.

The flagship o3 model—billed internally as o3‑pro—is described by OpenAI as its most capable reasoning system to date. The model processes both text and images, enabling it to interpret sketches, photographs, and data visualizations, and then synthesize that information alongside traditional text and code inputs. According to OpenAI, o3 integrates seamlessly with the full suite of ChatGPT tools, allowing users to conduct live web searches, generate images with DALL·E, and perform in‑conversation data analysis without leaving the chat interface.

Complementing the larger model is o4‑mini (and its higher‑throughput variant, o4‑mini‑high). These smaller deployments are engineered for enterprises that require low‑latency responses and cost efficiency, yet still need reasoning depth close to GPT‑4 levels. Both models are immediately accessible to ChatGPT Plus, Pro, and Team subscribers, with staged API availability for enterprise customers beginning in the coming weeks. In parallel, OpenAI will retire earlier offerings such as o1 and o3‑mini.

By expanding the system’s ability to “think with images,” OpenAI is targeting new use cases in design automation, medical imaging, robotics, and other data‑rich domains where visual context is critical. The launch follows this week’s incremental GPT‑4.1 release and arrives amid increased pressure from rivals such as Google’s Gemini 1.5 Pro and DeepSeek’s R‑series.

OpenAI chief executive Sam Altman framed the release as another step toward “generalist agents capable of solving real‑world problems end‑to‑end,” hinting at the trajectory for the company’s next‑generation GPT‑5 initiative.

What remains to be seen is how quickly the broader developer ecosystem adopts these models and how they will fare against the rapidly advancing competition from China‑based research labs and American Big Tech peers.


Will o3’s multimodal reasoning—or the speed‑optimized o4‑mini—have the most immediate impact on your workflows?

Angel Morales

Founder and lead writer at Duck-IT Tech News, and dedicated to delivering the latest news, reviews, and insights in the world of technology, gaming, and AI. With experience in the tech and business sectors, combining a deep passion for technology with a talent for clear and engaging writing

Previous
Previous

Mario Kart World Direct Recap: Nintendo Dives Deep Into New Courses, Characters, Open‑World and More

Next
Next

GeForce NOW Adds Hunt: Showdown 1896, Forever Skies, and Nine Additional Titles to Cloud Gaming Library