The OpenAI Era at Microsoft Is Over | Microsoft Released Its Own AI Models

picture of two computers in a blurry background split in the middle

Summary

Microsoft released three proprietary AI models today: MAI-Transcribe-1 for speech-to-text, MAI-Voice-1 for voice generation, and MAI-Image-2 for image generation. All three are available now through Microsoft Foundry. This is the first major model release since Microsoft reorganized its AI division in March and shifted AI chief Mustafa Suleyman away from Copilot to focus entirely on building frontier models. It is also the clearest signal yet that Microsoft is actively reducing its dependence on OpenAI.

What Microsoft Released Today

Three models dropped today, covering three of the most commercially valuable AI modalities. Each one goes directly at OpenAI, Google, or both.

MAI-Transcribe-1 is Microsoft’s first in-house speech-to-text model. It supports 25 languages and claims the lowest Word Error Rate on the FLEURS benchmark against OpenAI’s Whisper-large-v3, Google’s Gemini 3.1 Flash, GPT-Transcribe, and ElevenLabs’ Scribe v2. Microsoft says it beats Whisper on all 25 languages and Gemini on 22 of 25. Batch transcription speed is 2.5 times faster than Microsoft’s own Azure Fast offering. Pricing starts at $0.36 per hour of audio, which Microsoft says is roughly half the GPU cost of comparable alternatives.

MAI-Voice-1 is a speech generation model that produces 60 seconds of audio in under one second on a single GPU. It also lets developers clone a custom voice from a short audio snippet. Pricing starts at $22 per million characters. MAI-Image-2 is Microsoft’s text-to-image model, already ranked third on the Arena.ai image generation leaderboard. It is rolling out in Bing and PowerPoint now. Pricing starts at $5 per million tokens for text input.

All three are available immediately through Microsoft Foundry and a new MAI Playground where developers can test them directly. TechCrunch’s coverage notes that Suleyman told VentureBeat these are “the very best in the world for transcription” and that Microsoft can deliver them at half the GPU count of state-of-the-art competition.

The Backstory: How Microsoft Got Here

This did not happen overnight. Microsoft has been quietly building toward AI self-sufficiency for nearly a year.

The pivot started with a renegotiated contract with OpenAI, signed in October 2025, which formally allowed both companies to independently pursue AGI without the exclusivity restrictions that had previously locked them together. The revised agreement gave Microsoft the legal and strategic freedom to build competing models while still maintaining the partnership and accessing OpenAI’s technology.

In March 2026, CEO Satya Nadella reorganized Microsoft’s AI division. Mustafa Suleyman, the former DeepMind co-founder Microsoft acquired through a $650 million deal for his startup Inflection in 2024, was moved off day-to-day Copilot product oversight and given a single mandate: build world-class frontier AI models over the next five years. Jacob Andreou, a former Snap executive, took over Copilot operations and unified Microsoft’s fragmented AI assistant products under a single structure.

Today’s model releases are described by VentureBeat as the “opening salvo” from Suleyman’s new superintelligence team. The team was formed six months ago and today’s drop is their first public output.

Why Copilot’s Numbers Made This Inevitable

The reorganization did not happen because everything was going well. Copilot’s market share numbers are brutal.

A Recon Analytics survey of more than 150,000 paid AI subscribers found that Copilot’s market share fell from 18.8% in July 2025 to 11.5% by January 2026, a 39% drop in six months. The finding that got the most attention internally: when workers only have access to Copilot, adoption sits at 68%. Add ChatGPT as an option and Copilot drops to 18%. Add Gemini on top of that and just 8% choose Copilot. Microsoft has 450 million users it can put Copilot in front of, and nine out of ten choose something else when given the option.

Microsoft’s stock closed its worst quarter since the 2008 financial crisis heading into April, falling roughly 17% year-to-date as investors demand proof that the hundreds of billions spent on AI infrastructure is producing returns. Q2 FY2026 revenues came in at $81.3 billion, up 17% year-over-year, which is still strong, but the Copilot story investors were banking on has not materialized in adoption numbers.

The reorg separates the product problem from the research problem. Andreou comes from consumer growth at Snap and takes on the adoption challenge. Suleyman is freed from quarterly product pressure to build the models that would make Copilot genuinely competitive rather than just accessible.

What This Means for the OpenAI Relationship

Microsoft and OpenAI are still partners. But the dynamic has shifted permanently.

Microsoft has invested more than $13 billion in OpenAI and still hosts its models across Azure, Copilot, and other products. Suleyman reaffirmed that partnership in his interview with VentureBeat ahead of today’s announcement. The two companies are also in discussions to extend their agreement through 2030 for early model access.

But the relationship is also showing cracks. The Financial Times reported in March that Microsoft is weighing legal action over a reported $50 billion deal that would make AWS the exclusive third-party cloud provider for OpenAI’s enterprise platform, potentially breaching Microsoft’s Azure exclusivity agreement. And Microsoft is now building models that compete directly with OpenAI’s offerings in transcription, voice, and image generation, using the same developer infrastructure.

The framing Suleyman gave The Verge captures the current posture well. Revising the OpenAI contract, he said, “unlocked Microsoft’s ability to pursue superintelligence” and created a “best-of-both environment” where Microsoft is “free to pursue our own superintelligence and also work closely with them.” That is a polite way of saying the dependency is over.

What Comes Next

Today’s three models are the start, not the end. Suleyman has signaled more releases are coming across Microsoft’s own products and through Foundry.

MAI-Transcribe-1 is already rolling out in Copilot Voice mode and Microsoft Teams for conversation transcription, which means Microsoft is actively replacing components of its own products that previously relied on third-party or OpenAI models. MAI-Image-2 is rolling out in Bing and PowerPoint now. The next wave will likely push into reasoning and language models, which is where the direct competition with OpenAI’s core products becomes unavoidable.

The broader question the industry is watching is whether Microsoft can actually close the capability gap in language models, where OpenAI still leads by a significant margin. Building best-in-class transcription and image generation is achievable. Building GPT-5.4-level reasoning from scratch is a different order of challenge and will take the years Suleyman has given himself.

For now, today’s release does what it needs to do: it demonstrates that Microsoft AI can ship models that beat the market’s best on real benchmarks, at competitive prices, from its own infrastructure. That is a credible opening position for the superintelligence push that follows.

Frequently Asked Questions (FAQs)

What models did Microsoft release today?

Microsoft released three in-house AI models on April 2, 2026: MAI-Transcribe-1, a speech-to-text model supporting 25 languages that outperforms OpenAI’s Whisper-large-v3 on all 25 languages on the FLEURS benchmark; MAI-Voice-1, a voice generation model that produces 60 seconds of audio in under one second and supports custom voice cloning; and MAI-Image-2, a text-to-image model ranked third on the Arena.ai image generation leaderboard. All three are available through Microsoft Foundry and MAI Playground.

Is Microsoft breaking up with OpenAI?

No, but the relationship is changing significantly. Microsoft renegotiated its OpenAI contract in October 2025, giving both companies the freedom to independently pursue AGI. Microsoft still maintains its $13 billion investment in OpenAI, still accesses OpenAI’s models across its products, and is reportedly extending the partnership through 2030. However, Microsoft is now building and releasing its own models that compete directly with OpenAI offerings in transcription, voice, and image generation.

Who is Mustafa Suleyman and what is his role at Microsoft?

Mustafa Suleyman is the CEO of Microsoft AI. He co-founded Google DeepMind, then founded the AI startup Inflection, which Microsoft acquired for $650 million in 2024. After two years overseeing Copilot, he was reassigned in March 2026 to focus exclusively on building frontier AI models under the new MAI Superintelligence team. His mandate is to deliver world-class proprietary models over the next five years, with today’s three-model release being the team’s first public output.

Why did Microsoft reorganize its AI division in March 2026?

Microsoft reorganized its AI division in response to Copilot’s declining market share, which fell from 18.8% to 11.5% among paid AI subscribers between July 2025 and January 2026. Research found that when workers have access to both Copilot and competitors like ChatGPT, only 8% choose Copilot. CEO Satya Nadella moved Suleyman to focus on foundational model development and brought in Jacob Andreou, a former Snap executive with a consumer growth background, to unify and revive the Copilot product.

What is Microsoft Foundry?

Microsoft Foundry is Microsoft’s developer AI platform that provides access to AI models, tools, infrastructure, and security for building and scaling AI applications. It previously offered access to third-party models including OpenAI’s GPT series. With today’s release, Microsoft’s own MAI models are now available through Foundry alongside third-party offerings, giving developers a first-party Microsoft AI stack for the first time.