Back to Phantom Notes
AI Tools

OpenAI's Codex Becomes a Super-App: Computer Use, Atlas Browser, Image Gen, and 111 Plugins

April 21, 20268 min readBy T.W. Ghost
OpenAICodexGPT-5.4AI CodingClaude CodeDeveloper ToolsComputer UseAtlas

Release Summary

OpenAI shipped a major Codex update on April 16, 2026, under the name "Codex for (almost) everything." This is not a new product. It is a repositioning of the desktop app they launched in February from "agentic coding assistant" into what OpenAI's engineering lead called *"building a super app in the open."*

Three headline features landed:

  • Background Computer Use. Codex can now see your screen, move the cursor, click, and type into any macOS application.
  • In-app browser built on OpenAI's Atlas browser technology.
  • Inline image generation via gpt-image-1.5 built directly into the coding app.

Plus 90-plus (another source counts 111) new plugins, scheduled automations, memory preview, multi-terminal support, SSH to remote devboxes, and direct GitHub PR review in the app.

The strategic read: OpenAI is fighting for workflow ownership. Claude Code has won the code-quality benchmark war and most of the enterprise quality-conscious buyers. OpenAI's response is not to chase Claude on benchmarks. It is to absorb everything a developer does into one app that happens to include coding.


The Models Powering the Update

Codex now runs on three models depending on the task:

  • GPT-5.4 (1M token context, flagship reasoning)
  • GPT-5.3-Codex (coding-specialized variant)
  • GPT-5.3-Codex-Spark (fast variant on Cerebras WSE-3 hardware, 1,000-plus tokens per second)

Spark is the interesting one. It runs on Cerebras silicon and delivers the latency you need when an agent is trying to drive a macOS UI in real time. Cloud inference on standard H100 pods cannot realistically operate computer use at a pace that feels responsive.


Feature 1: Background Computer Use

This is the feature that made the announcement. Codex can now operate macOS applications in the background while you work uninterrupted in the foreground. Multiple agents can run in parallel across different apps without interfering with your own work.

What it does well:

  • Testing native macOS apps (Figma, Xcode, Slack) where there is no clean API
  • Running an agent against a GUI-only vendor tool while you do something else
  • Frontend iteration where you want the agent to click around the actual rendered UI
  • Game development loops against a running build

What it does not do yet:

  • No Windows support. The Codex desktop app has been on Windows since March 4, but computer use has not been announced for Windows. Microsoft's own Copilot Studio solves computer use differently, and OpenAI has not committed to a timeline.
  • Not available in the EU, UK, or Switzerland at launch. This is almost certainly AI Act compliance work, not a technical limitation.
  • One-time permission grant. You grant macOS Accessibility and Screenshot permissions once. After that, any agent in any thread can take control. There is no per-session gate. Review what you want the agent allowed to do before granting.

Feature 2: Atlas Browser Embedded

OpenAI's Atlas browser is now embedded in the Codex desktop app. You can point it at a localhost development server, annotate rendered elements directly on the page, and give the agent precise instructions tied to real DOM locations.

Limitations at launch:

  • Localhost only. Full browser control (drive the real web) is "coming" but not shipping yet.
  • Unauthenticated pages only. No logged-in sessions, no cookies passed through.

For frontend developers working against a dev server, this is already a meaningful improvement over the copy-paste-screenshot loop. "Make the button padding a bit larger" with a pointer at the actual rendered button beats describing it in words.


Feature 3: Image Generation Inline

gpt-image-1.5 is now inline in Codex. No more switching to ChatGPT to mock up an interface idea. Image generation in the coding tool unlocks workflows like:

  • Generate a button mockup while coding the component
  • Create placeholder game assets while building the engine
  • Iterate on a UI concept in the same thread as the implementation

One caveat Build Fast With AI's review flagged: image generation burns 3 to 5 times more tokens than equivalent text tasks. If you are on the Plus plan, the ceiling comes fast.


The 90-Plus Plugin Ecosystem

Codex now supports 90-plus (or 111, depending on which source you read) integrations, MCP servers, and skills. The notable names:

  • Version control: GitHub, GitLab
  • Project management: Atlassian Rovo (Jira, Confluence), Linear
  • CI/CD: CircleCI, Render
  • Communication: Slack
  • Productivity: Notion, Google Workspace, Microsoft Suite
  • PR review: CodeRabbit

The interesting framing is *"curated."* OpenAI is positioning the plugin set as vetted for security, in contrast to the fragmented MCP ecosystem Claude Code and other tools support (Anthropic's ecosystem now has 3,000-plus servers, most unvetted).

This is a real trade-off. Curated means fewer options but less risk of a rogue plugin pulling your API keys. Open means every possible integration but you are on your own to review them.

Get the Weekly IT + AI Roundup

What changed this week in NinjaOne, ServiceNow, CrowdStrike, and AI. One email, every Monday.

No spam, unsubscribe anytime. Privacy Policy


Pricing: The Token Burn Problem

OpenAI switched Codex billing from per-message to token-based as of April 2, 2026. The full matrix:

PlanCostCodex Access
Plus$20/monthStandard daily use
Pro$200/monthFull ChatGPT + Codex power user
$100 add-on+$100Doubles Plus Codex limits, adds Spark access
Business$20/seatToken-based team billing
EnterpriseCustomCustom token allocation

The $100 add-on is the tell. OpenAI introduced it because *"Plus users will hit ceilings very quickly"* once they start using image generation and multi-threaded agents. Plus was fine when Codex was mostly text prompts against code. It is not fine when the agent is rendering images, driving three apps at once, and running a scheduled automation overnight.

Compare to Anthropic:

  • Claude Pro: $20/month
  • Claude Max: $100/month
  • Claude Team: $25/seat

Both vendors have converged on effectively the same three price points for the same three user segments. The competition is no longer on price.


Codex vs Claude Code: Honest Comparison

Both tools are strong. They are optimizing for different things.

DimensionCodexClaude Code
Terminal-Bench (shell automation)77.3%65.4%
SWE-bench Verified (complex code)~49%80.8%
Context window1M (GPT-5.4)1M (beta)
Plugins90-plus curated3,000-plus MCP (unvetted)
Computer use (macOS)Built inVia MCP integrations
Image generationgpt-image-1.5 inlineNot available
Desktop platformsMac, Windows, experimental LinuxMac, Windows, VS Code, web, mobile
Starting price$20$20
Power tier$100 add-on or $200 Pro$100 Max

Choose Codex if:

  • You live in the terminal and DevOps-adjacent workflows
  • You do frontend visual iteration and want image gen in the loop
  • You want to automate Mac GUI apps that have no API
  • You need scheduled background work running overnight

Choose Claude Code if:

  • You work on large codebases and complex refactors where reasoning quality matters most
  • You need the highest SWE-bench scores (80.8% vs ~49%)
  • You need broad MCP ecosystem access to integrate with a long tail of niche tools
  • You need web, VS Code, and mobile parity (Anthropic ships to more surfaces)

Neither is obsolete. Most teams benefit from running both and routing tasks to the stronger one.


The Strategic Play: Workflow Ownership

The real story here is not Codex's specific features. It is OpenAI's business strategy.

For the first half of 2026, OpenAI was losing the developer tool war to Anthropic. Claude Code hit around 135,000 daily GitHub commits by April. Cursor and other IDE assistants defaulted to Claude Sonnet and Opus for quality. Codex weekly users were at 3 million but growth was stagnant compared to Anthropic's 6x enterprise expansion in the same window.

OpenAI's response was not to chase Claude on benchmarks. It was to redefine the category. If the question is "which AI writes better code," OpenAI loses. If the question is "which AI owns the developer's entire workflow," code quality is suddenly one dimension out of ten.

Computer Use, the Atlas browser, inline image generation, 111 plugins, and scheduled automations all move in the direction of *"you don't leave Codex for anything."* The GPT-6 roadmap (codenamed Spud, expected May 2026) puts memory at the cornerstone, which makes sessions even stickier.

This is a real strategy and it may well work. It also creates an opening for focused tools like Claude Code to keep winning on the core value prop (write and reason about code very well) while OpenAI is busy rebuilding Microsoft Office inside a chat box.


What This Means for You

If you are already deep in Claude Code, don't switch based on headlines. The SWE-bench gap (80.8% vs ~49%) is real. Check whether the specific Codex features actually solve a problem you have. Computer Use is genuinely new and genuinely macOS-only. If you work on a Mac and want to automate GUI apps, it is worth a $20 Plus plan to evaluate.

If you are already on Codex, the new features are free on your existing plan. Image generation in particular is a meaningful productivity lift for frontend work. Watch the token burn. The $100 add-on exists for a reason.

If you are choosing between the two for the first time, use both for a week on your actual work. Benchmarks are directional. Your codebase, your deadlines, and your preferred workflow are what actually decide.


Which AI Should You Use?

If the last three sections left you uncertain, we built a 2-minute quiz at llmmatchmaker.com/quiz that matches you to the right AI based on what you actually do with it. Not based on which one has the most features. Based on your stack, your habits, and your deadlines.

The winner is the one you actually use tomorrow.


References