xAI Launched Grok Build — A Terminal Coding Agent to Fight Claude Code and Codex

TL;DR

xAI launched Grok Build beta on June 5, 2026 — a terminal TUI coding agent running on Grok 4.3. Direct comparison with Claude Code and OpenAI Codex: features, pricing, when to use each.

The terminal coding agent market just got a third serious entrant. xAI launched Grok Build beta on June 5, 2026 — a terminal-first coding agent running on Grok 4.3 with a TUI interface, headless CI/CD mode, and a new protocol called Agent Client Protocol (ACP) that lets it communicate with other agents. It is a direct shot at Claude Code and OpenAI Codex, both of which have been on market for 12+ months.

The launch was quiet by xAI standards — a tweet from Elon Musk at 11:43pm PT and a sparse documentation page. No launch event. The beta invite page filled in under three hours. Here is what the product actually does based on early access testing and the published documentation.

What Grok Build Is

Grok Build is a terminal application — similar to Claude Code in the sense that you install it, open it in a project directory, and interact with it via a command-line interface. The core loop: you describe a task, Grok Build reads your codebase, writes changes to disk, runs tests or builds, and iterates based on output.

The TUI (Terminal User Interface) is Grok Build’s most visually distinctive feature. Rather than pure text output, it renders a terminal UI with panels: a code diff view on the right, a task log in the bottom panel, and the conversation interface on the left. This is closer to how Cursor looks in the terminal than how Claude Code currently renders. Whether this is better depends entirely on personal preference — the extra rendering adds 40–80ms to display latency on slower terminals.

Installation is straightforward:

# Install Grok Build
npm install -g @xai/grok-build

# Authenticate with xAI API key
grok-build auth

# Start a session
cd your-project
grok-build

During beta, access requires an xAI API key with Grok 4.3 access. The waitlist prioritizes developers who have used Grok 4.3 via the API. Enterprise access for teams is expected in Q3 2026.

Grok 4.3 Under the Hood

Grok Build runs on Grok 4.3, released alongside the Build launch. Grok 4.3 benchmarks:

Benchmark	Grok 4.3	Claude Opus 4.8	GPT-4.1
SWE-bench Verified	79.4%	88.6%	75.4%
HumanEval	91.2%	92.7%	89.8%
MMLU	89.6%	91.3%	88.4%
Context window	256K	200K	128K
Real-time web search	Yes (native)	No	No

Grok 4.3’s 79.4% SWE-bench score puts it clearly above GPT-4.1 (75.4%) but below Claude Opus 4.8 (88.6%). The 9-point gap to Opus 4.8 is significant for complex autonomous coding tasks but less meaningful for straightforward code generation, where all three models are in a similar range.

The native real-time web search integration is Grok’s differentiator. Grok 4.3 can search the web mid-task without a separate MCP server or tool configuration. When Grok Build encounters an unfamiliar API or library, it searches documentation directly. In testing on a project using a less-documented internal SDK, this meant fewer hallucinated method signatures than Claude Code in the same scenario (Claude Code requires the docs to be in the context window or reachable via an MCP server).

The 256K context window is larger than both Claude Opus 4.8 (200K) and GPT-4.1 (128K). For very large codebases, this matters — Grok Build can hold more of a monorepo in active context without chunking.

Headless CI/CD Mode

Grok Build’s headless mode is the feature most enterprise teams will find interesting. It runs Grok Build as a non-interactive process that accepts a task specification file and outputs results, designed for integration with CI/CD pipelines.

# Example task specification file (task.json)
{
  "task": "Review the changes in the PR, identify security vulnerabilities, and comment on the PR with findings",
  "context": {
    "repository": ".",
    "pr_number": "{{ PR_NUMBER }}",
    "github_token": "{{ GITHUB_TOKEN }}"
  },
  "output": {
    "format": "structured_json",
    "include_diff_annotations": true
  }
}

# Run headless
grok-build run --task task.json --no-tty --output results.json

# GitHub Actions integration
- name: Grok Build Code Review
  run: |
    grok-build run       --task .grok/review-task.json       --env PR_NUMBER=${{ github.event.pull_request.number }}       --env GITHUB_TOKEN=${{ secrets.GITHUB_TOKEN }}       --output /tmp/review-results.json
  env:
    XAI_API_KEY: ${{ secrets.XAI_API_KEY }}

Claude Code has CI/CD capabilities via claude -p (non-interactive pipeline mode). OpenAI Codex Cloud runs entirely as a cloud-based headless agent. Grok Build’s headless mode sits between these — locally executed (not cloud-based like Codex) but non-interactive (unlike Claude Code’s interactive terminal sessions).

The practical difference from Codex Cloud: Grok Build headless runs on your infrastructure, meaning your code never leaves your environment. Codex Cloud processes code on OpenAI servers. For enterprises with strict data sovereignty requirements, that distinction matters.

Agent Client Protocol (ACP)

ACP is xAI’s new inter-agent communication protocol, announced alongside Grok Build. The protocol defines how autonomous agents communicate task requests, share context, and report results. The spec is published at github.com/xai-dev/acp.

The short version: ACP is xAI’s answer to MCP. Where MCP connects models to tools (databases, APIs, file systems), ACP connects agents to other agents. An ACP-enabled Grok Build instance can spawn sub-agents, delegate tasks to specialized agents, and coordinate with other ACP-compatible tools.

xAI is positioning ACP as a competitor to both MCP and Google’s Agent-to-Agent (A2A) protocol. The honest assessment: ACP has a fraction of the adoption of MCP (97 million+ downloads vs. zero published adoption numbers for ACP). For most developers, ACP is a specification to watch, not something to build on yet. The MCP ecosystem’s lead is significant — it has real servers for every major database, cloud provider, and developer tool. ACP has a spec and a reference implementation.

ACP and MCP are not mutually exclusive. xAI published an ACP-to-MCP bridge adapter in the same release, allowing Grok Build to call existing MCP servers via the bridge. This is pragmatic — it would be impossible to gain traction without MCP compatibility given the ecosystem size.

Pricing Comparison: Grok Build vs. Claude Code vs. Codex

Product	Price	Model	Billing model
Grok Build Beta	Free (beta)	Grok 4.3	API tokens after beta
Grok Build (post-beta est.)	~$25/month or per-token	Grok 4.3	TBD
Claude Code Pro	$20/month	Claude Sonnet 4.6	Flat rate, interactive
Claude Code Max 5x	$100/month	Claude Opus 4.8	Flat rate, interactive
OpenAI Codex Cloud	$0.10/task + tokens	GPT-4.1	Per-task + token billing
GitHub Copilot Agent	$19-39/month + AI Credits	Various	Subscription + token overage

Grok Build’s post-beta pricing has not been announced. The beta is free, which is the expected onboarding strategy. Based on Grok 4.3 API pricing ($1.50/1M input, $5.00/1M output), a developer running moderate usage (10 agentic sessions per day at ~30K tokens each) would spend roughly $65–85 per month on API tokens alone. A flat-rate plan competitive with Claude Code Max 5x ($100/month) would need to be priced around $75–90 to be compelling.

The catch: during beta, heavy users will consume substantial tokens at zero cost. The transition from free beta to paid will be the first real test of whether Grok Build has retention. This is the same challenge Cursor faced at its 1.0 launch.

Where Grok Build Wins

Three areas where Grok Build has a genuine edge in its current form:

Real-time web search without configuration. Grok 4.3’s native search means Grok Build can look up current API documentation, check Stack Overflow for recent bug reports, and find GitHub Issues for unfamiliar packages without any setup. Claude Code requires a search MCP server; Codex has no equivalent. For working with less-documented or rapidly-changing APIs, this is a meaningful productivity advantage.

Larger context window for big codebases. 256K tokens handles larger repositories without the chunking strategies you need with Claude Code’s 200K or Codex’s effective context. On a monorepo test with 400+ files, Grok Build held more of the codebase in active context in a single load. The quality of cross-file reasoning was measurably better on tasks that required understanding relationships between distant parts of the codebase.

Headless local execution with structured output. For developers who want CI/CD integration but cannot use Codex Cloud (data sovereignty, network access restrictions), Grok Build headless runs locally and outputs structured JSON. This is a gap in the current Claude Code offering — claude -p does non-interactive work but output formatting for automated pipeline consumption requires custom prompt engineering.

Where Grok Build Falls Short

Equally honest assessment of the weaknesses:

SWE-bench gap vs. Claude Opus 4.8. 79.4% vs. 88.6% is a real gap on complex autonomous coding tasks. If you are running long multi-file refactors with test-driven iteration, Claude Code on Opus 4.8 is the better choice today. The 9-point benchmark difference reflects meaningful real-world performance differences on hard tasks.

No MCP ecosystem (native).strong> The ACP-to-MCP bridge works but adds latency and an abstraction layer. Developers with established MCP server configurations for their tools (databases, GitHub, Slack, cloud providers) get zero benefit from switching to Grok Build — they are running through a bridge rather than native protocol support. Claude Code’s MCP integration is direct and battle-tested.

Beta stability. Multiple early access users reported Grok Build crashing mid-session when the context window approaches capacity and the TUI renderer is processing large diffs simultaneously. This is expected beta behavior but worth knowing before migrating production workflows.

No CLAUDE.md equivalent (yet). Grok Build does not have a project-level configuration file that persists instructions across sessions. You can set preferences in a .grokbuild.json config file, but it is less expressive than CLAUDE.md and not documentation-aware (it does not read it as context the way Claude Code reads CLAUDE.md). This is listed as a roadmap item for Q3 2026.

Which Tool for Which Job

The practical decision matrix for developers evaluating all three:

Use Claude Code if: you do most of your development in the terminal, you have an existing MCP server setup, you value instruction-following fidelity and the CLAUDE.md project configuration system, or you need the highest SWE-bench accuracy for complex autonomous tasks.

Use Grok Build if: you work with poorly-documented APIs where real-time search matters, you are on a team with a large monorepo that strains 200K context limits, you need local headless CI/CD without sending code to a cloud service, or you want to try the current beta at zero cost.

Use Codex Cloud if: you want cloud-managed agent execution with no local infrastructure, you are primarily using GitHub workflows, or you want per-task billing rather than subscription pricing.

These are not mutually exclusive choices. Claude Code for active development and Grok Build headless for CI review is a reasonable hybrid. Most serious developers will end up with all three installed and route different task types to different tools.

For AI coding agent tools and starter kit templates, browse the developer tools collection at WOWHOW. The AI API cost calculator models Grok 4.3 pricing alongside Claude and GPT-4.1 for workload cost planning.

People Also Ask

Is Grok Build free to use?

Yes, during beta (launched June 5, 2026). Beta access requires an xAI API key and invitation from the waitlist. Post-beta pricing has not been announced. Based on Grok 4.3 token costs, expect either a flat subscription around $75–100/month or per-token billing that runs $50–100/month for moderate use.

How does Grok Build’s context window compare to Claude Code?

Grok 4.3 has a 256K token context window versus Claude Sonnet 4.6’s 200K and Claude Opus 4.8’s 200K. For large codebases, Grok Build can hold more files in active context before needing to chunk or reload. For typical projects under 200K tokens, the difference is not meaningful in practice.

Does Grok Build support MCP servers?

Not natively. Grok Build uses xAI’s Agent Client Protocol (ACP) natively. An ACP-to-MCP bridge adapter was released alongside the beta, allowing Grok Build to call existing MCP servers through the bridge. Performance and reliability of the bridge in production use is not yet established.

What is Agent Client Protocol and how is it different from MCP?

MCP (Model Context Protocol) connects models to tools — databases, APIs, file systems, external services. ACP (Agent Client Protocol) connects agents to other agents — it is designed for multi-agent coordination where one agent delegates tasks to specialized sub-agents and coordinates results. They solve different problems and are not direct competitors in function, though both xAI and Anthropic are positioning their protocols as infrastructure standards for the broader AI tooling ecosystem.

Tags:Grok BuildxAITerminal Coding AgentClaude CodeOpenAI Codex

All Articles

Written by

WOWHOW

The WOWHOW team brings 14+ years of production engineering experience. Every tool and product in the catalog is personally built, tested, and curated.

Monday Memo · Free

One insight, every Monday. 7am IST. Zero fluff.

1 field report, 3 links, 1 tool we actually use. No fluff, no spam.

Need production-ready templates?

Free browser tools with no signup, plus 2,000+ premium dev templates and starter kits.

Try Free Tools Browse Products

Comments · 0

Beta: comments are stored locally on your device and not visible to other readers.

Sign in to join the conversation

No comments yet. Be the first to share your thoughts.

What Grok Build Is

Grok 4.3 Under the Hood

Headless CI/CD Mode

Agent Client Protocol (ACP)

Pricing Comparison: Grok Build vs. Claude Code vs. Codex

Where Grok Build Wins

Where Grok Build Falls Short

Which Tool for Which Job

People Also Ask

Is Grok Build free to use?

How does Grok Build’s context window compare to Claude Code?

Does Grok Build support MCP servers?

What is Agent Client Protocol and how is it different from MCP?

One insight, every Monday. 7am IST. Zero fluff.

Need production-ready templates?

Comments · 0

Topics

Article stats

Try Our Free Tools

JSON Formatter & Validator

cURL to Code Converter

Regex Playground

Base64 Encoder / Decoder

UUID Generator

More from AI Tools & Tutorials

CLAUDE.md Rules That Survive Production: What a Year Taught Us

Best Supabase + Next.js Starter Kits in 2026 (Auth, Stripe, SaaS)

gstack Review 2026: What Garry Tan's Stack Doesn't Cover

We Packaged the Claude Code Config That Runs a Real Store

How to Write Suno Prompts That Work: Style, Tags & Structure

GST 2.0 Rate Changes: Old vs New Rates on 170+ Items (2026)