Grok Build

xAI

Autonomous Agents

Watch

45.0/100

Continue the conversation — chat opens pre-seeded with the current signal, caps, and movement.

xAI's CLI coding agent — 8 parallel subagents in isolated Git worktrees, Plan Mode (on by default), native MCP + Connectors (GitHub/Linear from the CLI), ACP custom orchestration, and headless mode (-p) for CI/CD scripting. Local-first design (no source code transmitted; air-gap capable). Runs the purpose-built grok-build-0.1 model (256K context, released May 20, 2026).

Target user: SuperGrok ($30/mo) and X Premium+ ($40/mo) subscribers — and SuperGrok Heavy ($299/mo) — who want xAI-native coding agents and are willing to accept significant organizational risk.

Launched May 14, 2026 (Heavy-only beta); went GA to all SuperGrok/X Premium+ May 25. Architecture is genuinely interesting and integration improved this cycle — but organizational stability and capability validation are not there. This is a `Tracked` watch position, not a recommendation.

AI Autonomy

9/20

Integration

10/20

Contextual Understanding

10/20

Compliance

9/20

Viability

6/20

User Interface

10/20

Adoption & Proof Points

**Launched May 14, 2026** (SuperGrok Heavy beta, $299/mo); **GA May 25** to all SuperGrok ($30/mo) and X Premium+ ($40/mo); $99/mo SuperHeavy promo for first 6 months
**Dedicated backing model:** grok-build-0.1 (256K context, purpose-built for agentic coding) released May 20; prior grok-code-fast-1 deprecated May 15
**Documented capability surface:** docs.x.ai/build covers headless mode (-p), custom models; native MCP out-of-the-box, AGENTS.md, and Connectors (GitHub/Linear usable from the CLI)
**Active development:** daily release notes, v0.1.218 stability fixes, ~5 major launches in two weeks; ~4M views on the official how-to within hours
**Benchmark gap:** grok-build-0.1 has NO independent benchmarks (BenchLM 'coming soon'); the 70.8% SWE-bench figure is vendor-internal and from the deprecated grok-code-fast-1 (rivals ~17pts higher)
**No named enterprise customers** for Grok Build specifically

Risks & Limitations

**Severe organizational risk:** all 11 original co-founders departed; 50+ staff exits since SpaceX acquisition (to Meta/Thinking Machines); pre-training lead Juntang Zhuang departed and the pre-training group 'shrunk to a handful'; xAI dissolved into SpaceXAI (finalized May 6–7, 2026); SpaceX absorbed a $4.94B loss on the merger
**Capability validation gap:** grok-build-0.1 has zero independent benchmark validation; the only public 70.8% SWE-bench figure is vendor-internal and belongs to the now-deprecated grok-code-fast-1
**`acquisition-uncertainty` cap:** compliance capped at 12; post-SpaceX-IPO product roadmap for Grok Build unclear
**`unvalidated-benchmarks` cap:** autonomy capped at 14; rests on absence of any independent validation of the current model
**Enterprise governance not on the CLI:** SSO/RBAC/audit/Vault are on the separate Grok Business/Enterprise product, not the Grok Build CLI
**Viability 6 (High Risk):** forward-dated positives — SpaceX IPO (~June 12, 2026) provides capital and V9-Medium coding model is due mid-June — but the coding-team talent drain keeps the floor at 6
**Arena Mode not live in beta; macOS/Linux only at launch (Windows not yet); no first-party VS Code/IDE integration confirmed**

Capabilities & Integration

**Autonomy (12, Agentic floor):** 8 parallel subagents (plan/search/build) in isolated Git worktrees, Plan Mode on by default, ACP-orchestrated workflows, headless operation, MCP tool use. Architecture is documented and real — but cap holds at 12 pending any independent benchmark validation of grok-build-0.1.

**Integration (12, Team-Aware floor):** native MCP out-of-the-box + AGENTS.md; CLI-consumable Connectors — GitHub (repos/issues/PRs, code search, PR summarize/review) and Linear (issues-as-context, auto-scan repo, propose changes in Plan Mode); ACP for custom orchestration; headless (-p) CI/CD scriptability; isolated-worktree execution. Gap to higher Team-Aware: no confirmed PR creation/merge write-back, no Slack/Teams.

**Context (12, Repository-Deep floor):** grok-build-0.1 with 256K-token context (corrected from the prior eval's 2M figure); local-first file-system access; CLI codebase traversal; worktree isolation. No cross-repo or persistent-memory awareness.

**Interface (11, upper Dual-Mode):** CLI primary; scriptable headless mode (-p); browser sign-in for auth. Arena Mode NOT live in beta; no first-party VS Code/IDE integration confirmed (community VSCode extensions only); no programmatic API surface, no mobile or standalone end-user web app. Not yet Multi-Platform.