Skip to main content
Cosine Genie logo

Cosine Genie

Cosine

Autonomous Agents
Emerging
59.0/100
Ask about this tool

Continue the conversation — chat opens pre-seeded with the current signal, caps, and movement.

Cosine Genie is a fully autonomous AI software engineer from YC-backed, London-based Cosine (founded 2022; CEO Alistair Pullen, COO Yang Li), differentiated by a purpose-built platform ('no VS Code inside') and, increasingly, by owning its model layer: the Genie agent (GPT-4o-derived, GPT-5 integration since Aug 2025) now sits alongside the Lumen own-model family post-trained for legacy/niche languages (COBOL, Fortran, ABAP, Verilog, Rust, SQL). Strongest differentiator is sovereign deployment -- fully air-gapped/on-prem/VPC, zero-egress -- which anchored selection as a first-cohort partner in the UK's £500M Sovereign AI programme (April 2026) with 500K GPU hours on Isambard-AI, across UK defence primes and CNI/nuclear-deterrent programmes. Multi-agent parallel execution with async background operation positions it as a Devin competitor for regulated autonomous coding. Key risk is scale: ~$6-8M raised / 32 staff against a now-$26B Cognition/Devin that is winning the same defence/regulated logos.

AI Autonomy
14/20
Integration
13/20
Contextual Understanding
11/20
Compliance
12/20
Viability
10/20
User Interface
11/20

Adoption & Proof Points

  • SWE-Lancer: Genie 2.1 solves just over half the published benchmark (~$107K of $250K human-work value, up from ~$88K in Genie 2.0); SWE-Bench 30.1% Full (2,294 pairs) / 43.8% Verified, with outputs publicly released on GitHub for independent verification. Strongest institutional proof point is the UK Sovereign AI cohort selection (April 16, 2026) with 500K GPU hours on Isambard-AI and Sovereign Fund option on a future round; engagement claimed across UK defence primes and CNI/nuclear-deterrent operators (no specific customer names published). Pricing now fully public: Free plan (80 tasks, no credit card, no user cap, up to 100 projects), Hobby $20/seat, Professional $200/seat, Enterprise custom. ~32 employees (Jan 2026); ~$6-8M raised across 16 investors (Lakestar, SOMA, Gaingels, Focal, others). No named non-government enterprise customer or case study. Reddit/HN community volume is thin (niche tool); practitioner reviews generally positive on large-codebase productivity, with a noted learning curve and requests for broader IDE support. For comparison, direct competitor Cognition/Devin now reports $492M ARR with named US Army/Navy, Goldman, Citi, Mercedes, NASA customers.

Risks & Limitations

  • Primary risk is viability against a widening competitive asymmetry: ~$6-8M total funding and 32 staff against Cognition/Devin -- which closed $1B at a $26B valuation (May 27, 2026) on $492M ARR and won US Army/Navy, Goldman Sachs, Citi, Mercedes and NASA, directly contesting Cosine's regulated/defence niche -- plus Codex (OpenAI) and Claude Code (Anthropic; 87.6% independent SWE-Bench Verified vs Cosine ~43.8%). No priced Series A; UK Sovereign Fund holds only an option on a future round. No named non-government enterprise customer or case study. Lumen Outpost/Niche-Bench leadership claims remain vendor-only (no independent validation found). Context is per-repo only -- no cross-repo / monorepo-spanning awareness or persistent cross-session memory. VS Code extension depends on the CLI bridge; CLI has scripting but no confirmed headless CI/CD mode; no JetBrains, no public API, no exposed MCP server. SOC2/ISO self-described rather than independently certified, raising governance questions for regulated buyers. Pricing is now public (Free + Hobby + Pro), removing the prior TCO-opacity concern except for contact-sales Enterprise.

Capabilities & Integration

Genie executes full autonomous software engineering workflows: code retrieval from large codebases, solution planning, multi-file implementation, iterative test execution, and PR creation with evidence. Multi-agent orchestration delegates testing, implementation, documentation, and integration work across parallel agents; native code execution within the platform eliminates external CI dependency. Agent tools include file/keyword search, directory navigation, URL reading, web search, and code execution. The Lumen own-model family (8-step pipeline turning production code into verifiable training trajectories, behavioral RL on scope discipline/honesty/evidence) adds an in-house model layer post-trained for legacy/niche languages alongside Genie + GPT-5. Three interface modes: web platform, VS Code extension (CLI bridge, auto-detect, streamed diffs), and CLI (scripting/chaining, local-to-remote execution, updated May 2026). Integrations: GitHub native (full collab/PR), GitLab/Bitbucket/Git-URL import (SSH/PAT), Jira/Linear/Slack, Vercel, CI workflow monitoring. Per-repo indexing (each repo is a separate project) -- no cross-repo/monorepo spanning. No JetBrains, no confirmed headless CI/CD mode, no public API, no exposed MCP server.

Cosine Genie | Agentic Developer Tools Radar · Signal