Maybe AI Agents Can Be Lawyers After All

Published 2026-02-06Agentic AIHigh

Summary

TechCrunch reported on February 6, 2026 that Claude Opus 4.6 significantly advanced performance on Mercor's APEX-Agents Leaderboard, a benchmark measuring AI agents' capabilities on professional tasks including legal work and corporate analysis. Previous frontier models had all scored below 25% on the benchmark, leading analysts to conclude that professional legal work remained out of reach for AI agents. Opus 4.6 scored just under 30% in one-shot trials, and reached an average of 45% when given

Alignment: New signal not yet covered

anthropicclaude-opus-4-6mercorapex-agentsbenchmarklegal-aiagentic-aiprofessional-servicesenterprise-ai