DeepSeek Releases OCR-2 with Visual Causal Flow Architecture
Published 2026-01-27Foundation ModelsLow
Summary
DeepSeek released DeepSeek-OCR 2, a 3B-parameter model for document understanding and image-to-text extraction. The model introduces DeepEncoder V2, a new vision encoder architecture using "Visual Causal Flow" — breaking from the traditional fixed left-to-right, top-to-bottom scanning order to instead dynamically rearrange image segments based on semantic meaning, mimicking how humans naturally read complex documents. Performance: The model scored 91.09% on OmniDocBench v1.5 document understand
Alignment: Neutral
deepseekocrdocument-understandingvisionvisual-causal-flowopen-sourcechinamultimodaldocument-processingefficiency