Large Genome Model: Open-Source AI Trained on Trillions of DNA Bases
Published 2026-03-04Ingested 2026-03-07Foundation ModelsLow
Summary
Ars Technica reports on a new open-source AI model trained on trillions of DNA base pairs from massive genomic datasets. Building on earlier work such as the Evo system, which was trained on bacterial genomes, this large genome model can identify genes, regulatory sequences, splice sites, and other biologically significant features directly from raw sequence data. The model represents a significant scaling effort in applying transformer-based architectures to biological sequences rather than nat
Alignment: Neutral
genomicsfoundation-modelsopen-sourcebiologydnalarge-language-modelsdomain-specific-ailife-sciencespretraining