
arXiv:2605.31268v1 Announce Type: new Abstract: We present Mellum 2, an open-weight 12B-parameter Mixture-of-Experts (MoE) language model with 2.5B active parameters per token. Mellum 2 is a general-purpose language model specialized in software engineering, spanning code generation and editing, debugging, multi-step reasoning, tool use and function calling, agentic coding, and conversational programming assistance, and it is the successor to the completion-focused 4B dense Mellum model. The architecture builds on the Mixture-of-Experts (64 experts, 8 active) and combines Grouped-Query Attenti
The release of Mellum 2 follows a rapid progression in open-weight language model development, particularly in specialized domains like software engineering, driven by increasing demand for AI assistance in coding.
A more capable open-weight Mixture-of-Experts model specialized in software engineering will accelerate AI adoption and integration within development workflows, setting new benchmarks for efficiency and accessibility.
The availability of Mellum 2 means superior open-source capabilities for code generation, debugging, and agentic coding are accessible, potentially lowering barriers for smaller entities to deploy advanced AI development tools.
- · Software developers
- · Open-source AI community
- · Tech startups
- · Cloud providers
- · Proprietary code AI models with inferior performance
- · Manual software engineering tasks that are easily automated
Increased productivity within software development teams due to enhanced AI assistance across the coding lifecycle.
A commoditization of certain software engineering tasks, shifting focus towards higher-level design and complex problem-solving.
The acceleration of AI agent development, as specialized models like Mellum 2 provide robust foundations for autonomous coding and deployment.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL