Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

TL;DR

IBM has launched two new multilingual embedding models, Granite Embedding Multilingual R2, under Apache 2.0. These models support over 200 languages, handle longer contexts, and outperform previous open models in retrieval benchmarks. They aim to improve multilingual and code retrieval tasks for enterprise and developer use.

IBM has announced the release of two open-source multilingual embedding models, Granite Embedding Multilingual R2, under the Apache 2.0 license, designed to support over 200 languages with enhanced retrieval capabilities.

The models include a 97 million-parameter compact version and a 311 million-parameter full-size version, both built on the ModernBERT architecture. They support 200+ languages, with explicit training for 52 languages, and can handle context lengths up to 32,768 tokens, a 64-fold increase over previous versions.

Both models are optimized for enterprise deployment, compatible with frameworks like sentence-transformers, LangChain, LlamaIndex, and others, requiring minimal integration effort. They include ONNX and OpenVINO weights for CPU inference, making them suitable for large-scale, real-time applications.

Why It Matters

This release marks a significant advance in open multilingual embedding models, offering high retrieval quality across numerous languages at a smaller size, which benefits cross-lingual search, retrieval-augmented generation, and code retrieval in international and diverse teams. It narrows the gap between performance and efficiency, enabling broader adoption in enterprise AI solutions.

Amazon

multilingual embedding models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Previous models like XLM-RoBERTa had limitations with smaller context windows and less support for many languages. The R2 models are a ground-up rebuild, leveraging recent transformer innovations and more curated training data, including programming code, to improve multilingual retrieval performance. This follows ongoing trends toward more capable, open models for enterprise AI applications.

“The Granite Embedding Multilingual R2 models set new benchmarks for open multilingual retrieval, supporting over 200 languages with enterprise-ready performance.”

— IBM AI Research

“Our models are designed to be easily integrated into existing frameworks, requiring no task-specific instructions and supporting long document contexts.”

— IBM Data Science Team

AI MEMORY SUPREMACY: The 12-Year Supercycle Unlocked by the Inference Era (The Memory Hegemony Series Book 2)

AI MEMORY SUPREMACY: The 12-Year Supercycle Unlocked by the Inference Era (The Memory Hegemony Series Book 2)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how these models perform in real-world enterprise deployments across diverse use cases or how they compare to proprietary models in production environments. Further benchmarks and user feedback are pending.

Transformer-Based LLMs for Long-Context Natural Language Processing: Unleash the True Potential of Your Data (LLMs for Beginners to Experts)

Transformer-Based LLMs for Long-Context Natural Language Processing: Unleash the True Potential of Your Data (LLMs for Beginners to Experts)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include broader adoption by developers and enterprises, integration into AI pipelines, and ongoing benchmarking. IBM may also release updated versions or extensions supporting more languages and tasks.

No-Code AI Tools Explained: Building Intelligent Systems Without Programming (No-Code, Automation & Tools Book 1)

No-Code AI Tools Explained: Building Intelligent Systems Without Programming (No-Code, Automation & Tools Book 1)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What are the main differences between the R1 and R2 models?

The R2 models are built on ModernBERT with a larger context window, improved architecture, and better training data, resulting in higher retrieval scores and support for longer texts.

Can these models be used for code retrieval?

Yes, both models support cross-lingual code retrieval across nine programming languages, making them suitable for software development and related tasks.

Are these models suitable for enterprise deployment?

Yes, they are designed for enterprise use, with compatibility for popular frameworks, CPU-optimized inference options, and compliance with governance standards.

How many languages do these models support?

They support over 200 languages, with explicit training and retrieval optimization for 52 languages.

What is the licensing for these models?

Both models are released under the Apache 2.0 license, allowing open use and modification.

You May Also Like

Zerostack – A Unix-inspired coding agent written in pure Rust

Zerostack is a new coding agent inspired by Unix, developed entirely in Rust, aiming to improve developer productivity and safety.

The State of Self-Driving Cars in 2025: Are We There Yet?

By 2025, self-driving cars are now a common part of daily life,…

Space‑Based Solar Power: Beaming Energy Down to Earth

Space‑based solar power promises continuous energy from orbit, but unlocking its full potential involves overcoming significant technological and safety challenges that require careful exploration.

Unlock a 50% saving on Hoto’s 25-bit electric screwdriver set for PC building and DIY — $29 Amazon deal is on a timer, so act fast

Limited-time deal offers 50% off Hoto’s 25-bit electric screwdriver set for PC building and DIY, dropping price to $29. Act fast, 14 hours left.