Introducing TurboAgents: Supercharge Your Agents with TurboQuant

The next bottleneck in agent systems is not just model quality. It is retrieval cost, memory pressure, and how much context infrastructure starts to hurt once your system grows beyond toy demos. That is why we built TurboAgents.

TurboAgents is a Python package for compressed retrieval and KV-style optimization for agent and RAG systems. It is designed to work with the tools you already use instead of forcing you into a new framework, a new database, or a new architecture. If you already have a vector store, a retrieval layer, or an agent runtime, TurboAgents is meant to sit in that stack and make it more efficient.

TurboAgents is now public. You can find the docs here, the source on GitHub, and the package on PyPI.

Why TurboAgents Exists

Teams building RAG systems and agent workflows usually hit the same pattern:

Retrieval quality starts to matter more as the corpus grows
Memory cost starts to rise faster than expected
Latency gets worse once reranking and broader search are added
Replacing the entire stack is rarely realistic

Most people do not want a new framework. They want something that helps the current one. TurboAgents was built for that exact gap. The goal is to make compressed retrieval practical for real systems while keeping the integration surface small and predictable.

What TurboAgents Is

TurboAgents is a standalone Python package for compressed retrieval, vector reranking, KV-style optimization paths, local benchmarking and evaluation, and adapter-based integration with common vector backends.

Compressed Retrieval

Apply compressed scoring and reranking between your embeddings and final retrieved results, improving quality without replacing your stack.

Vector Reranking

Add a reranking layer on top of retrieved candidates from any supported vector backend for better precision.

KV-Style Optimization

Reduce memory pressure and retrieval cost with optimization paths designed for real-world agent and RAG workloads.

Local Benchmarking

Evaluate retrieval quality and latency tradeoffs with built-in benchmark harnesses and adapter coverage.

It is framework-agnostic infrastructure. That matters. TurboAgents is not a replacement for your orchestration layer. It is not a new agent framework. It is a performance and retrieval layer that can plug into the system you already have.

Who TurboAgents Is For

Agent Framework Builders

Add a retrieval and compression layer under your existing abstractions. Keep your framework surface, gain a more efficient retrieval path.

RAG Application Teams

Better memory efficiency and compressed reranking without rewriting your entire system. Drop it into the stack you already have.

Researchers & Infra Engineers

A concrete package and benchmark harness for experimenting with retrieval quality, long-context behavior, and vector compression tradeoffs.

Local-First AI Builders

Designed around constrained hardware and practical local setups. Keep retrieval costs manageable without sacrificing quality.

Supported Vector Backends

TurboAgents currently supports validated retrieval paths across several vector backends. Keep your current storage choice and still use TurboAgents as a compressed retrieval and reranking layer.

Chroma

FAISS

LanceDB

pgvector

SurrealDB

How TurboAgents Works

At a high level, TurboAgents sits between your embeddings and your final retrieved results. A simple mental model:

Your vector database returns candidate matches

TurboAgents applies compressed scoring and reranking

Your agent or RAG pipeline receives the final results

That sounds simple because it should be simple. The point is not to replace the entire retrieval system. The point is to improve the part that becomes expensive as your system scales.

In practice, TurboAgents can be used as a backend adapter for a supported vector store, as a reranking layer on top of retrieved candidates, as part of an end-to-end framework integration, or as a benchmarkable retrieval surface for evaluating quality and latency tradeoffs.

What Makes TurboAgents Different

A lot of infrastructure projects try to win by asking you to adopt a new stack. TurboAgents takes the opposite approach.

The design principle: keep your framework, keep your vector database, add TurboAgents where retrieval cost and memory pressure start to hurt. That makes adoption much more realistic. It also means TurboAgents can be used incrementally. You do not need a giant migration plan to try it.

Benchmarks and Validation

We did not want TurboAgents to be just a packaging exercise. It needed real validation. The current benchmark coverage includes adapter benchmarks, MLX benchmark runs, pgvector validation, Chroma benchmark coverage, a minimal long-context Needle-style evaluation path, and checked-in benchmark harnesses and summaries.

Current Results

Chroma and FAISS both performed strongly on the validated adapter sweep
pgvector showed a credible higher-bit path
LanceDB, SurrealDB, and other adapters now have real integration and validation coverage
The long-context path is intentionally documented honestly rather than overclaimed

That last point matters. We want TurboAgents to be credible, not inflated.

Getting Started

TurboAgents is built around a simple install path. Check the getting started guide for the full walkthrough.

Core Install

uv add turboagents

Retrieval Extras

uv add "turboagents[rag]"

MLX Extras

uv add "turboagents[mlx]"

Watch the Demo

See TurboAgents in action - installation, CLI benchmarks, adapter integration, and compressed retrieval walkthrough. You can follow along with the turboagents-demo repo.

TurboAgents and SuperOptiX

TurboAgents is a standalone library, but SuperOptiX is the first full reference integration. That is an important distinction. TurboAgents is not tied to SuperOptiX, but SuperOptiX is where we have already integrated it end to end in a real agent framework environment. That makes SuperOptiX the clearest proof that TurboAgents is useful beyond isolated examples.

SuperOptiX now supports TurboAgents-backed retrieval paths including turboagents-chroma, turboagents-lancedb, and turboagents-surrealdb. This means TurboAgents is already wired into a broader agent optimization and orchestration environment, not just exposed as raw backend code.

Why this matters: TurboAgents can operate as standalone infrastructure. TurboAgents can also power retrieval paths inside a real agent system. Adoption does not need to start with framework-by-framework upstream contributions. One strong reference integration is more valuable than a dozen shallow adapters.

Explore SuperOptiX docs, the SuperOptiX GitHub repo, or the SuperOptiX PyPI package.

What You Can Do with TurboAgents Today

Test compressed retrieval locally
Compare supported vector stores
Integrate a compressed reranking path into your RAG stack
Experiment with retrieval quality versus memory tradeoffs
Use the SuperOptiX reference integration for end-to-end agent workflows

If you are already using Chroma, LanceDB, SurrealDB, FAISS, or pgvector, TurboAgents is worth trying directly. If you are already using SuperOptiX, TurboAgents is now a natural retrieval option inside that ecosystem.

What We Are Not Claiming

A launch post should be honest. TurboAgents is public and usable now, but we are not pretending every possible integration and benchmark surface is done. The package is strongest today as a compressed retrieval package, a backend adapter layer, a benchmarkable retrieval system, and a reference-integrated retrieval path inside SuperOptiX. That is already enough to be useful.

Why We Are Launching Now

We are launching now because TurboAgents has crossed the threshold from internal experimentation to something other people can actually install, run, and evaluate. It now has a real package on PyPI, live public docs, validated backend support, benchmark coverage, and a reference integration in SuperOptiX. That is the right point to make it public.

Try TurboAgents

If you want to try it now, start with the docs, the getting started guide, or the benchmarks. The source is on GitHub and the package is on PyPI.

And if you want to see the reference integration: SuperOptiX docs, SuperOptiX GitHub, and SuperOptiX PyPI.

TurboAgents is public now. The right next step is simple: install it, run it against your retrieval stack, and see where it helps.