About Subquadratic (SubQ)
Overview
SubQ is the first large language model built on a fully sub-quadratic sparse-attention (SSA) architecture, designed specifically for long-context reasoning tasks. Unlike transformer-based models that scale attention compute at O(n²), SubQ scales at O(n), reducing attention compute by nearly 1,000× at 12M tokens.
SubQ targets developers, enterprise teams, and coding agents that need to reason across entire codebases, months of pull request history, and long-running agent state — all in a single prompt call at a fraction of the cost of comparable frontier models.
Key Benefits
- 12M token context window allows reasoning across full repositories and extended agent histories in one prompt.
- Runs at 150 tokens per second with costs at one-fifth of other leading LLMs.
- Scores 95.6% on RULER @ 128K, outperforming Claude Opus 4.6 (94.8%) on long-context accuracy.
- Scores 86.2% on MRCR v2 (8-needle, 1M), surpassing all listed competitors including Claude Opus 4.7 (74.0%).
- All benchmark results are third-party validated.
- SubQ Code delivers approximately 25% lower bills and 10× faster codebase exploration for coding agents.
How It Works
SubQ's sparse-attention architecture identifies only the fraction of token relationships that matter and focuses compute exclusively on those, rather than evaluating every possible pair. Developers access SubQ via an OpenAI-compatible API endpoint, submitting full repositories or pipeline states in a single call with streaming and tool use support. Coding agent users install SubQ Code with a one-line command, and it automatically redirects expensive model turns to SubQ for token-heavy queries.
Use Cases
- Backend engineers processing full Python or JavaScript repositories through the API to get codebase-wide answers in one call.
- Coding agent developers integrating SubQ Code into Claude Code, Codex, or Cursor to cut inference costs on large context lookups.
- Enterprise ML teams running multi-step agentic pipelines that require persistent state across long task histories.
- DevOps teams analyzing months of pull request history — for example, six months of React PRs — within a single prompt.
- AI infrastructure researchers evaluating sub-quadratic architectures as a replacement for transformer-based long-context models.
Why Choose This Product
SubQ is best suited for teams and agents that regularly hit the context limits or cost ceilings of existing frontier models like GPT-5 or Claude Opus when working with large codebases or long task histories. The product is currently in private early access, so availability is limited to approved API and Code preview participants.
Subquadratic (SubQ)Pros & Cons
- 12M token context window — largest explicitly stated in the provided content
- Sub-quadratic O(n) attention reduces compute ~1,000× vs transformers at 12M tokens
- Benchmark results are third-party validated
- OpenAI-compatible API endpoints enable easy migration
- SubQ Code installs in one line and plugs into major coding agent tools
- Currently in private early access — not publicly available
- No public pricing listed; requires sales contact or access request
- Technical report not yet published (listed as coming soon)
Key Features
12M Token Context
SubQ supports a 12M token context window, enabling reasoning across entire codebases, six months of PR history, and long agent states in a single prompt.
Sub-Quadratic Architecture
SubQ uses a fully sub-quadratic sparse-attention (SSA) architecture that scales at O(n) instead of O(n²), reducing attention compute by nearly 1,000× at 12M tokens.
OpenAI-Compatible API
The SubQ API exposes OpenAI-compatible endpoints with streaming and tool use support, allowing developers to integrate without changing existing tooling.
SubQ Code Integration
SubQ Code is a one-line installable layer for coding agents that plugs into Claude Code, Codex, and Cursor to handle token-heavy queries at lower cost.
1/5 Cost of Frontier Models
SubQ is priced at one-fifth the cost of other leading LLMs, and SubQ Code specifically delivers approximately 25% lower bills for coding agent workflows.
Third-Party Benchmarks
SubQ's benchmark results on SWE-Bench Verified, RULER @ 128K, and MRCR v2 are third-party validated, with 95.6% on RULER @ 128K and 86.2% on MRCR v2 at 1M tokens.
Auto-Redirect Model Turns
SubQ Code automatically redirects expensive model turns from frontier models to SubQ for token-heavy questions, reducing cost without manual intervention.
150 Tokens Per Second
SubQ generates output at 150 tokens per second, maintaining high throughput even at extreme context lengths.
Frequently asked questions about Subquadratic (SubQ)
What is SubQ's context window size?
SubQ supports a 12M token context window, which is large enough to hold the entire Python 3.13 standard library (approximately 5.1M tokens) or six months of React pull requests (approximately 7.5M tokens) in a single prompt.
How does SubQ compare to GPT-5 and Claude on benchmarks?
On RULER @ 128K, SubQ scores 95.6%, outperforming Claude Opus 4.6 at 94.8%. On MRCR v2 (8-needle, 1M), SubQ scores 86.2%, ahead of Claude Opus 4.7 (74.0%) and GPT-5.5 (74.0%). All SubQ results are third-party validated.
How do I use SubQ with my existing coding agent setup?
SubQ Code is available as a one-line install that plugs directly into Claude Code, Codex, and Cursor. It automatically redirects expensive token-heavy model turns to SubQ, delivering approximately 25% lower bills and 10× faster codebase exploration.
Is SubQ publicly available?
SubQ is currently in private early access. Developers and teams can request API access or SubQ Code access via the website. Enterprise teams can contact the sales team directly.
Is Subquadratic (SubQ) free?
Subquadratic (SubQ) is a commercial product and pricing is provided on request. Contact the vendor for a quote.
What platforms does Subquadratic (SubQ) support?
Subquadratic (SubQ) is available on: Web.
How Subquadratic (SubQ) compares
Subquadratic (SubQ)This | ||||
|---|---|---|---|---|
| Starting price | Contact sales | $9/month | — | Free |
| Pricing model | Contact sales | Freemium | Freemium | Free |
| Platforms | Web | Web | Web, macOS | macOS, iOS, Web |
| Top features |
|
|
|
|
| Rating | — | — | — | — |
