Join the conversation on AI models, pricing, and tools. Price Per Token Community

|Follow:

Arcee AI News

Latest Arcee AI AI news and updates. Model releases, announcements, benchmarks, and developments. Updated daily.

All categories

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community

News Feed

Apr 7

I can’t help rooting for tiny open source AI model maker Arcee

Arcee is a tiny 26-person U.S. startup that built a high-performing, massive, open source LLM. And it's gaining popularity with OpenClaw users.

TechCrunch AI·4/7/2026·Arcee AI Open Source

Apr 2

[AINews] A quiet April Fools

a quiet day

Latent Space·4/2/2026·OpenRouter Nousresearch Open Source Benchmarks

Apr 1

Arcee AI: Trinity Large Thinking (arcee-ai/trinity-large-thinking)

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. It is free in open claw for the first five days. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7

OpenRouter·4/1/2026·Arcee AI Open Source Benchmarks

Jan 30

[AINews] SpaceXai Grok Imagine API - the #1 Video Model, Best Pricing and Latency

xAI cements its position as a frontier lab and prepares to merge with SpaceX

Latent Space·1/30/2026·Anthropic Xai Open Source Benchmarks

Jan 28

Tiny startup Arcee AI built a 400B open source LLM from scratch to best Meta’s Llama

30-person startup Arcee AI has released a 400B model called Trinity, which it says is one of the biggest open source foundation models from a US company.

TechCrunch AI·1/28/2026·Arcee AI Meta-llama Open Source Release

[AINews] Moonshot Kimi K2.5 - Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager

China takes another huge leap ahead in open models

Latent Space·1/28/2026·OpenRouter Anthropic Open Source Benchmarks

Jan 27

Arcee AI: Trinity Large Preview (free) (arcee-ai/trinity-large-preview)

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing, storytelling, role-play, chat scenarios, and real-time voice assistance, better than your average reasoning model usually can. But we’re also introducing some of our newer agentic performance. It was trained to navigate well in agent harnesses like OpenCode, Cline, and Kilo Code, and to handle complex toolchains and long, constraint-filled prompts. The architecture natively supports very long context windows up to 512k tokens, with the Preview API currently served at 128k context using 8-bit quantization for practical deployment. Trinity-Large-Preview reflects Arcee’s efficiency-first design philosophy, offering a production-oriented frontier model with open weights and permissive licensing suitable for real-world applications and experimentation.

OpenRouter·1/27/2026·Arcee AI Open Source Release

Dec 1

Arcee AI: Trinity Mini (free) (arcee-ai/trinity-mini)

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.

OpenRouter·12/1/2025·Arcee AI Open Source Release

Sep 16

Arcee AI: AFM 4.5B (arcee-ai/afm-4.5b)

AFM-4.5B is a 4.5 billion parameter instruction-tuned language model developed by Arcee AI. The model was pretrained on approximately 8 trillion tokens, including 6.5 trillion tokens of general data and 1.5 trillion tokens with an emphasis on mathematical reasoning and code generation.

OpenRouter·9/16/2025·Arcee AI Open Source Release

May 5

Arcee AI: Caller Large (arcee-ai/caller-large)

Caller Large is Arcee's specialist "function‑calling" SLM built to orchestrate external tools and APIs. Instead of maximizing next‑token accuracy, training focuses on structured JSON outputs, parameter extraction and multi‑step tool chains, making Caller a natural choice for retrieval‑augmented generation, robotic process automation or data‑pull chatbots. It incorporates a routing head that decides when (and how) to invoke a tool versus answering directly, reducing hallucinated calls. The model is already the backbone of Arcee Conductor's auto‑tool mode, where it parses user intent, emits clean function signatures and hands control back once the tool response is ready. Developers thus gain an OpenAI‑style function‑calling UX without handing requests to a frontier‑scale model.

OpenRouter·5/5/2025·Arcee AI OpenAI Coding

Arcee AI: Spotlight (arcee-ai/spotlight)

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 32 k‑token context window, enabling rich multimodal conversations that combine lengthy documents with one or more images. Training emphasized fast inference on consumer GPUs while retaining strong captioning, visual‐question‑answering, and diagram‑analysis accuracy. As a result, Spotlight slots neatly into agent workflows where screenshots, charts or UI mock‑ups need to be interpreted on the fly. Early benchmarks show it matching or out‑scoring larger VLMs such as LLaVA‑1.6 13 B on popular VQA and POPE alignment tests.

OpenRouter·5/5/2025·Arcee AI Qwen Open Source Benchmarks

Arcee AI: Maestro Reasoning (arcee-ai/maestro-reasoning)

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B preview, the production 32 B release widens the context window to 128 k tokens and doubles pass‑rate on MATH and GSM‑8K, while also lifting code completion accuracy. Its instruction style encourages structured "thought → answer" traces that can be parsed or hidden according to user preference. That transparency pairs well with audit‑focused industries like finance or healthcare where seeing the reasoning path matters. In Arcee Conductor, Maestro is automatically selected for complex, multi‑constraint queries that smaller SLMs bounce.

OpenRouter·5/5/2025·Arcee AI Qwen Open Source Benchmarks

Arcee AI: Virtuoso Large (arcee-ai/virtuoso-large)

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k context inherited from Qwen 2.5, letting it ingest books, codebases or financial filings wholesale. Training blended DeepSeek R1 distillation, multi‑epoch supervised fine‑tuning and a final DPO/RLHF alignment stage, yielding strong performance on BIG‑Bench‑Hard, GSM‑8K and long‑context Needle‑In‑Haystack tests. Enterprises use Virtuoso‑Large as the "fallback" brain in Conductor pipelines when other SLMs flag low confidence. Despite its size, aggressive KV‑cache optimizations keep first‑token latency in the low‑second range on 8× H100 nodes, making it a practical production‑grade powerhouse.

OpenRouter·5/5/2025·Arcee AI Deepseek Open Source Benchmarks

Arcee AI: Coder Large (arcee-ai/coder-large)

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file refactoring or long diff review in a single call, and understands 30‑plus programming languages with special attention to TypeScript, Go and Terraform. Internal benchmarks show 5–8 pt gains over CodeLlama‑34 B‑Python on HumanEval and competitive BugFix scores thanks to a reinforcement pass that rewards compilable output. The model emits structured explanations alongside code blocks by default, making it suitable for educational tooling as well as production copilot scenarios. Cost‑wise, Together AI prices it well below proprietary incumbents, so teams can scale interactive coding without runaway spend.

OpenRouter·5/5/2025·Arcee AI Qwen Open Source Benchmarks

Arcee AI: Virtuoso Medium V2 (arcee-ai/virtuoso-medium-v2)

Virtuoso‑Medium‑v2 is a 32 B model distilled from DeepSeek‑v3 logits and merged back onto a Qwen 2.5 backbone, yielding a sharper, more factual successor to the original Virtuoso Medium. The team harvested ~1.1 B logit tokens and applied "fusion‑merging" plus DPO alignment, which pushed scores past Arcee‑Nova 2024 and many 40 B‑plus peers on MMLU‑Pro, MATH and HumanEval. With a 128 k context and aggressive quantization options (from BF16 down to 4‑bit GGUF), it balances capability with deployability on single‑GPU nodes. Typical use cases include enterprise chat assistants, technical writing aids and medium‑complexity code drafting where Virtuoso‑Large would be overkill.

OpenRouter·5/5/2025·Arcee AI Deepseek Open Source Benchmarks

Arcee AI: Arcee Blitz (arcee-ai/arcee-blitz)

Arcee Blitz is a 24 B‑parameter dense model distilled from DeepSeek and built on Mistral architecture for "everyday" chat. The distillation‑plus‑refinement pipeline trims compute while keeping DeepSeek‑style reasoning, so Blitz punches above its weight on MMLU, GSM‑8K and BBH compared with other mid‑size open models. With a default 128 k context window and competitive throughput, it serves as a cost‑efficient workhorse for summarization, brainstorming and light code help. Internally, Arcee uses Blitz as the default writer in Conductor pipelines when the heavier Virtuoso line is not required. Users therefore get near‑70 B quality at ~⅓ the latency and price.

Arcee AI News

News Feed

I can’t help rooting for tiny open source AI model maker Arcee

[AINews] A quiet April Fools

Arcee AI: Trinity Large Thinking (arcee-ai/trinity-large-thinking)

[AINews] SpaceXai Grok Imagine API - the #1 Video Model, Best Pricing and Latency

Tiny startup Arcee AI built a 400B open source LLM from scratch to best Meta’s Llama

[AINews] Moonshot Kimi K2.5 - Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager

Arcee AI: Trinity Large Preview (free) (arcee-ai/trinity-large-preview)

Arcee AI: Trinity Mini (free) (arcee-ai/trinity-mini)

Arcee AI: AFM 4.5B (arcee-ai/afm-4.5b)

Arcee AI: Caller Large (arcee-ai/caller-large)

Arcee AI: Spotlight (arcee-ai/spotlight)

Arcee AI: Maestro Reasoning (arcee-ai/maestro-reasoning)

Arcee AI: Virtuoso Large (arcee-ai/virtuoso-large)

Arcee AI: Coder Large (arcee-ai/coder-large)

Arcee AI: Virtuoso Medium V2 (arcee-ai/virtuoso-medium-v2)

Arcee AI: Arcee Blitz (arcee-ai/arcee-blitz)

Tools

Directories

Models & Pricing

Endpoints

Rankings

News