Join the conversation on AI models, pricing, and tools. Price Per Token Community

|Follow:

Qwen News

Latest Qwen AI news and updates. Model releases, announcements, benchmarks, and developments. Updated daily.

All categories

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community

News Feed

May 1

[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work

a quiet day lets us reflect on coding agents "breaking containment"

Latent Space·5/1/2026·Anthropic Tencent Open Source Benchmarks

Apr 30

This startup’s new mechanistic interpretability tool lets you debug LLMs

The San Francisco–based startup Goodfire just released a new tool, called Silico, that lets researchers and engineers peer inside an AI model and adjust its parameters—the settings that determine a model’s behavior—during training. This could give model makers more fine-grained control over how this technology is built than was once thought possible. Goodfire claims Silico…

MIT Technology Review AI·4/30/2026·OpenAI Google Open Source Coding

Apr 29

[AINews] not much happened today

a quiet day.

Latent Space·4/29/2026·OpenRouter Deepseek Open Source Benchmarks

Apr 23

[AINews] Tasteful Tokenmaxxing

a quiet day lets us reflect on the top conversation that AI leaders are having everywhere.

Latent Space·4/23/2026·Xiaomi Qwen Open Source Benchmarks

Apr 22

[AINews] OpenAI launches GPT-Image-2

with Cursor getting a $10B contract with xAI and a right to acquire for $60B.

Latent Space·4/22/2026·Anthropic OpenAI Open Source Benchmarks

Apr 21

China’s open-source bet

Silicon Valley AI companies follow a familiar playbook: Keep the secret sauce behind an API, and charge for every drop. China’s leading AI labs are playing a different game: They ship models as downloadable “open-weight” packages. This lets developers adapt the models and run them on their own hardware to build products without negotiating a…

MIT Technology Review AI·4/21/2026·Z-ai Hugging Face Open Source Benchmarks

[AINews] Moonshot Kimi K2.6: the world's leading Open Model refreshes to catch up to Opus 4.6 (ahead of DeepSeek v4?)

Yay Kimi!!!

Latent Space·4/21/2026·OpenRouter Anthropic Open Source Benchmarks

Apr 20

Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances

Today, we are thrilled to announce the availability of G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Amazon SageMaker AI. You can provision nodes with 1, 2, 4, and 8 RTX PRO 6000 GPU instances, with each GPU providing 96 GB of GDDR7 memory. This launch provides the capability to use a single-node GPU, G7e.2xlarge instance to host powerful open source foundation models (FMs) like GPT-OSS-120B, Nemotron-3-Super-120B-A12B (NVFP4 variant), and Qwen3.5-35B-A3B, offering organizations a cost-effective and high-performing option.

AWS Machine Learning·4/20/2026·Nvidia Amazon Open Source Benchmarks

Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4

At what point do the financial markets price in the singularity?

Import AI·4/20/2026·Meta-llama Qwen Open Source Benchmarks

Apr 18

[AINews] The Two Sides of OpenClaw

a quiet day lets us reflect on openclaw this week.

Latent Space·4/18/2026·Anthropic Qwen Open Source Benchmarks

Apr 15

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM

In this post, you will learn how speculative decoding works and why it helps reduce cost per generated token on AWS Trainium2.

AWS Machine Learning·4/15/2026·Aws Amazon Open Source Benchmarks

Apr 14

Use-case based deployments on SageMaker JumpStart

We're excited to announce the launch of Amazon SageMaker JumpStart optimized deployments. SageMaker JumpStart improved deployments address the need for rich and straightforward deployment customization on SageMaker JumpStart by offering pre-defined deployment configurations, designed for specific use cases. Customers maintain the same level of visibility into the details of their proposed deployments, but now deployments are optimized for their specific use case and performance constraint.

AWS Machine Learning·4/14/2026·Aws Meta-llama Open Source Speed

[AINews] Top Local Models List - April 2026

a quiet day lets us check in on the local models scene

Latent Space·4/14/2026·Meta-llama Qwen Open Source Benchmarks

Apr 10

[AINews] AI Engineer Europe 2026

Two quiet days in a row let us reflect on the first AIE in London.

Latent Space·4/10/2026·Z-ai Anthropic Open Source Benchmarks

Apr 7

[AINews] Gemma 4 crosses 2 million downloads

a quiet day lets us give due respect to the enormously successful Gemma 4 launch

Latent Space·4/7/2026·Nousresearch Qwen Open Source Hardware

Apr 6

Accelerate agentic tool calling with serverless model customization in Amazon SageMaker AI

In this post, we walk through how we fine-tuned Qwen 2.5 7B Instruct for tool calling using RLVR. We cover dataset preparation across three distinct agent behaviors, reward function design with tiered scoring, training configuration and results interpretation, evaluation on held-out data with unseen tools, and deployment.

AWS Machine Learning·4/6/2026·Meta-llama Qwen Open Source Coding

AI is changing how small online sellers decide what to make

For years Mike McClary sold the Guardian LTE Flashlight, a heavy-duty black model, online through his small outdoor brand. The product, designed for brightness and durability, became one of his most popular items ever. Even after he stopped offering it around 2017, customers kept sending him emails asking where they could buy it. When McClary…

MIT Technology Review AI·4/6/2026·Qwen Anthropic Open Source Coding

Apr 3

[AINews] Gemma 4: The best small Multimodal Open Models, dramatically better than Gemma 3 in every way

A welcome update from Google!

Latent Space·4/3/2026·Qwen Google Open Source Benchmarks

Apr 2

Qwen: Qwen3.6 Plus (free) (qwen/qwen3.6-plus)

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers major gains in agentic coding, front-end development, and overall reasoning, with a significantly improved “vibe coding” experience. The model excels at complex tasks such as 3D scenes, games, and repository-level problem solving, achieving a 78.8 score on SWE-bench Verified. It represents a substantial leap in both pure-text and multimodal capabilities, performing at the level of leading state-of-the-art models.