Get real-time LLM pricing in Claude Code and Cursor. Try Price Per Token MCP

|Follow:

LLM Speed & Latency News - Inference Performance Updates

AI speed and performance news. Inference optimization, latency improvements, throughput benchmarks, and model efficiency.

Open Source AI News - Llama, Mistral, Qwen & DeepSeek Updates AI Agents News - Autonomous AI, Agentic Systems & Frameworks AI Coding News - Code Generation, Copilot & Developer Tools AI Regulation News - Policy, Laws & Government Oversight All categories

News Feed

Today

Import AI 445: Timing superintelligence; AIs solve frontier math proofs; a new ML research benchmark

Will 2026 be looked back on as the pivotal year for making decisions about the singularity?

Import AI·5h ago·Anthropic Meta-llama Benchmarks Hardware

Feb 13

[AINews] new Gemini 3 Deep Think, Anthropic $30B @ $380B, GPT-5.3-Codex Spark, MiniMax M2.5

There's too much going on!

Latent Space·3d ago·OpenRouter Deepseek Open Source Benchmarks

Feb 12

Owning the AI Pareto Frontier — Jeff Dean

From rewriting Google’s search stack in the early 2000s to reviving sparse trillion-parameter models and co-designing TPUs with frontier ML research, Jeff Dean has quietly shaped nearly every layer of the modern AI stack.

Latent Space·3d ago·Google Open Source Benchmarks

The data behind the design: How Pantone built agentic AI with an AI-ready database

Learn about an AI-powered experience launched as a minimum viable product to gather real user feedback and iterate rapidly. The post The data behind the design: How Pantone built agentic AI with an AI-ready database appeared first on Microsoft Azure Blog .

Microsoft Azure AI·4d ago·Microsoft Open Source Speed

AI is already making online swindles easier. It could get much worse.

Anton Cherepanov is always on the lookout for something interesting. And in late August last year, he spotted just that. It was a file uploaded to VirusTotal, a site cybersecurity researchers like him use to analyze submissions for potential viruses and other types of malicious software, often known as malware. On the surface it seemed…

MIT Technology Review AI·4d ago·Microsoft OpenAI Open Source Coding

Introducing GPT-5.3-Codex-Spark

Introducing GPT-5.3-Codex-Spark—our first real-time coding model. 15x faster generation, 128k context, now in research preview for ChatGPT Pro users.

OpenAI·4d ago·OpenAI Release Coding

[AINews] Z.ai GLM-5: New SOTA Open Weights LLM

We have Opus 4.5 at home

Latent Space·4d ago·Z-ai OpenRouter Open Source Benchmarks

🔬Science at the speed of inference — Gabriele Corso & Jeremy Wohlwend, Boltz

Inside Boltz, AlphaFold’s Legacy, and the Tools Powering Next-Gen Molecular Discovery

Latent Space·4d ago·Open Source Benchmarks

Feb 11

Mastering Amazon Bedrock throttling and service availability: A comprehensive guide

This post shows you how to implement robust error handling strategies that can help improve application reliability and user experience when using Amazon Bedrock. We'll dive deep into strategies for optimizing performances for the application with these errors. Whether this is for a fairly new application or matured AI application, in this post you will be able to find the practical guidelines to operate with on these errors.

AWS Machine Learning·5d ago·Amazon Anthropic Speed

Swann provides Generative AI to millions of IoT Devices using Amazon Bedrock

This post shows you how to implement intelligent notification filtering using Amazon Bedrock and its gen-AI capabilities. You'll learn model selection strategies, cost optimization techniques, and architectural patterns for deploying gen-AI at IoT scale, based on Swann Communications deployment across millions of devices.

AWS Machine Learning·5d ago·Amazon Anthropic Speed

[AINews] Qwen Image 2 and Seedance 2

Strong generative media showings from China

Latent Space·5d ago·Anthropic Qwen Open Source Release

Feb 10

Building real-time voice assistants with Amazon Nova Sonic compared to cascading architectures

Amazon Nova Sonicnbsp;delivers real-time, human-like voice conversations through the bidirectional streaming interface. In this post, you learn how Amazon Nova Sonic can solve some of the challenges faced by cascaded approaches, simplify building voice AI agents, and provide natural conversational capabilities. We also provide guidance on when to choose each approach to help you make informed decisions for your voice AI projects.

AWS Machine Learning·6d ago·Amazon Speed

[AINews] "Sci-Fi with a touch of Madness"

a quiet day lets us reflect on a pithy quote from the ClawFather.

Latent Space·6d ago·Anthropic Meta-llama Open Source Benchmarks

Feb 9

Aurora Alpha (openrouter/aurora-alpha)

This is a cloaked model provided to the community to gather feedback. A reasoning model designed for speed. It is built for coding assistants, real-time conversational applications, and agentic workflows. Default reasoning effort is set to medium for fast responses. For agentic coding use cases, we recommend changing effort to high. Note: All prompts and completions for this model are logged by the provider and may be used to improve the model.

OpenRouter·2/9/2026·OpenRouter Open Source Coding

Back to all news

Related Pricing

LLM Pricing

PromptLayer