Compare and vote on the best tools for running LLMs locally — from desktop apps and CLI tools to web UIs and inference frameworks. Community-ranked by developers.
AI Coding Assistants Leaderboard— Vote for the best AI coding assistant

114 out of our 298 tracked models have had a price change in February.
Get our weekly newsletter on pricing changes, new releases, and tools.
10 models
Name | Category | Subscription | Vote | Score | |
|---|---|---|---|---|---|
O | CLI Tool | $0 | 0 | ||
L | Desktop App | $0 | 0 | ||
J | Desktop App | $0 | 0 | ||
A | Web UI | $0 · From $6.99/mo | 0 | ||
G | Desktop App | $0 | 0 | ||
l | Framework | $0 | 0 | ||
O | Web UI | $0 | 0 | ||
M | Desktop App | $0 · $9.99/mo | 0 | ||
L | Framework | $0 | 0 | ||
P | Desktop App | $0 | 0 |
A local LLM is a large language model that runs entirely on your own hardware — your laptop, desktop, or home server — instead of through a cloud API. By running models locally, you get complete privacy (no data leaves your machine), zero API costs, offline access, and full control over which models you use and how they're configured.
Thanks to advances in model quantization (GGUF, GPTQ) and efficient inference engines like llama.cpp, it's now practical to run capable models on consumer hardware. Tools like Ollama, LM Studio, and Jan make the process as simple as downloading an app and picking a model — no machine learning expertise required.
The right tool depends on your technical comfort level and use case:
The hardware you need depends on the model size. As a rough guide for quantized (Q4) models:
Apple Silicon Macs (M1/M2/M3/M4) are particularly well-suited for local LLMs thanks to their unified memory architecture, which lets models use all available RAM as VRAM. A MacBook Pro with 32GB can comfortably run 30B models.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.