Alibaba
META TierAI ModelsAlibaba

Qwen2.5-Coder

Best local coding model for Ollama. Tops HumanEval at 7B-32B scale. Run Q4_K_M on 8GB VRAM for 40+ tokens/sec. The indie self-host meta.

llmlocalollamacodingqwenopen-sourcefree

Why it matters

Qwen2.5-Coder is the top local coding model for Ollama. Run with `ollama run qwen2.5-coder:7b-q4_K_M` on 8GB VRAM. Tops HumanEval at the 7B-14B scale. Best for local agentic stacks with Goose.

Specifications

LocalOllama
QuantQ4_K_M / Q5_K_M
VRAM8-16 GB
Speed40+ tokens/sec

Strengths

  • Completely free — run on your own hardware
  • Tops HumanEval at 7B-14B scale, punches well above its weight
  • Q4_K_M quantization: 95-97% quality at 4-8GB RAM, 40+ tokens/sec
  • Pairs with Goose/Aider for a fully local agentic stack

Trade-offs

  • Slower than cloud APIs on commodity hardware
  • Needs 8GB+ VRAM for good performance

Ask AI

Ask about Qwen2.5-Coder

Alternatives in AI Models

See all
AI History

No searches yet