Ollama
Run large language models locally. One command to pull and run any open model. The meta runtime for local AI on Apple Silicon and CUDA.
Why it matters
Ollama is the meta local LLM runtime. `ollama run qwen2.5-coder:7b-q4_K_M` on 8GB VRAM delivers 40+ tokens/sec for coding tasks. Use Q4_K_M or Q5_K_M quantization for best quality/speed tradeoff.
Specifications
Ask AI
Ask about Ollama
Alternatives in Local Dev
See allSecurely manage secrets, SSH keys, and certificates directly from the terminal. The dev team standard for credential security.
Open-source secrets management platform. Sync secrets across environments and inject them into local dev automatically.
High-performance agent sandbox for secure autonomous coding. VM-based with instant startup, stateful forking, and full system compatibility. Top HN Show April 2026.