✨ Why Ollama in 2026?
- 100% Free: No API costs, unlimited requests
- Privacy-First: Everything runs on your machine
- Latest Models: Llama 3.2, Mistral, Gemma, Phi-3
- LangChain Integration: Works with all major frameworks
What is Ollama?
Ollama lets you run large language models locally on your Mac, PC, or Linux machine. No API costs, no rate limits, complete privacy. Perfect for development, testing, and production workloads that need data privacy.
Step 1: Install Ollama
macOS:
# Download from ollama.com or use Homebrew
brew install ollama
# Start Ollama
ollama serve
Windows:
# Download installer from ollama.com
# Run installer
# Ollama starts automatically
Linux:
# One-line install
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama
ollama serve
Step 2: Download Models
Popular Models (2026):
# Llama 3.2 (Meta) - Best all-around
ollama pull llama3.2
# Llama 3.2 3B - Lightweight, fast
ollama pull llama3.2:3b
# Mistral - Great for reasoning
ollama pull mistral
# Gemma 2 (Google) - Creative tasks
ollama pull gemma2
# Phi-3 (Microsoft) - Code specialist
ollama pull phi3
# Nomic Embed - For RAG/vector search
ollama pull nomic-embed-text
Check Downloaded Models:
ollama list
Step 3: Test Ollama
Chat Mode:
# Start interactive chat
ollama run llama3.2
# Type your message, press Enter
# Type /bye to exit
Single Query:
# One-off query
ollama run llama3.2 "What is AI orchestration?"
Step 4: Use with LangChain
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Initialize local model
llm = ChatOllama(
model="llama3.2",
base_url="http://localhost:11434"
)
# Create prompt
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("user", "{input}")
])
# Create chain
chain = prompt | llm | StrOutputParser()
# Run
response = chain.invoke({"input": "Explain AI orchestration"})
print(response)
🔧 Troubleshooting
Connection Refused
Make sure Ollama is running: ollama serve
Model Not Found
Download the model first: ollama pull llama3.2
Slow Performance
Try smaller models: llama3.2:3b or phi3
🎉 Ollama is Ready!
You now have free, private AI running locally. Use it with LangChain, CrewAI, or any orchestration framework. Check the AI Models page for more local model options.