Jan is a free, open-source desktop app that lets you run AI models locally on your machine with an OpenAI-compatible API. That means any code using the OpenAI SDK works with Jan — just change the base URL.
No API keys. No usage fees. No data leaving your machine.
Why Use Jan?
- 100% local — your data never leaves your computer
- OpenAI-compatible — drop-in replacement for OpenAI API
- GPU accelerated — uses your NVIDIA/AMD/Apple Silicon GPU
- Model hub — download Llama, Mistral, Phi, Gemma with one click
- Free forever — no subscription, no token limits
Quick Setup
1. Install Jan
Download from jan.ai for Mac, Windows, or Linux. Or via CLI:
# macOS
brew install --cask jan
# Or download AppImage for Linux
wget https://github.com/janhq/jan/releases/latest/download/jan-linux-x86_64.AppImage
chmod +x jan-linux-x86_64.AppImage
2. Download a Model
In Jan UI: Hub → Search for model → Download
Popular choices:
- Llama 3.1 8B — great general-purpose (needs 8GB RAM)
- Mistral 7B — fast and capable
- Phi-3 Mini — Microsoft's small but powerful model (4GB RAM)
3. Start the API Server
In Jan: Settings → Advanced → Enable API Server
Default endpoint: http://localhost:1337/v1
4. Use with curl
# Chat completion (OpenAI-compatible)
curl -s http://localhost:1337/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1-8b-instruct",
"messages": [{"role": "user", "content": "Explain eBPF in 3 sentences"}],
"temperature": 0.7,
"max_tokens": 200
}' | jq '.choices[0].message.content'
# List available models
curl -s http://localhost:1337/v1/models | jq '.data[] | .id'
# Embeddings
curl -s http://localhost:1337/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"model": "nomic-embed-text", "input": "web scraping best practices"}' | jq '.data[0].embedding[:5]'
5. Use with OpenAI Python SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1337/v1", api_key="not-needed")
response = client.chat.completions.create(
model="llama3.1-8b-instruct",
messages=[{"role": "user", "content": "Write a Python function to validate email addresses"}],
temperature=0.3
)
print(response.choices[0].message.content)
6. Streaming
stream = client.chat.completions.create(
model="llama3.1-8b-instruct",
messages=[{"role": "user", "content": "List 5 web scraping best practices"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Key Endpoints
| Endpoint | Description |
|---|---|
| /v1/chat/completions | Chat with AI model |
| /v1/completions | Text completion |
| /v1/models | List available models |
| /v1/embeddings | Generate text embeddings |
Jan vs Alternatives
| Feature | Jan | Ollama | LM Studio |
|---|---|---|---|
| Desktop UI | Yes | No (CLI) | Yes |
| OpenAI API | Yes | Yes | Yes |
| Extensions | Yes | No | No |
| Open source | Yes | Yes | No |
| GPU support | NVIDIA/AMD/Apple | NVIDIA/AMD/Apple | NVIDIA/Apple |
Need custom data extraction or scraping solution? I build production-grade scrapers for any website. Email: Spinov001@gmail.com | My Apify Actors
Top comments (0)