How to Connect SillyTavern to a Private AI API (NanoGPT) — Setup Guide
SillyTavern is a customizable, open-source chat interface that runs locally on your machine. It supports a wide range of models and is pretty flexible. The catch is that most people connect it to APIs that log everything, which kind of defeats the purpose of running your own frontend.
This guide walks through connecting SillyTavern to NanoGPT, a privacy-focused API that speaks the OpenAI format. Your conversations stay private and you stay in control of your setup.
What We're Building
- SillyTavern running locally on your machine
- NanoGPT as the AI backend (OpenAI-compatible API)
- No data used for training
- Custom model and parameter configuration
Prerequisites
- Node.js 16+ (Node.js 20 recommended)
- Git
- A web browser
- A NanoGPT API key from nano-gpt.com
Step 1: Install SillyTavern
Clone the repo and start it up:
git clone https://github.com/SillyTavern/SillyTavern.git
cd SillyTavern
./start.sh
On Windows, use start.bat instead. First run installs dependencies automatically, so give it a minute.
Once it's running, open your browser to:
http://localhost:8000
You'll see the SillyTavern interface with a default character loaded.
Step 2: Open API Settings
In the top menu bar, click the Extensions icon (puzzle piece) or navigate to API Connections. You need to configure the Chat Completion source.
Most guides stop here.
Step 3: Configure the NanoGPT Backend
In the API settings panel, you'll see a dropdown for Chat Completion Source. Select Custom (OpenAI-compatible). This lets you point SillyTavern at any OpenAI-compatible endpoint.
Fill in the fields:
Custom Endpoint (Base URL): https://nano-gpt.com/api/v1
API Key: your-nanogpt-api-key
Then click Connect. If the connection works, the available models will show up in the model selector.
Why Custom (OpenAI-compatible)?
This means SillyTavern sends standard OpenAI-format requests to whatever endpoint you specify. NanoGPT speaks the same API language, so chat completions, streaming, and function calls all work out of the box.
Step 4: Select Your Model
After connecting, NanoGPT models will show up in the dropdown. Pick one based on your use case:
| Model | Best For | Context Window |
|---|---|---|
minimax/minimax-m2.7 |
General purpose, great balance | Large |
google/gemini-2.5-flash |
Fast responses, lower cost | Large |
For SillyTavern (character chats, roleplay, creative writing), I'd start with minimax/minimax-m2.7. It handles long conversations well and follows character descriptions reliably.
Step 5: Optimize Your Parameters
Tune these settings in SillyTavern's API settings:
Temperature
- 0.3–0.5: More focused, follows instructions closely. Good for code assistance.
- 0.7–0.9: More creative, varied responses. Great for character chats.
- 1.0+: Very random. Fun for brainstorming but can get weird.
I usually run at 0.8 for character chats and 0.4 for utility tasks.
Max Context
Set this to match your model's context window. Larger values let you have longer conversations before the AI "forgets" early messages.
Top-P and Top-K
If your model supports these parameters, they give you fine-grained control over response diversity. Start with defaults and adjust if responses feel too repetitive or too random.
Repetition Penalty
Set this between 1.0–1.2 to prevent the AI from repeating phrases. Higher values reduce repetition but can make responses feel less natural.
Step 6: Character Setup
This is where SillyTavern shines. Create or import a character card:
- Click the character name in the top-left
- Select Create Character or import a
.pngcard - Fill in the character description, personality, and scenario
NanoGPT models respond well to detailed character cards. The more specific you get with personality and speech patterns, the more consistent the responses.
System Prompt Configuration
In the character's advanced settings, you can add a system prompt. This is useful for setting the AI's behavior:
You are acting as {{char}}. Stay in character at all times.
Do not break the fourth wall. Respond as {{char}} would,
not as an AI assistant. Use {{char}}'s speech patterns
and knowledge base.
Step 7: Privacy Hardening
Since you're here for privacy, let's lock things down:
Disable Telemetry
SillyTavern doesn't phone home by default, but double-check:
config.yaml:
enableLogging: false
Use Local Storage Only
By default, SillyTavern stores everything locally in the SillyTavern/data/ directory. Your conversations, characters, and settings never leave your machine unless you explicitly configure an external API.
With NanoGPT as your backend, the only data that leaves your machine is the API request/response. NanoGPT doesn't log or retain that data for training.
Network Isolation
If you want to go full paranoia mode, run SillyTavern on a machine that can reach the NanoGPT API but nothing else. Use firewall rules or a VPN to restrict outbound connections:
# Only allow connections to nano-gpt.com
iptables -A OUTPUT -d nano-gpt.com -j ACCEPT
iptables -A OUTPUT -j DROP
API Key Security
Store your API key in an environment variable, not in the UI:
export NANOGPT_API_KEY="your-key-here"
Then reference it in SillyTavern's connection settings or use the .env file approach.
Step 8: Advanced — Multiple Backends
One thing I like about SillyTavern is that you can set up multiple API connections and switch between them on the fly. This is useful if you want:
- A fast model for quick questions
- A larger model for deeper conversations
- Different models for different characters
Configure each backend in the API connections panel, and use the model selector in the chat interface to switch.
You could run NanoGPT alongside a local Ollama instance too: cloud for when quality matters, local for when privacy is paramount.
Step 9: Backups
Don't forget to back up your data. Everything lives in the SillyTavern directory:
# Simple backup
tar -czf sillytavern-backup-$(date +%Y%m%d).tar.gz SillyTavern/data/
# Or set up a cron job for automatic backups
0 2 * * * tar -czf /backups/sillytavern-$(date +\%Y\%m\%d).tar.gz /path/to/SillyTavern/data/
Your character cards, conversation history, and settings are all in there.
Why This Setup Matters
Running your own frontend means nothing if your API provider is vacuuming up every message. Pairing SillyTavern with a privacy-focused backend like NanoGPT keeps you in control.
You get to choose where your data goes. That's kind of the whole point.
Originally published at ai-privacy-tools.vercel.app
Top comments (0)