SillyTavern Backend Setup Guide: Connect Local Ollama & Cloud APIs
SillyTavern Backend Setup Guide: Connect Local Ollama & Cloud APIs
If you are serious about local adult AI roleplay and companionship, the default terminal interfaces or standard cloud wrappers quickly feel highly limiting.
SillyTavern is the undisputed king of NSFW AI frontend shells. It runs locally in your browser, providing a breathtaking, gaming-styled interface with support for character avatar cards, vector-based long-term memory databases, custom expression sprites, text-to-speech outputs, and world-building lore books.
However, SillyTavern is just a frontend "shell"—it contains no AI model inside itself. To write responses, you must connect it to a hosting processor.
This SillyTavern backend setup guide explains how to bind SillyTavern to local runners (like Ollama and KoboldCpp) and cloud API aggregators (like OpenRouter) for a fully customized, 100% uncensored roleplay experience.
What is SillyTavern?
Unlike standard chatbot sites, SillyTavern keeps your data completely private on your machine while allowing you to switch backends with a single click.
It handles advanced prompt engineering dynamically behind the scenes, injecting custom context, memory buffers, and story variables into whichever model you are running.
Step 1: Install SillyTavern Locally
SillyTavern requires Node.js to compile and run its local web server.
- Download and install the latest LTS version of Node.js from
nodejs.org. - Download SillyTavern from the official GitHub repository:
- Method A (Recommended): If you have Git installed, open your command prompt/terminal, navigate to your desired install directory, and run:
git clone https://github.com/SillyTavern/SillyTavern.git - Method B: Visit
github.com/SillyTavern/SillyTavern, click the green "Code" button, select "Download ZIP", and extract the folder to your computer.
- Method A (Recommended): If you have Git installed, open your command prompt/terminal, navigate to your desired install directory, and run:
- Once downloaded, navigate inside the folder:
- Windows: Double-click the file named
start.bat. - macOS/Linux: Open terminal in the directory and run
./start.sh.
- Windows: Double-click the file named
- Your browser will automatically launch a new tab pointing to the local host address:
http://127.0.0.1:8000
Step 2: Bind SillyTavern to a Local Ollama Backend
Ollama is the easiest runner to deploy highly advanced uncensored models on your machine. To learn how to download and optimize Dolphin Llama 3 for this setup, check out our Dolphin Llama 3 on Ollama deployment guide.
1. Enable Ollama Network Access
By default, Ollama only listens to local requests. To allow SillyTavern to talk to it, you must configure the environment variable:
- Windows: Close Ollama from your taskbar. Open your Start Menu, search for "Environment Variables", click "Edit the system environment variables". In the window, click "Environment Variables" at the bottom. Under "User variables", click "New". Set Variable name to
OLLAMA_ORIGINSand Variable value to*. Click OK and launch Ollama again. - macOS/Linux: Start Ollama from your terminal using:
OLLAMA_ORIGINS="*" ollama serve
2. Configure the API Connection in SillyTavern
- In the top-right corner of the SillyTavern browser interface, click the API Connections icon (looks like a plug).
- Set the API dropdown to
Ollama. - In the Ollama Server URL field, ensure the default port is populated:
http://127.0.0.1:11434
- Click the Connect button. A green indicator will appear showing a successful binding.
- In the Model dropdown, select your downloaded model (e.g.
dolphin-llama3:latestor your compiled custom assistant).
Step 3: Bind SillyTavern to KoboldCpp (GGUF Models)
KoboldCpp is a highly popular alternative local runner that specializes in executing GGUF format files. It is highly optimized for splitting model execution between system RAM and GPU VRAM, allowing you to run larger models than your VRAM would normally support.
- Download the latest version of
koboldcpp.exefrom GitHub. - Download an uncensored GGUF model file from Hugging Face (such as a Mistral-based or Llama-based 8B model).
- Launch KoboldCpp, select your model file, adjust your GPU layers to offload computation, and click Launch. The server will boot and display:
http://localhost:5001
- In SillyTavern, open the API Connections panel, set API to
Text Completion, select API Type asKoboldCpp, enterhttp://127.0.0.1:5001in the connection box, and click Connect.
Step 4: Connecting Cloud-Based API Aggregators (OpenRouter)
If your local computer doesn't have a dedicated graphics card, you can still experience premium, completely uncensored roleplay by routing SillyTavern through OpenRouter. OpenRouter acts as a unified hub allowing you to pay pennies per million tokens to stream high-end models hosted in the cloud.
- Create a free account at
openrouter.aiand generate an API key. - In SillyTavern's API Connections panel, select API as
Chat Completionand set API Type toOpenRouter. - Paste your generated API key into the API Key input box.
- Select your target uncensored model from the list (such as Command R+, Psyfighter, or Fimbulvetr).
Step 5: Importing and Customizing Character Lore Cards
With your backend successfully bound, you can now load interactive characters:
- Click the Characters icon in the SillyTavern header.
- You can download and import character files (commonly saved as PNG images containing embedded metadata, known as V2 Character Cards) from online directories like Chub.ai or character catalogs.
- Drag and drop the PNG avatar directly into the SillyTavern window to import it instantly.
- Click on the character card to customize their permanent prompt settings, relationship level, and first message prompts.
API Comparison Reference Table
To help configure your ideal system configuration, use this comparative guide:
| Backend Provider | Execution Type | Cost | Privacy | Best Model Candidate |
|---|---|---|---|---|
| Ollama | Local | 100% Free | Absolute (No logs) | dolphin-llama3 (8B / 70B) |
| KoboldCpp | Local | 100% Free | Absolute (No logs) | psyfighter-8b |
| OpenRouter | Cloud | Pay-per-token | Subject to Host Policy | cohere/command-r-plus |
| NovelAI | Cloud | Subscription | Encrypted | Kayra |
For complete catalog listings of other desktop software setups, adult companion platforms, and generative suites, visit our full Uncensored Open-Source LLMs category.
