SillyTavern Backend Setup Guide: Connect Local Ollama & Cloud APIs

If you are serious about local adult AI roleplay and companionship, the default terminal interfaces or standard cloud wrappers quickly feel highly limiting.

SillyTavern is the undisputed king of NSFW AI frontend shells. It runs locally in your browser, providing a breathtaking, gaming-styled interface with support for character avatar cards, vector-based long-term memory databases, custom expression sprites, text-to-speech outputs, and world-building lore books.

However, SillyTavern is just a frontend "shell"—it contains no AI model inside itself. To write responses, you must connect it to a hosting processor.

This SillyTavern backend setup guide explains how to bind SillyTavern to local runners (like Ollama and KoboldCpp) and cloud API aggregators (like OpenRouter) for a fully customized, 100% uncensored roleplay experience.

What is SillyTavern?

Unlike standard chatbot sites, SillyTavern keeps your data completely private on your machine while allowing you to switch backends with a single click.

It handles advanced prompt engineering dynamically behind the scenes, injecting custom context, memory buffers, and story variables into whichever model you are running.

Step 1: Install SillyTavern Locally

SillyTavern requires Node.js to compile and run its local web server.

Download and install the latest LTS version of Node.js from nodejs.org.
Download SillyTavern from the official GitHub repository:
- Method A (Recommended): If you have Git installed, open your command prompt/terminal, navigate to your desired install directory, and run:
```
git clone https://github.com/SillyTavern/SillyTavern.git
```
- Method B: Visit github.com/SillyTavern/SillyTavern, click the green "Code" button, select "Download ZIP", and extract the folder to your computer.
Once downloaded, navigate inside the folder:
- Windows: Double-click the file named start.bat.
- macOS/Linux: Open terminal in the directory and run ./start.sh.
Your browser will automatically launch a new tab pointing to the local host address:
- http://127.0.0.1:8000

Step 2: Bind SillyTavern to a Local Ollama Backend

Ollama is the easiest runner to deploy highly advanced uncensored models on your machine. To learn how to download and optimize Dolphin Llama 3 for this setup, check out our Dolphin Llama 3 on Ollama deployment guide.

1. Enable Ollama Network Access

By default, Ollama only listens to local requests. To allow SillyTavern to talk to it, you must configure the environment variable:

Windows: Close Ollama from your taskbar. Open your Start Menu, search for "Environment Variables", click "Edit the system environment variables". In the window, click "Environment Variables" at the bottom. Under "User variables", click "New". Set Variable name to OLLAMA_ORIGINS and Variable value to *. Click OK and launch Ollama again.
macOS/Linux: Start Ollama from your terminal using:
```
OLLAMA_ORIGINS="*" ollama serve
```

2. Configure the API Connection in SillyTavern

In the top-right corner of the SillyTavern browser interface, click the API Connections icon (looks like a plug).
Set the API dropdown to Ollama.
In the Ollama Server URL field, ensure the default port is populated:
- http://127.0.0.1:11434
Click the Connect button. A green indicator will appear showing a successful binding.
In the Model dropdown, select your downloaded model (e.g. dolphin-llama3:latest or your compiled custom assistant).

Step 3: Bind SillyTavern to KoboldCpp (GGUF Models)

KoboldCpp is a highly popular alternative local runner that specializes in executing GGUF format files. It is highly optimized for splitting model execution between system RAM and GPU VRAM, allowing you to run larger models than your VRAM would normally support.

Download the latest version of koboldcpp.exe from GitHub.
Download an uncensored GGUF model file from Hugging Face (such as a Mistral-based or Llama-based 8B model).
Launch KoboldCpp, select your model file, adjust your GPU layers to offload computation, and click Launch. The server will boot and display:
- http://localhost:5001
In SillyTavern, open the API Connections panel, set API to Text Completion, select API Type as KoboldCpp, enter http://127.0.0.1:5001 in the connection box, and click Connect.

Step 4: Connecting Cloud-Based API Aggregators (OpenRouter)

If your local computer doesn't have a dedicated graphics card, you can still experience premium, completely uncensored roleplay by routing SillyTavern through OpenRouter. OpenRouter acts as a unified hub allowing you to pay pennies per million tokens to stream high-end models hosted in the cloud.

Create a free account at openrouter.ai and generate an API key.
In SillyTavern's API Connections panel, select API as Chat Completion and set API Type to OpenRouter.
Paste your generated API key into the API Key input box.
Select your target uncensored model from the list (such as Command R+, Psyfighter, or Fimbulvetr).

Step 5: Importing and Customizing Character Lore Cards

With your backend successfully bound, you can now load interactive characters:

Click the Characters icon in the SillyTavern header.
You can download and import character files (commonly saved as PNG images containing embedded metadata, known as V2 Character Cards) from online directories like Chub.ai or character catalogs.
Drag and drop the PNG avatar directly into the SillyTavern window to import it instantly.
Click on the character card to customize their permanent prompt settings, relationship level, and first message prompts.

API Comparison Reference Table

To help configure your ideal system configuration, use this comparative guide:

| Backend Provider | Execution Type | Cost | Privacy | Best Model Candidate | |---|---|---|---|---| | Ollama | Local | 100% Free | Absolute (No logs) | dolphin-llama3 (8B / 70B) | | KoboldCpp | Local | 100% Free | Absolute (No logs) | psyfighter-8b | | OpenRouter | Cloud | Pay-per-token | Subject to Host Policy | cohere/command-r-plus | | NovelAI | Cloud | Subscription | Encrypted | Kayra |

For complete catalog listings of other desktop software setups, adult companion platforms, and generative suites, visit our full Uncensored Open-Source LLMs category.