Lightning Developer

Posted on Feb 6 • Edited on Apr 29

Selfhost Ollama with Open WebUI Online

#pinggy #javascript #webdev #ai

Introduction

In the rapidly advancing AI landscape, deploying large language models (LLMs) such as Meta’s Llama 3, Google’s Gemma, and Mistral on local systems offers unparalleled advantages in terms of data privacy and customization. However, self-hosting and enabling secure online access to these tools significantly enhances their potential, whether it’s for developers demonstrating prototypes, researchers collaborating remotely, or businesses integrating AI into customer-centric applications.

This comprehensive guide provides step-by-step instructions for securely sharing Ollama’s API and Open WebUI online using Pinggy, a simple tunneling service. Learn how to seamlessly make your local AI setup accessible worldwide without the need for cloud infrastructure or complex configurations.

Summary of the steps:

Install Ollama & Download a Model:
- Get Ollama from ollama.com and run a model:
```
ollama run llama3:8b
```

Deploy Open WebUI

Run via Docker

docker run -d -p 3000:8080 --add-host=host.docker.internal:host- 
gateway ghcr.io/open-webui/open-webui:main

Expose WebUI Online

Tunnel port 3000:

ssh -p 443 -R0:localhost:3000 a.pinggy.io

Share the generated URL for ChatGPT-like access to your LLMs.

Why Share Ollama API and Open WebUI Online?

The Rise of Local AI Deployments:

Due to growing concerns about data privacy and API expenses, running LLMs locally using tools like Ollama and Open WebUI has become a popular choice. However, keeping access limited to your local network restricts their usability. Sharing these tools online enables:

AI integration into web and mobile applications.
Project demonstrations without cloud deployment.
Lower latency while keeping inference local.

Why Use Pinggy for Tunneling?

Pinggy simplifies the process of port forwarding by providing secure tunnels. Its standout features include:

Free HTTPS URLs without requiring signup.
No rate limitations on the free plan.
SSH-based encrypted connections for enhanced security.

Prerequisites for Sharing Ollama and Open WebUI

A. Install Ollama

Download and install Ollama based on your operating system:
- Windows: Run the .exe installer.
- macOS/Linux: Execute:
```
 curl -fsSL https://ollama.com/install.sh | sh
```
Verify the installation:
```
ollama --version
```

B. Download a Model

Ollama supports a wide range of models. Start with a lightweight one:

ollama run qwen:0.5b

For multimodal models:

ollama run llava:13b

C. Install Open WebUI

Open WebUI offers a ChatGPT-like interface for Ollama. Install it via Docker:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Access the interface at http://localhost:3000 and set up an admin account.

Sharing Ollama API Online: Detailed Steps

Start Ollama Locally

By default, Ollama runs on port 11434. Launch the server:
```
ollama serve
```
Create a Public URL with Pinggy

Run this SSH command to tunnel the Ollama API:
```
ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io 
"u:Host:localhost:11434"
```
After executing, you will receive a URL such as https://abc123.pinggy.link.
Verify API Access

Test the shared API using curl:
```
curl https://abc123.pinggy.link/api/tags
```

Alternatively, use a browser to verify access.

Sharing Open WebUI Online: Step-by-Step

Expose Open WebUI via Pinggy

To share port 3000, execute:
```
ssh -p 443 -R0:localhost:3000 a.pinggy.io
```
You will receive a unique URL, such as
https://xyz456.pinggy.link.
Access WebUI Remotely
1. Open the provided URL in a browser.
2. Log in using your Open WebUI credentials.
3. Utilize features such as:
  - Chatting with various models.
  - Uploading documents for Retrieval-Augmented Generation (RAG).
  - Switching between different models.

Advanced Security and Optimization Tips

Enhance Security

Add basic authentication to your Pinggy tunnel by appending
username/password credentials:
```
ssh -p 443 -R0:localhost:3000 user:pass@a.pinggy.io
```
Utilize Custom Domains

Upgrade to Pinggy Pro to configure
custom domains:

   ssh -p 443 -R0:localhost:3000 -T yourdomain.com@a.pinggy.io

Real-World Applications for Remote AI Access

Collaborative Development

Share an Ollama instance for collaborative code reviews and documentation generation.
Co-train custom models using Open WebUI.

Customer-Facing Applications

Power AI-driven chatbots for enhanced customer support.
Automate content generation for blogs and social media.

Academic and Research Projects

Securely share proprietary models with research collaborators.

Troubleshooting Common Issues

Connection Refused

Ensure Ollama is running with ollama serve.
Check firewall settings for ports 11434 and 3000.

Model Loading Failures

Verify model compatibility with your current Ollama version.
Free up system memory for larger models such as llama3:70b.

Conclusion

By combining Ollama, Open WebUI, and Pinggy, you can transform your local AI environment into a secure, shareable platform without relying on cloud services. This setup caters perfectly to startups, researchers, and anyone prioritizing data privacy and performance.

Top comments (2)

artydev • Feb 20

Thank you :-)

R Jones • Feb 7

Thanks for this.