Introduction
In the rapidly advancing AI landscape, deploying large language models (LLMs) such as Meta’s Llama 3, Google’s Gemma, and Mistral on local systems offers unparalleled advantages in terms of data privacy and customization. However, self-hosting and enabling secure online access to these tools significantly enhances their potential, whether it’s for developers demonstrating prototypes, researchers collaborating remotely, or businesses integrating AI into customer-centric applications.
This comprehensive guide provides step-by-step instructions for securely sharing Ollama’s API and Open WebUI online using Pinggy, a simple tunneling service. Learn how to seamlessly make your local AI setup accessible worldwide without the need for cloud infrastructure or complex configurations.
Summary of the steps:
-
Install Ollama & Download a Model:
- Get Ollama from ollama.com and run a model:
ollama run llama3:8b
-
Deploy Open WebUI
- Run via Docker
docker run -d -p 3000:8080 --add-host=host.docker.internal:host- gateway ghcr.io/open-webui/open-webui:main
-
Expose WebUI Online
- Tunnel port
3000
:
ssh -p 443 -R0:localhost:3000 a.pinggy.io
- Tunnel port
Share the generated URL for ChatGPT-like access to your LLMs.
Why Share Ollama API and Open WebUI Online?
The Rise of Local AI Deployments:
Due to growing concerns about data privacy and API expenses, running LLMs locally using tools like Ollama and Open WebUI has become a popular choice. However, keeping access limited to your local network restricts their usability. Sharing these tools online enables:
- AI integration into web and mobile applications.
- Project demonstrations without cloud deployment.
- Lower latency while keeping inference local.
Why Use Pinggy for Tunneling?
Pinggy simplifies the process of port forwarding by providing secure tunnels. Its standout features include:
- Free HTTPS URLs without requiring signup.
- No rate limitations on the free plan.
- SSH-based encrypted connections for enhanced security.
Prerequisites for Sharing Ollama and Open WebUI
A. Install Ollama
-
Download and install Ollama based on your operating system:
- Windows: Run the
.exe
installer. - macOS/Linux: Execute:
curl -fsSL https://ollama.com/install.sh | sh
- Windows: Run the
-
Verify the installation:
ollama --version
B. Download a Model
Ollama supports a wide range of models. Start with a lightweight one:
ollama run qwen:0.5b
For multimodal models:
ollama run llava:13b
C. Install Open WebUI
Open WebUI offers a ChatGPT-like interface for Ollama. Install it via Docker:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Access the interface at http://localhost:3000
and set up an admin account.
Sharing Ollama API Online: Detailed Steps
-
Start Ollama Locally
By default, Ollama runs on port
11434
. Launch the server:
ollama serve
-
Create a Public URL with Pinggy
Run this SSH command to tunnel the Ollama API:
ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io "u:Host:localhost:11434"
After executing, you will receive a URL such as
https://abc123.pinggy.link
. -
Verify API Access
Test the shared API using curl:
curl https://abc123.pinggy.link/api/tags
Alternatively, use a browser to verify access.
Sharing Open WebUI Online: Step-by-Step
-
Expose Open WebUI via Pinggy
To share port
3000
, execute:
ssh -p 443 -R0:localhost:3000 a.pinggy.io
You will receive a unique URL, such as
https://xyz456.pinggy.link
. -
Access WebUI Remotely
- Open the provided URL in a browser.
- Log in using your Open WebUI credentials.
- Utilize features such as:
- Chatting with various models.
- Uploading documents for Retrieval-Augmented Generation (RAG).
- Switching between different models.
Advanced Security and Optimization Tips
-
Enhance Security
Add basic authentication to your Pinggy tunnel by appending
username/password credentials:
ssh -p 443 -R0:localhost:3000 user:pass@a.pinggy.io
Utilize Custom Domains
Upgrade to Pinggy Pro to configure
custom domains:
ssh -p 443 -R0:localhost:3000 -T yourdomain.com@a.pinggy.io
Real-World Applications for Remote AI Access
Collaborative Development
- Share an Ollama instance for collaborative code reviews and documentation generation.
- Co-train custom models using Open WebUI.
Customer-Facing Applications
- Power AI-driven chatbots for enhanced customer support.
- Automate content generation for blogs and social media.
Academic and Research Projects
- Securely share proprietary models with research collaborators.
Troubleshooting Common Issues
Connection Refused
- Ensure Ollama is running with
ollama serve
. - Check firewall settings for ports
11434
and3000
.
Model Loading Failures
- Verify model compatibility with your current Ollama version.
- Free up system memory for larger models such as
llama3:70b
.
Conclusion
By combining Ollama, Open WebUI, and Pinggy, you can transform your local AI environment into a secure, shareable platform without relying on cloud services. This setup caters perfectly to startups, researchers, and anyone prioritizing data privacy and performance.
Top comments (2)
Thank you :-)
Thanks for this.