TLDR - successfuly got ollama running locally and through Google Colab using ngrok and my VPS.
Intro
As usual, I have gotten massively sidetracked on an interesting web development problem. It all started when I decided to update my yet-to-be-deployed developer profile site. I remembered I had created a small memory card game in vanilla JS and I wanted to share it on my site as well. However I knew it needed more work. Once I figured out which one of my devices it was saved locally on, I quickly pushed it to github.
A Decision was Made
Upon reviewing my tiny game that was over a year old, I decided it needed a massive facelift. I had created it just for learning javascript in a more fun gamified way, so it was simple and basic. It lacked any real function at all besides playing the same game over and over with the tiles in different places each time. My significant other kept talking about Ollama and how I should use it, and finally I came up with this plan of using AI to make my tiles.
View of original game:
I took my original code with me downtown to Google AI Studio, where I fed Gemini my original code and explained what I wanted to have happen:
- Use my existing code and rebuild it to be mobile-first, using react/vite.
- Add functionallity by creating a pdf/text loader area so that a user can generate their own custom flashcards using AI.
- Allow a user to select the number of flashcards.
- Generate a practice test with a reviewing and print option.
- Play their own memory game based on their flashcards.
I tend to run out of my free tiers in about 30 minutes. For this annoying reason, I decided it was time to listen to my significant other and try Ollama locally. I downloaded Ollama and started asking Mr. Ollama questions right away about how I might use it to create questions from text/pdf. I needed to choose a model, and Ollama informed me that Mistral was perfomant and effecient, so I chose to go with that.
After many revisions and error handling I was able to get something minimally functional.
I got the pdfs uploading and sending a prompt to AI for questions and flashcards. I then started to modify the styling since it was massivley bothering me.
I then moved onto test generation by having AI interpret the provided text and create questions for the users, as well as putting questions and answers into a matching game, modeling after my first idea. It was ugly at first, but I was so excited when I got it working and producing the test and games as I wanted. From a functionallity standpoint, this was a win.
Let me just cut to the ending styling, which I am much more happy about.
If you are also a student, you will see the value here of creating your own flashcards quickly with ease. I am currently fighting with google colab, ngrok, and my VPS to see how I can configure this up for public use at almost no cost. (Broke college student. cough cough, XD) When I figure it out, there will be an announcement lol.
If you are curious about my flashcard app you can find it here:
- Flippy Card(Doesn't work hosted, only locally atm)
- Or also the repo: Repo
I am open for hearing about suggestions. I am new to react and Ollama so it might be a hot mess. Just playing with stuff and making useful things. XD
Basically at the end of the day, I got my first open source local model running on my PC.
24 HOURS LATER...
I decided I would try a fresh start on google colab. Blank slate. Now that im sort of familiar with it, I found it was easier to manage the code by executing everything in separate blocks/chunks, instead of trying to run one big fat file. I think part of my original problem was trying to run everything in one "cell" as they call them.
The Process Explained With Tips
Create a new Google Colab Project
Go to ngrok and get your auth token. They have a free one. The endpoint for the free tier cycles so you will have to coninuosly update it in your code as needed, but there is a free tier for students if you are so lucky. Check out ngrok stuff here: ngrok docs
After you have successfully gotten your auth token, head back to your Google Colab project and pop that bad boy right here:
Your gonna need that in a bit.
Break up the code into parts on Google Colab.
Do not attempt to run it all at once. It make debugging miserable. Put different things into different "cells". First you are going to want to install Ollama in Google Colab. The warnings are fine depending on your use case.
On the bottom right of the notebook, you can change your runtime type if needed.
The next two code blocks should look like this:
Then install dependencies...
Here is a nice snipet of code that the lovely @trickell shared with me to get everything behaving. (I was very thankful after days of defeat.) This will be your next cell your gonna wanna create.
def start_ollama_server() -> None:
"""Starts the Ollama server."""
subprocess.Popen(['ollama', 'serve'])
print("Ollama server started.")
def check_ollama_port(port: str) -> None:
"""Check if Ollama server is running at the specified port."""
try:
subprocess.run(['sudo', 'lsof', '-i', '-P', '-n'], check=True, capture_output=True, text=True)
if any(f":{port} (LISTEN)" in line for line in subprocess.run(['sudo', 'lsof', '-i', '-P', '-n'], capture_output=True, text=True).stdout.splitlines()):
print(f"Ollama is listening on port {port}")
else:
print(f"Ollama does not appear to be listening on port {port}.")
except subprocess.CalledProcessError as e:
print(f"Error checking Ollama port: {e}")
def setup_ngrok_tunnel(port: str) -> ngrok.NgrokTunnel:
"""Sets up an ngrok tunnel.
Args:
port: The port to tunnel.
Returns:
The ngrok tunnel object.
Raises:
RuntimeError: If the ngrok authtoken is not set.
"""
ngrok_auth_token = userdata.get('NGROK_AUTHTOKEN')
if not ngrok_auth_token:
raise RuntimeError("NGROK_AUTHTOKEN is not set.")
ngrok.set_auth_token(ngrok_auth_token)
tunnel = ngrok.connect(port, host_header=f'localhost:{port}')
print(f"ngrok tunnel created: {tunnel.public_url}")
return tunnel
Run that, make sure it works, then assign the ngrok port and start ollama server.
Then the last few cells should look like:

Click the link to make sure that the tunnel is active and working. Pull whatever model you intend to be using in yout project.
At this point you have everthing working in Google Colab.
- Ollama is running
- ngrok makes you a tunnel with your endpoint
Connecting up your virtual private server
Go to your app on your server and make sure that your model name is set correctly as well as all endpoints. I have my endpoint in a .env file. When I say endpoint I mean that ngrock link to clarify.
After many days and hours I can say I built my first app locally with AI, deployed it to my VPS and ran the connected AI part in the cloud through Google Colab and ngrok.
YAY! IT WORKS!
AHHHH NO BUGS....(after many hours)
The Future
- Figure out how to keep it running 24/7 for user friendliness
- Add user accounts and connect a DB so users can store their tests and cards.
















Top comments (0)