Have you ever wondered if your old phone could become an AI research lab? I turned my OnePlus 3T (6GB RAM) into a working LLM chatbot, running completely offline with llama.cpp
and a quantized TinyLLaMA model. No cloud. No API keys. Just Linux, RAM, and willpower.
Here’s how I did it — step by step.
This is the first post in a series where I document building an offline-capable local LLM chatbot that runs directly on an Android phone — with no cloud dependencies, no telemetry, and no internet required once the model is downloaded.
This setup uses a OnePlus A3010 running LineageOS and UserLAnd to provide a Linux environment, where we build and run llama.cpp
to power the local LLM.
✅ Device Specs
- Device: OnePlus A3010 (OnePlus 3T)
- OS: LineageOS 18.1 (Android 11)
- Rooted: Yes (via Magisk)
- Memory: 6 GB RAM
- Storage: 64 GB
- Linux environment: UserLAnd (Arch Linux)
📦 Tools Installed on Android
- Magisk (for root)
- UserLAnd (installed Arch Linux distro)
- Termux (optional fallback)
1. Install and Set Up UserLAnd
- Install UserLAnd from F-Droid or Play Store.
- Create a new session with the Arch distribution.
- Launch the session and wait for the initial environment setup.
✔️ To ensure it's working:
Run the following:
uname -a
If you see an output like:
Linux localhost 3.x.x #1 SMP PREEMPT ... aarch64
then your UserLAnd Arch environment is working.
2. Update System and Install Dependencies
Run:
sudo pacman -Syyu --noconfirm
✔️ To verify:
No errors like "failed to synchronize all databases" should appear.
If the update is successful, the command ends without any errors.
3. Install Required Build Tools
sudo pacman -Sy --noconfirm base-devel clang cmake make git python
This installs everything you need to build and run llama.cpp
.
✔️ To verify:
Each of the following commands should return a version number:
git --version
cmake --version
clang --version
python --version
If these all work, you are ready to build.
4. Prepare the Project Workspace
mkdir -p ~/llmchat && cd ~/llmchat
Clone the llama.cpp repository:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
5. Build the LLM Engine
Create a build directory and compile the binaries:
mkdir build && cd build
cmake ..
make -j$(nproc)
✔️ To verify:
You should see a message like:
[100%] Built target llama-cli
Run:
ls -lh bin/
You should see binaries like llama-cli
, llama-server
, and llama-quantize
.
6. Download a Lightweight Model
TinyLLaMA is a great choice for phones with limited RAM.
Use wget
to download the model:
mkdir -p ~/llmchat/models && cd ~/llmchat/models
wget https://huggingface.co/cmp-nct/tiny-llama-gguf/resolve/main/tiny-llama.Q4_K_M.gguf -O tiny-llama.Q4_K_M.gguf
Info: you need to give your token in the header
✔️ To verify:
Check file size with:
ls -lh tiny-llama.Q4_K_M.gguf
If it's ~400MB+, it has downloaded correctly.
7. Run the Model Locally
Go back to the llama.cpp directory and run:
cd ~/llmchat/llama.cpp/build/bin
./llama-cli -m ~/llmchat/models/tiny-llama.Q4_K_M.gguf -p "Hello!"
✔️ To verify:
You should receive a text response within a few seconds, like:
Hello! How can I help you today?
If this happens, you're successfully running a local LLM — fully offline — on your phone.
Here is the first response I got:
If you’ve built something similar or have ideas to extend this, I’d love to hear about it in the comments.
Top comments (0)