DEV Community

Cover image for # Running a Local LLM on an Android Phone
Bellamer
Bellamer

Posted on

# Running a Local LLM on an Android Phone

Have you ever wondered if your old phone could become an AI research lab? I turned my OnePlus 3T (6GB RAM) into a working LLM chatbot, running completely offline with llama.cpp and a quantized TinyLLaMA model. No cloud. No API keys. Just Linux, RAM, and willpower.

Here’s how I did it — step by step.


This is the first post in a series where I document building an offline-capable local LLM chatbot that runs directly on an Android phone — with no cloud dependencies, no telemetry, and no internet required once the model is downloaded.

This setup uses a OnePlus A3010 running LineageOS and UserLAnd to provide a Linux environment, where we build and run llama.cpp to power the local LLM.


✅ Device Specs

  • Device: OnePlus A3010 (OnePlus 3T)
  • OS: LineageOS 18.1 (Android 11)
  • Rooted: Yes (via Magisk)
  • Memory: 6 GB RAM
  • Storage: 64 GB
  • Linux environment: UserLAnd (Arch Linux)

📦 Tools Installed on Android

  • Magisk (for root)
  • UserLAnd (installed Arch Linux distro)
  • Termux (optional fallback)

1. Install and Set Up UserLAnd

  1. Install UserLAnd from F-Droid or Play Store.
  2. Create a new session with the Arch distribution.
  3. Launch the session and wait for the initial environment setup.

✔️ To ensure it's working:

Run the following:

uname -a
Enter fullscreen mode Exit fullscreen mode

If you see an output like:

Linux localhost 3.x.x #1 SMP PREEMPT ... aarch64
Enter fullscreen mode Exit fullscreen mode

then your UserLAnd Arch environment is working.


2. Update System and Install Dependencies

Run:

sudo pacman -Syyu --noconfirm
Enter fullscreen mode Exit fullscreen mode

✔️ To verify:

No errors like "failed to synchronize all databases" should appear.
If the update is successful, the command ends without any errors.


3. Install Required Build Tools

sudo pacman -Sy --noconfirm base-devel clang cmake make git python
Enter fullscreen mode Exit fullscreen mode

This installs everything you need to build and run llama.cpp.

✔️ To verify:

Each of the following commands should return a version number:

git --version
cmake --version
clang --version
python --version
Enter fullscreen mode Exit fullscreen mode

If these all work, you are ready to build.


4. Prepare the Project Workspace

mkdir -p ~/llmchat && cd ~/llmchat
Enter fullscreen mode Exit fullscreen mode

Clone the llama.cpp repository:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
Enter fullscreen mode Exit fullscreen mode

5. Build the LLM Engine

Create a build directory and compile the binaries:

mkdir build && cd build
cmake ..
make -j$(nproc)
Enter fullscreen mode Exit fullscreen mode

✔️ To verify:

You should see a message like:

[100%] Built target llama-cli
Enter fullscreen mode Exit fullscreen mode

Run:

ls -lh bin/
Enter fullscreen mode Exit fullscreen mode

You should see binaries like llama-cli, llama-server, and llama-quantize.


6. Download a Lightweight Model

TinyLLaMA is a great choice for phones with limited RAM.

Use wget to download the model:

mkdir -p ~/llmchat/models && cd ~/llmchat/models
wget https://huggingface.co/cmp-nct/tiny-llama-gguf/resolve/main/tiny-llama.Q4_K_M.gguf -O tiny-llama.Q4_K_M.gguf
Enter fullscreen mode Exit fullscreen mode

Info: you need to give your token in the header

✔️ To verify:

Check file size with:

ls -lh tiny-llama.Q4_K_M.gguf
Enter fullscreen mode Exit fullscreen mode

If it's ~400MB+, it has downloaded correctly.


7. Run the Model Locally

Go back to the llama.cpp directory and run:

cd ~/llmchat/llama.cpp/build/bin
./llama-cli -m ~/llmchat/models/tiny-llama.Q4_K_M.gguf -p "Hello!"
Enter fullscreen mode Exit fullscreen mode

✔️ To verify:

You should receive a text response within a few seconds, like:

Hello! How can I help you today?
Enter fullscreen mode Exit fullscreen mode

If this happens, you're successfully running a local LLM — fully offline — on your phone.

Here is the first response I got:

ai_response_tinyllama

If you’ve built something similar or have ideas to extend this, I’d love to hear about it in the comments.

Top comments (0)