DEV Community

Ponikar
Ponikar

Posted on

Building an AI Agent That Commits and Raises PRs from Your Phone

Ever wondered when there will be a time when you’ll be able to ship code from your mobile? Since LLMs are capable of generating multiple chunks of code, you may not necessarily need your laptop to ship tiny features or hot fixes. What if an AI app could do that for you?

This blog is all about how I am building a mobile AI agent that helps you ship code directly from your phone, and hopefully, you won’t need to open your laptop while enjoying your favorite cocktail on a beach.

Tech Stack

I am using React Native to build a cross-platform app, Cloudflare Workers as a backend, and Vercel AI SDK to set up an AI agentic flow that enables support for multiple LLMs, including Google Gemini, OpenAI, and Claude.

For the sake of the prototype, by default, the AI agent uses Google Gemini Flash 2.0 since it’s a cheaper and effective solution out there.

The workflow

I have named it Gitfix. The code is fully open source, you can check it out here. Basically, it connects with your GitHub account, asks for the required permissions to read repos and raise PRs. Once the AI agent suggests changes, it can automatically raise a PR with a single tap.

So the flow looks like this:

High level overview of Gitfix AI agentic flow

GitHub OAuth flow

Millions of developers use GitHub every day, and it has a rich set of API tools, including GitHub Apps that let developers customize their experience.

As a user, you have to install the Gitfix GitHub app in your GitHub organization. This app will interact with your repos and activities like PRs, issues, etc.

Ultimately, our AI agent takes help from this GitHub app to read and raise PRs.

Our app uses the standard GitHub OAuth flow to authenticate users.

What about privacy?

Our server architecture follows a stateless approach, which means the server doesn’t store any GitHub-related data, including repos, access tokens, or any meta information.

The server relies totally on the client side to provide all the required context that helps the AI agent perform specific operations.

If you haven’t checked the app, you can refer to this tweet example that demonstrates the app flow.

@<filename> feature

@filename feature

If you’ve used Cursor’s @filename feature, it’s very handy to let AI know which file to consider and make changes.

I tried to implement a similar behavior in the app. Once GitHub shares meta information about the repo tree and branch data, and when the user mentions that file name, the app simply iterates over the tree data and shows suggestions. Once the user taps on it, the logic simply refers to that file in the prompt.

Each file has a unique sha and filePath that later helps AI fetch the file content. This information is stored on the client side to provide the LLM with file context throughout the conversation.

The AI agent uses these fileRefs to fetch file content using GitHub APIs. Once the contents are ready, we append them to the LLM context so the AI can effectively suggest changes.

Tool calling for rescue

We leverage the LLM’s tool/function calling feature to communicate with external APIs such as GitHub. LLMs can smartly decide which tool to use for a specific operation.

This is the high-level architecture of how the AI agent works. It consists of multiple chains of LLM calls that leverage tool calling to perform certain actions.

High level overview of AI agent that manage multiple LLM calls

The AI agent, at a high level, acts as an orchestrator to decide which route to take based on the user’s query.

High-quality context provides high-quality results

As we all know, the more and better context you provide to LLMs, the more accurate the result you get. Since the server strictly follows a stateless architecture, our app needs to store all the information regarding repos, conversation threads, meta information, file versioning, references, etc.

For example, our app stores file references based on threads. As users use the @fileName feature, the app stores these references corresponding to that thread. When we make a call to the LLM, the app provides these file refs as context.

Incremental changes support

As we all know, LLMs are good at making tiny changes effectively, and we often make iterations of changes from LLMs. The app maintains different versions of files and makes sure to provide the previous iteration of each file.

In summary, the app manages context effectively to help the AI agent generate responses efficiently.

User experience is the ultimate moat you can have in the race of AI wrappers.

The app uses git diff to highlight new changes similar to how GitHub does.

Screenshot of Gitfix mobile app

The magical moment

Eventually, once the user is happy with the changes, they can instruct the AI agent to directly raise a PR to their repo.

As you might wonder, the AI agent checks out a new branch, commits changes, and then raises a PR to the main branch. This entire process is hidden behind the abstraction layer provided by the AI agent.

For now, this flow is not very dynamic, it only raises PRs to the main branch, but later, as I continue working on the app, users will be able to decide multiple factors such as branch_name, commit message, and PR description, etc.

Overall, Gitfix is a POC that shows how your mobile can help you ship changes, thanks to the modern era of AI.

It has certain limitations as of now, but the potential is limitless. When Cursor came, it was just an LLM making changes in a bunch of files, but now the agentic flow can do more than what it’s asked for.

Mobile devices are powerful and can even run LLMs locally. This is the beginning of an era where mobile apps can perform intensive tasks with the help of AI.

Thanks for reading this blog. I hope you like it. Let me know your thoughts and questions in the comments.

I’ll see you in another blog. You can reach me on X.

Thanks.

Top comments (0)