DEV Community

Shrinidhi Achar
Shrinidhi Achar

Posted on

ResearchMate — I Finally Finished the AI Research Tool I Wish I Had in College

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge


What I Built

Okay so — like most side projects, this one started with a very specific frustration.

I was working on an assignment and had maybe 30+ PDFs open across different tabs, a notes doc that was getting out of control, and absolutely no good way to connect ideas across papers. I kept thinking: why can't I just... ask a question and have something search through all of these for me?

So I built ResearchMate.

It's a full-stack research management platform where you can upload your papers, organize them into project workspaces, and actually talk to your document library. Ask "what methodology did the 2021 papers use?" and get a cited answer pulled from your actual uploads. Generate summaries. Auto-tag papers. Annotate PDFs directly in the browser.

But the part I'm most proud of — and the part that was half-broken until this challenge — is the LaTeX Agent.

It's a ReAct-based AI agent (running locally via Ollama + Qwen 2.5 Coder 3B) that can draft and edit research papers. You tell it "add a related work section," and it actually reads your uploaded documents first, reasons about what to include, then makes targeted edits to your LaTeX file. It streams its thinking to you in real time so you can watch it work. It's kind of wild to see it go through the thought → action → observation loop live.

The stack:

  • Frontend: React, TypeScript, Vite, Tailwind, shadcn/ui
  • Backend: Flask, PostgreSQL, ChromaDB, LangChain
  • AI: Ollama running Qwen 2.5 Coder 3B locally, RAG pipeline with vector embeddings

🔗 Frontend · Backend


Demo

📺 YouTube demo walkthrough

The demo covers uploading papers into a project workspace, running Q&A against the library, and using the LaTeX agent to draft a new section — including the live agent thinking stream.


The Comeback Story

Honestly? This project had been sitting at about 60% done for months.

The RAG pipeline worked. Auth was solid. You could upload documents and organize them. But every time I opened the repo, the LaTeX agent was staring back at me like an unfinished promise. The WebSocket layer wasn't hooked up. The tool-dispatch logic was half-implemented. The frontend had pages that were basically empty shells. I kept telling myself I'd finish it "this weekend."

The Finish-Up-A-Thon was the kick I needed.

Here's what actually changed:

The LaTeX Agent went from prototype to something real. I rewrote the FastAPI service properly with WebSocket support so the agent's reasoning steps stream live to the frontend. The full ReAct loop — thought, action, observation, up to 20 steps — is now implemented with loop detection and a fallback so it doesn't just hang forever. The four tools (search_docs, read_doc, read_current_paper, none) are wired to the actual database and vector store.

The thing I'm most happy about: the agent now reads the current state of the paper before making any changes. Earlier it would just overwrite everything. Now it makes incremental, targeted edits and preserves whatever was already there. That sounds like a small thing but it makes it actually usable.

The frontend finally caught up. The annotation system, project workspace views, and the agent chat interface were all completed and connected to real endpoints instead of placeholder data.

The READMEs became actual READMEs. Both repos now have proper setup instructions, architecture overviews, environment variable templates, WebSocket protocol docs, and a troubleshooting table. Before this they were... not that.


My Experience with GitHub Copilot

I'll be honest — I went into this a bit skeptical. I've had mixed experiences with AI coding tools where the suggestions are confidently wrong and you spend more time fixing them than you would've writing from scratch.

But it genuinely helped in a few specific places.

The agent's tool-dispatch logic was the biggest one. A ReAct loop involves a lot of structural repetition — parse the model output, figure out which tool was called, run it, format the observation, feed it back, handle errors at every step. Once Copilot saw the pattern from the first tool, it scaffolded the rest pretty accurately. That let me focus on the actual interesting parts: loop detection, the step limit, the forced-completion fallback when things go sideways.

It also saved real time on the async WebSocket handler in FastAPI. Async Python has enough quirks that getting the first draft right is non-trivial, and Copilot's suggestion was close enough to correct that I only had to tweak it rather than reason through it from scratch.

Where I turned it off: the RAG pipeline design and ChromaDB integration. Those needed deliberate decisions about chunking strategy, embedding model choice, retrieval logic. That's not the kind of work that benefits from autocomplete — you need to actually think through the tradeoffs. Copilot suggestions there would've just been distracting.

The honest summary: it's a great tool for the structural and repetitive parts of finishing a project. It doesn't replace the thinking. But this project had a lot of structural work left, so the timing was good.


If you're a student who's ever been buried in research papers — this one's for you. Go check it out, and please do open issues if something's broken. There is almost certainly something broken.

— Shrinidhi

Top comments (0)