Gizem Yılmaz

Posted on Jun 6 • Edited on Jun 7

ExcelPy: From Tkinter Prototype to AI-Assisted Workbook Review

#devchallenge #githubchallenge

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

I rebuilt ExcelPy from an unfinished desktop stock-comparison prototype into a working spreadsheet QA workspace for teams that need to compare an older, trusted workbook against a newer one before publishing pricing, catalog, or inventory changes.

The original was a Tkinter script with file pickers. It could select two .xlsx files, but the actual comparison logic, validation, mismatch handling, and reporting layer were all still placeholders.

The current app is a Next.js 16 + TypeScript workflow built around a clear path: upload → map → review → report.

Upload & validation — .xlsx upload from the browser (or one-click demo workbooks), with server-side parsing, sheet selection, and file/size/type/row/column checks.
Schema-drift resolution before diffing — pick a key column (with warnings for blank or duplicate keys), auto-match identical headers, get suggested renames for drifted columns, or map them manually when baseline and current don't line up.
Deterministic diff & review — added/removed/changed/unchanged records with summary cards and a filterable, searchable table, plus a mismatch summary covering renamed columns, row anomalies, layout drift, and cell-style changes.
Optional AI price review — explains only verified comparison data, flags price-like changes, recommends next checks, and falls back to a deterministic report when AI is unavailable.
Built to be public-safe — content-type checks, body-size limits, per-route rate limiting, AI timeout protection, shared Zod contracts keeping UI and API aligned, and Vitest coverage on the comparison/reporting services and route boundaries.

The real thing I finished wasn't a new UI; it was a set of explicit rules. Choose the trusted baseline, select the newer workbook, resolve schema drift before trusting the diff, review the deterministic changes, and only then generate a narrative report. That's the difference between "compare two Excel files" and a product you'd let near real pricing data.

Demo

Final Demo:https://excel-py.vercel.app/
Github: https://github.com/00Gizem00/ExcelPy
Video Walkthrough: https://youtu.be/8EJiguMn2t4

The Comeback Story

ExcelPy started as a small Tkinter desktop prototype with a real idea behind it: compare two stock Excel files and show what changed. But the project had stopped at the file-picker stage. The user could choose two .xlsx files, yet the comparison logic, validation rules, difference output, and reporting flow were still missing.

For the comeback, I kept the original idea but rebuilt the product around the parts that make spreadsheet comparison trustworthy. I moved the workflow into a Next.js 16 app, added server-side workbook parsing, defined clear diff states for added, removed, changed, and unchanged rows, and introduced validation that blocks structurally unsafe files while surfacing blank and duplicate row keys as data-quality warnings.

I also expanded the product beyond a simple diff. The new version detects schema drift, suggests renamed column matches, supports manual column mapping, surfaces row and style mismatches, and gives users a focused upload → map → review → report flow. The AI layer was added carefully: it does not decide what changed. It only turns verified comparison data into an operational summary, with a deterministic fallback when AI is unavailable.

The project went from an unfinished local script into a working spreadsheet QA workspace with typed contracts, API routes, tests, sample workbooks, security guardrails, and a submission package that preserves the full before-and-after story.

My Experience with GitHub Copilot

GitHub Copilot wasn't autocomplete for me on this project; it's what turned a half-finished main.py Excel-comparison script into a real Next.js SaaS app. A few moments where it actually moved the needle:

It gave me a path, not just code. I started with a rough script and one goal: "make this real." I asked Copilot to plan the rebuild around Next.js 16 + Tailwind with an AI-powered report, and it produced an actual phased plan (Phase 1 → Phase 7) I could execute step by step. Then I drove the build one phase at a time — "start with Phase 1," "okay, Phase 3" — and it held the context between phases: the data flow, the file structure, the naming. I never had to re-explain the project. That continuity is the part you don't get from a search-and-paste workflow.

"fix" one word. Early on, a generated file broke the build. Instead of writing a careful bug report, I literally typed fix. Copilot traced it, patched it, and then, without me asking, pointed out that the broken artifact shouldn't be in version control at all and helped me add it to .gitignore. It didn't just fix the symptom; it stopped me from committing the mess in the first place.

It found the edge cases I couldn't name. The whole app compares two spreadsheets, and I knew rows weren't lining up, but I couldn't say why. I just said: "handle all the cases where the files don't match styles, rows, names." It came back with a plan covering mismatched column names, missing rows, and formatting differences, the exact things that crash on a real user's messy file. I didn't have to enumerate the failure modes; it did.

It pushed back when I was wrong. At one point, I asked, "Is this even SaaS?" and instead of just agreeing, it helped me rethink the architecture (moving to a managed backend) so what I was building actually matched what I claimed to be building. Those reframes saved me from shipping something half-right.

A security review I didn't know I needed. I asked it to check the project "as a cybersecurity dev," and it reviewed the real code and flagged real issues, not generic advice. Later, realizing a public SaaS should never leak infrastructure or service names to a regular user, I had it sweep the entire project and the README to strip them out. It caught spots I'd have missed before pushing to a public repo.

Design direction, not just implementation. The UI was my weak point. It helped me redesign around a stepper-style multi-step flow, a card-style dropzone, and a minimal cream-white theme with Framer Motion, even researching current small-SaaS design trends so the redesign was grounded in something real instead of my gut feeling.

The first run just worked. I said, "fire up whatever services are necessary and run the app," expecting to lose an hour wiring up the backend and chasing missing config. Instead, it brought the services up, got it running, and wrote the run instructions straight into the README. The evening I'd budgeted for "why won't this start locally" simply didn't happen.

The pattern across all of it: I described what I wanted — often vaguely and Copilot filled in the how: the plan, the structure, the edge cases, the security gaps, the design language, all while keeping the whole project in its head the entire time.

Tooling Note

I also tested different AI coding assistants while finishing this project, and the experience taught me something important: the model matters, but the way you direct it matters even more.

Some tools were useful but often produced familiar or generic answers unless I pushed them with very specific product, UX, and edge-case constraints. Gemini was especially helpful for exploring interface directions when guided clearly, although I later refined parts of the UI again during implementation. Claude Sonnet was useful for reasoning and cleanup, but I had to keep pushing it away from safe, predictable answers.

The strongest coding experience for this rebuild came from GitHub Copilot, running its GPT-5.3-Codex and GPT-5.4 models. It handled the project context, implementation steps, refactors, and debugging with much less friction. GitHub Copilot’s auto mode was also a major part of the experience: being able to approve changes quickly and let it move through files almost hands-free made the rebuild feel much closer to pair programming than copy-pasting snippets.

There are trade-offs. Planning modes are great for architecture and sequencing, but sometimes limiting when you want to immediately move from plan to code. Still, that separation helped me think more clearly before implementation.

The main lesson: AI coding tools are not magic buttons. They become powerful when you give them constraints, force them to reason about edge cases, and keep ownership of the product decisions yourself.

DEV Community