visiohex

Posted on Feb 26 • Originally published at apacitationgenerator.online

How I Built an AI-Powered APA Citation Generator with Next.js, Supabase, and PayPal

#ai #nextjs #showdev #supabase

If you have ever tried to cite random web pages in APA style, you know the pain:

missing authors
missing publication dates
inconsistent metadata
lots of manual fixing after "auto generation"

I built APA Citation Generator to solve that specific workflow:

accept a URL or DOI
extract metadata with rule-first parsing
use AI only for missing fields
output APA reference + in-text citation
clearly mark confidence and review warnings

Live demo: https://apacitationgenerator.online

Why I built this

Most citation tools work great on clean sources, but fail on real-world pages. In practice, users still need to:

open the page manually
find author/date by hand
patch placeholders like (n.d.)

My goal was not "generate everything with AI".
My goal was to build a fast, reviewable citation pipeline where users can trust what is extracted and quickly fix what is uncertain.

Tech stack

Frontend/App: Next.js 16 (App Router)
DB/Auth: Supabase (custom apa schema, Google OAuth)
AI: OpenRouter (Gemini model fallback chain)
Payments: PayPal (one-time credit packs + subscription)
Deploy: Vercel
Domain/DNS: Cloudflare

Core architecture

1) Input classification

User input is classified as either:

Anything else is rejected early with explicit errors.

2) Rule-first metadata extraction

For URLs, I first parse metadata from HTML:

og:title
og:site_name
article:published_time
common author/date tags

For DOI, I query Crossref.

3) AI completion only for missing fields

If critical fields are missing, I send a trimmed text snapshot to AI and ask for strict JSON output:

authors
publicationDate
title
containerTitle

This kept latency low and reduced hallucination risk versus full-AI generation.

4) APA formatter + confidence model

The formatter creates:

full reference
in-text citation

Then a confidence score is calculated from available fields and inference flags.
Low-confidence outputs are labeled Needs Review with specific warnings.

Monetization model

I used a mixed model:

free daily quota
paid credit packs
monthly Pro subscription

This works better than a hard paywall for utility tools.
Users can test value first, then upgrade for volume.

SEO strategy (early stage)

Primary keyword: "APA Citation Generator".

Then I added related landing pages and long-tail clusters:

no author citation
no date citation
DOI to APA
in-text citation guides

I also shipped:

sitemap
robots
metadata/canonical
structured data (FAQ/Breadcrumb/SoftwareApplication)

What I had to harden before launch

After initial release, I did a security pass and fixed a few important issues:

SSRF protections on URL fetch
block localhost/internal hosts/private IP ranges
restrict ports
safe redirect handling
OAuth callback open redirect
allow only internal next paths
Edit API authorization
verify auth + ownership before updating saved citation jobs
Quota/credit race conditions
switched to compare-and-set style updates for concurrent safety
CSP + security headers
nonce-based CSP
frame/content/referrer protections
Lead form anti-bot
honeypot + timing checks + per-IP daily limit
optional Cloudflare Turnstile support

Lessons learned

Rule-first + AI fallback beats AI-only for this use case.
Explain uncertainty in UI. Users accept imperfection when they can see why.
Monetization should follow workflow friction, not just pageviews.
Security hardening early saves painful migrations later.

Current limitations

still focused on URL + DOI flows
some edge cases need manual verification (as expected in citation work)
content and internal-link SEO still improving

What’s next

better source-specific parsing heuristics
export formats (BibTeX/Word)
stronger account history and saved references UX
deeper analytics for conversion and retention

If you build in the education/productivity SEO space, I’d love feedback on two things:

Which citation edge cases break most often in your experience?
Which growth loop would you prioritize first: browser extension, exports, or team collaboration?

I’m happy to share implementation details if anyone wants to replicate this architecture.

DEV Community