This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
PaySnap is an AI-powered wage theft detector that helps workers
understand their paystubs and recover stolen wages in their language.
Every year, $50 billion is stolen from American workers through wage
theft. Construction workers, restaurant staff, and farmworkers are hit hardest. They don't know their rights. They can't read their paystub. Many are afraid to report.
PaySnap changes that. A worker uploads a paystub photo — or describes
their situation in Hindi, Spanish or Chinese and PaySnap tells them
exactly what they're owed, which law was broken, and how to report it.
Try it live: paysnap.vercel.app
Key features:
- Gemma 4 E2B fine-tuned on 365,393 real DOL enforcement cases
- Native Gemma 4 function calling — truly agentic AI
- Reads paystub photos via Gemma 4 vision (llama.cpp)
- Explains violations in 11 languages
- Detects overtime, illegal deductions, minimum wage violations
- Always provides DOL hotline: 1-866-487-9243 (free, confidential)
Demo
Live app: https://paysnap.vercel.app/
Scenario 1 — Texas construction worker:
- Input: 52 hours, $15/hour, Texas, no overtime shown
- PaySnap detects: 12 unpaid overtime hours
- Result: $90 owed under FLSA 29 USC 207(a)(1)
Scenario 2 — New York restaurant worker (Hindi):
- Input: 48 hours, $16/hour, NY, UNIFORM $35 + BREAKAGE $50 deductions
- PaySnap detects: overtime violation + 2 illegal deductions
- Result: $149 owed — full explanation in Hindi
Code
GitHub: https://github.com/Aadarsh-Praveen/Paysnap
Fine-tuned model (GGUF):
https://huggingface.co/Aadarsh-Praveen/paysnap-gemma4-gguf
LoRA weights:
https://huggingface.co/Aadarsh-Praveen/paysnap-gemma4-lora
Training notebook:
https://kaggle.com/code/aadarshpraveen/paysnap-gemma4-finetuning
Dataset (365,393 DOL cases):
https://kaggle.com/datasets/aadarshpraveen/paysnap-labor-law-dataset
How I Used Gemma 4
I chose Gemma 4 E2B for three specific reasons:
1. Edge deployment — Workers PaySnap serves often use older
devices. E2B runs at 63 tokens/second on Apple M3 Pro and fits
in 3.4GB as a Q4_K_M GGUF. A larger model would not run locally.
2. Fine-tuning efficiency — I fine-tuned E2B on 365,393 real
DOL enforcement cases using Unsloth LoRA on a Kaggle T4 GPU.
Training loss reached 0.009. A 31B model would have been
impossible on free compute.
3. Multilingual capability — Despite its small size, E2B
generates coherent responses in Hindi, Spanish, Chinese, and 8
other languages — critical for reaching vulnerable workers.
Four ways Gemma 4 powers PaySnap:
- Vision — reads paystub photos via llama.cpp multimodal API
- Native function calling — Gemma 4 autonomously decides which tools to call (calculate_overtime, check_deductions, get_applicable_statutes, get_dol_contact)
- Fine-tuned knowledge — learned real DOL enforcement patterns from 365,393 cases, +11.7% improvement on LLM-as-Judge eval
- Multilingual explanation — explains violations in worker's language with exact statute citations
Evaluation (LLM-as-Judge, base Gemma 4 E2B as judge):
Base Gemma 4 E2B: 8.12/10
PaySnap fine-tuned: 9.07/10
Improvement: +11.7%
All 5 dimensions improved: Legal Accuracy +1.73, Statute Quality
+1.33, Actionability +0.73, Dollar Accuracy +0.67, Worker Clarity +0.27
Team
This project was built by:
Aadarsh Praveen Selvaraj Ajithakumari — @aadarsh_praveen
Suriya Kasiyalan Siva — @suriya_ks_0902
Top comments (0)