I built an AI tool that turns survey data into research papers — here's the architecture
Hey DEV community! I'm a solo founder building AI tools for researchers. My latest product is Data2Paper — it takes raw survey/questionnaire export data and produces complete research paper drafts.
The problem
Researchers collect survey data → export CSV → spend weeks turning it into a paper.
The manual workflow looks like this:
- Clean the exported data (fix encoding, remove junk rows, identify the actual response sheet)
- Recode variables and set up analysis frameworks
- Run statistical tests in SPSS/R/Python
- Build tables and charts
- Write methodology, results, and discussion sections
- Format everything into a deliverable document
Data2Paper compresses that entire workflow into a single pipeline.
Architecture overview
┌─────────────┐
│ Upload │ CSV / XLSX / XLS
│ (Survey │ from any questionnaire platform
│ Export) │
└──────┬──────┘
│
▼
┌─────────────┐
│ Data │ Identify response sheet vs summary
│ Intake │ Parse machine headers (Q1, SC2...)
│ │ Detect variable types
└──────┬──────┘
│
▼
┌─────────────┐
│ Analysis │ Python execution chain
│ Engine │ Statistical tests based on variable types
│ │ Generate charts & tables
└──────┬──────┘
│
▼
┌─────────────┐
│ Paper │ Multi-language (7 languages)
│ Generation │ Full academic structure
│ │ Claude API
└──────┬──────┘
│
▼
┌─────────────┐
│ Export │ PDF / Word / LaTeX / ZIP
│ & Delivery │
└─────────────┘
Key technical decisions
Why Python execution instead of LLM-generated stats?
Language models can hallucinate numbers. For a research tool, that's unacceptable. The analysis engine runs actual Python code to compute statistics — correlation, regression, chi-square, ANOVA, etc. The LLM interprets the results, but doesn't generate them.
Why survey-specific, not generic?
Generic "data to text" tools don't understand that row 1 might be a machine header, that columns might represent Likert scales, or that the first sheet might be a summary rather than raw data. By focusing specifically on survey exports, the system handles these patterns reliably.
Why multi-language from day one?
Research is global. A tool that only outputs English misses a huge segment of users — Chinese grad students, European consulting teams, Japanese research groups. Supporting 7 languages in the generation pipeline (not as translation) was a deliberate product decision.
Tech stack
- Frontend/Backend: Next.js on Vercel
- AI: Claude API
- Analysis: Python execution chain
- Payments: Stripe
- Export: PDF, DOCX, LaTeX rendering
Try it
If you work with survey data or know someone in academia who does: datatopaper.com
I'd love feedback from the DEV community, especially around the analysis pipeline design and the multi-language generation approach. Drop a comment or reach out!

Top comments (0)