Zeyrian Faris

Posted on Jun 28

Testing fourpointo Against Malicious Uploads: Prompt Injection and Stored XSS

#ai #security #llm #python

fourpointo is a self-hosted Flask app I built that generates AI-powered task checklists and rubric breakdowns from uploaded assignment PDFs. It uses Groq's LLaMA 3.3 70B for extraction, SQLite for storage and Gunicorn behind a Cloudflare Tunnel.

After fixing a magic-byte validation bug found during normal use (a fake PDF was causing an unhandled crash in PyMuPDF), I wanted to go further and actually probe the upload pipeline the way an attacker might. This writeup documents that process: the setup, the test cases and the results.

All testing was done locally in a Kali VM against a local copy of fourpointo. No production data or other users were involved at any point.

Setup

The first step was building a valid baseline PDF to work from. A blank or garbage file wouldn't tell me anything useful since fourpointo already rejects those.

This gave me real.pdf, a small but genuinely valid PDF I could use as raw material for the rest of the testing.

Test 1: Input Validation (Magic Bytes and Content Gate)

Before testing anything more advanced, I wanted to confirm the existing validation layers actually worked. fourpointo has two gates:

A magic-byte check that rejects files that aren't structurally PDFs
A content gate (an LLM call) that rejects PDFs that don't read like an assignment specification

To test the second gate, I truncated a valid PDF so the header would pass but the internal structure would be incomplete.

Uploading the truncated file produced a clean rejection rather than a crash or a silent failure.

Result: Both gates behaved as intended. The truncated file still had a valid PDF header (since the magic bytes survive truncation) but lacked enough coherent content for the second gate to accept it, and it was rejected with a clear message instead of breaking anything downstream.

Building a Realistic Specification PDF

To test prompt injection and stored XSS properly, I needed a PDF that would actually pass the content gate, meaning it had to look like a real assignment specification with a Tasks section and a Grading Criteria section.

I wrote a small Python script using fpdf2 to generate this on demand, which also made it easy to produce variants later by editing one line at a time.

Once the script ran cleanly, I uploaded the resulting spec.pdf as a baseline with no malicious content. It passed the content gate and generated a normal checklist.

This confirmed the test harness worked end to end before introducing anything adversarial.

Test 2: Prompt Injection

fourpointo pipes extracted PDF text directly into an LLM call to generate tasks and rubric criteria. This is exactly the kind of pipeline that's vulnerable to prompt injection if the document content isn't treated as untrusted input.

Rather than using an obvious phrase like "ignore all previous instructions," I used a more disguised pattern that mimics how a real injection attempt would try to impersonate a system-level instruction embedded inside user-supplied content.

I built this into a new PDF and confirmed it was generated correctly.

After uploading it and generating the checklist, I checked the raw JSON output for the rubric extraction.

The injected text never appears anywhere in the output. The model extracted the real tasks and the real grading criteria and nothing else.

I also checked the rendered project page in the browser to confirm the same thing held true for the task list, not just the rubric.

Result: The injection attempt failed. fourpointo's extraction stayed accurate even when an instruction-like phrase was embedded directly inside the task text. It's worth noting that fourpointo's AI layer only reads and displays content. It doesn't grade anything or make decisions that affect a real outcome, so even a successful injection here would have limited impact beyond corrupting the displayed checklist. Still, this is a meaningful result because it shows the extraction step doesn't blindly follow instructions found inside untrusted document content.

Test 3: Stored XSS via PDF Content

Next, I wanted to know whether a script tag embedded inside a PDF's task text could survive the full pipeline (extraction, storage and rendering) and execute in the browser. I disguised the payload as part of a legitimate-looking task rather than dropping it in isolation, since that's closer to how a real attacker would try to slip it past a casual read.

I uploaded the resulting PDF and generated the checklist, then checked every task in the rendered UI.

No alert fired, and none of the displayed tasks contained the script tag at all. Looking at the task labels (Analyze Dataset, Write Report, Check Formatting, Create Charts, Submit Report), it's clear the AI isn't echoing the raw text back. It's paraphrasing each task into a short label before it ever reaches the frontend.

Result: No XSS was observed, but the reason matters. This wasn't a deliberate sanitization control catching the payload. It was the summarization step rewriting the content before the payload could survive long enough to be rendered. If fourpointo's prompt ever changes to extract tasks verbatim instead of summarizing them, this protection could disappear without anyone noticing. It's a real mitigation today, but not one to rely on long term.

Test 4: Stored XSS via Direct Form Input

Since the PDF-based test was inconclusive about actual output escaping (the payload never survived to be rendered), I tested the rendering layer directly by typing a script tag straight into the project name field. This field gets stored and displayed with no LLM step in between, which isolates the question of whether the frontend escapes HTML on output.

After creating the project, I checked how the name rendered on the project page and in the sidebar.

The page title and sidebar entry both showed the literal text <script>alert('XSS via project name')</script> instead of executing it.

Result: Confirmed. fourpointo properly escapes this field on render. The script tag is treated as plain text rather than executable HTML, almost certainly because the frontend rendering layer (whether that's React's default text handling or template auto-escaping) doesn't interpret stored strings as raw HTML.

One caveat worth being precise about here: since fourpointo currently has no sharing, admin view, or export feature, this would technically be self-XSS rather than exploitable stored XSS, because there's no path for the payload to reach another user's session. The mechanism is confirmed either way, and the escaping itself is a real control that would matter immediately if any multi-user-facing feature (sharing, an admin dashboard or exported reports) were added later.

Summary of Findings

Test	Result	Notes
Magic-byte validation	Pass	Rejects non-PDF files correctly
Content gate validation	Pass	Rejects structurally invalid or unconvincing specs
Prompt injection (disguised)	Pass	Injected instruction was not followed
Stored XSS via PDF content	Pass (incidental)	Payload never survived extraction due to paraphrasing, not deliberate sanitization
Stored XSS via form input	Pass (deliberate)	Output is properly HTML-escaped on render

Recommendations

Continue treating all extracted document text as untrusted input. The current prompt injection resistance held up in this round of testing, but it should be re-verified any time the extraction prompt or model changes.
Don't rely on the LLM's paraphrasing behavior as a security control against XSS. If task extraction is ever changed to preserve verbatim text, explicit output encoding should be added at that point rather than assumed.
The form-field escaping currently in place is a genuine and correct control. If sharing, an admin view or export features are added in the future, this is the exact protection that would prevent a self-XSS finding from becoming a real stored XSS affecting other users.

Conclusion

Across four separate test categories, fourpointo held up better than expected for a personal project with a small user base. Input validation correctly filters malformed and irrelevant uploads, the AI extraction layer resisted a disguised prompt injection attempt and the frontend properly escapes user-controlled output. The one point worth continued attention is that the XSS resistance in the document pipeline is currently a side effect of summarization rather than a deliberate control, which is the kind of gap that's easy to lose track of as the app evolves.

DEV Community