<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: SAMUELVANUNU</title>
    <description>The latest articles on DEV Community by SAMUELVANUNU (@samuelvanunu).</description>
    <link>https://dev.to/samuelvanunu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4013434%2F6fc36036-30dd-430f-a40e-f5c6c20e5a60.png</url>
      <title>DEV Community: SAMUELVANUNU</title>
      <link>https://dev.to/samuelvanunu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/samuelvanunu"/>
    <language>en</language>
    <item>
      <title>I Built a File Analyzer During My Summer Break — Here's What I Learned</title>
      <dc:creator>SAMUELVANUNU</dc:creator>
      <pubDate>Fri, 03 Jul 2026 10:46:43 +0000</pubDate>
      <link>https://dev.to/samuelvanunu/i-built-a-file-analyzer-during-my-summer-break-heres-what-i-learned-5di7</link>
      <guid>https://dev.to/samuelvanunu/i-built-a-file-analyzer-during-my-summer-break-heres-what-i-learned-5di7</guid>
      <description>&lt;h2&gt;
  
  
  The Idea
&lt;/h2&gt;

&lt;p&gt;I wanted a summer break project that wasn't just another to-do app. Something that would touch a few different skills: backend API design, error handling, PDF generation, and a bit of frontend — without getting too ambitious to finish.&lt;/p&gt;

&lt;p&gt;The result: a &lt;strong&gt;File Analysis &amp;amp; Report System&lt;/strong&gt; built with FastAPI. You upload a file (or several), it detects the type, runs a type-specific analyzer, and hands back a structured report — either as JSON or as a downloadable PDF.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Supports 7 file types: &lt;code&gt;.py&lt;/code&gt;, &lt;code&gt;.js&lt;/code&gt;, &lt;code&gt;.json&lt;/code&gt;, &lt;code&gt;.csv&lt;/code&gt;, &lt;code&gt;.xml&lt;/code&gt;, &lt;code&gt;.yaml&lt;/code&gt;, &lt;code&gt;.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Each type gets its own analyzer:

&lt;ul&gt;
&lt;li&gt;Python → &lt;code&gt;ast&lt;/code&gt; module for syntax checking and import detection&lt;/li&gt;
&lt;li&gt;JSON/YAML → parses and reports validity + top-level keys&lt;/li&gt;
&lt;li&gt;CSV → column consistency checks&lt;/li&gt;
&lt;li&gt;XML → element tree parsing&lt;/li&gt;
&lt;li&gt;Markdown → heading/link/code-block counting&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Single or batch upload, with a live web UI&lt;/li&gt;
&lt;li&gt;PDF export (single file or combined batch report)&lt;/li&gt;
&lt;li&gt;Dockerized — &lt;code&gt;docker run&lt;/code&gt; and you're up&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A Few Things I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Design for compactness from the start.&lt;/strong&gt;&lt;br&gt;
My first PDF export put each file on its own page. For 10 files, that's 11 pages, mostly blank. I only noticed how wasteful it looked once I actually printed a preview — the fix was switching from forced page breaks to a flowing layout, which cut page count by ~70%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Error handling isn't optional, it's the whole point.&lt;/strong&gt;&lt;br&gt;
The one thing that broke the app the most during testing was assuming a file would parse cleanly. A single malformed JSON or an unclosed parenthesis in Python would 500 the whole request if I didn't wrap the parsing in try/except. Once I made &lt;em&gt;every&lt;/em&gt; analyzer fail gracefully into the report format instead of throwing, the whole system got a lot more trustworthy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. .gitignore matters more than I thought.&lt;/strong&gt;&lt;br&gt;
Early on I accidentally committed my entire &lt;code&gt;venv/&lt;/code&gt; folder — 3,500+ files. Lesson learned: set up &lt;code&gt;.gitignore&lt;/code&gt; before your first commit, not after.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend:&lt;/strong&gt; FastAPI, Python's &lt;code&gt;ast&lt;/code&gt;/&lt;code&gt;csv&lt;/code&gt;/&lt;code&gt;xml&lt;/code&gt;/&lt;code&gt;json&lt;/code&gt; stdlib modules, PyYAML&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PDF generation:&lt;/strong&gt; ReportLab, with a Unicode font for proper character support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Vanilla HTML/CSS/JS (no framework — kept it simple), localStorage for history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Containerization:&lt;/strong&gt; Docker + docker-compose&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;Repo's here if you want to poke around or use it: &lt;a href="https://github.com/SamulVanunu/dosya-analiz-sistemi" rel="noopener noreferrer"&gt;github.com/SamulVanunu/dosya-analiz-sistemi&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy to hear feedback, especially on the analyzer architecture — I tried to keep it modular (a &lt;code&gt;BaseAnalyzer&lt;/code&gt; interface) so adding new file types is just a few lines of code, but I'm curious if there's a cleaner pattern.&lt;/p&gt;

</description>
      <category>python</category>
      <category>fastapi</category>
      <category>webdev</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
