I am a B.Tech Computer Science undergraduate at Amrita Vishwa Vidyapeetham who enjoys building privacy-focused systems and learning by deploying real software end-to-end.
The Motivation
The inception of InfoStuffs was not driven by a desire to build just another productivity application. It started with a very specific user requirement from my sister. She needed a digital space to organize personal documents and sensitive notes but refused to use standard cloud services such as Google Keep or Notion.
Her constraint was simple but technically demanding: she wanted the convenience of the cloud without trusting the cloud provider with her plaintext data.
This challenge became the foundation of InfoStuffs. My goal shifted from building a simple web application to architecting a Zero-Knowledge Information Management System that prioritizes privacy by default.
The Problem Statement
Modern productivity tools generally fall into two categories:
- Convenient SaaS: (e.g., Notion, Keep) store user data in plaintext or use server-managed keys, leaving data vulnerable to internal leaks or database breaches.
- Self-Hosted: (e.g., Obsidian, Nextcloud) offer strong privacy but are difficult to access and maintain across multiple devices.
InfoStuffs bridges this gap. The system had to be secure enough that a complete server-side compromise would yield nothing but garbage data, yet accessible via a standard web browser on any device.
High-Level Architecture
To satisfy these constraints, InfoStuffs uses a decoupled, cloud-native architecture with clearly separated responsibilities.
- Frontend: React (Vite) with Material UI. Responsibilities include UI rendering and client-side cryptographic operations (encryption/decryption).
- Backend: Node.js and Express, utilizing a Serverless Architecture.
- Local Development: Runs as a fully dockerized monolithic container, ensuring a consistent development environment that mirrors production dependencies.
Production Deployment: Deployed to Vercel as stateless Serverless Functions, allowing the API to scale to zero when idle (cost-efficient) while maintaining a single Express codebase.
Database: MongoDB Atlas for storing encrypted metadata and ciphertext.
Authentication: Clerk. Delegating identity management reduced the attack surface for auth flows (MFA, session management).
Storage: Supabase Storage, used strictly for isolating binary objects (images and PDFs) via obfuscated paths and client-side encrypted blobs.
Security by Design: The Zero-Knowledge Vault
Security was not an optional feature; it was the primary architectural constraint.
1. The Problem with Static Keys
In my initial design, I used a static encryption key stored in the server's environment variables (VITE_SECRET_KEY). I quickly realized this was a critical flaw. If an attacker or a compromised hosting environment were to expose environment variables, they could decrypt everyone's data. The key was visible, which violated the core concept of Zero-Knowledge.
2. The Solution: User-Derived Cryptography
To fix this, I removed the static key entirely. I implemented PBKDF2 (Password-Based Key Derivation Function 2) on the client side.
- When a user logs in, they enter a Vault Password. This password is never transmitted or stored and exists only transiently in the client’s memory.
- The browser runs PBKDF2 to derive a temporary 256-bit AES-GCM key in memory.
- This key is used to encrypt notes, titles, and file paths before the network request is even formed.
The server only ever sees (and stores) ciphertext. If the database administrator (me) were to look at the data, I would see nothing but unreadable strings.
3. Ephemeral Access to Media
For file storage, I avoided public buckets entirely.
- Encrypted Paths: The database stores an encrypted string pointing to the file path (e.g., "user/123/image.jpg" is encrypted).
- Obfuscated Storage: Uploaded files generate randomized UUID storage paths, so the storage provider never sees the original filenames.
- On-Demand Access: When a user unlocks their vault, the client decrypts the path and requests a Signed URL from Supabase.
- Time-Limited: Signed URLs are valid for one hour. This prevents "link sharing" leaks and ensures that even if a URL is intercepted, it becomes useless quickly.
4. Zero-Knowledge Offline Vault (PWA)
To make the app truly resilient, I engineered a complete offline mode without sacrificing the encryption model.
- Offline Caching: During an online session, raw encrypted ciphertexts of the user's notes are saved into IndexedDB; plaintext is never stored on disk.
- Instant Offline Detection: A module-level event interceptor and a fast connectivity probe immediately detect offline states.
- Auth Bypass & Key Derivation: If offline, Clerk initialization is bypassed entirely. The app regenerates the AES-GCM key dynamically using the Vault Password and the cached, plaintext Clerk User ID as the PBKDF2 salt.
Infrastructure Evolution: Solving the Cost Problem
One of the most valuable learning experiences came from adapting the infrastructure to real-world cost constraints.
Phase 1: The "Enterprise" Trap (GCP)
My initial deployment used Google Cloud Platform with Cloud Run and Cloud Build. While this was an industry-standard "Enterprise" setup, it introduced significant problems for a personal project:
- High Costs: Paying for load balancers, container registry storage, and compute time quickly added up.
- Complexity: Managing IAM roles and build triggers for a simple app was overkill.
Phase 2: The "Serverless Monolith" (Vercel)
To solve the deployment cost problem, I re-architected the stack to run for $0/month:
-
Local Development (Dockerized): I kept the convenience of a containerized environment. A single
docker-compose upspins up the Frontend, Backend, and Database services. This ensures that the development environment is isolated and reproducible on any machine. - Production Deployment (Serverless): Instead of paying for a permanently running container (which costs money even when idle), I refactored the Express application to run on Vercel Serverless Functions.
- The Result: I effectively have a "Serverless Monolith." I develop it like a standard monolithic app (easy to debug, easy to run locally in Docker) but deploy it as distributed functions. This gives me the best of both worlds: Zero infrastructure management and Zero cost for personal use.
I intentionally avoided microservices, as the domain does not yet justify multiple bounded contexts, and premature service decomposition would increase complexity and attack surface without tangible benefits.
Technical Challenges & Performance Optimizations
1. Environment Variable Visibility
Relying on .env files for security was a mistake. The migration to user-derived keys solved this, but it required handling edge cases like "Lost Passwords." Since I no longer had the key, I had to implement a "Nuclear Reset" feature. This allows users to wipe their unrecoverable data and start fresh, prioritizing security over recovery.
2. Client-Side Decryption Bottlenecks
Rendering a vault full of encrypted images and text placed heavy load on the UI thread. I implemented two major optimizations:
- Intersection Observer Deferral: Encrypted image attachments are only requested, downloaded, and decrypted when they scroll into the viewport.
-
High Performance Base64 Parser: I replaced slow native
Uint8Array.from()callbacks with custom loops, significantly reducing CPU overhead during decryption.
3. The Docker vs. Serverless Routing Mismatch
One of the most complex challenges was reconciling the difference between a running Docker container and Vercel's file-system routing. In my local Docker container, Express handled all routing internally. However, when deployed to Vercel, requests to sub-paths (like /api/info/nuke) were hitting Vercel's 404 handler before reaching my Express app. I resolved this by creating a Vercel-compatible entry point (api/info.js) and a rewrite rule that pipes all sub-route traffic directly into the Express instance.
4. Production OAuth Strictness
A critical UI bug emerged in production: users completing Google Sign-In were dumped onto a blank white screen at /sso-callback. My SPA lacked a dedicated route to handle strict server-side redirects enforced in production. I architected a dedicated "OAuth Landing Pad" component that intercepts the OAuth token, displays a branded loading state, and seamlessly completes the handshake before forwarding the user.
Future Roadmap
While InfoStuffs is fully functional, I plan to explore:
- Redis Caching: To reduce database reads for frequently accessed (encrypted) metadata.
- React Native Mobile App: Wrapping the existing logic to allow biometric vault unlocking (FaceID) instead of typing the password.
Closing Thoughts
InfoStuffs is more than a note-taking application. It is a practical exploration of Zero-Knowledge Engineering.
By addressing the real-world problems of data visibility and cloud costs, I built a system where privacy is enforced by mathematics, not by policy. It satisfies a real user need while serving as a valuable learning experience in full-stack security and DevOps.
Repository: GitHub
Live Deployment: Link
Note: The live deployment requires authentication and a vault password. The core security properties are enforced client-side and are best understood via the architecture discussion above.


Top comments (11)
Great example and solid work, thx for sharing this Pranav. Curious to know if you have got more feedback from users, and whether have seen more interest from others in using this.
And how do you deal with search? Does data encryption introduce a challenge in search, especially as the data grows more and more
Thanks! Most feedback so far has been from family, friends, and privacy-focused users. Search is the main trade-off. Since the server only sees encrypted data, I handle search on the client by decrypting notes in memory and filtering locally. It works well for personal use, but it will need rethinking as the dataset grows.
Great to hear this . Yes, Search would be a challenge for big datasets, but also due to the whole encryption and privacy-preserving impl. How would you do it , keeping the encryption and privacy intact of course. Meta data? indexing?
For larger datasets, I’d lean toward client-side indexing with minimal encrypted metadata. The server stays blind, and search scales without breaking the privacy model. More advanced approaches exist, but they add complexity that isn’t needed yet.
This is genuinely solid work. You didn’t just build an app, you clearly thought through the security model end to end, and catching the static-key issue early shows good instincts. The user-derived key approach & short-lived signed URLs is exactly how zero-trust systems should be built.
What’s interesting is how naturally this could grow without breaking your design. If you ever move beyond purely personal use (shared vaults, delegated access, automation), confidential compute ideas, like running logic in TEEs on platforms such as Oasis ,could fit nicely while keeping the same “don’t trust the server” mindset.
Really nice balance of practicality, security, and cost awareness. This is the kind of project that actually teaches you how systems fail in the real world.
Thanks, I appreciate that. The static-key issue was a key turning point, and extending the same threat model to shared access and confidential compute is definitely on my list.
This is a solid example of actual zero-trust design, not just “encrypted at rest” marketing. Client-side key derivation & server seeing only ciphertext is the right call, and your pivot away from env-stored secrets shows good threat modeling. The serverless-monolith approach is also very pragmatic for personal projects.
If you ever want to push this further, the next interesting step would be verifiable or confidential server-side logic e.g., TEEs for things like metadata processing or search, so even compute can be proven without exposing data. But as it stands, this is a clean, well reasoned privacy architecture with real-world tradeoffs handled thoughtfully.
Thank you, I really appreciate that. The shift away from env-stored secrets was a key design turning point, and I agree that TEEs or verifiable compute would be a natural next step if server-side logic ever expands beyond simple coordination.
Neat
Amazing work!
Nice@pranav_kishan_f81e2fc8327