DEV Community

Adrian Crîșmaruc
Adrian Crîșmaruc

Posted on • Originally published at adriancrismaruc.com

My 18-Month Journey Building a SaaS App

10,000+ photos. Hundreds of videos. One specific memory I desperately needed to find.

Sound familiar?

💡 Tip: Skip to the solution

Try the RekoSearch Demo yourself, sign up for the RekoSearch Closed Beta or explore the technical deep-dive if you're curious about the architecture and implementation.

When Digital Hoarding Becomes a Real Problem

About a year and a half ago, I was looking for my next big project, but I wasn't sure what it would be. Fortunately, while I was trying to filter through my media and find some very specific photos and videos, I encountered a problem that motivated me to build this.

The problem is that I have literally over 10,000 photos and videos, and while they are organized by year and even by period, that still results in hundreds per folder. Going through everything to find exactly what I need takes time, and a lot of it.

While there are tools that allow you to find a piece of media by face or search for text within them, they either don't provide in-depth enough searching capabilities, a broad enough scope of things to search for, or are tailored to a particular use case.

RekoSearch: Semantic Search Across All Your Content

RekoSearch, on the other hand, allows you to simultaneously search across photos, videos, documents, and audio files semantically, using natural language queries.

Key Features

  • 🔍 Natural Language Search: Search using phrases like 'dog mountain' to find all photos and videos with mountains and dogs
  • 📁 Multi-Format Support: Search across photos, videos, documents, and audio files simultaneously
  • ⚡ Advanced Queries: Use Boolean operators and filters for precise results with complex search logic

For example, you may be looking for something simple like "dog mountain" which in this case would return all photos and videos having a mountain and a dog in them or a phrase like "happy birthday" which will return photos, videos, documents and audio files containing either the sentence "happy birthday" or happy people and a birthday-like scene.

Or you may have a more advanced need, in which case, consider this example: you have created a new product, such as a couch, and recorded a promotional video for it, but it required many takes. You specifically want to look at those that feature the product, the brand name, and the person in them is not sad. In this case, you could do an advanced search query like this:

label:couch AND text:"brand name" NOT face:sad
Enter fullscreen mode Exit fullscreen mode

Furthermore, RekoSearch also features functions such as filtering, saving searches, downloading results, and many more features planned for implementation.

Building this, however, turned out to be far more complex than I initially anticipated.


The Technical Challenge: Why This Was Hard to Build

At first, thinking about how to build it didn't seem that hard, but I soon realized this was way more complex than I expected. Even a simple thing, such as authentication and authorization for someone like me who hasn't implemented OAuth 2.0 from scratch before, was challenging. Still, you have to ensure it's secure when actual people's money is involved.

Another challenge that has arisen as I've delved deeper into the project is the sheer size and complexity of managing everything. You have infrastructure, API methods each with their own program (15+ Lambda functions), backend processing servers, which in themselves could be entirely separate projects, and the frontend Homepage and Dashboard (which are both separate entities). To make a change or add something new, you need to ensure it propagates and works well with everything else, from the backbone to what the user sees.

Plus many other challenges related, but not limited to (with non-exhaustive examples):

  • Scalability: processing many files without delays, wasting resources and unnecessary costs (solution: scheduling job processors on AWS Fargate).
  • Security: ensuring user data is handled securely, encrypted, and access to it is limited.
  • Performance: not leaving out resources unused in case of processing simple files like images, and thus processing multiple in parallel for faster processing times.
  • Cost Optimization: Utilizing the cheapest hosting options (e.g, using Digital Ocean instead of AWS for the Kubernetes cluster) and building more efficient components (e.g, using Rust Lambda functions).
  • AWS Service Integration: Learning and utilizing the SDK for each AWS service, all of them having their own tricks and ways of handling things.
  • Niche Implementation Challenges: Things that you would think someone has to have implemented a way to do this, or there has to be a standard defined, only to find out there isn't, and you are all on your own. An example in this case would be reliably counting the pages in a PDF file, tamper-proof and exploit-proof, by streaming it and doing so exceptionally fast.

Combine all of these, and from the 3-4 months I initially thought it would take me to develop this, I am 1.5 years and over 60,000 lines of code later, and this is only the start of the beta.

Yet, by no means do I regret taking on this project; in fact, I am actually thrilled. The amount of knowledge I gained in this year and a half of building a proper, enterprise-like SaaS application is something I could never have acquired by building smaller desktop apps.

The Tech Stack

I could not possibly explain everything in a paragraph or two here, so if you are interested in a deep dive, please don't hesitate to check the public repository containing highly in-depth documentation and technical diagrams that cover every aspect of it: https://github.com/Obscurely/RekoSearch-Public

1. Infrastructure

All managed by Terraform.

  • AWS Services for managing stuff like login (Cognito), databases (DynamoDB), storage (S3) and the processing of the files (Rekognition, Textract and Transcribe).
  • Kubernetes Cluster on DigitalOcean for hosting the Dashboard, the Queue Processor (which listens to jobs pushed to the SQS queue and spins up job processing applications on AWS Fargate, allowing for infinite job processing scalability), the stats application (Grafana), and the analytics application (Umami).

2. Backend

All programmed in Rust, allowing for fast and safe execution of all the backend services, as well as lower costs in case of AWS Lambda (pre-compiled Rust functions are much quicker and lighter weight and can also run on arm64, making them dirt cheap).

3. Frontend

Built using TypeScript and Python, it's split into two components:

  • The homepage: a Next.js static website exported and hosted on S3 behind Cloudfront, handles any amount of traffic with ease and at a very low cost.
  • The Dashboard: a React SPA hosted on the Kubernetes cluster using Gunicorn (production WSGI server) and Flask as the business logic layer. It can be scaled as needed, and in this case, users pay for it when processing their files.

✅ The Power of Rust + ARM64

Here's how much of a crazy difference developing your Lambda functions in Rust and running them on ARM makes:

  • Rust lambda functions execute on average 10 to 100 times faster than Python-based ones.
  • Say you pay about $100 a month for Lambda functions written in Python that run on x86. If you convert to arm64, you save about 20%, so you are down to about 80%. If you convert to Rust due to its greater efficiency, you will be down to about $15-30 per month. That's a 70-85% reduction in costs!

What's Next

I plan to continually work on and improve this project, as it's very close to my heart and I genuinely enjoy working on it. Some of the features I have planned that are in my immediate attention include:

  • Implementing the use of parentheses in the search query to allow writing complex queries easier. Currently, the search follows the algebraic Boolean order.
  • Implementing an API keys system, allowing the use of the API outside the Dashboard.
  • Adding support for more file types by converting them on the server.
  • Adding social provider identities for login (OIDC for Google, etc.)
  • Implementing a library-like feature for organizing searches, files, combining multiple jobs and other features to turn it into a fully fledged knowledge mining platform.

My long-term vision for RekoSearch is tied to the last feature I mentioned. I want to implement multiple features, including organizing, filtering, combining various data sources, adding integrations with other platforms, and others, to make it a one-stop solution for all knowledge mining-related needs.

Experience RekoSearch Yourself

Quick Links

  • Get Started with Closed Beta: 50 free credits ($5 value) • Perfect for testing with your own files • No commitment required
  • GitHub Repository: Complete architecture documentation • Technical diagrams and implementation details • Deep dive into how I built it
  • Main Website: Learn more about RekoSearch and its capabilities

ℹ️ Want to Connect?

Top comments (0)