Asadbek Karimov

Posted on Sep 21

I Nearly Lost All My Users to 30-Second Database Queries

#webdev #sideprojects #lessons #postgres

asadk.dev

Built Mylinx as a solo dev -> grew to $750 MRR.

Mylinx started as my alternative to Linktree. Built in three months, it grew into something I'm proud of:

Full customization (layout, colors, themes, etc)
Music/video embeds for creators
QR codes for offline promotion
SEO control and 11+ support
Built-in URL shortener

For over a year, everything ran smoothly. Then the analytics started choking.

When "Simple" Analytics Becomes Not So Simple

Like most developers, I built analytics the straightforward way:two tables (HitPage and HitPageLink) storing every click and view as raw data. Classic "I'll optimize later" thinking.

12 months later:

150K+ new rows monthly
Queries timing out at 30+ seconds
Users are complaining about broken analytics
Database costs are climbing

The wake-up call came when users started asking for refunds because the analytics features were unusable. At $750/month in revenue, losing customers over fixable technical issues felt especially painful.

The Constraint That Shaped the Solution

I had two paths:

Enterprise approach: Rebuild with ClickHouse, TimescaleDB, and a separate analytics infrastructure
Bootstrap approach: Work within existing constraints and get creative

Learning From the Big Players (On a Small Budget)

I studied how Instagram and other platforms handle analytics data and adapted their tiered storage approach:

Hot Data (0–90 days): Keep in HitPage for real-time queries
Warm Data (3–12 months): Pre-aggregate into daily summaries
Cold Data (1+ years): Archive or delete

The key insight: most analytics queries don't need individual hit-level data. Daily aggregates work fine for historical trends.

The Schema That Made It Work

I added a HitPageDaily table for pre-calculated metrics:

model HitPageDaily {
  id             Int      @id @default(autoincrement())
  kyteId         String
  date           DateTime
  hitCount       Int
  uniqueVisitors Int
  countryStats   Json?
  deviceStats    Json?
  referrerStats  Json?
  createdAt      DateTime @default(now())
  updatedAt      DateTime @updatedAt

  @@unique([kyteId, date])
  @@index([kyteId])
  @@index([date])
}

This turns millions of individual hits into manageable daily rows. A year of data becomes 365 rows instead of millions.

The Aggregation Job

I built a serverless function that runs nightly, processing old data in batches:

const FETCH_BATCH_SIZE = 10000;
const BATCH_SIZE = 100;

const oldHitPageData = await prisma.hitPage.findMany({
  where: { timestamp: { lt: cutoffDate } },
  take: FETCH_BATCH_SIZE,
  skip
});

// aggregate into daily buckets
oldHitPageData.forEach(hit => {
  const key = `${hit.kyteId}-${format(hit.timestamp, 'yyyy-MM-dd')}`;
  if (!dailyAggregates.has(key)) {
    dailyAggregates.set(key, {
      kyteId: hit.kyteId,
      date: startOfDay(hit.timestamp),
      hitCount: 0,
      uniqueVisitors: new Set(),
    });
  }
  const aggregate = dailyAggregates.get(key)!;
  aggregate.hitCount++;
  aggregate.uniqueVisitors.add(`${hit.ip}-${hit.device}`);
});

// save results
await prisma.hitPageDaily.upsert({
  where: { kyteId_date: { kyteId: agg.kyteId, date: agg.date } },
  update: { hitCount: agg.hitCount, uniqueVisitors: agg.uniqueVisitors.size },
  create: { kyteId: agg.kyteId, date: agg.date, hitCount: agg.hitCount, uniqueVisitors: agg.uniqueVisitors.size }
});

The job processes data in chunks, aggregates by day, then cleans up the old raw data.

Hybrid Queries for Seamless Experience

The tricky part was making queries work across both recent raw data and historical aggregates:

const [hitPageData, dailyData] = await Promise.all([
  prisma.$queryRaw`
    SELECT TO_CHAR("timestamp", 'MM/DD') as date,
           COUNT(DISTINCT CONCAT("ip", '-', "device"))::int as views
    FROM "HitPage"
    WHERE "kyteId" = ${kyteId} AND "timestamp" >= ${daysAgo}
    GROUP BY 1
  `,
  prisma.$queryRaw`
    SELECT TO_CHAR("date", 'MM/DD') as date, "uniqueVisitors" as views
    FROM "HitPageDaily"
    WHERE "kyteId" = ${kyteId} AND "date" >= DATE(${daysAgo})
  `
]);

This gives users recent data with full granularity, plus historical context without the performance hit.

Testing and Deployment

I set up a production-like dev environment with cloned real data to test the migration. After confirming everything worked as expected, I deployed during low-traffic hours with a rollback plan ready.

Results after deployment:

Analytics queries dropped from 30+ seconds to under 2 seconds
Database size reduced by ~60%
User complaints stopped
No data loss or accuracy issues

What I learned

"I'll optimize later" is expensive. What seemed like premature optimization would have saved weeks of stress and nearly losing customers.

Constraints breed creativity. Limited budget forced me to find a solution that big companies might overlook - sometimes simple aggregation beats complex infrastructure.

Solo dev advantages are real. I could test on real data, deploy quickly, and iterate without committees or approval processes.

Performance is a feature. Users don't care about your technical debt - they want things to work fast.

The Bottom Line is

This isn't how Google or Facebook would solve analytics scaling, but it doesn't need to be. For a solo developer with a $750/month side project, this pragmatic approach worked perfectly.

Two months later, the system handles 3x the original traffic with better performance than day one. Sometimes the best architecture is the one you can actually build, deploy, and maintain yourself.

Top comments (3)

林琪（狂花） • Sep 22

你这是什么项目？

Triston Armstrong • Oct 12

Great save!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.