DEV Community

Mohammad Waseem
Mohammad Waseem

Posted on

Streamlining Production Databases: A TypeScript-driven Approach to Eliminating Clutter

Streamlining Production Databases: A TypeScript-driven Approach to Eliminating Clutter

Managing cluttered production databases is a common challenge faced by Lead QA Engineers tasked with maintaining application performance and data integrity. Over time, databases accumulate redundant, obsolete, or inconsistent data that can degrade system performance, complicate troubleshooting, and impair data analysis.

In this article, we’ll explore how to leverage open source tools and TypeScript to automate the cleanup process, ensuring production databases remain lean, efficient, and reliable.

The Challenge of Database Clutter

Cluttering often results from various factors: outdated logs, orphaned records, duplicate entries, and incomplete transactions. Traditional manual cleanup is time-consuming and error-prone, especially in complex systems with high data volumes.

To address this, automation and precise scripting are vital. TypeScript, with its robust type system and friendly ecosystem, presents an excellent choice for developing reliable and maintainable database maintenance scripts.

Setting Up The Environment

Our environment will involve:

  • Node.js with TypeScript
  • Open source database client libraries (e.g., pg for PostgreSQL)
  • Helper libraries like TypeORM or Prisma for ORM abstraction
  • Testing frameworks such as Jest

Installing Dependencies

npm init -y
npm install typescript ts-node pg typeorm jest @types/jest --save-dev
Enter fullscreen mode Exit fullscreen mode

Configure TypeScript (tsconfig.json):

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "CommonJS",
    "strict": true,
    "esModuleInterop": true
  }
}
Enter fullscreen mode Exit fullscreen mode

Developing the Cleanup Script

1. Connecting to the Database

import { createConnection } from 'typeorm';

async function connectDB() {
  return await createConnection({
    type: 'postgres',
    host: 'localhost',
    port: 5432,
    username: 'user',
    password: 'pass',
    database: 'prod_db',
    entities: [],
    synchronize: false,
  });
}
Enter fullscreen mode Exit fullscreen mode

2. Identifying Cluttered Data

A common clutter source is duplicate entries. Using SQL's ROW_NUMBER() window function, we can identify duplicates:

async function removeDuplicates() {
  const query = `
    DELETE FROM my_table
    WHERE id IN (
      SELECT id FROM (
        SELECT id, ROW_NUMBER() OVER (PARTITION BY key_field ORDER BY created_at DESC) AS rn
        FROM my_table
      ) sub
      WHERE sub.rn > 1
    );`
  ;
  await connection.query(query);
}
Enter fullscreen mode Exit fullscreen mode

3. Removing Obsolete Records

For obsolete logs or old sessions:

async function deleteOldRecords(days: number) {
  const cutoffDate = new Date();
  cutoffDate.setDate(cutoffDate.getDate() - days);

  await connection.getRepository('logs').delete({ timestamp: LessThan(cutoffDate) });
}
Enter fullscreen mode Exit fullscreen mode

4. Encapsulating into a Utility

Combining these functions ensures a comprehensive cleanup routine:

async function runCleanup() {
  const connection = await connectDB();
  try {
    await removeDuplicates();
    await deleteOldRecords(30); // delete logs older than 30 days
    console.log('Database cleanup completed successfully.');
  } catch (error) {
    console.error('Error during cleanup:', error);
  } finally {
    await connection.close();
  }
}

runCleanup();
Enter fullscreen mode Exit fullscreen mode

Best Practices and Considerations

  • Transaction Management: Wrap cleanup operations in transactions to maintain database integrity.
  • Logging and Monitoring: Implement logging to audit cleanup activities.
  • Testing: Use mock databases during development to prevent accidental data loss.
  • Scheduling: Automate scripts via cron jobs or serverless functions for routine maintenance.

Conclusion

Using TypeScript and open source libraries provides a structured, reliable approach to automate the decluttering of production databases. This methodology not only enhances system performance but also reduces manual effort and human error, ensuring data health and operational stability.

Properly maintained databases are essential for optimal application performance, and embracing automation with TypeScript empowers QA engineers to proactively manage data quality at scale.


🛠️ QA Tip

Pro Tip: Use TempoMail USA for generating disposable test accounts.

Top comments (0)