<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sudheer Tripathi</title>
    <description>The latest articles on DEV Community by Sudheer Tripathi (@sudheer121).</description>
    <link>https://dev.to/sudheer121</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F387129%2F698b216f-9f2e-4975-8ea2-550239ad86c5.jpg</url>
      <title>DEV Community: Sudheer Tripathi</title>
      <link>https://dev.to/sudheer121</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sudheer121"/>
    <language>en</language>
    <item>
      <title>Reading 20k excel files per minute with Node</title>
      <dc:creator>Sudheer Tripathi</dc:creator>
      <pubDate>Sun, 16 Mar 2025 11:37:55 +0000</pubDate>
      <link>https://dev.to/sudheer121/reading-20k-excel-files-per-minute-with-node-df0</link>
      <guid>https://dev.to/sudheer121/reading-20k-excel-files-per-minute-with-node-df0</guid>
      <description>&lt;p&gt;Recently I was working on an adhoc activity which involved writing a script that would extract some cell data from around 75k excel files, assume these files are readily available on your disk. Because these are user provided files, some of these are corrupt and can cause a range of issues when parsed. In an attempt to optimize the data extraction script, I ended up at a point where the final script ran 30X faster and handled errors gracefully. Lets look at the approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Excel files are like bombs
&lt;/h2&gt;

&lt;p&gt;Parsing excel files is like running user provided code on your system, when these files are read using excel libs, these can cause issues like heap overflow, blocking of main thread, unforeseen exceptions, etc. The most popular excel libraries in the JS ecosystem are &lt;a href="https://sheetjs.com/" rel="noopener noreferrer"&gt;SheetJs&lt;/a&gt; and &lt;a href="https://github.com/exceljs/exceljs" rel="noopener noreferrer"&gt;ExcelJs&lt;/a&gt;, both of them face issues like &lt;a href="https://github.com/SheetJS/sheetjs/issues/1363" rel="noopener noreferrer"&gt;blocking of main thread&lt;/a&gt; &amp;amp; &lt;a href="https://github.com/exceljs/exceljs/issues/1898" rel="noopener noreferrer"&gt;heap overflow&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How do we read them then ?
&lt;/h2&gt;

&lt;p&gt;Because these files can block the main thread (infinite loop), cause heap overflow and throw unknown exception during parsing, we need to run them in a container type environment, we can use NodeJs worker threads like a container. NodeJs allows us to specify the memory we want to allocate to the worker beyond which the worker would terminate. If the worker goes in infinite loop we can add a timer in the main thread which would terminate the worker if it doesnt respond after a few seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfb29pku5n3s4v63vhl1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfb29pku5n3s4v63vhl1.png" width="800" height="505"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Spawn worker thread when reading file
&lt;/h2&gt;

&lt;p&gt;We can limit the worker threads memory by specifying it in &lt;code&gt;resourceLimits&lt;/code&gt; property (see &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/blob/main/src/1-sequential-spawn-worker/index.ts#L72" rel="noopener noreferrer"&gt;this&lt;/a&gt;) when creating the worker. If the worker goes in infinite loop, we need to specify a &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/blob/main/src/1-sequential-spawn-worker/index.ts#L42" rel="noopener noreferrer"&gt;timeout&lt;/a&gt; after which we will terminate the worker. In NodeJs, a worker thread gets automatically terminated once the script inside it has completed running.&lt;/p&gt;

&lt;p&gt;You can find the code for this approach &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/tree/main/src/1-sequential-spawn-worker" rel="noopener noreferrer"&gt;here&lt;/a&gt;. When run on 20k excel files (with sizes ranging from 200 to 300KB), with 40 corrupt files, it completes in 40 minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: Re-use the same thread
&lt;/h2&gt;

&lt;p&gt;In the previous approach we were spawning a new thread for each excel file, because the worker exited after running the script, to make the worker run indefinitely we can add a really long &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/blob/main/src/2-sequential-worker-reuse/worker.ts#L6" rel="noopener noreferrer"&gt;interval timer in the worker&lt;/a&gt;. Now we want to push another file into the worker as soon as it has completed processing the last file or thrown an exception.&lt;/p&gt;

&lt;p&gt;How do we do it ?&lt;/p&gt;

&lt;p&gt;We can use the &lt;code&gt;EventEmitter&lt;/code&gt; to inform other part of code that has to push a new file to worker for processing. With this we can have a completely even driven system which would process files, listen to events to make decisions of when to push file to worker, when to terminate worker, etc.&lt;/p&gt;

&lt;p&gt;You can find the code for this approach &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/tree/main/src/2-sequential-worker-reuse" rel="noopener noreferrer"&gt;here&lt;/a&gt;. When run on 20k excel files (with sizes ranging from 200 to 300KB), with 40 corrupt files, it completes in 5 minutes 🔥 . Wow, this is much better than our previous scripts performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 3: Use all cores !
&lt;/h2&gt;

&lt;p&gt;We are still not using all the resources, we can process files parallely in multiple threads, for an octacore system, we can spawn 7 parallel threads and leave the last core for the main thread.&lt;/p&gt;

&lt;p&gt;This also adds complexity because now we want to manage a pool of 7 workers, manage their timeouts and send messages to correct worker which becomes available. Writing this kind of code using even driven methodology is really easy. Whenever a worker exists or sends any kind of message, we use its threadId to take an action.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyotiyv9pick51gkkdlw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyotiyv9pick51gkkdlw.png" width="800" height="589"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can find the code for this approach &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads/tree/main/src/2-sequential-worker-reuse" rel="noopener noreferrer"&gt;here&lt;/a&gt;. When run on 20k excel files (with sizes ranging from 200 to 300KB), with 40 corrupt files, it completes in around 1 minute 💀 .&lt;/p&gt;

&lt;p&gt;The scripts were run on a Macbook Air M3 and took a maximum RAM of 600MB.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The approach is very similar to how message queues work in a distributed system, the queue consumers pull message from queue as soon as they have completed processing previous message, if a consumer/worker goes down, it can be restarted. Multiple consumers can pull from the messages queue without any delay. This is a good example of how good Nodes IO is, if used well, it can be used to complete IO tasks from hours to minutes.&lt;/p&gt;

&lt;p&gt;The source code for the examples can be found in &lt;a href="https://github.com/dumbbellcode/blazing-fast-excel-reads" rel="noopener noreferrer"&gt;this repository&lt;/a&gt;, give it a star if you learned something interesting.&lt;/p&gt;

&lt;p&gt;]]&amp;gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>workerthread</category>
      <category>concurrency</category>
      <category>excel</category>
    </item>
    <item>
      <title>Observability and Security with AWS</title>
      <dc:creator>Sudheer Tripathi</dc:creator>
      <pubDate>Fri, 29 Mar 2024 14:45:09 +0000</pubDate>
      <link>https://dev.to/sudheer121/observability-and-security-with-aws-36lf</link>
      <guid>https://dev.to/sudheer121/observability-and-security-with-aws-36lf</guid>
      <description>&lt;p&gt;Compliance is boring, as developers we don't actively think about security loopholes in our infrastructure, just a few standard practices and that's it. However, AWS provides us with the tools that can help improve the observability of our infra and protect us from security issues down the line.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS S3 Security
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Observe who is accessing the S3 buckets&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can enable &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerLogs.html"&gt;server access logs&lt;/a&gt; for your s3 buckets and view detailed records of requests that are made to the bucket. The logs can be loaded in Athena to identify malicious access to buckets. Ideally you should have one server access logs bucket in each region with logs added based on bucket name prefix paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observe S3 configuration changes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You should be able to log all configuration changes to your S3 buckets, this is useful to detect malicious modifications to your bucket configurations. This is easily possible with AWS CloudTrail. AWS CloudTrail will log all API access to your buckets in CloudWatch. You can add a &lt;a href="https://docs.fugue.co/FG_R00083.html"&gt;metric filter&lt;/a&gt; with alarm on CloudWatch to send you a notification if any malicious S3 configuration change activity is detected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Block public access to buckets if possible&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you are serving public data via cloudfront, you can &lt;a href="https://aws.amazon.com/s3/features/block-public-access/"&gt;block all public access&lt;/a&gt; to S3 buckets and add a bucket policy to only allow cloudfront to access your buckets. This is an additional security measure, although not necessary if you are serving non-sensitive data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enforce HTTPS access on buckets&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's a good practice to disable HTTP access to buckets, to do this you just need to &lt;a href="https://repost.aws/knowledge-center/s3-bucket-policy-for-config-rule"&gt;add a policy&lt;/a&gt; (aws:SecureTransport) to your bucket that disables insecure access&lt;/p&gt;

&lt;h3&gt;
  
  
  Cloudfront
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Enable WAF on CloudFront distributions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;WAF protects your distribution from bots and common web exploits by blocking such requests. This feature is just &lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/enable-waf-security-tab.html"&gt;one click away&lt;/a&gt; on your CloudFront dashboard. Be careful as it blocks genuine requests too, hence you can enable it in monitor mode to start off with and add custom rules later on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CloudFront Standard Logs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is similar to S3 server access logs, detailed information about requests to you distribution are &lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html"&gt;logged in a separate S3 bucket&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  IAM
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cloudwatch alarm for sign in / authorization failures&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you have enabled management events in CloudTrail. Authorization failures will also be logged in the associated CloudWatch group. All you need to do is add a metric filter with alarm that filters out &lt;a href="https://docs.fugue.co/FG_R00065.html"&gt;console signin failures&lt;/a&gt; and authorization failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rotate IAM user passwords&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AWS recommends to regularly rotate the passwords of all IAM users. This is possible by adding a &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_passwords_account-policy.html#PasswordPolicy_CLI"&gt;password policy&lt;/a&gt; from the AWS console.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At least one user with support role&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AWS provides the ability to manage incidents with AWS Support through a managed policy called arn:aws:iam::aws:policy/AWSSupportAccess.This allows an IAM user to interact with the AWS Support Center where users can talk to AWS agents to resolve issues. It is recommended that at least one IAM user is assigned this role.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enable IAM access analyzer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It is an &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/what-is-access-analyzer.html"&gt;AWS security feature&lt;/a&gt; that regularly scans your IAM policies for misconfigurations, over-privileges and possible security risks. It is recommended to enable it in all active regions.&lt;/p&gt;

&lt;h3&gt;
  
  
  VPC
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;VPC flow logs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html"&gt;Flow logs&lt;/a&gt; capture information about incoming / outgoing / restricted traffic without any impact on network performance. They are helpful in diagnosing overly restrictive security group rules. It is recommended to enabled flog logs on all subnets in your VPC&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observe VPC configuration changes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;VPC configuration changes should be logged to CloudTrail and a log metric filter with associated alarm should be configured in CloudWatch so that all changes to VPC configurations are logged.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Code Signing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuration-codesigning.html"&gt;This feature&lt;/a&gt; ensures only code from trusted sources runs in your lambda functions. For this you'll have to first create an AWS signer profile and AWS signing config using another service called is AWS signer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CloudWatch Lambda Insights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Lambda functions runtime performance metrics (eg: CPU, memory, disk usage etc) can be monitored by enabling &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Lambda-Insights.html"&gt;enhanced monitoring&lt;/a&gt; form the lambda function console. This will allow you to monitor unusual behaviour of your lambda functions.&lt;/p&gt;

&lt;h3&gt;
  
  
  EBS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Volumes and Snapshots should be encrypted&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It is recommended that all EBS snapshots and volumes in use are encrypted using an AWS KMS customer managed key. For existing volumes, a snapshot can be taken, then they can be encrypted and restored. Also you have an option in the console to enable encryption by default when new volumes are created next time.&lt;/p&gt;

&lt;h3&gt;
  
  
  KMS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Customer managed keys must be rotated regularly&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Rotation of AWS managed keys is handled by AWS, for Customer managed keys, they can be configured to rotate after after a fixed interval from the AWS console.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitory KMS configuration changes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just like services mentioned previously, metric filters can be added to cloudtrail logs on CloudWatch that filter API calls like (&lt;code&gt;ScheduleKeyDeletion&lt;/code&gt; , &lt;code&gt;DisableKey&lt;/code&gt; ) and an alarm should be associated with them&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;EC2&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Ensure IMDSv1 is not in use&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The instance metadata service on EC2 instances provides an api to access meta information about EC2 instances, however this also an exploitation target for attackers. You need to ensure that the EC2 instances are using IMDSv2 and not IMDSv1, read more &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html"&gt;here&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;A lot of the above mentioned security features come with a cost, so you have to be cautious of the impact of such features on the AWS cloud bill.&lt;/p&gt;

&lt;p&gt;Hopefully, you learned something new !!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>security</category>
      <category>developer</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Designing SQL databases for rock solid data quality</title>
      <dc:creator>Sudheer Tripathi</dc:creator>
      <pubDate>Sun, 02 Oct 2022 18:07:27 +0000</pubDate>
      <link>https://dev.to/sudheer121/designing-sql-databases-for-rock-solid-data-quality-3ech</link>
      <guid>https://dev.to/sudheer121/designing-sql-databases-for-rock-solid-data-quality-3ech</guid>
      <description>&lt;p&gt;The blog focuses on 3 things,&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prevention from data redundancy.&lt;/li&gt;
&lt;li&gt;Good design choices.&lt;/li&gt;
&lt;li&gt;Bad design choices.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;u&gt;Prerequisites&lt;/u&gt;: Basic knowledge of databases and practices like indexing, normalization, etc. &lt;/p&gt;




&lt;h2&gt;
  
  
  1. Unique partial index to avoid data duplication
&lt;/h2&gt;

&lt;p&gt;While indexes are generally used to improve database performance, they can also be used to avoid redundancy in data. You can add &lt;a href="https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-UNIQUE-CONSTRAINTS"&gt;unique constraints&lt;/a&gt; on your columns in DDL, but for conditional unique constraints you will have to define a unique &lt;a href="https://www.postgresql.org/docs/current/indexes-partial.html"&gt;partial index&lt;/a&gt;.  &lt;/p&gt;

&lt;p&gt;Consider an example of a shopping-cart application, where along with orders you would also like to uniquely identify every checkout.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frpzhxtvf77whatmn3ock.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frpzhxtvf77whatmn3ock.png" alt="user_orders table" width="800" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the above example, under cart_id &lt;code&gt;abc-123&lt;/code&gt;, the product_id &lt;code&gt;xyz-456&lt;/code&gt; has a duplicate entry. Such cases can be detected by adding a unique index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="k"&gt;unique&lt;/span&gt; &lt;span class="k"&gt;index&lt;/span&gt; &lt;span class="n"&gt;unique_product_cart&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;user_orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cart_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Soft delete cascades
&lt;/h2&gt;

&lt;p&gt;Keeping &lt;a href="https://en.wiktionary.org/wiki/soft_deletion#:~:text=(databases)%20An%20operation%20in%20which,data%20itself%20from%20the%20database."&gt;soft deleted&lt;/a&gt; data in your database is awesome but it makes ensuring data quality difficult. Every time you add a constraint, you'll have to take care of &lt;code&gt;deleted_at&lt;/code&gt; column (the first point is a good example). &lt;/p&gt;

&lt;p&gt;Managing soft deletes has to be done at application level since in SQL databases cascading on update only happens for foreign keys. If you are lucky, your ORM might support soft delete cascades, otherwise you'll have to implement it by using model observers or extending your ORM's query builder for instance. They can be implemented using triggers too.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Transactions at application level
&lt;/h2&gt;

&lt;p&gt;A single operation at application level might be a combination of multiple writes at database level. While you might enforce all kinds of constrains on database, your data will get ruined if you don't implement transactions at application level.&lt;/p&gt;

&lt;p&gt;Consider an example of a movie ticket booking application, when you reserve a seat for a movie, it can be a combination of multiple operations at database level, eg: add rows in &lt;code&gt;seat_reservation&lt;/code&gt; table, &lt;code&gt;user_activity&lt;/code&gt; and &lt;code&gt;snacks_orders&lt;/code&gt; table. All of the operations would usually come under a function &lt;code&gt;reserveTicket()&lt;/code&gt;. Such a function has to be put under a transaction in application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Keep historic changes and quick lookup columns.
&lt;/h2&gt;

&lt;p&gt;Sometimes saving historic data in a separate table becomes important. For example consider a credit card company, when you apply for a credit card it goes through multiple intermediate stages before getting approved. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrs5gblkmg351jefri6e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrs5gblkmg351jefri6e.png" alt="Image description" width="800" height="138"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In above example, the &lt;code&gt;applications&lt;/code&gt; table keeps track of users' applications while the &lt;code&gt;application_status&lt;/code&gt; table keeps track of status of each application.&lt;br&gt;
The reason for keeping historic changes is to keep track of who changed the status (creator_id) and why was the status changed (comments).&lt;br&gt;
Notice that the applications table also has a &lt;code&gt;latest_status_id&lt;/code&gt; for quick lookup to latest status, it's a good practice to have such columns.  &lt;/p&gt;

&lt;p&gt;Although credit card companies don't work this way, but I hope you get my point 🙃. &lt;/p&gt;

&lt;h2&gt;
  
  
  5. Save last data point in a JSON meta column
&lt;/h2&gt;

&lt;p&gt;Sometimes, you might not need to track how data for some row has changed historically but just keep the last data point for rolling back. Consider the same example in point 4, this time we save the status in the same table but we also save the old status in a JSON column&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5m2kyh4o3hjhb5gwgoe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5m2kyh4o3hjhb5gwgoe.png" alt="applications_jsonb" width="800" height="178"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This example doesn't do justice to how useful it is to save old critical data points in JSON meta column 😶, when used in correct place, it has the potential to save you a lot of time while doing data analysis or data rollback. &lt;/p&gt;

&lt;h2&gt;
  
  
  6. Log data changes
&lt;/h2&gt;

&lt;p&gt;While schema changes are tracked by migration files, it's important to keep track of all data changes in the database, usually this would be implemented using a polymorphic table, which would keep track of changes to all models. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon77waujqv6w6xihykz3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon77waujqv6w6xihykz3.png" alt="changelogs" width="800" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How to do it ?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Aspect Oriented Programming (you are lucky if your framework has good support for it)&lt;/li&gt;
&lt;li&gt;Database &lt;a href="https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-DML-TRIGGER"&gt;Triggers&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Overriding your query builder's CRUD methods/events or using external libraries that do that.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's also important to regularly cleanup these logs and store them somewhere else as the table gets heavy quickly. &lt;/p&gt;

&lt;h2&gt;
  
  
  7. Use Polymorphic tables
&lt;/h2&gt;

&lt;p&gt;The table used in point 6 is a polymorphic table, the column &lt;code&gt;model_id&lt;/code&gt; is a foreign_key that points to any table in the database. &lt;/p&gt;

&lt;p&gt;Let's take another example of an online forum, the forum can have comment sections at multiple places and from multiple sources, i.e comments from users, admins, automated comments from system, etc. Instead of storing comments in multiple tables we can have a polymorphic table for storing comments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focktig7pi976wf0tts9z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focktig7pi976wf0tts9z.png" alt="Comments polymorphic" width="800" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Avoid saving UUIDs as string
&lt;/h2&gt;

&lt;p&gt;While keeping UUIDs as primary keys is a &lt;a href="https://dzone.com/articles/uuid-as-primary-keys-how-to-do-it-right"&gt;good choice&lt;/a&gt;. Saving them as binaries is much better than saving them as strings. &lt;/p&gt;

&lt;p&gt;UUIDs consists of 32 hexadecimal digits with 4 hyphens. If saved as a string it can take upto 36 bytes (288 bits). Most SQL databases support saving them as binaries i.e 128 bits. Index size of binary UUIDs is less than half the index size of index size of string UUIDs. &lt;/p&gt;

&lt;h2&gt;
  
  
  9. Avoid un-necessary indexes
&lt;/h2&gt;

&lt;p&gt;Indexes are cool but it's important to keep in mind that indexes occupy memory and need to be updated in every write operation (in Postgres). Too many indexes means too many indexes to be updated for every insert/delete operation, this will basically slow down your database writes. &lt;/p&gt;

&lt;h2&gt;
  
  
  10. Use views when possible
&lt;/h2&gt;

&lt;p&gt;There are multiple cases where views are a good choice, I'll tell one of them. Consider a case where stakeholders require reports of entire data in all critical tables. For this, you would aggregate data from multiple table using joins, overall that would be a complex query. There can be multiple instances where you would want to re-use such aggregate data i.e reuse the query. In such case writing a view will be a good choice. &lt;/p&gt;

&lt;p&gt;Creating a view is same as writing the same query at application level, it's just that in case of view you can reuse the query which is saved at database level. &lt;/p&gt;




&lt;p&gt;There are much more things that can be good choices when it comes to designing databases and maintaining data quality. If you have things to share please do in comments.&lt;/p&gt;

&lt;p&gt;-- &lt;a href="https://github.com/Sudheer121"&gt;sudheer121&lt;/a&gt; &lt;/p&gt;

</description>
      <category>sql</category>
      <category>postgres</category>
      <category>database</category>
      <category>mysql</category>
    </item>
    <item>
      <title>Nodejs developer gets blown away by Laravel</title>
      <dc:creator>Sudheer Tripathi</dc:creator>
      <pubDate>Fri, 04 Feb 2022 07:54:40 +0000</pubDate>
      <link>https://dev.to/sudheer121/nodejs-developer-gets-blown-away-by-laravel-5dpi</link>
      <guid>https://dev.to/sudheer121/nodejs-developer-gets-blown-away-by-laravel-5dpi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"They've got docker setup upfront, that's awesome !!" - on the first day of learning laravel&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I don't know what you think about PHP, but the developer experience with Laravel has been really good, and it motivates me to write this blog. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foe5blnc2tjrnca8w9041.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foe5blnc2tjrnca8w9041.jpeg" alt="Laravel deveolper" width="307" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  My background with other Frameworks / Libs
&lt;/h3&gt;

&lt;p&gt;Most of my projects have been around FullStack Javascript / Typescript, which makes me look at Laravel from a different eye.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Here are the top 5 things I liked about Laravel.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Automated dockerizing with &lt;code&gt;Laravel Sail&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Managing different database types/versions and switching between them for different projects gets messy.&lt;/li&gt;
&lt;li&gt;With Laravel Sail, you can get your laravel application and the database of your choice within a docker container in no time.&lt;/li&gt;
&lt;li&gt;The Best part, &lt;code&gt;sail cli&lt;/code&gt; connects you with your dockerized laravel application from outside docker.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Server Side Rendering with &lt;code&gt;Blade&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If your frontend is complex, separating it out would be better and ideally &lt;code&gt;NextJs / NuxtJs / etc&lt;/code&gt; would be the way to go. &lt;/li&gt;
&lt;li&gt;But, when it comes to writing SSR code within your backend application, this framework really shines here. Writing SSR code with Laravel Blade is way cleaner than the &lt;code&gt;Express + EJS&lt;/code&gt; duo.&lt;/li&gt;
&lt;li&gt;You get more powerful directives, multiple ways to nest components, custom directives, etc. If your component involves heavy logic, you can create class-based components too. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consider an example where you have to show some posts to a logged in user, skipping the first post. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhxqrk0xv4ly9t7tm5uz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhxqrk0xv4ly9t7tm5uz.png" alt="With EJS" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbrac4766f8wosg95smza.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbrac4766f8wosg95smza.png" alt="With Blade" width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Laravel Blade creates an "aha" moment. &lt;/p&gt;




&lt;h3&gt;
  
  
  Application bootstrapping and Dependency Injection
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The Laravel application instance is called a &lt;code&gt;service container&lt;/code&gt;. You can bind a class instance to a &lt;code&gt;service container&lt;/code&gt; as a singleton and reuse it wherever you want. &lt;/li&gt;
&lt;li&gt;Laravel automatically resolves &lt;a href="https://freecontent.manning.com/understanding-constructor-injection/"&gt;constructor injection&lt;/a&gt; and &lt;a href="https://freecontent.manning.com/understanding-method-injection/"&gt;method injection&lt;/a&gt; (&lt;code&gt;kinda like NestJs&lt;/code&gt;). &lt;/li&gt;
&lt;li&gt;With &lt;code&gt;AppServiceProvider&lt;/code&gt; you can easily swap out what dependency gets injected by default. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5a71fxj1pe0p9ov5mz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5a71fxj1pe0p9ov5mz2.png" alt="Bind to app" width="800" height="642"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6l7xeu2clcszkuha90z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6l7xeu2clcszkuha90z.png" alt="Auto Inject" width="800" height="676"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Migrations, Factories and Query Builder
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Writing migrations and seeding the database is easy. &lt;/li&gt;
&lt;li&gt;Personally, I have used &lt;code&gt;Sequelize&lt;/code&gt; heavily and working with migrations in &lt;code&gt;Sequelize&lt;/code&gt; has some initial learning curve.&lt;/li&gt;
&lt;li&gt;I found laravel migrations cleaner. Also, laravel has out of the box developer-friendly query builder, like &lt;code&gt;TypeORM&lt;/code&gt; but more flexible.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consider the example below where we want to include relations Author, Category, Comments with a Post table and filter the results by Post body and Category name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foup7zazm71dh56x22f3u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foup7zazm71dh56x22f3u.png" alt="Laravel Query Builder Example" width="800" height="698"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Top-notch support for &lt;code&gt;miscellaneous requirements&lt;/code&gt;.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;With NodeJs frameworks, adding(and updating) npm packages is more frequent, you usually install an npm package for any miscellaneous requirement. &lt;/li&gt;
&lt;li&gt;Laravel has inbuilt/supplemented support for authentication, request validation, cron jobs, mailing, event handling, http requests, notifications, caching, file storage, OAuth etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The above reviews were solely based on developer experience. Laravel is relatively slower than other backend frameworks, but it does fine for most of your requirements.     &lt;/p&gt;

&lt;p&gt;There's something unique to learn from every framework (I learned some new design patterns in Laravel) and it's important to not get attached to the one you are using.&lt;/p&gt;




&lt;h3&gt;
  
  
  Hiring Alert
&lt;/h3&gt;

&lt;p&gt;I work as a Software Engineer Intern at ClearGlass, we are a Cost Transparency company based out of London, UK.&lt;/p&gt;

&lt;p&gt;ClearGlass is looking for Senior Software Engineers for the Engineering team. Most of our tech stack is in NodeJS / PHP. Know more about us and apply &lt;a href="https://clearglass.com/careers/"&gt;here&lt;/a&gt;. Learn about our tech stack &lt;a href="https://stackshare.io/clearglass/clearglass"&gt;here&lt;/a&gt; &lt;br&gt;
See you there 👋. &lt;/p&gt;

</description>
      <category>laravel</category>
      <category>node</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
