I was recently tasked with expanding the back end of a rental home listing application and in this post I am going to share some of the things I applied!
The goal for this project was to ensure that the application could handle over 100 requests per second on an EC2 micro instance while utilizing a database that contained a minimum of 10 million records.
I was tasked with the photo carousel portion of the app and it contained 2 datasets, one for all of the home listings and one for all of the photos. Listings to photos is a one to many relationship with each listing having 5 to 10 photos but each photo only corresponding to one listing. I had to make a choice between using a non-relational database mongoDB or a relational database postgres.
It made much more sense to use postgres for this application because postgres is a relational database and the data I was working with is inherently relational with each listing having multiple corresponding photos.
I created a table for each dataset, one for all of the listings that contained 10 million records and one for the photos that contained 50 million records.
The primary request made by the application is an inner join that selects a listing and all of the images that pertain to that listing. So I added a listing id foreign key on the photos table that corresponded to the id on the listings table.
My initial query time to select a listing and its photos was quite slow at 700ms so I decided to apply an index to the listing id on the photos table. This allows all of the photos with a particular listing id to be selected at one spot in memory and selected all at once rather than having to iteratively search through 50 million records. After adding the index my query time improved dramatically to .8ms.
After finishing the database design locally I migrated it to a EC2 micro instance with one instance for the database and one for the application. Initially the average response time was 200ms and the google page load score was at 88%.
The application consists almost entirely of images so I decided to set up a cloud distribution network instead of hosting them directly from S3. I attached the cdn to my S3 bucket for this project and this cut down the distance that the files need to travel by utilizing servers that are closest to the user which greatly increases the speed of delivery.
I then set up redis as a cache to store files that have been accessed from postgres recently. All recently requested files are stored in redis so if they are requested again they can just be withdrawn from redis instead of postgres. This sped up my query times quite a bit and I was now at an average request time of 78ms. I then decided to horizontally scale the application with nginx by making 4 more images of the service. After horizontally scaling I ended up with a request time of 34ms and a google page load speed of 92%.
And thats all of the steps I made to improve this application! Good luck and I hope this may be of help.