This article was written by Darshan Jayarama.
I was recently packing for a short trip, and I personally hate carrying 2–3 pieces of luggage. One for my laptop, another for clothes. So a person like me has definitely used a “vacuum sealer,” allowing me to pack as many clothes as I want.
During the trip, I had an idea. Which vacuum sealers do we have to help store efficiently and reduce billing? Here, I have jotted down some…
Online Archive
Does your dataset contain cold data? For a supply chain company, data older than three months is considered infrequently accessed. For an e-commerce website or a courier company, data that has already been delivered is accessed by users very rarely. We can group them using a query, such as lastAccess > 30, and create the Online archive store in cloud object storage. Worried about how to access this when needed? Relax — it’s all covered, there are no access restrictions. Storage costs drop from $4/GB/Month to $0.02/GB/Month.
Table Compact
In a write-intensive application where you frequently delete data, storage watermarks always stay high. You can run the below command to check:
db.COLLNAME.stats().freeStorageSize
The above command outputS how many bytes are available to be reused. Even though wiredTiger uses this space organically over time, if the storage utilization is effectively essential, you can run compact on the collection.
We can run compact with the following syntax:
use mydb // switch to the database
db.runCommand( { compact: "myCollection" } ) // mention the collection name
As it also has many other options, I would recommend visiting the official documentation of compact.
Resync Secondary:
Running compact is fine if you’ve got just a handful of collections. But when you’re dealing with many collections with high watermarks, especially large ones, running compact isn't ideal.
You can resync a member in rolling fashion so that all members can release that space back to the operating system and utilize the storage efficiently.
TTL Index:
So far, we have discussed the reactive approach for storage utilization. We are preparing the plan once the storage hits the ceiling. But do we have any such proactive approach? Yes, here is where TTL comes into play. If you believe the data is no longer needed after a certain time, set the TTL index to expire, and the record will be deleted.
This approach will be useful for the application that generates large logs. Set the TTL index for 3 days, and the log entry will be removed from the collection.
Query Optimization:
Being a query enthusiast, I circle back to query optimization, as a single bad query can eat up your IOPS, Network, and Compute.
- Tune the query.
- Create a compound index covering the query and the projection
- Reduce the network round-trip and IOPS
- If possible, precompute the value and store it in a collection.
Don't know where to start? Use/Abuse the Atlas Performance Advisor for index recommendations, inefficient queries, and inefficient schema advice. Or visit my previous blogs:
- How schema anti-patterns in MongoDB can cost you $$$$
- MongoDB is a flexible schema database. Which means you have the flexibility to modify the structure of data store. One…medium.com
- Whistleblower for Database: Setup your internal informant who exposes performance
- As a DBA waking up at 2AM due to production down/slow is a nightmare and none of us likes it. But everyone of us know…medium.com
Flex Cluster:
Planning to perform some dev/API test? Go with the Flex cluster rather than the M10+ cluster for non-prod deployment. Same API, essentially Pause/Resume when not needed.
Conclusion:
MongoDB gives tons of storage-saving options — but only smart DBAs use them. Your cluster is bleeding $$$ right now. Online Archive, TTL indexes, compaction, proper sizing = 60% instant savings.
Pick 3 techniques → Deploy → Watch bill shrink. Simple as that. 💰

Top comments (0)