DEV Community

Cover image for 🌐 Get started: MongoDB Change streams, Concurrency, backup snapshot & checkpoint, Compound Wildcard Indexes
Danny Chan for MongoDB Builders

Posted on

2 2 2 2 2

🌐 Get started: MongoDB Change streams, Concurrency, backup snapshot & checkpoint, Compound Wildcard Indexes

πŸ’« Change streams:

  • Use case: large documents, pre-images, post-images
  • Access real-time data changes
  • Subscribe to all data changes on database, then react to them
  • Available for replica sets and sharded clusters
  • Must use WiredTiger storage engine
  • Can use encryption-at-rest


πŸ•°οΈ Time series data:

  • High-volume datasets
  • Improved storage optimization and compression
  • High cardinality data
  • Fine-grained updates & deletes
{
    "metadata": { "sensorId": 5578, "type": "temperature" },
    "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
    "temp": 12
}
Enter fullscreen mode Exit fullscreen mode



Concurrency:



πŸ”’ WiredTiger Storage Engine:

  • Can't pin documents to WiredTiger cache
  • Cannot reserve a portion of cache for reads & writes
  • Heavy write workload affect performance
  • Allocates its cache to the entire MongoDB instance


🌍 Feature: Transaction (Read & Write) Concurrency:

  • Dynamically adjust maximum number of concurrent storage engine transactions (read and write)
  • Optimizes database throughput during cluster overload
  • Never exceeds 128 read & write tickets
  • Read & write are always equal


πŸ”Document Level Concurrency control:

  • For WiredTiger to control write operations
  • Multiple clients can modify different documents of a collection at the same time
  • Intent locks at the global, database and collection levels
  • Detects conflicts between two operations, one will stop due to conflict operation, other one keep going
  • Transparently retry conflict operation



Backup:



πŸ›‘οΈFeature: point-in-time snapshot
MultiVersion Concurrency Control (MVCC)


πŸ—„οΈ Feature: checkpoint

  • Act as recovery points
  • Now-durable data in the data files
  • Consistent up to and including last checkpoint
  • Writing snapshot data to disk every 60 seconds
  • If MongoDB encounters an error while writing a new checkpoint, MongoDB can recover from last valid checkpoint


πŸ“œ Journal:

  • Ensure data durability by write-ahead log (journal) with checkpoints
  • Persists all data modifications between checkpoints
  • MongoDB error between checkpoints, use journal to replay all data modified


πŸ“œ Compression:

  • Snappy compression for collections
  • Prefix compression for indexes (deduplicate common prefixes)


πŸ“œ Memory Use:

  • Utilizes both WiredTiger internal cache and filesystem cache


πŸ“œ Block compression:

  • Provide significant on-disk storage savings
  • Collection data in WiredTiger internal cache is uncompressed for able to be manipulated



Compound Wildcard Indexes:



πŸ” Feature: Compound Wildcard Indexes

  • One wildcard index replace a large number of individual indexes
  • Can efficiently cover many potential queries Apply to attribute Pattern

From

{ tenantId: 1, "customFields.addr": 1 }
{ tenantId: 1, "customFields.name": 1 }
{ tenantId: 1, "customFields.blockId": 1 }
Enter fullscreen mode Exit fullscreen mode

To

{ tenantId: 1, "customFields.$**": 1 }
Enter fullscreen mode Exit fullscreen mode



More Features:



πŸ”„ cluster-to-cluster sync (mongosync)

  • Syncing specific data sets instead of the entire cluster


πŸ“Š approximate percentiles

  • Use $percentile as an accumulator in the $group stage or as an aggregation expression Record:
{ studentId: "2345", test01: 62, test02: 81, test03: 80 }
Enter fullscreen mode Exit fullscreen mode

Result:

{ studentId: '2345', testPercentiles: [ 80, 81 ] }
Enter fullscreen mode Exit fullscreen mode


πŸ‘€ user role

  • Use user role variables within aggregation pipelines
  • Example: application display different data based on user permissions


πŸ”’ Queryable Encryption

  • Security
  • Fully randomized encrypted data

Editor

Image description

Danny Chan, specialty of FSI and Serverless

Image description

Kenny Chan, specialty of FSI and Machine Learning

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

πŸ‘‹ Kindness is contagious

Please leave a ❀️ or a friendly comment on this post if you found it helpful!

Okay