DEV Community

Jami
Jami

Posted on

What Is MongoDB?

What is MondoDB?

Mongo DB uses JSON like documents called BSON. BSON is a Binary Object Javascript notation that is used to send data across web based applications. BSON is much easier for machines to parse than JSON due to it's encoding process. To export BSON files back to JSON data in MongoDB, you can use the bsondump command. BSON files are written in binary, and are slow to read but quicker to build (faster for machines to review too!). While JSON sends data through web API's, BSON uses databases to store data. It is source available (released through a source code model where the source can be viewed), and supports retrieval of records for values between an upper and lower boundary (as well as retrieval through regex searches). BSON can handle datatypes such as Bindata, Minkey, Maxkey, Binary Data, ObjectID, Regular Expression, JavaScript, Decimal128.

Explaining BSON Commands

mongorestore - loads data from a binary database dump(use to store, replicate and migrate data)

bsondump collection.bson - exports BSON data

bsondump --outFile=collection.json collection.bson - will export the collection as JSON data

Schema

A Schema is data that is organized in a relational database (organizes data into rows and columns). Schemas are a big part of how MongoDB works. In order to understand the functionality of MongoDB we must first understand where Schemas and NoSQL fit into the grand _scheme _ of MongoDB.

Types of Schemas

Conceptual Schemas - Big picture view of the databases organization by describing entries, attributes and relationships. These schemas do not depend on storage, hardware or SQL and bridge the logical schema with the requirements from the user.

Logical Database Schemas - Describe tables, columns, primary keys, foreign keys, relationships, and integrity rules which ensure proper data structuring. They are abstract and independent of physical storage.

Physical Schemas - Determines how the data is stored on the disk. It is not very abstract and is designed mostly by the database administrator (create and organize systems to store and secure data).

Star Schema - Has a fact table at it's center and multiple dimensions connecting to that fact table.

Snowflake Schema - A star table that has multiple dimensions for each dimension branching from the singular fact table

Other Model Examples:

Flat Model

Hierarchical Model

Network Model

Relational Model

Why Data Schemas?

In Schemas, data is organized into separate entities making it easier to share with different databases. Access is granted through permissions allowing database control making encryption a key aspect in the privacy and security that Schemas grant. In addition to this, Schemas contain a single source of truth, validifying it's organizational integrity and adherence to *ACID * properties (mentioned later on).

NoSQL

MondoDB is classified as a NoSQL database product. NoSQL provides flexible schemas and work well with larger amounts of data. NoSQL databases are base compliant meaning they are able to tolerate partial failure and temporary inconsistencies.

NoSQL Types

Document databases - store information like JSON data.

Key-Value databases - where unique keys are used to store a single values. This data is usually cached to memory and provides high performance.

Wide-Column databases - where data is stored in tables. They are flexible and different rows can have different sets of columns.
Graph databases - where data is stored in nodes and edges. This works well for connective data where relationships need to be formed.

Features

ACID transactions are database read and write operations that complete successfully. There are two types of ACID transactions: Single and Multi. Single transactions are a series of one or more database operations that result in a single success. Multi-Transactions are multiple transactions across multiple databases. ACID is atomic (meaning one failure will result in in total failure) and consistent (where changes in one database are reflected across multiple databases). As far as consistency, when data is duplicated in a Schema there are different methods used to keep duplicates consistent in other collections as well.

Methods for Duplicates

Larger sets of data needs to be handled by MongoDB in different ways. Here are a few:

Transactions - applications remain up to date and can maintain performance even when large amounts of data are being read.

Embedded Data - embeds related data in a single collection; application reads updates in different collections at the same time preventing the need for lookups.

Atlas Database Triggers - outdated data can be seen if a query is ran after an update and before the trigger updates the second collection.

What MongoDB Utilizes

MongoDB uses geospatial search, lexical search, vector search, horizontal scaling, and geography-aware fault tolerance.
Geospatial Search - Geospatial search is when queries are entered on specific data to find where the data is located. Data is filtered based on geographical coordinates and distance. MongoDB utilizes Geospatial-Indexes which support queries on GeoJSON data and legacy coordinate pairs.

Lexical Search - relies on word match to produce a result.

Vector Search - finding numerical representations of words or documents that capture the relationship between two data points.

Horizontal Scaling - increasing performance by adding servers to distribute a work load.

Commands to install MongoDB dependencies

MacOS:
brew install mongodb-atlas
brew install --cask docker
Windows:
choco install mongodb-atlas
choco install docker-desktop

Local Deployment Vs Cloud Deployment

Local and Cloud Deployments are done using Atlas CLI. Atlas CLI is a command line interface made to interact with deployments. Local Deployment occurs at the developers local host while Cloud Deployment occurs over the internet. Local Deployment is often limited and requires hardware updates while Cloud Deployment is much more scalable and accessible from anywhere. Cloud Deployment is much faster and offers higher availability services for downtime handling.

Connecting to a Deployment

Deployments can be connected using one of three methods.

  1. A connection string for if the deployment is hosted on Atlas or the connection string for a deployment is already available
  2. Advanced connection settings for if a connection string needs to be customized and options for connection need to be seen
  3. Atlas CLI to connect existing AtlasCLI configurations. Best for if Atlas CLI has already been installed and an existing deployment needs to be connected

Connection Strings:

SRV Connection Strings - uses the mongodb+srv:// to automatically include all seed list hosts.

Standard Connection Strings - uses the mongodb:// prefix

Finding Atlas Connection String
atlas clusters connectionStrings describe --projectId

https://discuss.cryosparc.com/t/what-is-mongo-db-migrating-cryosparc-to-new-server/15467/2
https://www.mongodb.com/docs/
https://www.mongodb.com/resources/languages/bson#what-is-bson
https://www.mongodb.com/resources/basics/databases/nosql-explained
https://www.handybackup.net/backup_terms/database-dump.shtml
https://www.mongodb.com/resources/basics/databases/acid-transactions
https://www.geeksforgeeks.org/elasticsearch/geospatial-search-and-location-based-queries/
https://www.mongodb.com/resources/basics/lexical-search
https://www.ibm.com/think/topics/vector-search
https://www.geeksforgeeks.org/system-design/system-design-horizontal-and-vertical-scaling/
https://www.ibm.com/think/topics/database-schema
https://www.geeksforgeeks.org/dbms/database-schemas/
https://www.bls.gov/ooh/computer-and-information-technology/database-administrators.htm
https://www.mongodb.com/docs/atlas/cli/current/index/
https://www.geeksforgeeks.org/cloud-computing/on-premises-vs-on-cloud/
https://www.mongodb.com/docs/manual/reference/connection-string/?deployment-type=atlas&interface-atlas-only=atlas-cli

Top comments (0)