Also on my personal blog at nativealgorithms.com
Overview
Industry and academia share the need for a fundamental resource… data. The ability to collect, store and retrieve data is vital to meaningful research and enterprise prosperity. The increasing importance of data has led to the development of many software applications aimed at managing it.
Programming advancements have facilitated an era of evolution from monolithic databases to our most modern, non-relational databases or NoSQL. Although, relational/SQL databases are the industry standard, the new “not only SQL” database is not bounded by scale or rigid tabular constructs. This facilitates the hyper-efficiency needed to contend with the high volume data that supports today’s microservices.
NoSQL is a non-relational database that addresses the problems of the ever increasing volume of data. NoSQL leverages the benefits of a highly scalable multi-server system. NoSQL works well with microservices and the high volume unstructured data of contemporary e-commerce. The resilience of NoSQL is found in the ease at which these databases can be built upon or scaled-out as data increases, requiring little down time or major updates. From a software engineer’s perspective, the ability to make on-the-fly changes has made NoSQL the go-to for high-volume-data driven services.
It is important to note that relational/SQL databases are still the most common type of databases. Mainly because of its minimal end-user coding requirements and consistency (reduced latency as data is called upon). Enterprises of today have adapted a combination of SQL and NoSQL. This combination along with the microservices they support provides a robust framework to which enterprises rely on. For example, Craigslist uses MySQL a relational/SQL database, for its live or current listing while using MangoDB, a NoSQL. For its high-scaled archived data. Using NoSQL for archive data insures that there is ease of use, agility and resilience in the system as data increases.
What is SQL?
SQL is the interface or language used to interact with relational databases. Relational databases organize data in the row-column-table design. Tables can be related based on common data via a single query, most often requiring only a few lines of code, that can be learned by non-technical individuals. This is because the management system does the work. It compiles the query and figures out the correct data points. The fact that the database can be managed without substantial amount of code, is a plus. For example, a single query requiring a few lines of code can bring up large amounts of records from a database. One drawback is the vertical scalability of non distributed SQL.
Moreover, the vertical scaling of relational databases is limited by server size. In this case, the increased volume of data is accommodated by getting a bigger server and having a large central machine often leads to single point failure. Unfortunately, the optimal and limitless nature of horizontal scaling for SQL is supported by immature technologies or in an ad-hoc way. This makes horizontal scaling very challenging for this SQL relational databases.
When a large single server is employed, relational databases are resistant to change often leading to downtime. Relational databases also require careful up-front design to ensure quality performance. This leads to its costly start-up as compared to NoSQL databases.
SQL is a long established standardized language that conforms to ANSI & ISO standards. This is in contrast to NoSQL that has no real standard. Overall, although client usability is less complicated. The back-end interfacing of the SQL database is more complex than NoSQL. Furthermore, even though SQL is a standard, database vendors have added their own additional proprietary extensions or features that are usually used on their system only. Most often this is done to create vendor lock-in. However, the standard SQL commands such as “Select”, “Create”, “delete”, “update”, “insert”, “alter or drop” can be used to accomplish almost everything with any SQL interface.
Example SQL, Select statement Query and Tables
Resources:
—Indiana University– SQL examples https://kb.iu.edu/d/ahux#examples
More resources:
—Illinois Institute of Technology– free mini course on SQL and Quiz, can be found at http://www.cs.iit.edu/~cs561/cs425/VenkatashSQLIntro/default.htm
Examples of Notable SQL Databases (RELATIONAL Databases)
PostgreSQL
Db2
MySQL
Oracle
Microsoft SQL Server
Amazon Aurora
IBM Informix
NuoDB
CockroachDB
YugabyteDB
SQL Pro and CONs
Pros
Inter-process communication lends to speed and consistency. Relational databases employs normalization that prevents replication of data, because of hardware limitations they are keenly designed to optimized storage. They use a common and well understood standard language, SQL. Queries requires minimal coding but can call large amounts of data. Optimal for non-technical users.
Cons
Horizontal scaling is challenging or not supported at all. Single server hardware facilitates vertical scaling and leads to point of failure. Thus, data increases are limited by server size requiring a bigger server to accommodate more data. This imposes upper limits on scalability. Furthermore, SQL has rigid data models that adhere strictly to the row-column-table relational schema. This is not optimal for web-pages or document type data. Restricted to more structured data leads businesses to seek alternatives. Additionally, SQL vendor companies add proprietary extension to create vendor-lock-in therefore, changing is difficult. Moreover, set up is complex, requiring a carefully crafted design.
What is NoSQL?
NoSQL arose from two perceived problems in the existing system. Lack of horizontal scalability and rigidity of the table design in relational databases. NoSQL are non-relational databases that do not use SQL. These databases do not conform to the rigid row-column-table schema. They have also abandoned patterns such a transactions and JOIN. Thus, they are well suited for unstructured data such as a collection of web pages or documents. Moreover as data increases NoSQL leverages horizontal scaling that depends on distributed data across several servers rather than a single bigger machine as in vertical scaling. The resilience of NoSQL is found in its ability to divert data to another server in case one server fails. Limiting network downtime and single point failures.
SOME Notable EXAMPLES OF NoSQL Databases (Non-Rational Databases)
Arango DB
MongoDB
DynamoDB
Volt DB
Maria DB
Redis
Apache Couch DB
Elasticsearch
Cassandra
NOSQL Pro and CONs
Pros:
NoSQL is scalable and highly available. It employs several servers at different locations which facilitates horizontal scaling without significant single points of failure. Multiple servers allows a shift to working servers if one fails. This leads to increase resilience and agility. The linear addition of servers allows for limitless expansion. NoSQL is schema-less, not limited to structured data. Therefore a collection of web pages or documents are supported. They have low overhead, easy to start and cost efficient. Overall, they are flexible data models that can change on-the-fly.
Cons
Multiple servers requires load balancing protocols. Slow IO Network calls between servers causes latency. Multiple servers requires sending the data to multiple places this lends to inconsistency (for example loading a new picture on social media then refreshing the page and the old picture is still their)
💡 Interview Question 💡
A combined and thoughtfully designed SQL and NOSQL database can provide the agility, stability, consistency and reliability to augment e-commerce. Can you articulate why this is a solution for many enterprises.
Top comments (1)
nice summary of differences!