DEV Community

shiva yada
shiva yada

Posted on

SQL Concepts

ACID Properties

Accessing and modification of the contents of a database using transaction which contain a single logical unit of work with accessing data using read and write operations.

In order to maintain consistency in a database, before and after the transaction, certain properties are followed. These are called ACID properties.

Atomicity:

Transactions do not occur partially. Each transaction is considered as one unit and either runs to completion or is not executed at all. Atomicity is also known as the ‘All or nothing rule’.

Two operations:

  • Abort : If transaction aborts, changes made to the database are not visible.
  • Commit: if a transaction commits, changes made are visible.

Consistency:

The database must be consistent before and after the transaction. It refers to correctness of database.

Isolation:

Multiple transactions occur independently without interference.Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed.

Durability:

The Changes of a successful transaction occurs even if the system failure occurs. These updates now become permanent and are stored in non-volatile memory. The effects of the transaction, thus, are never lost.

The ACID properties, in totality, provide a mechanism to ensure the correctness and consistency of a database in a way such that each transaction is a group of operations that acts as a single unit, produces consistent results, acts in isolation from other operations, and updates that it makes are durably stored.

CAP Theorem

Cap theorem used to explain some of the completing requirements in a distributed system with replication.

C- Consistency
A- Availability
P- Partition tolerance

The CAP theorem states that it is not possible to guarantee all three of the desirable properties – consistency, availability, and partition tolerance at the same time in a distributed system with data replication.

  • Consistency:

Consistency means that the nodes will have the same copies of a
replicated data item visible for various transactions. Every node in a distributed cluster returns same(most recent and a successful write).

  • Availability :

Read and write request for a data item will either successful or failure. For every successful node returns a response for all the read and write requests in a reasonable amount of time.

  • Partition Tolerance:

The system can continue operating even if the network connecting the nodes has a fault that results in two or more partitions, where the nodes in each partition can only communicate among each other. Distributed systems guaranteeing partition tolerance can gracefully recover from partitions once the partition heals.

JOINS

SQL Join statement is used to combine data or rows from two or more tables based on a common field between them.

  • Inner join

The INNER JOIN keyword selects all rows from both the tables when where condition satisfy. JOIN is same as INNER JOIN.

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
INNER JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.
table2: Second table
matching_column: Column common to both the tables.
Enter fullscreen mode Exit fullscreen mode
  • Left join

This join returns all the rows of the table on the left side of the join and matches rows for the table on the right side of the join. LEFT JOIN is also known as LEFT OUTER JOIN.

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
Enter fullscreen mode Exit fullscreen mode
  • Right join

Similar to left join. It returns all the rows of the table on the right side of the join and matching rows for the table on the left side of the join. RIGHT JOIN is also known as RIGHT OUTER JOIN.

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;
Enter fullscreen mode Exit fullscreen mode
  • Full join

FULL JOIN creates the result-set by combining results of both LEFT JOIN and RIGHT JOIN. The result-set will contain all the rows from both tables.

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
FULL JOIN table2
ON table1.matching_column = table2.matching_column;
Enter fullscreen mode Exit fullscreen mode

Aggregations:

An aggregate function in SQL returns one value after calculating multiple values of a column. We often use aggregate functions with the GROUP BY and HAVING clauses of the SELECT statement.
Various types of SQL aggregate functions are:

  • Count() The COUNT() function returns the number of rows in a database table.

Syntax:

COUNT(*)

id name quantity unit_price
1 foam 70 1.21
2 pork 49 4.65
3 Lettuce 38 3.35
4 brocolinni 90 4.53
5 sauce 94 1.63
6 petit 14 2.39
7 sweet 98 3.29
8 island 26 0.74
SELECT COUNT(id) from PRODUCTS
Enter fullscreen mode Exit fullscreen mode

Result:

count(id)
8
  • SUM() Function Sum() returns the total sum of a numeric column.
SELECT SUM(unit_price) from products;
Enter fullscreen mode Exit fullscreen mode

Result:

sum(unit_price)
21.72
  • AVG()

The AVG() function calculates the average of a set of values.

SELECT AVG(quantity) from products;
Enter fullscreen mode Exit fullscreen mode

Result:

AVG(quantity)
59.875
  • MIN() MIN() return the lowest value in set of non null values.
SELECT MIN(quantity) from products;
Enter fullscreen mode Exit fullscreen mode
MIN(quantity)
14
  • MAX()

MAX() returns the highest value in set of non null values.

SELECT MAX(quantity) from products;
Enter fullscreen mode Exit fullscreen mode
MAX(quantity)
98

Filter Clause

The filter clause is another tool for filtering in SQL. The filter clause is used to, as the name suggests, filter the input data to an aggregation function. They differ from the WHERE clause because they are more flexible than the WHERE clause. Only one WHERE clause can be used at a time in a query. While on the other hand, multiple FILTER clauses can be present in a single query.

Table:

ID NAME EMPLOYED
1 Drake Maverick 1
2 Nancy Peloski 0
3 Anakin Skywalker 0
4 Uchiha Obito 1
5 Anakin Skywalker 1

QUERY:

SELECT COUNT(ID)
FILTER(WHERE ID!=2),
AVG(LENGTH(NAME))
FILTER(WHERE LENGTH(NAME)>4)
FROM TEST;
Enter fullscreen mode Exit fullscreen mode

Result:

COUNT AVG
5 13.8333333333333333

Normalization:

  • The process of organizing the data in the database. It is used to minimize the redundancy from a relation or set of relations.
  • Normalization divides the larger table into smaller and links them using relationships.

Data modification anomalies can be categorized into three types:

  • Insertion Anomaly, deletion Anomaly, updatation anomaly.

TYPES OF NORMAL FORMS

1NF:

A relation is in 1NF if it contains an atomic value.
The First normal form disallows the multi-valued attribute, composite attribute, and their combinations.

2NF:

A relation will be in 2NF if it is in 1NF and all non key attributes are fully functional dependent on the primary key.

3NF:

A relation will be in 3NF if it is in 2NF and no transition dependency exists.
The third Normal Form ensures the reduction of data duplication. It is also used to achieve data integrity.

BCNF:

A stronger definition of 3NF is known as Boyce Codd's normal form.
The first condition for the table to be in Boyce Codd Normal Form is that the table should be in the third normal form. Secondly, every Right-Hand Side (RHS) attribute of the functional dependencies should depend on the super key of that particular table.

Indexes

An index is a schema object. It is used by the server to speed up the retrieval of rows by using a pointer. Reduces disk I/O by using rapid
path access method to locate data quickly.

  • It speed up select queries and where clauses, but it slows down data input, with the update and the insert statements.
  • It can be created or dropped with no effect on the data.

Creating an Index:

CREATE INDEX index
 ON TABLE column;
Enter fullscreen mode Exit fullscreen mode

Multi Indexes

 CREATE INDEX index
 ON TABLE (column1, column2,.....);
Enter fullscreen mode Exit fullscreen mode

Unique Indexes:

Unique indexes are used for the maintenance of the integrity of the data present in the table as well as for the fast performance, it does not allow multiple values to enter into the table.

CREATE UNIQUE INDEX index
 ON TABLE column;
Enter fullscreen mode Exit fullscreen mode

Removing an Index:

To remove an index from the data dictionary by using the DROP INDEX command.

DROP INDEX index;
Enter fullscreen mode Exit fullscreen mode

Altering an Index:

To modify an existing table’s index by rebuilding, or reorganizing the index.

ALTER INDEX IndexName ON TableName REBUILD;
Enter fullscreen mode Exit fullscreen mode

Transactions:

A transaction consist of a sequence of query and/or update statements and is a “unit” of work.

  • The sql standard specifies that a transaction begins implicitly when an SQL statement is executed.
  • COMMIT work The updates performed by the transaction become permanent in the database.

  • ROLLBACK work: All the updates performed by the sql statements in the transaction are undone.

  • Atomic transaction

either successful execution or rolled back as if it never occurred.

Locking Mechanism

In dbms with multiple users, there is a trade-off between concurrency and consistency. Users need to be able to access data any time, but changes to the database must be done in an orderly sequence that maintains the underlying structure of the data.

  • The locking system is to manage access to resources shared by user databases, tables, pages and rows to guarantee the consistency of the shared data.

  • Concurrency control in database management systems(DBMS) ensures that database transactions are performed concurrently without violating the integrity of the database.

Database Isolation Levels

Isolation determines how transaction integrity is visible to other users and systems. It means that a transaction should take place in a system in such a way that it is the only transaction that is accessing the resources in a database system.

  • Dirty Read: A dirty read is a situation when a transaction reads data that has not yet been committed.

  • Non Repeatable read: Non Repeatable read occurs when a transaction reads the same row twice and gets a different value each time.

  • Phantom Read : Phantom read occurs when two same queries are executed,but the rows retrieved by the two, are different.

Isolation levels:

  1. Read Uncommitted: lowest isolation level.
    In this level, one transaction may read not yet committed changes made by other transactions, thereby allowing dirty reads.

  2. Read Committed –in this level it guarantees that any data read is committed at the moment it is read. Transaction holds a read or write lock on the current row,then it prevents from other transactions from reading, updating, or deleting it.

  3. Repeatable read: The transaction holds read locks on all rows it references and writes locks on referenced rows for updates and delete actions. Since other transactions cannot read, update or delete these rows, consequently it avoids non-repeatable read.

  4. Serializable : Highest isolation level. In this execution of operations in which concurrently executing transactions appears to be serially executing.

Triggers :

  • A trigger is a statement that is executed automatically by the system as a side effect of a modification to the database.

  • To design a trigger mechanism, we must:

    • Specify the conditions under which the trigger is to be executed.
    • Specify the actions to be taken when the trigger executes
create trigger [trigger_name] 
[before | after]  
{insert | update | delete}  
on [table_name]  
[for each row]  
[trigger_body] 
Enter fullscreen mode Exit fullscreen mode

Explanation of syntax:

  • create trigger [trigger_name]: Creates or replaces an existing trigger with the trigger_name.
  • [before | after]: This specifies when the trigger will be executed.
  • {insert | update | delete}: This specifies the DML operation.
  • on [table_name]: This specifies the name of the table associated with the trigger.
  • [for each row]: This specifies a row-level trigger, i.e., the trigger will be executed for each row being affected.
  • [trigger_body]: This provides the operation to be performed as trigger is fired.
CREATE OR REPLACE FUNCTION calculate_marks()
RETURNS TRIGGER AS $$
BEGIN
    NEW.total := NEW.subj1 + NEW.subj2 + NEW.subj3;
    NEW.per := NEW.total * 60 / 100;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Enter fullscreen mode Exit fullscreen mode
CREATE TRIGGER stud_marks
BEFORE INSERT ON Student
FOR EACH ROW
EXECUTE FUNCTION calculate_marks();
Enter fullscreen mode Exit fullscreen mode

Insert query:

insert into Student values(0, "ABCDE", 20, 20, 20, 0, 0); 
Enter fullscreen mode Exit fullscreen mode
select * from Student; 
Enter fullscreen mode Exit fullscreen mode
tid name subj1 subj2 subj3 total per
100 ABCDE 20 20 20 60 36

Resources:

Top comments (0)