A comprehensive reference of database terms from A to Z written for beginners and anyone learning how data is stored, managed, and retrieved. Each term includes a plain-English definition and a real world example.
Quick Navigation
A · B · C · D · E · F · G · H · I · J · K · L · M · N · O · P · Q · R · S · T · U · V · W · X · Y · Z
A
| Term | Definition | Example |
|---|---|---|
| ACID Properties | A set of four guarantees — Atomicity, Consistency, Isolation, Durability — that ensure database transactions are processed reliably even when something goes wrong | When you transfer money between bank accounts, ACID ensures the amount is deducted from one account and added to the other — never half-done |
| Aggregate Function | A function that performs a calculation on a group of rows and returns a single result | COUNT(), SUM(), AVG(), MIN(), and MAX() — e.g. SUM(sales) to get total revenue across all orders |
| ALTER TABLE | An SQL command used to add, remove, or modify columns and constraints in an existing table | Adding a "phone_number" column to a "customers" table that was created without it |
| Anomaly (Data Anomaly) | An error or inconsistency in a database caused by poor design — specifically by data redundancy | Storing a customer's city in multiple rows, then updating only some of them so different rows show different cities for the same customer |
| Attribute | A characteristic or property of an entity in a database — equivalent to a column in a table | In a "Student" table, "name", "age", and "enrollment_date" are attributes |
| Auto-Increment | A feature that automatically generates a unique, incrementing number each time a new row is inserted | A new customer gets ID 1001, the next gets 1002, and so on — without the user specifying the number |
B
| Term | Definition | Example |
|---|---|---|
| Backup | A copy of the database saved at a point in time so it can be restored if data is lost or corrupted | Scheduling an automatic nightly backup of a hospital's patient records database |
| Base Table | A real, physically stored table in a database, as opposed to a view which is virtual | The "orders" table where actual order data lives — as distinct from a "recent_orders" view built on top of it |
| Batch Processing | Processing a large volume of data records together at scheduled intervals rather than one at a time in real time | Running payroll calculations for all 5,000 employees every Friday night in a single batch job |
| BCNF (Boyce-Codd Normal Form) | A stricter version of Third Normal Form (3NF) that eliminates certain remaining redundancies in table design | Resolving a table where a non-key column can determine part of the primary key — a subtle design flaw BCNF catches |
| Binary Large Object (BLOB) | A data type used to store large binary files — such as images, audio, or documents — directly in the database | Storing a user's profile photo as a BLOB in the database instead of just saving a file path |
| Buffer Pool | A region of memory where the database caches recently accessed data pages to reduce slow disk reads | The database serving the same popular product page from memory instead of hitting the disk every time |
C
| Term | Definition | Example |
|---|---|---|
| Candidate Key | Any column or combination of columns that could uniquely identify each row in a table — before one is chosen as the primary key | A "students" table where both "student_id" and "national_id" uniquely identify each student — both are candidate keys |
| Cardinality | The number of unique values in a column, or the nature of the relationship between two tables (one-to-one, one-to-many, many-to-many) | A "gender" column with 3 values has low cardinality; a "customer_id" column with millions of unique values has high cardinality |
| CASCADE | A rule applied to a foreign key that automatically propagates changes — when a parent row is deleted or updated, child rows follow | Deleting a customer automatically deletes all their orders, because the foreign key is set to CASCADE |
| Check Constraint | A rule enforced at the database level that limits what values can be entered in a column | Ensuring the "age" column never accepts values below 0 or above 150 |
| Clustered Index | An index that physically sorts and stores table rows in the order of the index key — only one can exist per table | A "customers" table sorted on disk by customer_id, making lookups by ID very fast |
| Column | A vertical field in a table that holds one specific type of data across all rows | The "email" column in a "users" table stores an email address for every user |
| Composite Key | A primary key made up of two or more columns combined, because no single column is unique enough on its own | In an "order_items" table, the combination of "order_id" and "product_id" together uniquely identifies each line item |
| Concurrency | Multiple users or processes accessing and modifying the database at the same time | Hundreds of shoppers adding items to their carts simultaneously on an e-commerce site |
| Constraint | A rule applied to a column or table that restricts what data can be entered, enforcing data integrity | NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY are common constraints |
| CRUD | The four basic operations of any database — Create, Read, Update, Delete | Adding a new user (Create), viewing their profile (Read), changing their email (Update), and removing their account (Delete) |
| Cursor | A database object that allows you to retrieve query results one row at a time, rather than all at once | Iterating through a list of 10,000 customers one by one to process each individually inside a stored procedure |
D
| Term | Definition | Example |
|---|---|---|
| Data Definition Language (DDL) | SQL commands that define or modify the structure of a database — its tables, columns, and constraints | CREATE TABLE, ALTER TABLE, DROP TABLE — used by developers to build or change the database schema |
| Data Integrity | The accuracy, consistency, and reliability of data throughout its lifecycle in the database | Ensuring a "date_of_birth" field never contains a future date, and a "country_code" field only accepts valid country codes |
| Data Lake | A large storage repository that holds raw, unstructured, or semi-structured data in its native format until it is needed | A company storing all clickstream logs, emails, and sensor data in Amazon S3 before deciding how to analyse them |
| Data Manipulation Language (DML) | SQL commands that retrieve or change the actual data within tables | SELECT, INSERT, UPDATE, DELETE — used every time you query or modify records |
| Data Mart | A smaller, subject-specific version of a data warehouse focused on one department or business function | A sales data mart containing only revenue, pipeline, and quota data for the sales team |
| Data Migration | The process of moving data from one system, format, or location to another | Moving customer records from an old Oracle database to a new PostgreSQL system |
| Data Redundancy | Storing the same piece of data in multiple places, which wastes storage and risks inconsistency | Storing a customer's city name in both the "customers" table and every row of the "orders" table |
| Data Type | The classification that defines what kind of value a column can hold | INT for whole numbers, VARCHAR for text, DATE for dates, BOOLEAN for true/false values |
| Data Warehouse | A large, centralised database designed specifically for reporting and business intelligence rather than day-to-day operations | Amazon Redshift or Snowflake storing years of sales history so analysts can run trend reports |
| Database | An organised collection of structured data stored electronically, managed by a Database Management System (DBMS) | A library's catalogue system storing every book's title, author, availability, and borrower history |
| Database Administrator (DBA) | The person responsible for installing, configuring, securing, monitoring, and maintaining a database system | A DBA who schedules backups, tunes slow queries, and manages user access permissions |
| Database Management System (DBMS) | The software layer that sits between users and the physical database, handling storage, retrieval, and security | MySQL, PostgreSQL, Oracle, Microsoft SQL Server — the engine that makes databases work |
| Deadlock | A situation where two transactions are each waiting for the other to release a resource, causing both to freeze indefinitely | Transaction A holds a lock on Table 1 and wants Table 2; Transaction B holds Table 2 and wants Table 1 — neither can proceed |
| Denormalisation | The deliberate introduction of redundancy into a database to improve read performance at the cost of write complexity | Storing a pre-calculated "total_order_value" in the orders table to avoid joining three tables every time a report runs |
| Dirty Read | Reading data that another transaction has changed but not yet committed — which may be rolled back | Reading a bank balance that is being updated mid-transfer and seeing an amount that will never actually be the final value |
| DROP | An SQL command that permanently deletes a table, database, or other object and all the data inside it | DROP TABLE customers — immediately and irreversibly removes the entire customers table |
| Dump | A complete export of a database's structure and data into a file, typically used for backup or migration | Running pg_dump in PostgreSQL to export the entire database to a .sql file |
E
| Term | Definition | Example |
|---|---|---|
| Entity | A real-world object or concept that a database table represents | "Customer", "Product", "Order", and "Invoice" are entities in an e-commerce database |
| Entity-Relationship Diagram (ERD) | A visual diagram that shows the tables in a database, their columns, and how they relate to each other | A diagram showing that one "customer" can have many "orders", and each "order" contains many "products" |
| Execution Plan | The step-by-step strategy a database engine uses to run a query — reveals how it accesses data and whether indexes are being used | Running EXPLAIN on a slow query to discover the database is scanning 2 million rows when an index could narrow it to 10 |
F
| Term | Definition | Example |
|---|---|---|
| Field | Another word for a column — the smallest unit of data storage in a relational table | The "email" field in the "users" table |
| File-Based Database | An early form of data storage where data is kept in flat files without a DBMS managing relationships or integrity | Old systems that stored all customer records in a single .csv or .txt file |
| First Normal Form (1NF) | The first rule of database normalisation — every column must hold atomic (indivisible) values and every row must be unique | Splitting "John, Jane" in a single "names" cell into two separate rows — one per person |
| Foreign Key | A column in one table that references the primary key of another table, creating a link between the two | An "orders" table with a "customer_id" column that refers to the "id" column in the "customers" table |
| Full Table Scan | A query operation where the database reads every single row in a table to find matching records — slow on large tables | A query with no WHERE clause, or a WHERE clause on a column with no index, triggers a full table scan |
| Function (SQL) | A reusable block of SQL code that accepts inputs, performs an operation, and returns a value | UPPER('hello') returns 'HELLO' — a built-in scalar function |
G
| Term | Definition | Example |
|---|---|---|
| Graph Database | A database that stores data as nodes and edges — designed for highly connected data with complex relationships | Neo4j modelling a social network where people are nodes and friendships are edges |
| GROUP BY | An SQL clause that groups rows sharing a common value so aggregate functions can be applied to each group | GROUP BY country with SUM(revenue) to get total revenue per country |
| GRANT | An SQL command that gives a user or role permission to perform specific operations on database objects | GRANT SELECT ON customers TO analyst — allowing a user to read but not modify the table |
H
| Term | Definition | Example |
|---|---|---|
| HAVING | An SQL clause that filters grouped results after GROUP BY — like WHERE but applied after aggregation | HAVING SUM(revenue) > 100000 — showing only countries with total revenue above ₹1 lakh |
| Heap | The unsorted physical storage of table rows on disk, with no particular order — the default when no clustered index exists | A table where rows are stored in the order they were inserted, with no logical sort |
| Horizontal Scaling (Sharding) | Distributing a database across multiple servers by splitting the data — each server holds a subset of the rows | Users with IDs 1–1,000,000 on Server A; users 1,000,001–2,000,000 on Server B |
I
| Term | Definition | Example |
|---|---|---|
| Index | A data structure that speeds up data retrieval by creating a quick lookup path — like the index at the back of a book | An index on the "email" column of a "users" table lets the database find a user by email instantly instead of scanning all rows |
| Inner Join | An SQL join that returns only rows where there is a matching value in both tables | Joining "orders" and "customers" with INNER JOIN returns only orders that have a matching customer — orphan orders are excluded |
| INSERT | The SQL command used to add new rows of data into a table | INSERT INTO students (name, age) VALUES ('Priya', 21) |
| Instance | A running copy of a database management system on a server | Two departments running separate MySQL instances on the same machine, each managing their own databases |
| Isolation Level | A database setting that controls how much one transaction can see of another's uncommitted changes | READ COMMITTED prevents dirty reads; SERIALIZABLE prevents all concurrency anomalies but is slower |
J
| Term | Definition | Example |
|---|---|---|
| Join | An SQL operation that combines rows from two or more tables based on a related column | Combining "orders" and "products" to show what items are in each order — linking them via "product_id" |
| JSON (in Databases) | A flexible data format used to store semi-structured data inside a column — supported natively by many modern databases | Storing a product's variable attributes (colour options, sizes, specs) as a JSON column instead of creating dozens of columns |
K
| Term | Definition | Example |
|---|---|---|
| Key | A column or set of columns used to identify rows uniquely or to create relationships between tables | Primary key, foreign key, candidate key, and composite key are the main types |
| Key-Value Store | A simple NoSQL database that stores data as pairs of keys and their associated values — extremely fast for lookups | Redis storing session tokens where the key is the session ID and the value is the user's data |
L
| Term | Definition | Example |
|---|---|---|
| Latency | The time delay between sending a query to the database and receiving the result | A query taking 3 seconds to return results has high latency — a sign of a missing index or poor schema design |
| Left Join (Left Outer Join) | An SQL join that returns all rows from the left table, and matching rows from the right table — unmatched rows show NULL | Joining "customers" and "orders" with LEFT JOIN returns all customers, including those who have never placed an order |
| Lock | A mechanism that prevents two transactions from modifying the same data simultaneously, avoiding corruption | Transaction A locks a row while updating a bank balance — Transaction B must wait until the lock is released |
| Log (Transaction Log) | A file that records every change made to the database, used for recovery if the system crashes | If a server crashes mid-transaction, the database replays the transaction log to restore a consistent state |
M
| Term | Definition | Example |
|---|---|---|
| Many-to-Many Relationship | A relationship where multiple rows in Table A relate to multiple rows in Table B — requires a junction table | Students can enrol in many courses; courses can have many students — linked via an "enrolments" junction table |
| Materialized View | A saved query result stored as a physical table that can be refreshed periodically — faster than running the query each time | A daily sales summary view that is pre-calculated at midnight so reports load instantly all day |
| Migration | A versioned, scripted change to a database schema — adding a column, renaming a table, or changing a data type | Deploying version 2.0 of an application that adds a "loyalty_points" column to the "customers" table |
| Multi-Version Concurrency Control (MVCC) | A method that allows multiple transactions to read and write simultaneously by keeping multiple versions of a row | PostgreSQL using MVCC so readers never block writers — each transaction sees a consistent snapshot of data |
N
| Term | Definition | Example |
|---|---|---|
| NoSQL Database | A category of databases that store data in formats other than relational tables — document, key-value, graph, or column-family | MongoDB (documents), Redis (key-value), Cassandra (column-family), Neo4j (graph) |
| Normalisation | The process of organising a database to reduce data redundancy and improve data integrity by applying a series of formal rules | Splitting a table that stores customer name, customer city, and order details into separate "customers" and "orders" tables |
| NOT NULL | A constraint that prevents a column from being left empty — a value must always be provided | Applying NOT NULL to the "email" column so every user must have an email address on record |
| NULL | A special marker indicating that a value is unknown, missing, or not applicable — it is not zero or an empty string | A "middle_name" column showing NULL for customers who have no middle name |
O
| Term | Definition | Example |
|---|---|---|
| OLAP (Online Analytical Processing) | Database systems optimised for complex analytical queries across large historical datasets | A business analyst running a query across 5 years of sales data to identify seasonal trends |
| OLTP (Online Transaction Processing) | Database systems optimised for fast, high-volume, everyday transactions — insert, update, delete | A retail point-of-sale system processing hundreds of transactions per second |
| One-to-Many Relationship | A relationship where one row in Table A relates to multiple rows in Table B | One customer can have many orders — but each order belongs to exactly one customer |
| One-to-One Relationship | A relationship where each row in Table A corresponds to exactly one row in Table B | A "users" table and a "user_profiles" table — each user has exactly one profile |
| ORM (Object-Relational Mapping) | A programming technique that lets developers interact with the database using objects in their code instead of writing SQL | Django's ORM letting a Python developer write User.objects.filter(age__gt=18) instead of SELECT * FROM users WHERE age > 18 |
P
| Term | Definition | Example |
|---|---|---|
| Partition | Dividing a large table into smaller, more manageable pieces while still treating it as one logical table | Partitioning an "orders" table by year — 2022, 2023, 2024 each stored separately — so queries for one year don't scan all years |
| Phantom Read | A problem in concurrent databases where a transaction re-runs a query and sees new rows that appeared from another transaction | Transaction A counts 50 orders; Transaction B inserts 5 more; Transaction A counts again and gets 55 — it "sees" phantom rows |
| Primary Key | A column (or combination of columns) that uniquely identifies each row in a table — no two rows can share the same primary key value, and it cannot be NULL | "customer_id" in a customers table — every customer has a different ID, and no customer can exist without one |
| Procedure — see Stored Procedure |
Q
| Term | Definition | Example |
|---|---|---|
| Query | A request sent to the database to retrieve, insert, update, or delete data — usually written in SQL | SELECT name, email FROM customers WHERE city = 'Nagpur' — retrieving all Nagpur customers |
| Query Optimiser | The component inside a DBMS that analyses a query and determines the most efficient way to execute it | Choosing to use an index rather than a full table scan to run a WHERE clause faster |
| Query Plan — see Execution Plan |
R
| Term | Definition | Example |
|---|---|---|
| RDBMS (Relational Database Management System) | A DBMS that organises data into tables with rows and columns, and uses SQL to manage and query it | MySQL, PostgreSQL, Oracle Database, and Microsoft SQL Server are all RDBMSs |
| Record | A single row in a database table — represents one complete instance of the entity the table describes | One row in the "students" table represents one student with all their details |
| Referential Integrity | The rule that a foreign key value in one table must always match an existing primary key value in the referenced table | An "orders" table cannot have a customer_id that does not exist in the "customers" table |
| Replication | Copying data from one database server to one or more other servers to improve availability and fault tolerance | A primary database in Mumbai continuously copying its data to a replica in Bengaluru — if Mumbai fails, Bengaluru takes over |
| REVOKE | An SQL command that removes previously granted permissions from a user or role | REVOKE DELETE ON orders FROM intern — preventing an intern from deleting order records |
| Right Join (Right Outer Join) | An SQL join that returns all rows from the right table and matching rows from the left | Joining "employees" and "departments" with RIGHT JOIN returns all departments, including those with no employees yet |
| Roll Back | Undoing all changes made by a transaction that failed or was explicitly cancelled | A failed payment transaction rolling back so the customer's balance is not deducted |
| Row | A horizontal record in a table — one complete set of data values for a single entity | One row in the "products" table holds all the details of one product |
S
| Term | Definition | Example |
|---|---|---|
| Schema | The blueprint of a database — the definition of its tables, columns, data types, and relationships | A schema defines that the "orders" table has columns: order_id (INT), customer_id (INT), order_date (DATE), and total (DECIMAL) |
| Second Normal Form (2NF) | A normalisation rule that requires a table to be in 1NF and that every non-key column depends on the entire primary key, not just part of it | Removing "product_name" from an "order_items" table whose key is (order_id, product_id) — product_name depends only on product_id, not the full key |
| SELECT | The most frequently used SQL command — retrieves data from one or more tables based on specified conditions | SELECT * FROM employees WHERE department = 'Engineering' |
| Sequence | A database object that generates a series of unique numbers in order, used to populate primary key columns | A sequence that produces 1, 2, 3, 4... automatically as each new order is inserted |
| Sharding — see Horizontal Scaling | ||
| SQL (Structured Query Language) | The standard language used to communicate with relational databases — used to create, retrieve, update, and delete data | Virtually every relational database — MySQL, PostgreSQL, Oracle — uses SQL as its primary interface |
| SQL Injection | A security attack where malicious SQL code is inserted into an input field to manipulate or destroy the database | Entering ' OR '1'='1 into a login form to bypass authentication — a classic injection attack |
| Stored Procedure | A pre-written, saved block of SQL code stored in the database that can be called by name — like a function for SQL | A "ProcessPayment" stored procedure that handles all the steps of a payment transaction in one reusable call |
| Subquery | A query nested inside another query — the inner query's result is used by the outer query | SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees) — finding employees earning above average |
| Surrogate Key | An artificial primary key — typically an auto-incremented number — created purely to uniquely identify rows, with no business meaning | Assigning a customer_id of 10042 to a customer purely for database purposes — the number itself means nothing to the business |
T
| Term | Definition | Example |
|---|---|---|
| Table | The fundamental storage structure in a relational database — data is organised into rows (records) and columns (fields) | A "products" table with columns for product_id, name, price, and stock_quantity |
| Third Normal Form (3NF) | A normalisation rule that requires every non-key column to depend only on the primary key — not on another non-key column | Removing "city" from a "customers" table where city depends on "zip_code" rather than directly on "customer_id" |
| Transaction | A group of one or more SQL operations treated as a single unit — either all succeed together, or none are applied | Transferring ₹10,000 between two bank accounts is one transaction — the debit and credit must both succeed or both fail |
| Trigger | An automated action that the database executes automatically when a specific event occurs on a table | A trigger that automatically logs every DELETE operation on the "customers" table to an audit log table |
| Truncate | An SQL command that removes all rows from a table quickly without logging individual row deletions — faster than DELETE but cannot be rolled back easily | TRUNCATE TABLE temp_data — clearing a staging table before loading fresh data into it |
U
| Term | Definition | Example |
|---|---|---|
| Union | An SQL operator that combines the results of two or more SELECT queries into a single result set, removing duplicates | Combining a list of email addresses from "customers" and "newsletter_subscribers" into one deduplicated list |
| UNION ALL | Like UNION but keeps all rows including duplicates — faster because it skips the deduplication step | Combining two sets of transaction records where duplicates are expected and acceptable |
| Unique Constraint | A rule that ensures all values in a column (or combination of columns) are distinct across all rows | Applying UNIQUE to the "email" column so no two users can register with the same email address |
| UPDATE | The SQL command used to modify existing data in a table | UPDATE products SET price = 499 WHERE product_id = 7 — changing the price of one specific product |
V
| Term | Definition | Example |
|---|---|---|
| View | A virtual table defined by a saved SELECT query — it has no physical data of its own but behaves like a table | A "active_customers" view that automatically shows only customers who placed an order in the last 90 days |
| Vertical Scaling | Increasing the power of a single database server by adding more CPU, RAM, or storage | Upgrading the database server from 16GB RAM to 128GB to handle heavier query loads |
W
| Term | Definition | Example |
|---|---|---|
| WHERE | An SQL clause that filters which rows are returned or affected by a query based on one or more conditions | SELECT * FROM orders WHERE status = 'pending' — retrieving only orders that have not been fulfilled |
| Wildcard | A special character used in SQL queries to match patterns in text data | The % symbol in LIKE 'Na%' matches any value starting with "Na" — e.g. "Nagpur", "Nandita", "Naresh" |
| Write-Ahead Logging (WAL) | A technique where changes are recorded in a log file before being applied to the database — ensures durability and crash recovery | PostgreSQL writing every committed transaction to the WAL file so changes can be replayed if the system crashes |
X
| Term | Definition | Example |
|---|---|---|
| XML (in Databases) | A structured data format supported by many databases for storing hierarchical or semi-structured data within columns | Storing a product's full specification tree as an XML column in SQL Server |
Y
| Term | Definition | Example |
|---|---|---|
| YAML (Database Config) | A human-readable format commonly used to write database configuration files and connection settings | Defining a PostgreSQL connection — host, port, username, password — in a YAML config file for a web application |
Z
| Term | Definition | Example |
|---|---|---|
| Zero Downtime Migration | A database migration strategy designed to apply schema changes without taking the application offline | Adding a new column to a live production table in stages so users never experience an outage |
| Zone (Data Zone) | A logical or physical partition of a database or data lake that separates data by sensitivity, source, or processing stage | Separating raw incoming data (Bronze Zone), cleaned data (Silver Zone), and analytics-ready data (Gold Zone) in a data lake |
*This glossary covers 100+ terms across the full database landscape, from beginner SQL to advanced concepts like MVCC, WAL, and sharding.
Top comments (0)