Athreya aka Maneshwar

Posted on Dec 23, 2025 • Edited on Mar 7

From Blocks to Meaning: Data Items and Databases

#webdev #programming #database #architecture

Hello, I'm Maneshwar. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

Yesterday, we grounded ourselves in the physical reality of disks—blocks, sectors, seek time, and why storage is slow and unreliable.

That gave us the why behind many database design choices.

Now it’s time to move one layer up.

Before we can talk about tables, indexes, or transactions, we need to understand what databases actually store and what it means for that stored state to be correct.

This starts with the most basic unit of information: the data item.

Data Items: Where Information Begins

A data item is anything that carries a piece of information.

The information itself is represented by the value stored in that data item, and the meaning we assign to that value is what makes it useful.

In practice, we often use the terms data, value, and information interchangeably, even though they are conceptually distinct.

A data item can be almost anything:

An integer
A person’s name
A house address
A binary blob
A table
Or something even more abstract

The size of a data item is called its granularity. Granularity defines how much information a single data item can carry.

For example:

An integer might occupy 1, 2, 4, or 8 bytes
A string might span dozens or thousands of bytes
A blob could be arbitrarily large

In most database discussions, data items are treated abstractly—their exact size or meaning is often left unspecified.

This abstraction is intentional: it allows database systems to reason about correctness and performance without being tied to specific representations.

At its core, every data item:

Resides somewhere in storage
Has a name or address by which it is referenced
Holds a value constrained by its type

The type defines:

What values are allowed
What operations are permitted

At a minimum, every data item must support:

Reading its current value
Overwriting it with a new value

That simple ability to read and overwrite is enough to create surprisingly complex systems once persistence and concurrency enter the picture.

From Data Items to Databases

A database is not just a container of data items.

It is a single repository of many persistent data items that are related to one another.

Data items in a database rarely exist in isolation. Their values are connected through relationships, and those relationships are governed by integrity constraints—rules that define which combinations of values are allowed.

The complete set of values of all data items at a given moment defines the database state.

A database state is said to be consistent if:

All data items satisfy all defined integrity constraints

This idea of consistency is foundational.

A database does not merely store values; it represents a real-world system at a particular point in time.

For example:

A university database models students, courses, instructors, and enrollments
A consistent state corresponds to a real, valid university scenario
An inconsistent state represents something that cannot exist in the real world

The job of a database system is not just to store data, but to prevent impossible worlds from being written to disk.

Database Operations and State Transitions

Users interact with databases by applying database operations.

These operations allow users to:

Retrieve information
Insert new data
Modify existing data
Delete obsolete data

Each operation transitions the database from one state to another.

Internally, these high-level database operations do not directly manipulate disks.

Instead, they are translated by the database management system (DBMS) into lower-level file operations.

Physically:

A database lives in one or more ordinary files
These files reside on disk
Modifying the database means modifying these files

This translation layer is critical. It is where abstract concepts like records and tables meet concrete realities like blocks, writes, and syncs.

Database Applications and Controlled Access

In early systems, users often manipulated data files directly using scripts, editors, or ad hoc tools. This approach is commonly called a file processing system and it quickly proved to be fragile and dangerous.

Direct file manipulation:

Is error-prone
Easily violates integrity constraints
Becomes unmanageable as data grows
Breaks down completely under concurrent access

The alternative is database applications.

Database applications are carefully designed programs that:

Encapsulate database logic
Expose only well-defined operations
Shield users from storage details

Modern databases are almost always accessed through such applications. Users execute queries and updates without needing to understand:

How data items are stored
Where files live on disk
How concurrency is handled
How failures are recovered from

Crucially, multiple users may run these applications independently and concurrently, all operating on the same database.

This concurrency is combined with persistence on unreliable disks and that is where the real complexity of database systems begins.

Why This Layer Matters

By now, we’ve moved from:

Disks and blocks → physical constraints to
Data items and databases → logical meaning and correctness

Everything that comes next the transactions, concurrency control, logging, recovery all exists to preserve the illusion that:* Data items behave atomically

Database states transition cleanly
Integrity constraints are never violated

Even though, underneath it all, we’re still just overwriting blocks on a slow, failure-prone disk.

*AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

HexmosTech / git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
🔁 Build a habit, ship better code. Regular review → fewer bugs → more robust code → better results in your team.
🔗 Why git? Git is universal. Every editor, every IDE, every AI…

View on GitHub