Skip to content
loading...

re: Explain indexing in databases Like I'm Five VIEW POST

FULL DISCUSSION
 

It is highly likely you are already aware of some common indexing techniques, like alphabetic or numeric order.

The latter you might have come into contact with when using arrays or vectors in some programming language and the former in libraries or encyclopedias.

In both cases the indexing system allows for quick insertion as well as quick retrieval once you have identified what you want to retrieve or what the thing you're inserting looks like.

With a database you tend to have more data than is convenient to index with just numbers or alphabetic characters, so those are usually combined with structuring and factoring of the data involved.

It could be pages with tables and relations between them or tree structures or something like that, what kind of data one is handling determines what is a good indexing and structuring technique. A good database engine gives you some control over this so you can optimise for different use cases while developing your program, e.g. flexibility or raw computational performance or visualisation.

Usually one uses index layers that contain reductive interpretations of the stored data that speed up searches by providing shortcuts. It could be links between copies of beginnings of words in the data set, like 'ju', to every data point that starts with these characters, for example the entry about the book Jurassic Park (Michael Crichton) as well as the one regarding Jubilees, as in the Book of Jubilees in Ethiopian Orthodoxy.

Programming such an index is fairly straightforward, one would basically check for the first two or three characters at insertion and add a layer of links to the underlying data, and a more tricky and interesting example of something similar is the soundex algorithm, but there are many, many clever ways to do these kinds of things.

For further reading Knuth's classic The Art of Computer Programming as well as more recent stuff like Cormens Introduction to Algorithms are time well spent.

 

I thinks this covers the ELI5 level pretty well.

If you want to dig deeper into indexing and how it works with a relational database I would absolutely recommend Use the index luke. Even from the 3 minute test I learnt something.

code of conduct - report abuse