I've been programming seriously now for about a year; in that time I've brushed up on my python skills, completed a few YouTube courses on HTML/CSS, studied JavaScript, and built a few React apps. So far, everything has been awesome. I love to code and this journey has allowed me to seamlessly integrate the technical and creative sides of my brain. Recently, however, I have been faced with this lingering question: are my programs optimized for performance?
Curiosity has led me to dive deeper into some of the computer science fundamentals, something that I feel was missing from the previous stage of my learning. Sure I was learning to write code; heck, I even got out of "tutorial hell" and was quickly building projects on my own. However, I lacked the theory necessary to be confident that I was writing the best code.
Enter my introduction to data structures. For the past few months now, I have been following the old Stanford CS106B lectures available through YouTube. The course does an excellent job at transforming you from someone who knows how to code into someone who knows how to use computers to solve problems.
Data Structures
So why would you want to learn about data structures? Well, just as there are many tools that serve unique purposes in the construction of a house, there are various data structures that are useful in solving different kinds of problems in computer science. Here, I will briefly discuss the theory and applications of stacks, queues, sets, and maps implemented in C++.
Stacks
Like the name implies, a stack is simply a stack of things; In this structure, items can only be added or removed from the top. The order of a stack is known as "LIFO", or the last item that goes in is the first item that comes out.
Imagine you are grading papers, every time you finish one you place the paper into a neat pile on your desk. When you get up and look at those papers, the first one you will see is the last you finished grading. Similarly, if you grade another new paper and place it in the pile, it is now at the top of the pile. Stacks are just like this.
In a stack, the three fundamental operations are top, push, and pop. Calling "top" on a stack allows you to examine the top item but does not remove it. When you "push" an item to a stack, you place it at the top. When you "pop" from a stack, you remove the top item.
int main(){
stack<int> s;
s.push(1); // s -> {1}
s.push(2); // s -> {1,2}
cout << s.top() << endl;
s.pop(); // s -> {1}
s.push(3); // s -> {1,3}
cout << s.top() << endl;
s.pop(); // s -> {1}
cout << s.top() << endl;
return 0;
}
2
3
1
Stacks are useful in an abundance of applications. One of the most popular is the undo feature that is used in word processors. Every time a change is made to a document, that change gets added to a stack. When you press CMD+Z (or CTRL+Z for my windows friends), the most recent action gets "popped" from the stack. Another area where stacks find use is in the balancing of braces and parentheses in a program. The compiler checks for matching braces by implementing the stack data structure.
Queues
A queue is another data structure that is a lot like the line at a grocery store or movie theater. At the movie theater, people are served in the order in which they get in line; the first person that gets in line is the first person who gets their ticket. The order of a queue is then known as "FIFO", or the first item that goes in is the first item that comes out.
The fundamental operations of a queue are pretty similar to those of a stack; You can push to a queue, you can pop from a queue, and you can check both the frontmost and rearmost elements of a queue. The difference, however, is in the order in which elements are enqueued (pushed) and dequeued (popped); enqueuing an element places it at the back and dequeuing removes the element at the front.
int main(){
queue<int> q;
q.push(1); // q -> {1}
q.push(2); // q -> {1,2}
q.push(3); // q -> {1,2,3}
cout << q.front() << endl;
cout << q.back() << endl;
q.pop(); // q -> {2,3}
cout << q.front() << endl;
return 0;
}
1
3
2
Like stacks, queues are useful in a variety of problems. For instance, a queue might be used to schedule the order in which items get processed by a printer. Another type of problem that a queue excels in is the implementation of a breadth-first-search algorithm.
Sets
A set is a data structure that represents a unique collection of values. Sets cannot contain duplicates and are not indexed like vectors. Consequentially, adding and removing items from a set is efficient because, unlike vectors, there is no reindexing of other elements during these kinds of operations.
Sets have three primary operations: insert, erase, and count. "Insert" adds an element to the set so long as it is unique, "erase" removes an element from the set, and "count" checks if an element is present in the set. Of course, there are plenty of other things you can do with a set, and if you are interested you can check those out on the C++ STL reference page.
int main(){
set<int> st;
st.insert(1); // set contains 1
st.insert(2); // set contains 1,2
st.insert(2); // 2 is not unique and set is unchanged
st.insert(3); // set contains 1,2,3
st.insert(4); // set contains 1,2,3,4
cout << st.count(2) << endl; // return of 1 -> set contains element
st.erase(2); // set contains 1,3,4
cout << st.count(2) << endl; // return of 0 -> set does not contain element
cout << "--" << endl;
for (int item : st){
cout << item << endl;
}
return 0;
}
1
0
--
1
3
4
Sets, similar to other data structures mentioned, have specific cases where their use is ideal. One example that implements a set is a shopping list. We can add items to the shopping list and we can remove them, but we don't really care about the order of items on our list. The point is that we put all items on our list into our cart exactly one time. Another great scenario for a set that comes to mind is determining the number of unique words in a book. In this case, the book would be parsed and all words would be inserted into the set. Since sets ignore duplicates, attempting to add the same word more than once does nothing. At the end of the parsing, we simply count the number of items in our set, representing unique words from the book.
Maps
If you are familiar with python, a map in C++ is a lot like a python dict
. More formally, a map is a data structure that stores key-value pairs. Intuitively, a map is a lot like a dictionary, with words representing unique keys and the respective definitions representing their values. The concept of indexing is still present in a map, though the "index" (key) does not necessarily need to be an int
.
Maps support a variety of operations but the main ones are insertion of a key-value pair, removal of a key, and retrieval of a key. When you remove a key from the map, it also removes the value that was associated with it.
Consider the previous scenario that implemented a set to find all unique words in a book. Imagine that instead of simply wanting to get the count of all unique words in the document, you wanted to know the number of times that each word occurred. You could implement a map to do this:
// this file is named document.txt
to be or not to be
to be a not
void readFile(const string& filename){
ifstream file;
map <string, int> tally;
file.open(filename);
string word;
while (file >> word){
if (!tally.count(word)){
// tally.insert({word,1});
tally[word] = 1;
}
else{
tally[word]++;
}
}
for (auto& s : tally) {
cout << '{' << s.first << ':' << s.second << '}' << endl;
}
}
int main(){
readFile("document.txt");
return 0;
}
{a:1}
{be:3}
{not:2}
{or:1}
{to:3}
Besides this simple example, maps are used extensively in a variety of problems. You can imagine a map-type structure being used to keep track of a user and their friend list on a social media platform. Similarly, a map would be a perfect data structure to store the information in a phonebook. Maps are incredibly useful for storing associative data.
The next steps
Of course, this is not an exhaustive list of all the different kinds of data structures; if you want to learn more, the lectures I am following provide much more detail than I can share here. I have never had the pleasure of taking one of his courses, but Marty Strepp does an excellent job of presenting the material.
If you are interested in reading about how I got involved in programming, I wrote a medium article which you can find here. Similarly, if you want to stay up to date with my progress in this field, please visit my GitHub: https://github.com/michaelarn0ld
Keep on coding...
Top comments (0)