Every variable we'd worked with for four weeks held exactly one thing.
One name. One score. One balance. One grade.
That's fine for simple programs. But real data doesn't come in ones. A school has hundreds of students. A shop has thousands of products. M-Pesa processes millions of transactions every day.
I opened Week 5 with a question: "How do you store 500 student names in Python?"
Someone said: name1 = "Amina", name2 = "Brian" ...
I let them get to name5 before raising my hand.
"What about name 500?"
This week we got the answer. Not one box - a whole warehouse.
Lists: The Numbered Shopping List
A list is an ordered, changeable collection. Think of it as a numbered shopping list - item 1, item 2, item 3. The order matters. You can add things, remove things, change things.
students = ["Amina", "Brian", "Njeri", "Kamau", "Wanjiku"]
print(students) # the whole list
print(students[0]) # first item — Amina
print(students[2]) # third item — Njeri
print(students[-1]) # last item — Wanjiku
['Amina', 'Brian', 'Njeri', 'Kamau', 'Wanjiku']
Amina
Njeri
Wanjiku
That -1 index caused the first pause of the session. "Negative indexing?"
Yes. In Python, -1 always means the last item. -2 is second from last. It's Python's shortcut for "count from the back" -useful when you don't know how long the list is.
List Methods: What You Can Do With a List
Lists come with built-in tools:
scores = [78, 45, 92, 61, 88]
scores.append(73) # add to the end
print(scores)
scores.insert(0, 100) # insert at position 0
print(scores)
scores.remove(45) # remove first occurrence of 45
print(scores)
scores.sort() # sort in place
print(scores)
print(len(scores)) # how many items?
print(max(scores)) # highest value
print(min(scores)) # lowest value
print(sum(scores)) # total
[78, 45, 92, 61, 88, 73]
[100, 78, 45, 92, 61, 88, 73]
[100, 78, 92, 61, 88, 73]
[61, 73, 78, 88, 92, 100]
6
100
61
492
And looping through a list - which felt familiar from Week 3:
for score in scores:
if score >= 80:
print(f"{score} ✅ Pass")
else:
print(f"{score} ❌ Below average")
61 ❌ Below average
73 ❌ Below average
78 ❌ Below average
88 ✅ Pass
92 ✅ Pass
100 ✅ Pass
Tuples: The Sealed Envelope
A tuple looks like a list but uses round brackets - and once you create it, you cannot change it. It's sealed.
coordinates = (-1.286389, 36.817223) # Nairobi CBD coordinates
rgb_blue = (0, 0, 255)
weekdays = ("Mon", "Tue", "Wed", "Thu", "Fri")
print(coordinates[0]) # -1.286389
print(len(weekdays)) # 5
-1.286389
5
Try to change it:
coordinates[0] = 99 # trying to modify a tuple
TypeError: 'tuple' object does not support item assignment
"Why would you ever want something you can't change?"
Good question - and it's the right one to ask. Tuples are for data that should not change. GPS coordinates. Days of the week. A student's ID number and date of birth. If you use a list for these and something accidentally modifies it, your data is wrong. A tuple makes that impossible.
Rule I gave them: if the data shouldn't change - use a tuple. If it might change - use a list.
Dictionaries: The Contacts Book
This is the one that changed how students thought about data.
A dictionary stores data in key-value pairs. Instead of looking things up by position (item 0, item 1), you look them up by name.
My analogy: a contacts book. You don't flip to page 7 to find someone's number - you look them up by their name. The name is the key. The number is the value.
student = {
"name": "Amina Waweru",
"age": 24,
"course": "Data Engineering",
"city": "Nairobi",
"score": 88
}
print(student["name"]) # Amina Waweru
print(student["score"]) # 88
print(student["course"]) # Data Engineering
Amina Waweru
88
Data Engineering
Adding, updating, and removing entries:
# Add a new key
student["email"] = "amina@luxacademy.co.ke"
# Update an existing value
student["score"] = 92
# Remove a key
del student["age"]
print(student)
{'name': 'Amina Waweru', 'course': 'Data Engineering', 'city': 'Nairobi', 'score': 92, 'email': 'amina@luxacademy.co.ke'}
Looping through a dictionary:
for key, value in student.items():
print(f"{key}: {value}")
name: Amina Waweru
course: Data Engineering
city: Nairobi
score: 92
email: amina@luxacademy.co.ke
The Moment That Stopped the Room
We'd been working with one dictionary - one student. Then I asked: "What if we had a list of dictionaries? One per student?"
students = [
{"name": "Amina", "course": "Data Engineering", "score": 88},
{"name": "Brian", "course": "Data Science", "score": 74},
{"name": "Njeri", "course": "Data Engineering", "score": 91},
{"name": "Kamau", "course": "Data Science", "score": 62},
{"name": "Wanjiku","course": "Data Engineering", "score": 79},
]
for student in students:
print(f"{student['name']} | {student['course']} | Score: {student['score']}")
Amina | Data Engineering | Score: 88
Brian | Data Science | Score: 74
Njeri | Data Engineering | Score: 91
Kamau | Data Science | Score: 62
Wanjiku | Data Engineering | Score: 79
Otieno looked at it for a moment, then said: "This is basically a table. Like SQL."
The room shifted.
He was exactly right. Each dictionary is a row. Each key is a column. A list of dictionaries is a table - the same structure that SQL uses, the same structure that Pandas DataFrames use. The students headed for Data Engineering had just seen their first dataset. The ones headed for Data Science had just seen their first DataFrame — before ever importing Pandas.
I told them to remember this moment.
Filtering a List of Dictionaries
Once they saw the table connection, the next step was obvious - filtering:
# Find all Data Engineering students
de_students = []
for student in students:
if student["course"] == "Data Engineering":
de_students.append(student)
for s in de_students:
print(f"{s['name']}: {s['score']}")
Amina: 88
Njeri: 91
Wanjiku: 79
Someone said quietly: "That's a WHERE clause."
Yes. Exactly. We'd just written a Python WHERE filter - no SQL, no Pandas, just a loop and a conditional. These concepts connect across every tool they'll ever use.
Sets: The Register With No Duplicates
The last structure - and the one with the best surprise moment.
A set is an unordered collection with no duplicates. If you try to add the same item twice, it just ignores the second one.
# Imagine students signing up for a workshop
signups = ["Amina", "Brian", "Amina", "Njeri", "Brian", "Brian", "Kamau"]
unique_signups = set(signups)
print(unique_signups)
print(f"Unique attendees: {len(unique_signups)}")
{'Njeri', 'Kamau', 'Amina', 'Brian'}
Unique attendees: 4
"It just... removed the duplicates?!"
Yes. Automatically. No loop. No checking. Just set().
Aisha immediately said: "So this is like SELECT DISTINCT in SQL?"
Exactly. Different syntax, same idea. Sets exist for moments when uniqueness is what matters - unique visitors, unique products, unique error types. If you need order or labels - use a list or dictionary. If you just need to know what's there without repetition - use a set.
Choosing the Right Structure
I ended the theory section with a decision table:
| I need to... | Use |
|---|---|
| Store items in order, change them later | List |
| Store items that should never change | Tuple |
| Look things up by name/label | Dictionary |
| Store unique items, no duplicates | Set |
This table ended up on everyone's notes. The question isn't "how do I use each one" - it's "which one fits what I'm trying to do."
We Built This Together: Student Registry System
The session project - combining all four structures plus functions from Week 4:
# Our student registry - a list of dictionaries
registry = []
def add_student(name, course, score):
student = {
"name": name,
"course": course,
"score": score,
"grade": calculate_grade(score)
}
registry.append(student)
def calculate_grade(score):
if score >= 80: return "A"
elif score >= 70: return "B"
elif score >= 60: return "C"
elif score >= 50: return "D"
else: return "F"
def print_registry():
print(f"\n{'='*50}")
print(f" LUX ACADEMY - STUDENT REGISTRY")
print(f"{'='*50}")
for s in registry:
print(f" {s['name']:<15} {s['course']:<20} {s['score']} ({s['grade']})")
print(f"{'='*50}")
print(f" Total students: {len(registry)}")
def course_summary():
courses = set(s["course"] for s in registry)
print("\n--- Course Summary ---")
for course in courses:
enrolled = [s for s in registry if s["course"] == course]
avg = sum(s["score"] for s in enrolled) / len(enrolled)
print(f" {course}: {len(enrolled)} students | Avg score: {avg:.1f}")
# Populate the registry
add_student("Amina Waweru", "Data Engineering", 88)
add_student("Brian Otieno", "Data Science", 74)
add_student("Njeri Kamau", "Data Engineering", 91)
add_student("Kamau Mwangi", "Data Science", 62)
add_student("Wanjiku Achieng", "Data Engineering", 79)
add_student("Aisha Baraka", "Data Science", 85)
print_registry()
course_summary()
==================================================
LUX ACADEMY — STUDENT REGISTRY
==================================================
Amina Waweru Data Engineering 88 (A)
Brian Otieno Data Science 74 (B)
Njeri Kamau Data Engineering 91 (A)
Kamau Mwangi Data Science 62 (C)
Wanjiku Achieng Data Engineering 79 (B)
Aisha Baraka Data Science 85 (A)
==================================================
Total students: 6
--- Course Summary ---
Data Engineering: 3 students | Avg score: 86.0
Data Science: 3 students | Avg score: 73.7
That output on the projector got a round of quiet impressed noises. It looks like something you'd actually use. Because it is.
Practice Problems
Easy:
# 1. Create a list of 5 Kenyan cities. Print the first, last, and middle one.
# 2. Create a tuple of your top 3 favourite foods. Try to change one item — read the error.
# 3. Create a set from this list and print how many unique items there are:
# items = ["unga", "sukari", "unga", "mafuta", "sukari", "chumvi", "unga"]
Medium:
# Build a simple phone book using a dictionary
# It should support: add contact, lookup contact, delete contact
phone_book = {}
def add_contact(name, number):
pass # your code here
def lookup(name):
pass # your code here - handle missing contacts gracefully
def delete_contact(name):
pass # your code here
add_contact("Amina", "0712 345 678")
add_contact("Brian", "0723 456 789")
print(lookup("Amina")) # 0712 345 678
print(lookup("Njeri")) # "Contact not found"
delete_contact("Brian")
Challenge:
# M-Pesa transaction log analyser
# Given this list of transactions, find:
# 1. Total amount sent
# 2. Largest single transaction
# 3. Unique recipients (use a set)
# 4. How many transactions were above KES 1000
transactions = [
{"recipient": "Amina", "amount": 500, "type": "send"},
{"recipient": "Brian", "amount": 2000, "type": "send"},
{"recipient": "Amina", "amount": 750, "type": "send"},
{"recipient": "Njeri", "amount": 150, "type": "send"},
{"recipient": "Brian", "amount": 3500, "type": "send"},
{"recipient": "Kamau", "amount": 1200, "type": "send"},
{"recipient": "Amina", "amount": 800, "type": "send"},
]
# Your analysis here...
What I Noticed Teaching This Session
1. The list-of-dictionaries moment is the pivot point of the whole course. When Otieno said "this is basically a table" - that connection to SQL, to DataFrames, to real data - is what separates Python-as-syntax from Python-as-a-data-tool. Make sure that moment has space to breathe.
2. Sets always get the best surprise reaction. Students expect deduplication to require a loop and a lot of logic. When set() does it in one word, the reaction is genuine. Let them sit with it before explaining why.
3. The decision table mattered more than I expected. Students kept coming back to it during exercises - "should I use a list or a dictionary here?" The fact that they were asking that question at all means the concept landed.
4. Functions + data structures together feel like real software. The registry project is the first thing we've built that genuinely looks like something you'd find in a codebase. That matters for motivation - especially as we head into the home stretch of the foundations course.
What's Coming in Week 6: String Methods & File Handling
So far all our data has lived inside the program. We type it in, it disappears when the program ends.
Week 6 changes that — we read data from files. Real files. CSV files:
with open("students.csv", "r") as file:
for line in file:
print(line.strip())
name,course,score
Amina,Data Engineering,88
Brian,Data Science,74
Njeri,Data Engineering,91
A file full of student records - read into Python, processed with loops and dictionaries, results written back out. For the Data Engineering students, this is the beginning of pipelines. For the Data Science students, this is the step right before Pandas.
One session away. See you then.
Try It Yourself
- Download Python 3
- Download VS Code
- This week's code on GitHub ← link coming soon
Start with the M-Pesa transaction analyser - it uses all four structures and will tell you immediately which concepts need more practice.
I'm a data trainer in Nairobi running a full data programme -
Python foundations → Data Science or Data Engineering specialisations.
I write weekly about what we covered, what worked, and what surprised me.
Follow along or drop your questions in the comments.
Top comments (0)