Finding ways to apply your knowledge after the learning process essentially means that the learning happened without much sense of a destination. All we were trying to do was amass all the knowledge we could, in the hope that it would come of use in some distant, mystical future.
Doesn't that feel like procrastination?
When you try to make something, you discover a hundred things that you don’t know. You discover things that you thought you knew but don’t really know. You trip over things that seemed so simple that you didn’t even pay attention to them. You fill the gaps in your learning.
Also, it is super fun and adventurous!
You can get all that only if you do a project. So, I think that it’s worth it to center your entire learning around completing a project.
If you want to dive into building something interesting and learn useful Python/programming skills along the way, this guide is for you.
Radek Osmulski@radekosmulski@hwasiti @nityeshaga @fastdotai @dhh @levelsio learning as you go might be the only way to ensure you are learning something that is genuinely useful 🙂
I have lived most of my life in the camp of learning for the sake of learning so these are quite novel concepts for me as well 🙂10:54 AM - 02 Nov 2019
With this guide, I aim to walk you through building something interesting, allow you to experience difficult-to-grab programming intuitions as you build it and make you to go from a basic Pythonista to an advanced one.
Most importantly, I want to give you motivation and the incentive for you to teach yourself.
Here are some textbook skills that you will pick up:
- File handling
- String operations in Python
- PiP and using 3rd party packages
- Regular Expressions (RegEx) in Python
But this is not a textbook. So along with them, you will also develop intuitions about good programming practices like:
- The importance of readability of your code and coding style
- When and how to break your code into functions
- How to go about debugging your code (when you want to bang your head against the wall, instead)
- How to look things up on the Internet - use Google, use StackOverflow, read documentation etc.
- Understand the need for different data structures and when to use what
Let's get to it then!
When chatting with a close friend, have you ever wanted to know -
- the number of messages sent by each of you
- your the average length of messages
- who texts first and the first text in each conversation
- your chatting time patterns - hourly, daily and monthly
- most shared website links
- most common words that each of you use
Wouldn't it be cool if you wrote a program that would just calculate all this stuff for you?!?
Your program is going to find similar results and print them for you without those graphs and visuals.
“Every great developer you know got there by solving problems they were unqualified to solve until they actually did it.”
— Patrick McKenzie
Thinking along these lines, I believe that:
- If you know the basics of the following in Python - variables, lists, dictionaries, loops, conditions, functions - you are ready.
- Otherwise, if you are new to Python but know the basics in some other language - go through this quick Python tutorial and I think you'll be ready.
Just dive into the 1st "hello world"-equivalent exercise below. If you can complete it, you are are ready!
Whatsapp allows you to export any chat into a text file that looks something like this -
So you can write a program that will read this chat file, parse it, analyse it and give you the results.
But that's not enought help, right?
- That is why I have written this short guide for you to follow like a roadmap. I have divided the task of building into 10 milestones (MS) and have written small pieces of advice on what you need to learn to cover each milestone. Treat it like apprenticeship.
"Okay, let's do it then!" 😃
When you are starting out, you don't want to spend hours setting up your environment. Half your motivation gets killed right there! Right?
Repl.it is the way out of the setup-frustration.
It is a website that provides an online IDE for almost every language, which you can access for free with just a few clicks. It is great for small projects like the one we are building.
Every programming book/tutorial ever starts out with a "Hello World!" program. Why is it so?
Apart from being welcoming to newcomers, this program does the job of reassuring the learner that her environment is set up and that things work. So, if she does it right, her program will work too!
With these goals in mind, here is your Hello World-equivalent program:
"I love you 3000".. 3000 times! (Any Marvel fans out there? :p)
- See, if you are ready to dive deeper into the project
If not, then its time to do the basics of Python. Don't worry, it isn't too difficult.
Here onwards, you will build a piece of the project with each chapter.
There are 2 files that you will need for the project -
- Your Whatsapp chat file (ending in
- A Python code file (ending in
Once you have them, this first chapter requires you to open the chat file using your Python program and print all of its contents.
- Understand how to handle files with Python
You know that any editor that you use to open a text file on your computer (Notepad, VS Code, Vim, etc.) is a program, right?
You know what? - you can make your own Python program do that. Almost easily!
Go through this excellent tutorial by Real Python to learn the concepts of file handling in Python.
Count the number of messages that you and your friend have exchanged. Then, count each of your individual share - both according to the number of messages and the number of words. Print the results.
- Understand Strings in Python
- Strings are treated as lists. So you can do search like this:
if "- Paridhi:" in chat_line: counter+=1
Python strings are famous (as compared to the ones in other languages) because Python powers them with a rich library of in-built methods that you can use to perform operations on them. I suggest that you use this tutorial by W3Schools as your reference material for those methods.
Python's ability to slice and negative index strings can be really handy at times!
Caution: Now onwards, you will feel your program grow in size and complexity. As it does so, you should start getting conscious about your coding style and keep the readability of your code in mind.
Brian W. Kerninghan says in his book - The Practice of Programming:
"The purpose of style is to make the code easy to read for yourself and others, and good style is crucial to good programming."
Personally, whenever I try to take decisions about readability of my code, this line from the Zen of Python plays in my brain:
Explicit is better than implicit
Here are 3 simple, actionable rules that you can keep in mind to develop a good coding style:
1. Put some thought into choosing your variables' names
I find Brian W. Kerninghan's advice really helpful here:
- Global functions, classes, and structures should have descriptive names that suggest their role in a program.
- By contrast, shorter names suffice for local variables; within a function,
nmay be sufficient,
npointsis fine, and
- Local variables used in conventional ways can have very short names. The use of
jfor loop indices,
qfor pointers, and
tfor strings is so frequent that there is little profit and perhaps some loss in longer names.
2. Use functions wherever necessary
- Break long pieces of code into functions
- Don't Repeat Yourself (DRY) - use functions to remove duplicate pieces of code
More on functions in the next chapter.
3. Write helpful comments
- Comments are meant to help the reader of a program. They do not help by saying things the code already plainly says, or by contradicting the code, or by distracting the reader with elaborate typographical displays.
- As much as possible, write code that is easy to understand; the better you do this, the fewer comments you need. Good code needs fewer comments than bad code. Comments are, at best, a necessary evil.
- Don't contradict the code. Most comments agree with the code when they are written, but as bugs are fixed and the program evolves, the comments are often left in their original form, resulting in disagreement with the code.
In the end, remember that the principles of programming style are based on common sense guided by experience, not on arbitrary rules and prescriptions.
Now, that you have calculated your individual share using 2 metrics - message count and word count - you can use it to calculate each of yours average length of messages. Print the results.
Understand functions as a means to:
- reduce repetition
- make code more readable
Duplication may be the root of all evil in software. Functions were one of the first techniques developed to control this evil.
It is easy to understand the syntax of writing functions but it takes practice and some sense of design to learn when to break the code into functions. One goal is to design functions such that they can be reused when extending your program to new cases.
What more? Making such design choices are what make programming fun!
Here are 3 heuristics from Bob Martin's book Clean Code that will guide you while making such choices:
- Functions should be small; how small? No more than a screenful or 20 lines
- Functions should have descriptive names. The smaller and more focused a function is, the easier it is to choose a descriptive name. Don’t be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.
- Functions should do only one thing and have no "side effects" - its intent should be clear from its name
When you first write a function it will probably come out long and complicated and not follow any of the above rules. And that's ok. You can refine and reformat your code later. I don't think anyone could start with writing functions that follow all the rules mentioned above.
Remember they are function-building goals that you need to strive towards. Don't let them paralyse you.
Do you want to resolve the issue of "who texts first" once and for all? 😜
After this milestone, you will. You will know exactly how many conversations each of you have initiated and have a list of those first texts. Print all that out.
- Understand modules - you'll need Python's
- Learn how to look things up and read the documentation
Caution: Don't be intimidated by the docs. They're your friends.
Every file of Python source code whose name ends in a .py extension
is a module.
Python installation comes with a standard library that contain such modules out-of-the-box. These are useful pieces of code that you don't have to write!
Now, its time to find out your usual chatting patterns.
- What hour of the day do you chat the most? What about the rest of the hours?
- Which day of the week do you usually chat the most? What about the rest of the days?
- Which month have you chatted the most? What about the rest?
Print the results.
- Understand the need for different data structures for storing all this data and think upon how to design a data structure to suit your needs
Note: You will need the
time module again here. It's important for you to know that it's okay if you don't remember it; you are allowed to use Google and check the documentation as many times as you need.
Caution: Implementing this can be quite tricky. You are likely to spend a majority of your coding time banging your head over broken code.
Remember: "It's not the computer but your code that is at fault." :)
- Explain the code to a friend or use the "Rubber duck technique"
- Pick a friend (or a rubber duck)
- Open the problematic code and explain it to him (/her/it), line by line, slowly and patiently
- Find the problem staring at you, in your face, without any help of your friend (or the duck), as if by magic!
- Add print statements
Although adding such print statements isn't the correct way to debug, I find them incredibly effective at times. Especially, when I'm working with a text editor like VIM and not on a full-fledged IDE that has a debugger (or when you are too lazy to learn how to use a debugger :p).
But I have to say, once you learn how to use an IDE debugger, there is no going back..
- Use an IDE debugger
Programming Wisdom"Debuggers don't remove bugs. They only show them in slow motion." - unknown18:01 PM - 11 Jan 2018
As of writing, repl.it doesn't fully support a debugger yet. My favourite IDEs for Python that do support it are PyCharm or VS Code.
A debugger can be so useful that I will recommend you to make the switch and learn how to use the debugger in it. Trust me, it is totally worth the pain! (Especially now, that your code is of a considerable complexity.)
Personal advice: I using "IDE debugger" because Python provides a debugger in the standard library module - *
pdb** - and I will suggest that you don't get into using it now.
- Learn RegEx
- Understand Python dictionaries as traditional hashtables: mapping from website name to the number of occurrences
A regular expression is a special text string for describing a search pattern.
You are probably familiar with wildcard notations such as
*.txt to find all text files in a file manager. You can think of regular expressions as wildcards on steroids.
"I want every string that is between
"https://"and the second
/after that, if present. Or else, the first
Here are a few favourite resources to learn Regex:
- RegexOne - an interactive tutorial for learning RegEx
- An Introduction to Regex in Python
- Python's documentation - it refers to them as "tiny, highly specialised programming language embedded inside Python".
I'll let you figure this one out on your own!
You must be using some print statements to print the results of each milestone. Now, its time to focus on the presentation of those results. Print all the above results in pretty, neat tables.
To do this, you might need to restructure a large portion of the code in order to decouple the print statements from the function definitions (assuming you haven't already been doing it).
- Realize what it means when people advise - "functions should do just one thing"
- Learn to search, install and use 3rd party modules that Python's awesome, vibrant community provides through
- Give a personal touch to the project with the way you design the tables!
Python's ecosystem has contributors ranging from individual developers to megacorps like Facebook and Google (rich ecosystem, eh? :p). They offer modules and libraries of code to aid in website construction, numeric programming, game development, data science, machine learning, deep learning and well, printing pretty tables.
Now, that's a whole lot of code you don't have to write!
PyPi is the home to all these 3rd-party Python packages. You can find a page on every open-source, 3rd party package here.
Here are a few things that will get you up to speed to using PyPi:
- You can install every package using a simple terminal command -
pip. You can find exactly what you need to type on a package's page in PyPi.
pip install tabulate
- Any good package also has a How To Use guide (or documentation) on its page in PyPi
- Even newbies can publish their experimental packages to Python as well. You should be careful before using them; they may be incomplete or unmaintained. You can check out a package's
Release Historyor its Github statistics to determine its credibility.
With this milestone, you will be extending your program to a new case - group chats. Up untill this point, you would have a direct message chat file with one friend. Now, you will modify your program so that it will work with Whatsapp group chat files as well.
- Evaluate your functions. Are you able to reuse at least some of them?
- Feel the benefits of a good coding style and good programming practices
- See the importance of version control system and learn Git
It will do you well to remember what Brian W. Kerninghan says about good software in his book The Practice of Programming:
The basic principles that form the bedrock of good software are simplicity, which keeps programs short and manageable; clarity, which makes sure they are easy to understand, for people as well as machines; generality, which means they work well in a broad range of situations and adapt well as new situations arise; and automation, which lets the machine do the work for us, freeing us from mundane tasks.
Alright, I hope this has been useful for you. You will gain a true understanding of all the mini-lessons in this guide, once you actually dive into doing the project yourself.
Here's some code to give your start a boost:
Don't be afraid of starting out because things will difficult when you get stuck. That is the adventure; it will feel super cool everytime you dig yourself out.
Also, you can ask me or your fellow learners your doubts in the Build To Learn Slack group.
As an ending note, I would like you to remember the words of Jen Simmons as you work on this project (or any other programming project for that matter):
If you are a developer, and you feel bad about not knowing everything, I have one item I want you to memorize:
No one knows everything. No one.
The best coders in the world only know a small fraction of everything there is to know about coding.17:20 PM - 26 Jul 2018
The only skill you need is to know 1) how to identify what you don’t know / when you don’t know something; and 2) how to look things up, how to read documentation, how to try & try & try and keep trying while things fail, until they work. That’s literally the job of writing code.17:20 PM - 26 Jul 2018
Everything is always changing anyway. So as soon as you “know” it, it’s changed. Knowing what to do when you are stuck and at a loss — that’s the most important skill. Being ok with that scary feeling of not knowing, that’s the job.17:20 PM - 26 Jul 2018
Those coders who act like they do know it all — who brag and boast and try to make other people feel small… They are just covering up insecurities about not knowing what they are doing. That arrogance is a lie.17:20 PM - 26 Jul 2018
They are likely pretty bad coders because they can’t identify when they don’t know something. They can’t handle it emotionally, so they treat other people crappy.
The better option is to learn to breath through the feeling of incompetence. Identify it as part of the process.17:20 PM - 26 Jul 2018
Don’t take it personally. You don’t know what code to write? Yeah, we all don’t know what code to write. We figure it out. Line by line. Bit by bit. It’s broken most of the time. Until it’s not. Then we go write new broken code. Until it’s not broken.
Welcome. That’s the job.17:20 PM - 26 Jul 2018
Whatsapp Chat Analyser is one of the 20 cool programming projects that I mentioned in the last post in the series - Build To Learn. If you want me to do a similar guide for any of the others, feel free to comment below or reach out to me directly!
Subscribe to the Build To Learn newsletter to get an email when I do new guides and articles.