Hello everyone! For this year’s Researchers' Night, my professors Pilar García Díaz and Juan Antonio Martínez suggested a few weeks ago that I present a preview of a project I’ve been working on and collaborating with them on for several months. It's not finished yet, but we’ve made enough progress to have a demo to show this year, and that’s the plan!
This post is the english version of the original published on my LinkedIn.
Many of the visitors who will be coming to our university during this event are middle and high school students who are probably looking, aside from checking out what projects are at the university and what they’re about, for advice and some guidance for when it's their turn. For that reason, and because I don’t want to make my presentation overly technical, I’ve decided to accompany my live presentation with this post. Here, I’ll summarize what I plan to show to those who stop by, plus an extra section for future students.
My project is entirely written in Python, which is why the extra section I mentioned will include a list of Python concepts and topics that aren’t usually covered in class and that I highly recommend exploring if you want to learn more about the language.
I’ll also include a list of tools at the end, some of which I’ve used in my project, in case you find them useful. With this, I hope you take something valuable away from your visit, so if you’re too lazy to read everything, feel free to skip the explanation of my project and go straight to the end—at least that part might be helpful to you!
Part One: The Project
Our project involves the development of various tools for audio analysis that will allow us to create programs capable of identifying specific sounds. The goal is to be able to distinguish, for example, the sound of a fire or blaze and provide responses based on the potential risks they may pose.
We are exploring several approaches around this idea, and I’ll tell you about the one I’ve been working on. My part of the project focuses on developing a library written in Python, based on the use of audio analysis libraries, to perform mathematical comparisons and return results. My project has several components, but the most important one, and the one I’ve focused on the most so far, is the core. The overall structure of the project looks something like this:
The core of my application features a collection of twenty metrics that I use to extract the characteristics of each audio file. I then perform comparisons using the resulting arrays. To do this, I use different mathematical metrics, which you can find in this section of my project:
With these metrics, I’ve been creating result sheets to try to distinguish sounds and group them by themes. The following images are an example of one of the first ones I made. What you see below is a heat map where the color indicates the similarity between different audio files. The corresponding list of sounds is part of the small sound library I’ve uploaded along with the rest of the project. All the sounds are free to use, and you don’t even need to mention the source—feel free to use them!
Here’s the link to the sound library; the formats are .mp3 and .wav
The sounds are arranged in the table by rows and columns, in alphabetical order. The first audio corresponds to row and column 1, the second to row and column 2, and so on. The result in each cell indicates how similar or different one audio file is compared to the one it’s being matched with. As you can see, and as expected, the diagonal of the heat map shows a 100% similarity, since these are the results of comparing each audio file with itself.
Here’s the list of audios compared in the image, as I mentioned earlier, alphabetically ordered in the table from top to bottom and left to right:
- A-fireplace-with-a-crackling-fire.mp3
- An-open-fire-in-fireplace.mp3
- Burning-fireplace.wav
- Crackling-bark-wood-in-a-closed-fireplace.mp3
- Crickets-and-cicadas-at-the-night-San-Francisco-Libre-Nicaragua.mp3
- Crows Picking and Eating.mp3
- Daytime Forrest Bonfire.mp3
- Desert Howling Wind.mp3
- Fire.mp3
- Fire-in-an-open-fireplace.mp3
- Forest Wind Summer.mp3
- Forest-atmosphere-in-village-outskirts.mp3
- Horse Eating Grass.mp3
- light-rain-and-thunder.mp3
- Morning Highway in Distance.mp3
- thunder-distant.mp3
- thunder-distant-2.mp3
- thunder-distant-3.mp3
- Warm-and-toasty-fireplace.wav
- Water Hose on Concrete.mp3
- Water Sizzle.mp3
- waves-on-the-shore.mp3
- Wood-crackling-in-a-fireplay.mp3
- Woodpecker Eating Distant.mp3
All of these are available for download and listening through the link I shared above.
This is the core idea of my project: the goal is to combine my results with the work of others and develop a multi-faceted system for performing measurements and returning similarity results.
If you'd like to know more details I’ll be happy to share more during the event. And if you didn’t get the chance or still have questions, don’t hesitate to contact me! Just keep in mind I might not respond instantly—just saying, haha!
Part Two: Recommendations
In this section I’d like to share a list of Python concepts and tools that I consider essential for your study, if you're interested. I'll start by talking about Python.
Python is a vast language with many shortcuts, and it’s easy to get lost in the details without taking the time to understand how it works overall. That’s why most of the concepts I’m sharing here relate to the internal workings of the language. I’m not trying to give a lesson in this post, but rather to leave you with a few breadcrumbs that took me a while to discover and that I would have appreciated finding earlier.
First of all, though it might sound obvious, I highly recommend the official Python documentation. It may seem self-evident, but I’m guilty of overlooking things like this, and reading through the documentation helped me discover many things I didn’t know existed.
With that and a couple of YouTube videos, you'll have a general understanding of the language. Once you’ve done that, you should look into Python’s thread and process management, as well as the functioning of one of the language's core mechanisms: the GIL (Global Interpreter Lock). In this post I talk about both topics; if you're only curious about what the GIL is, that might be enough, but if not, I recommend researching further on your own. This video gave me the idea to write the post linked above.
Aside from thread and process management, it’s very useful to take advantage of the built-in structures in Python. More than once, I’ve implemented an idea only to realize later that the language already had a built-in solution. Knowing how to use classes, data classes, and leverage their features, among other structures, is fundamental. Here’s another video on this topic that really helped me out at the time.
If you’ve already dealt with graphical interfaces in this or any other language, you know it can be a painful process, especially if you're using older software. This video covers my favorite tool for building interfaces in Python, I've even come across it at work! Highly recommended ;)
Finally, I’ll recommend another official resource that often gets overlooked when you’re starting out. When you’re just writing small scripts, managing dependencies and libraries tends to be ignored. I made that mistake, ignoring versions and compatibility of the libraries you're using can cause major problems, especially as your project grows in size and dependencies. On PyPI (Python Package Index), you can check the status of libraries, their requirements, versions, and compatibility with different versions of the language. If you’re thinking about starting a serious project in Python, this site is essential.
I hope you found this useful and interesting. To wrap up, here’s a list of websites that have been incredibly helpful to me in building and enhancing this and other projects.
If your path someday leads you to the computer science field at the Polytechnic School of Alcalá, remember, choose EOC in English...
See you next time!
Top comments (0)