DEV Community

Ioan Papuc
Ioan Papuc

Posted on

Movies Recommendation Software

The program presented below is basically a recommendation tool that has the ability to suggest a group of films belonging to a genre chosen by the user. The link to the Github repository that contains the Python files with the complete source code is: https://github.com/IoanPapuc/recommendation_software.git

Image description

First, being asked to enter the desired genre, the user has the possibility to type the entire word or just the first characters into the terminal. The categories that match the input will be displayed and the user will be asked again to insert his choice from the available possibilities. This process will continue until a single option is printed and the user decides either on listing the recommended films from that variety or chose another genre.

Algoritms and Data Structures

Regarding the Python code written for building the movie recommendation software, it is structured in three different files for an easier interaction:

  • dataset.py, that stores the collection of movies, genres and the corresponding data,
  • algorithms.py, where algorithms and classes we need are defined,
  • recommendation_software.py, that includes the main functions for running the software.

Now let's mention the most important steps in the development of our program as well as briefly describe the problems, resources and methods that each one envolves.

The first thing the software does is getting the desired genre from the user, hence a function is defined in recommendation_software.py file for this purpose: construct_user_choice(genres_list), where unsorted genres_list = ['genre_a', 'genre_b', 'genre_c', 'genre_d', ... ] is located in dataset.py file. Inside the function, the user is asked to enter a genre or just to type the beginning of that word into the terminal; afterwards, the program prints a list of possible options to choose from. In order to provide options that match input, a simple linear pattern search is performed:

def linear_search(genres_list, target):
    matches = []
    for item in genres_list:
        if target == item[0:len(target)]:
            matches.append(item)
    return matches
Enter fullscreen mode Exit fullscreen mode

Even if this process has a O(len(genres_list)) time complexity, the number of genres remains relatively small if we do not consider the subgenres, hence the computing cost will be low.

Next, having a movie genre chosen by the user, a list of films belonging to that genre would be printed by the second function defined in recommendation_software.py: print_movies(genres_list, movies_collection), where movies_collection is the array of movie data which can be found in dataset.py file.

If we store the movies data as lists in movies_collection as below:

movies_collection = [[Movie_Title_1, genre_1, year_1, score_1],
                     [Movie_Title_2, genre_2, year_2, score_2],
                     [Movie_Title_3, genre_3, year_3, score_3],
                     ...
                    ]
Enter fullscreen mode Exit fullscreen mode

the program will iterate throughout the entire stock in order to gather all the movies corresponding to a specific type, in which case will generate a O(len(movies_collection)) time complexity. To avoid this situation, a MoviesCollection HashMap class was created in algorithms.py file. Therefore, printing the movies from a certain category will take len(genres_list) + len(movies_of_certain_genre) in the worst case scenario when the computed hashcode for almost every key will lead to a collision.

Top comments (0)