Analyzing likes using Instagram API with python - part 3
Some time ago, we started working on an app to analyze likes on Instagram posts. We want to answer the question: What percentage of followers like Instagram posts? This will tell us whether the account should focus on gaining new followers or improving content since its current followers might not be interested. The result will be a bar chart for each post showing the ratio of all likes to likes from followers.
In our previous post How to analyze Instagram likes – part 2, we created an API client in Python. We decided to use the Instagram Scraper 2023 API. We wrote a client to fetch the data needed for our task. Today, it’s time to finish our app.
Save requests to Instagram API by adding file caching
Let’s revisit our rapidapi_client
. We want to cache every call to the endpoint. We don’t want to waste requests while working, so it’s worth returning previously fetched data.
Currently, we have three functions that we will add caching to. Let’s write the caching method first:
def __get_cache_file_path(self, endpoint: str, identifier: str, count: int, end_cursor: Optional[str]) -> str:
"""
Generate a file path for caching based on the endpoint, identifier, count, and end_cursor.
"""
filename = f"{endpoint}_{identifier}_{count}_{self.__get_end_cursor(end_cursor)}.pkl"
return os.path.join(self.cache_dir, filename)
def __get_or_set_cache(self, endpoint: str, identifier: str, count: int, end_cursor: Optional[str], url: str) -> Any:
"""
Retrieve data from the cache if it exists; otherwise, fetch data from the URL, cache it, and return the data.
:param endpoint: The API endpoint being queried.
:param identifier: The identifier for the request (e.g., user ID or post shortcode).
:param count: The number of items requested.
:param end_cursor: The pagination cursor for the request.
:param url: The URL to fetch data from if not cached.
:return: The data retrieved from the cache or fetched from the URL.
"""
cache_file_path = self.__get_cache_file_path(endpoint, identifier, count, end_cursor)
# Check if the cache file exists
if os.path.exists(cache_file_path):
# Load and return data from cache
with open(cache_file_path, 'rb') as cache_file:
return pickle.load(cache_file)
# Make the API request if cache does not exist
response = requests.get(url, headers=self.headers)
data = response.json()
# Save the fetched data to the cache
with open(cache_file_path, 'wb') as cache_file:
pickle.dump(data, cache_file)
return data
We also need to change the constructor to create a folder for caching files:
def __init__(self, cache_dir: str = 'cache'):
self.headers = {
'x-rapidapi-key': RAPIDAPI_KEY,
'x-rapidapi-host': RAPIDAPI_HOST
}
self.cache_dir = cache_dir
if not os.path.exists(self.cache_dir):
os.makedirs(self.cache_dir)
Everything looks fantastic! Now, we need to change the methods that communicate with the API:
def get_user_posts(self, userid, count, end_cursor=None):
url = self.__get_api_url(f"/userposts/{userid}/{count}/{self.__get_end_cursor(end_cursor)}")
return self.__get_or_set_cache("userposts", userid, count, end_cursor, url)
def get_post_likes(self, shortcode, count, end_cursor=None):
url = self.__get_api_url(f"/postlikes/{shortcode}/{count}/{self.__get_end_cursor(end_cursor)}")
return self.__get_or_set_cache("postlikes", shortcode, count, end_cursor, url)
def get_user_followers(self, userid, count, end_cursor=None):
url = self.__get_api_url(f"/userfollowers/{userid}/{count}/{self.__get_end_cursor(end_cursor)}")
return self.__get_or_set_cache("userfollowers", userid, count, end_cursor, url)
Nothing extraordinary here, we just use __get_or_set_cache
and return the cached content if found instead of querying the Instagram API.
We are now protected against wasting requests, let’s move on!
Creating plain old python objects (POPO)
POPO are simple Python objects without additional methods or attributes other than those explicitly defined. We need them to pass data between classes.
Let’s think about the POPO we need. We must gather data from the API and merge it into one object to pass it further. So, we need a simple PostData
class:
class PostData:
def __init__(self, post_id, post_likers: List[str], followers: List[str]):
self.post_id = post_id
self.post_likers = post_likers
self.followers = followers
Another POPO will be the analysis result, let’s name this class PostResult
:
class PostResult:
def __init__(self, post_id, all_likes_count, likers_likes_count):
self.all_likes_count = all_likes_count
self.likers_likes_count = likers_likes_count
self.post_id = post_id
Let’s create a list of PostData
built from Instagram API data.
# Create an instance of the RapidApiClient
api_client = RapidApiClient()
end_cursor = None
# Replace with your actual Instagram account ID
ACCOUNT_ID = '__PASTE_ACCOUNT_ID__'
# Retrieve followers for the account (maximum 50 followers for testing purposes)
followers = api_client.get_user_followers(ACCOUNT_ID, 50)
# Extract usernames of the followers
followers_usernames = [user['username'] for user in followers["data"]["user"]]
# Initialize an empty list to store post data
posts_data: List[PostData] = []
PAGE_LIMIT = 1 # Limit the number of pages to retrieve for testing purposes
for i in range(PAGE_LIMIT):
# Retrieve user posts (maximum 5 posts per page for testing purposes)
posts = api_client.get_user_posts(ACCOUNT_ID, 5, end_cursor)
# Check if there is no next page, break the loop if not
if not posts["data"]["next_page"]:
break
data = posts["data"]
end_cursor = data["end_cursor"] # Update the end_cursor for the next page
edges = data["edges"] # Extract the posts data
for edge in edges:
node = edge["node"]
post_id = node["id"] # Extract the post ID
# Retrieve likes for the post (maximum 50 likes for testing purposes)
post_likes = api_client.get_post_likes(node["shortcode"], 50)
# Extract usernames of the users who liked the post
post_likers = [like['username'] for like in post_likes["data"]["likes"]]
# Create a PostData object and add it to the posts_data list
posts_data.append(PostData(post_id, post_likers, followers_usernames))
At this point, posts_data
contains a list of PostData
objects composed of post IDs, lists of likers, and lists of followers. Great! We have the data ready; it’s time to feed it to the analyzer.
Analyzing data from the Instagram API
Let’s think about what such an analyzer should do. Essentially, it should count, just count 😉 But what exactly?
- The total number of likes on a given post
- The number of likes from followers
This is achieved by the following code:
from typing import List
class LikesAnalyzer:
def __init__(self, posts_data: List[PostData]):
"""
Initialize the LikesAnalyzer with a list of PostData objects.
:param posts_data: List of PostData objects containing data for each post.
"""
self.posts_data = posts_data
def get_analysis(self) -> List[PostResult]:
"""
Analyze the likes data to determine the total likes and the likes from followers for each post.
:return: A list of PostResult objects containing the analysis results for each post.
"""
results: List[PostResult] = [] # Initialize an empty list to store the analysis results
# Iterate through each PostData object in the posts_data list
for p in self.posts_data:
all_likes_count = len(p.post_likers) # Count the total number of likes for the post
likers_likes_count = 0 # Initialize the count for likes from followers
# Iterate through each liker of the post
for liker in p.post_likers:
# Check if the liker is also a follower
if liker in p.followers:
likers_likes_count += 1 # Increment the count if the liker is a follower
# Create a PostResult object with the post ID, total likes, and likes from followers
results.append(PostResult(p.post_id, all_likes_count, likers_likes_count))
return results # Return the list of PostResult objects
The get_analysis
function returns a list of PostResult
. Yes, these are the results of our analysis. Such dry results don’t tell us much. Let’s make a chart out of them!
Displaying the chart based on
Data from the Instagram API
I propose the class name PostLikesPlotter
to keep it simple. The best representation for the results will be a bar chart. It will immediately show on one bar how many followers liked a given post.
import matplotlib.pyplot as plt
from typing import List
import random
# Constants for the bar chart
BAR_WIDTH = 0.5 # Width of the bars in the chart
ALL_LIKES_COLOR = 'blue' # Color of the bars representing all likes
LIKERS_LIKES_COLOR = 'green' # Color of the bars representing likes from followers
class PostLikesPlotter:
def plot_analysis(self, results: List[PostResult]) -> None:
"""
Plot the analysis of post likes, showing total likes and likes from followers for each post.
:param results: A list of PostResult objects containing analysis results for each post.
"""
# Extract post IDs, total likes counts, and likers likes counts from the results
post_ids = [result.post_id for result in results]
all_likes_counts = [result.all_likes_count for result in results]
likers_likes_counts = [result.likers_likes_count for result in results]
# Create a figure and axis for the bar chart
fig, ax = plt.subplots()
# Plot the bars for total likes
bars = ax.bar(post_ids, all_likes_counts, BAR_WIDTH, color=ALL_LIKES_COLOR, label='All Likes')
# Plot the bars for likes from followers, overlaying them on the total likes bars
for i, (bar, all_likes, liker_likes) in enumerate(zip(bars, all_likes_counts, likers_likes_counts)):
ax.bar(bar.get_x(), liker_likes, BAR_WIDTH, color=LIKERS_LIKES_COLOR, label='Likers Likes' if i == 0 else "")
# Set the labels and title of the chart
ax.set_xlabel('Post ID')
ax.set_ylabel('Likes Count')
ax.set_title('Likes Analysis per Post')
# Move the legend outside the bar chart area
ax.legend(loc='upper left', bbox_to_anchor=(1, 1))
# Adjust the appearance of the x-axis labels
ax.set_xticklabels(post_ids, fontsize=8, rotation=45, ha='right')
# Display the bar chart
plt.show()
Summary
We have a working application written in Python for Instagram data analysis! For those who want to download the source code, visit our website UseMyApi.com where you will find a link to GitHub with the code.
If you liked the post ❤️ leave a comment or any reaction. If you need any other application or changes to the current code, you can "hire" me for programming work. Feel free to contact me!
All the best for you all!
Top comments (0)