Joerg Rech

Posted on Dec 14, 2024

Weekend Project: Create a Personalized Job Posting Agent

#career #sideprojects #weekend #programming

Introduction

Job hunting can feel like a full-time job in itself. Tailoring searches and scanning multiple job boards every day can be time-consuming and overwhelming. A personalized job posting agent automates this process, saving you valuable time and effort—it’s like having your own digital assistant scouring the web for jobs that fit you.

This tutorial helps you build a personalized job posting agent that queries job postings you're interested in from an API, formats the data, and emails it to you automatically. Whether you’re actively looking for a new role or just want to keep an eye on the market, this project is a practical and fun way to make technology work for you.

Prerequisites

Before diving into the implementation, it’s crucial to prepare your system and set up the tools and services required for this tutorial. This setup will enable you to run the Python script, configure the cron job, and send email notifications.

First, you need to create an account on RapidAPI, a platform that provides access to numerous APIs, including the job postings API we’ll use. Once your account is set up, go to the Daily International Job Postings API, subscribe to the free Basic plan, and obtain your API key. The Basic plan gives you access to 25 free requests with 10 job postings each per month.

Second, you’ll need access to an SMTP-based email provider, like Gmail. SMTP (Simple Mail Transfer Protocol) is the standard protocol for sending emails, and it allows your Python script to communicate with the email server to deliver messages. You'll need the SMTP server address and your credentials to send emails. Alternatively, you can start with printing the job postings into a file and create an calendar reminder to look at it.

Next, install the necessary Python tools and libraries for your operating system. You can use pip to install essential packages, including requests for making API calls, smtplib for sending emails, and any other dependencies specified in the code. Once all prerequisites are met, verify that your Python environment is functional by running python3 --version.

NOTE: Using programming languages other than Python isn't too hard - RapidAPI shows the core API access for other languages in their Code Snippets selector tab.

Finally, on Linux or macOS, ensure that cron is installed and operational; Windows users can achieve similar functionality using Task Scheduler. These tools will enable the automation of your Python script at specified intervals.

Understanding the Job Postings API

Techmap's Daily International Job Postings API provides comprehensive access to global job data from the last 3 months. It returns structured job postings and supports queries with various filtering parameters to narrow search results such as date of posting, location, occupation, and required skills.

You can find the documentation for constructing API requests on RapidAPI's Playground for the Paginated Search or the API's OpenAPI documentation page. We will use the /api/v2/jobs/search endpoint with the following query parameters:

dateCreated for filtering jobs posted on a specific date (day or month),
countryCode for selecting the country (using the ISO3166 codes),
city for specifying a particular city,
occupation for job type, (using many synonyms)
skills for required expertise, and
page to paginate through larger lists of jobs.

These parameters help us in building a personalized job posting agent for a specific type of occupation in a specific area. The API returns data in Schema.org Microdata format using JSON, which is ideal for parsing, further processing, or direct use in webpages to improve SEO.

Building a Query with Parameters

To query job postings in Python using the API, you need to structure your request with the appropriate parameters. Consider the following example parameter configuration:

parameters = {
  "dateCreated": "2024-12",    # Filter jobs posted for December 2024
  "countryCode": "de",         # Country code for Germany
  "city":        "Berlin",     # Specific city in Germany
  "occupation":  "Programmer", # Job type
  "skills":      "Java"        # Required skill
}

These parameters allow you to filter job postings precisely for a programmer role in Berlin, Germany, looking for expertise in Java. If more than 10 results are found we'll be adding the pagination (page) parameter to fetch additional results.

These parameters will be used to construct the following query:

/api/v2/jobs/search?dateCreated=2024-12&countryCode=de&city=Berlin&occupation=Programmer&skills=Java

Alternatively, you can use other parameters described in the API documentation such as title, workPlace (e.g. remote or hybrid), company, geoPoints, timezones, etc.

// Remote jobs in core US Timezones (UTC-8 to UTC-5):
/api/v2/jobs/search?dateCreated=2023-11-01&page=1&workPlace=remote&timezoneMin=-8&timezoneMax=-5

// Jobs in the Healthcare industry of the United Kingdom:
/api/v2/jobs/search?dateCreated=2023-11-01&page=1&countryCode=uk&industry=healthcare

// Jobs for Java Developers in San Francisco:
/api/v2/jobs/search?dateCreated=2023-11-01&page=1&skills=Java&occupation=java,programmer&geoPointLat=37.757&geoPointLng=-122.449&geoDistance=100mi

// Jobs for JavaScript Developers in New York, USA:
/api/v2/jobs/search?dateCreated=2023-11-01&page=1&skills=JavaScript&countryCode=us&city=New%20York

// Jobs in English language located in Germany:
/api/v2/jobs/search?dateCreated=2023-11-01&page=1&countryCode=de&language=en

NOTE: Please remember that the free Basic Account only gives 25 free requests per month - it's probably best to start with our example parameters and then tailor it to you needs. Nevertheless, the API is cheap and 100 requests over the free ones only cost 5$.

Using curl to Call the API

The curl command is an easy way to interact with the API via the terminal and test different queries and parameters. Here’s how you can query the API using curl with the above parameters:

curl --request GET \
  --url 'https://daily-international-job-postings.p.rapidapi.com/api/v2/jobs/search?dateCreated=2024-12&countryCode=de&city=Berlin&occupation=Programmer&skills=Java&page=1' \
  --header 'x-rapidapi-host: daily-international-job-postings.p.rapidapi.com' \
  --header 'x-rapidapi-key: <YOUR_API_KEY>'

After replacing <YOUR_API_KEY> with your unique API key obtained from RapidAPI, this command sends a GET request to the API endpoint, including the query parameters and authentication headers. The response, in JSON format, will contain the job postings matching your criteria including a description of the query that the server understood.

Calling the API with Python

Using Python to call the API offers greater flexibility for processing and integrating the data into your application. Here’s an example code snippet that achieves this:

import requests

# API endpoint and headers
url = "https://daily-international-job-postings.p.rapidapi.com/api/v2/jobs/search"
headers = {
  "x-rapidapi-host": "daily-international-job-postings.p.rapidapi.com",
  "x-rapidapi-key": "<YOUR_API_KEY>"
}

# Query parameters
parameters = {
  "dateCreated": "2024-12",
  "countryCode": "de",
  "city":        "Berlin",
  "occupation":  "Programmer",
  "skills":      "Java"
}

# Make the GET request
response = requests.get(url, headers=headers, params=parameters)

# Check for successful response
if response.status_code == 200:
  job_data = response.json() # Parse JSON response
  print(f"Found {job_data.get('totalCount', 0)} jobs.")
else:
  print(f"Error: {response.status_code} - {response.text}")

The script constructs the request, sends it to the API, and processes the JSON response to extract job postings. By using Python, you can further manipulate and analyze the data, save it to a file, store it in a database, or send notifications.

The complete picture

After we covered the basics, we will walk through the full implementation of the personalized job posting agent python script. This script fetches job postings from an API based on specified parameters, processes the data to ensure relevance, and sends a customized email notification to the user. The main parts of the script are:

Importing Required Libraries: The script begins by importing the necessary Python libraries for handling API requests, email communication, file manipulation, and data processing. These libraries provide the core functionality required to fetch, process, and email job postings.
Setting Up Variables and Configuration: To personalize the job search, the script defines variables for the API query parameters, email configuration, and file paths. These variables can be easily adjusted to suit the user’s requirements.
Fetching Job Postings with Pagination: The script queries the API in a loop to handle pagination, ensuring all relevant job postings are fetched. Results are written to a temporary file for further processing.
Deduplicate and sort Database File: After fetching the job postings, the script processes the data to remove duplicates and sorts the entries by the dateCreated field in descending order. The cleaned data is saved to a final JSON file for record-keeping.
Sending the Email: Finally, the script sends an email containing the job postings to the specified recipient. The email content is read from the prepared file, and the SMTP library is used to send the message.

To sum it up, the main script for the personalized job posting agent in detail looks as follows, combining all the steps above. The script is modular and can be customized easily by changing the query parameters or email configurations.

NOTE: to run this script you need to edit the configuration and replace the placeholders for EMAIL_TO, SMTP_* info, API_KEY, city, occupation, and skills.

File: fetch_jobs_and_email.sh

################################
# Importing Required Libraries #
################################

import os
import json
import requests
import smtplib
from email.mime.text import MIMEText
from datetime import datetime
from time import sleep
from operator import itemgetter

##########################################
# Setting Up Variables and Configuration #
##########################################

# Current date in "yyyy-MM" format - "yyyy-MM-dd" is possible
DATE_CREATED = datetime.now().strftime("%Y-%m")

# Email Configuration
EMAIL_TO = "recipient@example.com"
SMTP_SERVER = "smtp.gmail.com"
SMTP_FROM = "<SMTP_FROM_ACCOUNT_EMAIL>"
SMTP_PASS = "<SMTP_PASSWORD>"
EMAIL_SUBJECT = "My Daily Job Postings"
EMAIL_FILE = f"techmap_jobs_email.txt"

# Prepare the email file
with open(EMAIL_FILE, "w") as email_file:
  email_file.write(f"Job Postings in {DATE_CREATED} for you:\n")
  email_file.write("======================================\n\n")

JOB_FILE_TEMP = f"techmap_jobs_{DATE_CREATED}_all.json"
JOB_FILE = f"techmap_jobs_{DATE_CREATED}.json"

# API details
MAX_REQUESTS_PER_MONTH = 25
MAX_REQUESTS_PER_CALL = 2 # use 2 for tests and 5 for weekly execution
API_ENDPOINT = "https://daily-international-job-postings.p.rapidapi.com/api/v2/jobs/search"
API_HOST = "daily-international-job-postings.p.rapidapi.com"
API_KEY = "<YOUR_API_KEY>"

# Query parameters
parameters = {
  "dateCreated": DATE_CREATED,
  "countryCode": "de",
  "city": "Berlin",
  "occupation": "Programmer",
  "skills": "Java"
}

#########################################
# Fetching Job Postings with Pagination #
#########################################

print(f"Fetching job postings for date: {DATE_CREATED}")

PAGE = 1
TOTAL_COUNT = 0
i = 0
while i++ < MAX_REQUESTS_PER_CALL:
  # Add pagination to parameters
  parameters["page"] = PAGE
  print(f"Querying: {API_ENDPOINT} with parameters {parameters}")

  try:
    response = requests.get(API_ENDPOINT, headers={
      "x-rapidapi-host": API_HOST,
      "x-rapidapi-key": API_KEY
    }, params = parameters)
    response.raise_for_status()
  except requests.RequestException as e:
    print(f"ERROR: Failed to fetch jobs (Maybe the API_KEY): {e}")
    break

  data = response.json()
  TOTAL_COUNT = data.get("totalCount", 0)
  page_size = data.get("pageSize", 0)
  results = data.get("result", [])
  print(f" Found {len(results)} jobs of {TOTAL_COUNT} on page: {PAGE}")

  if not results:
    print("No more jobs found. Stopping.")
    break

  # Save jobs to temporary file
  with open(JOB_FILE_TEMP, "a") as temp_file:
    for job in results:
      temp_file.write(json.dumps(job) + "\n")

  # Write email content to Database File
  with open(EMAIL_FILE, "a") as email_file:
    for job in sorted(results, key=lambda x: x.get("dateCreated", ""), reverse = True):
      email_file.write(
        f"Title: {job.get('title', 'N/A')}\n"
        f"Company: {job.get('company', 'N/A')}\n"
        f"City: {job.get('city', 'N/A')}\n"
        f"Posted On: {job.get('dateCreated', 'N/A')}\n"
        f"URL: {job.get('jsonLD', {}).get('url', 'N/A')}\n\n"
      )

  # Stop if there are no more results
  if len(results) < page_size:
    break

  PAGE += 1
  if PAGE > MAX_REQUESTS_PER_CALL:
    print("Max requests reached. Exiting.")
    break
  # Sleep due to throttled access to API
  sleep(1)

######################################
# Deduplicate and sort Database File #
######################################
try:
  with open(JOB_FILE_TEMP, "r") as temp_file:
    job_list = [json.loads(line) for line in temp_file]
  # Deduplicate
  deduplicated_jobs = {json.dumps(job, sort_keys=True): job for job in job_list}.values()
  # Sort by dateCreated in reverse order
  sorted_jobs = sorted(deduplicated_jobs, key=itemgetter("dateCreated"), reverse = True)
  # Save the sorted, deduplicated results to the final job file
  with open(JOB_FILE, "w") as final_file:
    for job in sorted_jobs:
      final_file.write(json.dumps(job) + "\n")
  print(f"Successfully deduplicated and sorted: {JOB_FILE}")
except (FileNotFoundError, json.JSONDecodeError) as e:
  print(f"ERROR: Error processing job files: {e}")

#####################
# Sending the Email #
#####################
print(f"Sending email to {EMAIL_TO}")
try:
  with open(EMAIL_FILE, "r") as email_content:
    message = MIMEText(email_content.read())
    message["Subject"] = EMAIL_SUBJECT
    message["From"] = SMTP_FROM
    message["To"] = EMAIL_TO
  with smtplib.SMTP(SMTP_SERVER, 587) as server:
    server.starttls()
    server.login(SMTP_FROM, SMTP_PASS) # Update with your credentials
    server.send_message(message)
  print("Email sent successfully.")
except Exception as e:
  print(f"ERROR: Failed to send email (maybe wrong SMTP info): {e}")

print("Done fetching job postings.")

Setting Up a Weekly Schedule

To automate the Python script to run on a weekly schedule, you’ll need to configure a task scheduler specific to your operating system. Here’s how to do it on MacOS, Windows, and Linux.

Setting up the Cron Jobs on MacOS or Linux

On MacOS and Linux, you can use cron, a built-in task scheduler, to automate your script. Follow these steps:

Open the Terminal: On MacOS press Command + Space to open the Spotlight search bar, type Terminal, and hit Enter. On Linux start a shell.
Edit the Crontab: Type crontab -e to open the cron configuration file. If prompted, choose a text editor such as nano or vim.
Add the Cron Job: To schedule the Python script to run weekly, add the line below. This runs the script every Saturday at 6:00 AM. Replace /usr/local/bin/python3 with the path to your Python interpreter and /path/to/your_script.py with the full path to your Python script.

0 6 * * 6 /usr/local/bin/python3 /path/to/your_script.py

Save and Exit: Save the file (Ctrl + O in nano; Esc and ':x!' in vim) and exit (Ctrl + X in nano).
Finally, you can check if the cron job is active with crontab -l.

Setting up the Cron Jobs on Windows

Windows uses the Task Scheduler for automation. Here’s how to set up a weekly task:

Open Task Scheduler: Search for Task Scheduler in the Start menu and open it.
Create a New Task: Click on Create Basic Task in the right-hand menu, name your task (e.g., “Weekly Job Agent Python Script”) and click Next.
Set the Schedule: Choose Weekly and specify the day of the week and time you want the script to run.
Configure the Action: Choose Start a Program., and enter the path to your Python executable (e.g., C:\Python39\python.exe) in the Program/Script field. In the Add Arguments field, enter the full path to your script (e.g., C:\path\to\your_script.py).
Finish the Setup: Confirm and save the task. Your script will now run automatically as scheduled.

Conclusion

In this tutorial, we developed a fully functional and personalized job posting agent in Python to automate the retrieval of job postings using an API, customized the results for specific criteria, and sent the data as a formatted email notification. The script demonstrates how to query different API parameters such as location, occupation, and skills. With just a few adjustments, you can fetch data tailored to a wide range of use cases, from remote jobs in specific time zones to opportunities in specific industries or geographic regions.

Beyond sending notifications, the collected job data opens doors to more advanced applications. For instance, storing the retrieved job postings in a database enables the creation of a niche job board. You could enhance this board by integrating search and filtering features, allowing users to find tailored job opportunities. Additionally, analyzing this data for trends—such as demand for specific skills, regional salary variations, or emerging industries—could provide valuable insights for businesses and job seekers alike.

I hope you found value in this tutorial and are inspired to try it out as a personal project over a weekend or holidays. It’s an excellent opportunity to enhance your Python skills, experiment with API integration, and explore the world of automation.

DEV Community