DEV Community

IntelliTools
IntelliTools

Posted on

Automate Hacker News Job Tracking with Python Script

import requests
from bs4 import BeautifulSoup
import csv

# Function to fetch Hacker News job postings
def fetch_hn_jobs():
    url = 'https://news.ycombinator.com/jobs'
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    jobs = []

    # Extract job entries
    for item in soup.select('.athing'):  # Select job posts
        title = item.select_one('.titlelink').text
        link = item.select_one('.titlelink')['href']
        company = item.select_one('.subtext .company').text.strip()
        role = item.select_one('.subtext .role').text.strip()
        location = item.select_one('.subtext .location').text.strip()
        jobs.append({
            'company': company,
            'role': role,
            'location': location,
            'link': link
        })

    return jobs
Enter fullscreen mode Exit fullscreen mode

The fetch_hn_jobs function above demonstrates how to programmatically extract job postings from Hacker News. This script uses the requests and BeautifulSoup libraries to fetch and parse the HTML content. It then extracts key details like company, role, location, and link for each job posting. This eliminates the need for manual scraping and ensures data is captured consistently.

# Function to export jobs to CSV
def export_to_csv(jobs, filename='hn_jobs.csv'):
    with open(filename, 'w', newline='', encoding='utf-8') as csvfile:
        fieldnames = ['company', 'role', 'location', 'link']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        for job in jobs:
            writer.writerow(job)
Enter fullscreen mode Exit fullscreen mode

Once you've fetched the job data, the export_to_csv function writes it to a CSV file. This makes it easy to import into spreadsheet tools or job tracking systems. The script handles formatting and ensures the data is clean and structured for downstream use.

# Main execution block
if __name__ == '__main__':
    jobs = fetch_hn_jobs()
    export_to_csv(jobs)
    print(f"Exported {len(jobs)} job postings to 'hn_jobs.csv'")
Enter fullscreen mode Exit fullscreen mode

This script is designed to be run as a standalone Python program. When executed, it fetches the latest Hacker News job postings, processes the data, and exports it to a CSV file. This automation saves time by eliminating the need for manual data entry or copy-paste workflows.

The script is built with real-world use cases in mind. It handles pagination and network errors automatically, ensuring that you get a complete and accurate dataset. This is particularly useful for remote developers and job seekers who want to track real-time job postings without spending hours manually scraping the site.

To get started, simply download the script and follow the setup instructions in the included README.txt. The script requires Python 3.7+ and the requests and beautifulsoup4 libraries, which can be installed with a single pip install command. The requirements.txt file ensures that all dependencies are managed efficiently.

For developers looking to streamline their job tracking process, this tool offers a practical and efficient solution. By automating the tracking of Hacker News job postings, you can focus more on your work and less on data collection. This script is a great example of how Python can be used to automate repetitive tasks and improve productivity.

If you're interested in learning more about how this script was built or want to try it out, you can find the full project at https://intellitools.gumroad.com/l/hn-job-exporter. This resource includes the complete Python script, setup instructions, and a sample output to help you get started quickly.

Top comments (0)