DEV Community

IntelliTools
IntelliTools

Posted on

HN Job Exporter: Automate Real-Time Hacker News Job Tracking

import requests
from bs4 import BeautifulSoup
import csv

def fetch_hn_jobs(page=0):
    url = f'https://news.ycombinator.com/jobs?show=story&sort=votes&desc=1&page={page}'
    headers = {'User-Agent': 'Mozilla/5.0'}
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'html.parser')
    jobs = []

    for item in soup.select('.story'):
        title = item.select_one('a.storylink').text.strip()
        link = item.select_one('a.storylink')['href']
        company = item.select_one('.subtext a').text.strip()
        jobs.append((company, title, link))

    return jobs
Enter fullscreen mode Exit fullscreen mode

HN Job Exporter is a Python script that automates the process of fetching and exporting real-time job postings from Hacker News. It eliminates the need for manual scraping and provides a clean CSV output with essential details like company, role, and link. This tool is particularly useful for remote developers and job seekers who want to stay updated with the latest opportunities without the hassle of manual data entry.

The script uses the requests library to fetch job listings from the Hacker News website and BeautifulSoup to parse the HTML content. It handles pagination and network errors automatically, ensuring that you get a complete and accurate dataset. The output is structured as a CSV file, making it easy to import into spreadsheet tools or job tracking systems.

import csv
from datetime import datetime

def export_jobs(jobs, filename='hn_jobs.csv'):
    with open(filename, 'w', newline='', encoding='utf-8') as file:
        writer = csv.writer(file)
        writer.writerow(['Company', 'Role', 'Location', 'Link'])
        for company, title, link in jobs:
            writer.writerow([company, title, '', link])

    print(f'Exported {len(jobs)} jobs to {filename}')
Enter fullscreen mode Exit fullscreen mode

The export functionality is straightforward. After fetching the job data, the script writes it to a CSV file with headers for clarity. Each row contains the company name, job title, location (which is optional in this case), and a direct link to the job posting. This format ensures that the data is easy to read and can be integrated into various workflows.

One of the key benefits of HN Job Exporter is that it saves time by automating the process of tracking job postings. Instead of manually visiting Hacker News and copying job details, the script does this for you. This can save up to 2 hours per week, allowing you to focus on more important tasks.

The script is designed to be run from the command line, and it requires Python 3.7 or higher. To get started, you'll need to install the necessary dependencies using pip:

pip install requests beautifulsoup4
Enter fullscreen mode Exit fullscreen mode

Once installed, you can run the script and specify the number of pages to fetch. The script will handle the rest, including error checking and pagination. The output is saved as a CSV file, which you can review or import into your preferred job tracking tool.

For those looking to streamline their job search process, HN Job Exporter is an excellent tool. It provides a clean, automated way to track real-time job postings from Hacker News without the need for manual scraping. The script is easy to use, well-documented, and includes a sample output to help you understand the format.

If you're interested in downloading the script and exploring its full capabilities, you can find it at HN Job Exporter. The package includes the Python script, setup instructions, and a sample output file to get you started quickly.

Overall, HN Job Exporter is a practical solution for developers and job seekers who want to automate their job tracking process. It's a simple yet powerful tool that can significantly improve productivity and reduce the time spent on manual data entry.

Top comments (0)