DEV Community

agenthustler
agenthustler

Posted on

Scraping Subsidy and Government Grant Databases

Scraping Subsidy and Government Grant Databases

Government subsidies and grants represent billions in funding, but finding relevant opportunities means navigating dozens of fragmented databases. Let's build a scraper that aggregates grant data into a searchable pipeline.

Key Data Sources

  • grants.gov — US federal grants (API available)
  • USAspending.gov — Federal spending data (API)
  • EU Open Data Portal — European funding
  • State-level portals — Vary by state

Setting Up

pip install requests beautifulsoup4 pandas schedule
Enter fullscreen mode Exit fullscreen mode

Grants.gov API

import requests

GRANTS_URL = "https://apply07.grants.gov/grantsws/rest/opportunities/search/"

def search_grants(keyword, page=1):
    payload = {
        "keyword": keyword,
        "oppStatuses": "forecasted|posted",
        "rows": 25,
        "startRecord": (page - 1) * 25
    }
    resp = requests.post(GRANTS_URL, json=payload,
                        headers={"Content-Type": "application/json"})
    data = resp.json()
    opportunities = []
    for opp in data.get("oppHits", []):
        opportunities.append({
            "title": opp.get("title", ""),
            "agency": opp.get("agencyCode", ""),
            "opp_number": opp.get("number", ""),
            "close_date": opp.get("closeDate", ""),
            "award_ceiling": opp.get("awardCeiling", 0),
            "award_floor": opp.get("awardFloor", 0),
        })
    return opportunities, data.get("hitCount", 0)

grants, total = search_grants("artificial intelligence")
print(f"Found {total} AI-related grants")
for g in grants[:5]:
    print(f"  {g['title'][:60]} - ${g['award_ceiling']:,.0f}")
Enter fullscreen mode Exit fullscreen mode

USAspending.gov API

USA_SPENDING = "https://api.usaspending.gov/api/v2"

def search_spending(keyword, limit=50):
    payload = {
        "filters": {
            "keywords": [keyword],
            "time_period": [{"start_date": "2025-01-01", "end_date": "2026-12-31"}]
        },
        "limit": limit, "page": 1
    }
    resp = requests.post(f"{USA_SPENDING}/search/spending_by_award/", json=payload)
    return resp.json().get("results", [])

awards = search_spending("machine learning")
for a in awards[:5]:
    print(f"  {a.get('Recipient Name', 'N/A')}: ${a.get('Award Amount', 0):,.0f}")
Enter fullscreen mode Exit fullscreen mode

Scraping State-Level Portals

Many state grant portals lack APIs. Use ScraperAPI for JavaScript-heavy sites:

from bs4 import BeautifulSoup

def scrape_state_grants(state_url):
    params = {
        "api_key": "YOUR_SCRAPERAPI_KEY",
        "url": state_url,
        "render": "true",
        "wait_for_selector": ".grant-listing"
    }
    resp = requests.get("https://api.scraperapi.com", params=params)
    soup = BeautifulSoup(resp.text, "html.parser")
    grants = []
    for item in soup.select(".grant-listing"):
        title = item.select_one(".title")
        amount = item.select_one(".amount")
        deadline = item.select_one(".deadline")
        if title:
            grants.append({
                "title": title.get_text(strip=True),
                "amount": amount.get_text(strip=True) if amount else "N/A",
                "deadline": deadline.get_text(strip=True) if deadline else "N/A"
            })
    return grants
Enter fullscreen mode Exit fullscreen mode

Building an Alert System

import pandas as pd
import schedule

def check_new_grants():
    keywords = ["AI", "machine learning", "data science", "cybersecurity"]
    all_grants = []
    for kw in keywords:
        grants, _ = search_grants(kw)
        all_grants.extend(grants)
    df = pd.DataFrame(all_grants).drop_duplicates(subset="opp_number")
    df.to_csv("grants_latest.csv", index=False)
    print(f"Updated: {len(df)} unique grants found")

schedule.every().day.at("08:00").do(check_new_grants)
Enter fullscreen mode Exit fullscreen mode

Scale with ThorData proxies and monitor with ScrapeOps.

Key Takeaways

  • Federal APIs (grants.gov, USAspending) provide structured grant data
  • State portals often need JavaScript rendering for scraping
  • Automated monitoring catches new opportunities early
  • Combining federal and state data creates comprehensive views

Government data is public by law. Respect rate limits and use data responsibly.

Top comments (0)