agenthustler

Posted on Mar 27

How to Build a Grant Writing Assistant with Scraped Foundation Data

#python #tutorial #webdev #programming

Grant writing is time-consuming, but much of the work is research: finding relevant foundations, understanding their priorities, and matching your project to their criteria. Let's automate the research part by scraping foundation databases.

The Problem with Grant Research

There are over 100,000 grant-making foundations in the US alone. Manually searching each one's website for funding opportunities takes hundreds of hours. By scraping foundation databases, we can build a system that matches your project to relevant funders automatically.

Data Sources

Key databases to scrape:

Foundation Directory Online (Candid/GuideStar)
Grants.gov (federal grants)
State arts/humanities councils
University grant portals

Setting Up

pip install requests beautifulsoup4 pandas scikit-learn

Using ScraperAPI for reliable access to foundation websites:

import requests
from bs4 import BeautifulSoup
import json

SCRAPER_KEY = "YOUR_SCRAPERAPI_KEY"

def fetch(url):
    """Fetch page with ScraperAPI."""
    return requests.get(
        "http://api.scraperapi.com",
        params={"api_key": SCRAPER_KEY, "url": url, "render": "true"},
        timeout=60
    )

Scraping Grants.gov

def scrape_grants_gov(keyword, page=1):
    """Search Grants.gov for matching opportunities."""
    url = (
        f"https://www.grants.gov/search-grants?"
        f"keywords={keyword}&page={page}"
    )
    resp = fetch(url)
    soup = BeautifulSoup(resp.text, "html.parser")

    grants = []
    for item in soup.select(".grant-result-item"):
        title = item.select_one(".grant-title")
        agency = item.select_one(".grant-agency")
        deadline = item.select_one(".grant-deadline")
        amount = item.select_one(".grant-amount")

        if title:
            grants.append({
                "title": title.text.strip(),
                "agency": agency.text.strip() if agency else "",
                "deadline": deadline.text.strip() if deadline else "",
                "amount": amount.text.strip() if amount else "",
                "url": title.get("href", ""),
                "source": "grants.gov"
            })
    return grants

# Search for technology education grants
tech_grants = scrape_grants_gov("technology education")
print(f"Found {len(tech_grants)} grants")

Scraping Foundation Profiles

def scrape_foundation_profile(foundation_url):
    """Extract detailed info from a foundation's page."""
    resp = fetch(foundation_url)
    soup = BeautifulSoup(resp.text, "html.parser")

    profile = {
        "name": "",
        "mission": "",
        "focus_areas": [],
        "grant_range": "",
        "total_giving": "",
        "application_info": "",
        "deadlines": []
    }

    # Extract mission statement
    mission = soup.select_one(".mission-statement, .about-text")
    if mission:
        profile["mission"] = mission.text.strip()

    # Extract focus areas
    for tag in soup.select(".focus-area, .program-area"):
        profile["focus_areas"].append(tag.text.strip())

    # Extract grant size range
    grant_info = soup.select_one(".grant-range, .funding-info")
    if grant_info:
        profile["grant_range"] = grant_info.text.strip()

    return profile

Building the Matching Engine

Use TF-IDF to match your project description to foundation missions:

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd

def match_foundations(project_description, foundations_df):
    """Match project to relevant foundations using NLP."""
    # Combine foundation text fields
    foundations_df["combined"] = (
        foundations_df["mission"] + " " +
        foundations_df["focus_areas"].apply(" ".join)
    )

    # TF-IDF vectorization
    vectorizer = TfidfVectorizer(stop_words="english", max_features=5000)
    corpus = [project_description] + foundations_df["combined"].tolist()
    tfidf_matrix = vectorizer.fit_transform(corpus)

    # Calculate similarity scores
    similarities = cosine_similarity(
        tfidf_matrix[0:1], tfidf_matrix[1:]
    ).flatten()

    foundations_df["match_score"] = similarities
    results = foundations_df.sort_values("match_score", ascending=False)
    return results.head(20)

# Example
project = """
We are developing an AI-powered tutoring platform for
underserved high school students in STEM subjects.
The platform uses adaptive learning to personalize
the curriculum for each student.
"""

matches = match_foundations(project, foundations_df)
print(matches[["name", "match_score", "grant_range"]])

Generating Grant Summaries

def generate_grant_brief(foundation, project_desc):
    """Create a research brief for a matching foundation."""
    brief = f"""
# Grant Opportunity Brief

## Foundation: {foundation['name']}
**Match Score**: {foundation['match_score']:.0%}

## Foundation Mission
{foundation['mission']}

## Focus Areas
{', '.join(foundation['focus_areas'])}

## Grant Range
{foundation['grant_range']}

## Alignment Analysis
Your project focuses on: AI tutoring for underserved STEM students.
Foundation priorities overlap in: {', '.join(foundation['focus_areas'][:3])}

## Next Steps
1. Review full application guidelines
2. Contact program officer if available
3. Draft LOI (Letter of Intent)
4. Deadline: {foundation.get('deadline', 'Check website')}
"""
    return brief

Scaling Your Research

For comprehensive foundation research, use ThorData proxies to scrape at scale without getting blocked. Monitor your scrapers with ScrapeOps to ensure you're not missing data from failed requests.

Database Schema

import sqlite3

def create_grant_db(db="grants.db"):
    conn = sqlite3.connect(db)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS foundations (
            id INTEGER PRIMARY KEY,
            name TEXT, mission TEXT,
            focus_areas TEXT, grant_range TEXT,
            total_giving TEXT, website TEXT,
            last_updated TEXT
        )
    """)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS opportunities (
            id INTEGER PRIMARY KEY,
            foundation_id INTEGER,
            title TEXT, deadline TEXT,
            amount TEXT, status TEXT,
            match_score REAL,
            FOREIGN KEY (foundation_id) REFERENCES foundations(id)
        )
    """)
    conn.commit()
    return conn

Conclusion

A grant writing assistant powered by web scraping can cut research time from weeks to hours. The key is building good scrapers for foundation databases, then using NLP to match your project to relevant funders. Start with Grants.gov (structured, easy to scrape), then expand to private foundations.

The matching engine alone can save nonprofits hundreds of hours per year in grant research.

DEV Community