Grant writing is time-consuming, but much of the work is research: finding relevant foundations, understanding their priorities, and matching your project to their criteria. Let's automate the research part by scraping foundation databases.
The Problem with Grant Research
There are over 100,000 grant-making foundations in the US alone. Manually searching each one's website for funding opportunities takes hundreds of hours. By scraping foundation databases, we can build a system that matches your project to relevant funders automatically.
Data Sources
Key databases to scrape:
- Foundation Directory Online (Candid/GuideStar)
- Grants.gov (federal grants)
- State arts/humanities councils
- University grant portals
Setting Up
pip install requests beautifulsoup4 pandas scikit-learn
Using ScraperAPI for reliable access to foundation websites:
import requests
from bs4 import BeautifulSoup
import json
SCRAPER_KEY = "YOUR_SCRAPERAPI_KEY"
def fetch(url):
"""Fetch page with ScraperAPI."""
return requests.get(
"http://api.scraperapi.com",
params={"api_key": SCRAPER_KEY, "url": url, "render": "true"},
timeout=60
)
Scraping Grants.gov
def scrape_grants_gov(keyword, page=1):
"""Search Grants.gov for matching opportunities."""
url = (
f"https://www.grants.gov/search-grants?"
f"keywords={keyword}&page={page}"
)
resp = fetch(url)
soup = BeautifulSoup(resp.text, "html.parser")
grants = []
for item in soup.select(".grant-result-item"):
title = item.select_one(".grant-title")
agency = item.select_one(".grant-agency")
deadline = item.select_one(".grant-deadline")
amount = item.select_one(".grant-amount")
if title:
grants.append({
"title": title.text.strip(),
"agency": agency.text.strip() if agency else "",
"deadline": deadline.text.strip() if deadline else "",
"amount": amount.text.strip() if amount else "",
"url": title.get("href", ""),
"source": "grants.gov"
})
return grants
# Search for technology education grants
tech_grants = scrape_grants_gov("technology education")
print(f"Found {len(tech_grants)} grants")
Scraping Foundation Profiles
def scrape_foundation_profile(foundation_url):
"""Extract detailed info from a foundation's page."""
resp = fetch(foundation_url)
soup = BeautifulSoup(resp.text, "html.parser")
profile = {
"name": "",
"mission": "",
"focus_areas": [],
"grant_range": "",
"total_giving": "",
"application_info": "",
"deadlines": []
}
# Extract mission statement
mission = soup.select_one(".mission-statement, .about-text")
if mission:
profile["mission"] = mission.text.strip()
# Extract focus areas
for tag in soup.select(".focus-area, .program-area"):
profile["focus_areas"].append(tag.text.strip())
# Extract grant size range
grant_info = soup.select_one(".grant-range, .funding-info")
if grant_info:
profile["grant_range"] = grant_info.text.strip()
return profile
Building the Matching Engine
Use TF-IDF to match your project description to foundation missions:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
def match_foundations(project_description, foundations_df):
"""Match project to relevant foundations using NLP."""
# Combine foundation text fields
foundations_df["combined"] = (
foundations_df["mission"] + " " +
foundations_df["focus_areas"].apply(" ".join)
)
# TF-IDF vectorization
vectorizer = TfidfVectorizer(stop_words="english", max_features=5000)
corpus = [project_description] + foundations_df["combined"].tolist()
tfidf_matrix = vectorizer.fit_transform(corpus)
# Calculate similarity scores
similarities = cosine_similarity(
tfidf_matrix[0:1], tfidf_matrix[1:]
).flatten()
foundations_df["match_score"] = similarities
results = foundations_df.sort_values("match_score", ascending=False)
return results.head(20)
# Example
project = """
We are developing an AI-powered tutoring platform for
underserved high school students in STEM subjects.
The platform uses adaptive learning to personalize
the curriculum for each student.
"""
matches = match_foundations(project, foundations_df)
print(matches[["name", "match_score", "grant_range"]])
Generating Grant Summaries
def generate_grant_brief(foundation, project_desc):
"""Create a research brief for a matching foundation."""
brief = f"""
# Grant Opportunity Brief
## Foundation: {foundation['name']}
**Match Score**: {foundation['match_score']:.0%}
## Foundation Mission
{foundation['mission']}
## Focus Areas
{', '.join(foundation['focus_areas'])}
## Grant Range
{foundation['grant_range']}
## Alignment Analysis
Your project focuses on: AI tutoring for underserved STEM students.
Foundation priorities overlap in: {', '.join(foundation['focus_areas'][:3])}
## Next Steps
1. Review full application guidelines
2. Contact program officer if available
3. Draft LOI (Letter of Intent)
4. Deadline: {foundation.get('deadline', 'Check website')}
"""
return brief
Scaling Your Research
For comprehensive foundation research, use ThorData proxies to scrape at scale without getting blocked. Monitor your scrapers with ScrapeOps to ensure you're not missing data from failed requests.
Database Schema
import sqlite3
def create_grant_db(db="grants.db"):
conn = sqlite3.connect(db)
conn.execute("""
CREATE TABLE IF NOT EXISTS foundations (
id INTEGER PRIMARY KEY,
name TEXT, mission TEXT,
focus_areas TEXT, grant_range TEXT,
total_giving TEXT, website TEXT,
last_updated TEXT
)
""")
conn.execute("""
CREATE TABLE IF NOT EXISTS opportunities (
id INTEGER PRIMARY KEY,
foundation_id INTEGER,
title TEXT, deadline TEXT,
amount TEXT, status TEXT,
match_score REAL,
FOREIGN KEY (foundation_id) REFERENCES foundations(id)
)
""")
conn.commit()
return conn
Conclusion
A grant writing assistant powered by web scraping can cut research time from weeks to hours. The key is building good scrapers for foundation databases, then using NLP to match your project to relevant funders. Start with Grants.gov (structured, easy to scrape), then expand to private foundations.
The matching engine alone can save nonprofits hundreds of hours per year in grant research.
Top comments (0)