I had a list of 40 keywords and needed a blog post for each one.
Writing them manually would take two weeks.
Writing a script to generate them took one afternoon — and 47 lines of Python.
Here's exactly how I built it.
This isn't a tutorial about AI being magic. It's a tutorial about the specific, unsexy plumbing you need to turn a keyword into a structured, usable blog post — with retry logic, output formatting, and a folder of .md files you can actually work with.
By the end of this, you'll have a working script that:
- Takes a list of keywords from a .txt file
- Sends a structured prompt to the OpenAI API
- Parses and saves each response as a Markdown file
- Handles rate limit errors without crashing
Why this is harder than it looks
The naive version is 5 lines:

That works exactly once, in a Jupyter notebook, for a demo.
In practice, you hit three problems immediately:
Rate limits. OpenAI's default tier for gpt-4o is 3 requests per minute. Try to fire 40 at once and you'll get a RateLimitError on request 4.
Unstructured output. "Write a blog post" gets you anything from a 200-word paragraph to a 3,000-word essay with inconsistent headers. If you're using this content anywhere, you need predictable structure.
No persistence. If the script crashes on keyword 22, you've lost the first 21. You need to write each output to disk as it completes.
The 47-line version handles all three.
Prerequisites
- Python 3.9+
- An OpenAI API key
- openai library: pip install openai
That's it. No frameworks, no databases, no Docker.
The setup
Create this file structure:

Your keywords.txt should look like this — one keyword per line:
The script
Here's generate.py in full:
import os
import time
from pathlib import Path
from openai import OpenAI, RateLimitError
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
OUTPUT_DIR = Path("output")
OUTPUT_DIR.mkdir(exist_ok=True)
PROMPT_TEMPLATE = """Write a blog post about: "{keyword}"
Use this exact structure:
[Title]
Introduction
[2-3 sentences introducing the topic]
[Section 1 heading]
[3-4 sentences]
[Section 2 heading]
[3-4 sentences]
[Section 3 heading]
[3-4 sentences]
Conclusion
[2-3 sentences wrapping up with a practical takeaway]
Tone: conversational and practical. Avoid fluff. Total length: ~400 words."""
def slugify(keyword: str) -> str:
return keyword.lower().replace(" ", "-").replace("/", "-")
def generate_post(keyword: str, retries: int = 3) -> str:
for attempt in range(retries):
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": PROMPT_TEMPLATE.format(keyword=keyword)}],
max_tokens=700,
temperature=0.7,
)
return response.choices[0].message.content
except RateLimitError:
wait = 20 * (attempt + 1) # back off: 20s, 40s, 60s
print(f" Rate limit hit. Waiting {wait}s before retry {attempt + 1}/{retries}...")
time.sleep(wait)
raise RuntimeError(f"Failed to generate post for '{keyword}' after {retries} retries.")
def run(keywords_file: str = "keywords.txt"):
keywords = Path(keywords_file).read_text().strip().splitlines()
print(f"Processing {len(keywords)} keywords...\n")
for i, keyword in enumerate(keywords, 1):
output_path = OUTPUT_DIR / f"{slugify(keyword)}.md"
if output_path.exists():
print(f"[{i}/{len(keywords)}] Skipping '{keyword}' — already generated.")
continue
print(f"[{i}/{len(keywords)}] Generating: '{keyword}'")
content = generate_post(keyword)
output_path.write_text(content, encoding="utf-8")
# OpenAI free tier: ~3 requests/minute. This keeps us just under.
time.sleep(22)
print("\nDone. Check the output/ folder.")
if name == "main":
run()
What each part is actually doing
The prompt template is doing the real work here. "Write a blog post" is too open-ended — the model will vary wildly in length and structure. The template locks in a specific H2 structure, a word count target, and a tone instruction. Your output becomes predictable enough to actually use.
The retries loop in _generate_post _uses exponential-ish backoff — 20s, 40s, 60s — because OpenAI's rate limit errors are temporary. Most of the time, waiting 20 seconds is enough. The retry loop means the script keeps running instead of crashing and forcing you to restart.
The if output_path.exists(): continue check is the most important line for long runs. If you're processing 100 keywords and the script dies at #73, you don't want to regenerate the first 72. This check skips already-completed files and resumes from where you left off.
The time.sleep(22) at the bottom of the loop is tuned for the free tier rate limit. If you're on a paid OpenAI tier with higher limits, you can drop this to time.sleep(2) or remove it entirely and let the retry logic handle any occasional errors.
Running it
Your output/ folder now has three .md files, each with consistent structure:
What can go wrong
RateLimitError even with the sleep. This happens if you've been running the script multiple times in the same minute. The rate limit is per-minute across all your requests, not just this script. Fix: increase time.sleep(22) to time.sleep(30) if you're hitting it consistently.
The model ignores your structure prompt. This happens more with gpt-3.5-turbo than gpt-4o. If you switch models to reduce cost, test with 5 keywords first and inspect the output structure before running it on your full list.
You get back an empty string. Rare, but it happens. The generate_post function returns the raw content string — add a check after the call: if not content.strip(): raise ValueError(...) to catch and flag empty responses before they get written to disk as empty files.
Special characters in keywords break the filename. The slugify function handles spaces and forward slashes, but if your keywords have apostrophes, colons, or question marks, you'll get OS-level errors. Add .replace("'", "").replace(":", "").replace("?", "") to the slugify function if your keyword list is user-generated.
What I'd add next
This script is intentionally minimal — 47 lines, no external dependencies beyond openai. But if you're running this in production for more than a few hundred keywords, here's what breaks next:
Cost tracking. Add a token counter so you know what each run costs before you've spent $40 without noticing.
Quality validation. A second API call that checks whether the output meets a minimum quality bar (does it have all the required sections? Is it close to the target word count?). This sounds expensive but catching bad outputs early is cheaper than rewriting them manually.
Concurrent requests. The serial approach (one keyword at a time, sleep 22 seconds) is slow. With asyncio and a proper rate limiter, you can process 3 keywords simultaneously and cut wall-clock time by ~60%.
I'll cover the async version in the next post.
The full script on GitHub
The complete source, with a requirements.txt and a sample keywords.txt to test with, is at:
Drop a star if it saved you time.
Have you hit rate limits running keyword pipelines at scale? What's your retry strategy — and are you running requests serially or concurrently? Curious what others have landed on.




Top comments (1)
How is the content? Won't it sound like it was generated by AI? It will save you a lot of time but you should still reword the content so that it sounds like a person wrote it.