Most of us write across multiple platforms: personal blogs, dev.to, company blogs, and more. But our GitHub profile — the place other developers often visit first— rarely reflects all of that activity.
In this article, we’ll build a simple system that:
📚 Pulls your latest articles from:
- dev.to
- cosine.sh/blog (only the posts you’ve authored) / you can add your personal blog, or company's blog, or whatever else you're using
📃Merges them into a single list
📅Shows the latest 6 on your GitHub profile README
♻️Automatically updates once per day via GitHub Actions
p.s. If you don't want to read, here's a video with the whole process:
What we’re building
End goal:
- Your GitHub profile README (e.g. https://github.com/YOUR_USERNAME) will have a section like:
## Recent Articles
<!-- recent-blog-posts start -->
<!-- recent-blog-posts end -->
A GitHub Action runs daily, fetches your latest posts from dev.to and cosine.sh/blog, and replaces everything between those markers with a grid of:
- Cover image
- Article title (linked)
- Source (dev.to or Cosine or in your case -> whatever you want)
- Date It always shows the 6 most recent posts across both platforms, sorted by date.
High-level architecture:
- GitHub Action on a schedule (cron + workflow_dispatch).
- A Python script that:
• Calls the dev.to API for your username.
• Scrapes cosine.sh/blog and filters for your posts only.
• Merges and sorts the posts.
• Renders a small HTML grid.
• Replaces the marker section in README.md.
3.Action commits the updated README back to your profile repo.
Step 1: Add markers to your GitHub profile README
First, in your profile repository (same name as your username, e.g. EleftheriaBatsou/EleftheriaBatsou), edit README.md and add a section like this:
## Recent Blog Posts
<!-- recent-blog-posts start -->
<!-- recent-blog-posts end -->
Those comments are the “anchors” the script will use to know where to inject the generated content. Everything between them will be replaced automatically.
Step 2: Create the GitHub Action workflow
Create a file:
.github/workflows/update_blog.yml
with this content:
name: Update Recent Blog Posts
on:
schedule:
- cron: "0 3 * * *" # daily at 03:00 UTC
workflow_dispatch: # allow manual runs from the Actions tab
jobs:
update:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install requests beautifulsoup4 python-dateutil
- name: Update README with latest posts
env:
# Optional: enable verbose logs during development
RECENT_BLOG_VERBOSE: "false"
# Optional: seed known Cosine URLs if you want to be explicit
# COSINE_AUTHOR_URLS: "https://cosine.sh/blog/cosine-vs-codex-vs-windsurf,https://cosine.sh/blog/projects-you-can-build-with-cosine"
run: |
python scripts/update_blog.py
- name: Commit changes
run: |
if [[ -n "$(git status --porcelain)" ]]; then
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add README.md
git commit -m "chore: update recent blog posts"
git push
else
echo "No changes to commit."
fi
Notes:
-
workflow_dispatchlets you trigger the workflow manually from GitHub’s UI. -
cronruns it once a day. - We install only a few, standard Python libraries:
requests,beautifulsoup4,python-dateutil.
Step 3: The Python script that does the work
Create:
scripts/update_blog.py
with the following:
import os
import re
import sys
import json
import time
import unicodedata
import requests
from bs4 import BeautifulSoup
from datetime import datetime, timezone
from dateutil import parser as dateparser
import xml.etree.ElementTree as ET
README_PATH = "README.md"
START_MARK = "<!-- recent-blog-posts start -->"
END_MARK = "<!-- recent-blog-posts end -->"
DEVTO_USERNAME = "eleftheriabatsou"
COSINE_BLOG_INDEX = "https://cosine.sh/blog"
COSINE_SITEMAP = "https://cosine.sh/sitemap.xml"
AUTHOR_NAME = "Eleftheria Batsou"
AUTHOR_ROLE = "Developer Advocate"
AUTHOR_TWITTER_HANDLE = "BatsouElef" # used for x.com/twitter.com handle checks
MAX_POSTS = 6
TIMEOUT = 20
VERBOSE = os.getenv("RECENT_BLOG_VERBOSE", "").lower() in {"1", "true", "yes"}
# Optional manual fallback: extra Cosine post URLs (comma-separated env var)
COSINE_AUTHOR_URLS = [u.strip() for u in os.getenv("COSINE_AUTHOR_URLS", "").split(",") if u.strip()]
# Known Cosine posts (helps when structure changes)
DEFAULT_KNOWN_COSINE = {
"https://cosine.sh/blog/cosine-vs-codex-vs-windsurf",
"https://cosine.sh/blog/projects-you-can-build-with-cosine",
"https://cosine.sh/blog/ai-coding-tools-comparison",
}
session = requests.Session()
session.headers.update({
"User-Agent": "Mozilla/5.0 (compatible; GitHubAction; +https://github.com/EleftheriaBatsou)",
"Accept-Language": "en",
"Referer": "https://cosine.sh/",
})
def log(msg):
if VERBOSE:
print(msg)
def normalize_text(s):
if not s:
return ""
s = unicodedata.normalize("NFKC", s)
s = s.replace("\u00A0", " ")
s = re.sub(r"\s+", " ", s)
return s.strip()
def normalize_date(dt):
if not dt:
return None
if dt.tzinfo is None:
return dt.replace(tzinfo=timezone.utc)
return dt.astimezone(timezone.utc)
def safe_get(url, timeout=TIMEOUT):
try:
r = session.get(url, timeout=timeout)
r.raise_for_status()
return r
except Exception as e:
log(f"[WARN] GET failed: {url} -> {e}")
return None
def fetch_devto_posts():
url = f"https://dev.to/api/articles?username={DEVTO_USERNAME}"
resp = safe_get(url)
if not resp:
return []
try:
data = resp.json()
except Exception as e:
log(f"[WARN] dev.to JSON parse failed: {e}")
return []
posts = []
for item in data:
published = item.get("published_at") or item.get("created_at")
try:
dt_raw = dateparser.parse(published)
except Exception:
dt_raw = None
dt = normalize_date(dt_raw)
posts.append({
"source": "dev.to",
"title": item.get("title") or "",
"url": item.get("url") or "",
"cover_image": item.get("cover_image") or item.get("social_image") or "",
"date": dt,
"date_str": dt.strftime("%Y-%m-%d") if dt else (published or ""),
})
log(f"[INFO] dev.to posts fetched: {len(posts)}")
return posts
def get_cosine_links_from_index():
resp = safe_get(COSINE_BLOG_INDEX)
if not resp:
return set()
soup = BeautifulSoup(resp.text, "html.parser")
links = set()
for a in soup.select("a[href^='/blog/']"):
href = a.get("href", "").strip()
if not href:
continue
if href.rstrip("/").endswith("/blog"):
continue
if href.count("/") >= 2:
full = f"https://cosine.sh{href}" if href.startswith("/") else href
links.add(full)
log(f"[INFO] Cosine index links found: {len(links)}")
return links
def get_cosine_links_from_sitemap():
resp = safe_get(COSINE_SITEMAP)
if not resp:
return set()
links = set()
try:
root = ET.fromstring(resp.text)
for loc in root.iter():
if loc.tag.endswith("loc"):
url = (loc.text or "").strip()
if "/blog/" in url:
links.add(url)
log(f"[INFO] Cosine sitemap blog links found: {len(links)}")
except Exception as e:
log(f"[WARN] sitemap parse error: {e}")
return links
def detect_author(page):
# 1) Meta name="author"
meta_author = page.find("meta", attrs={"name": "author"})
if meta_author and meta_author.get("content"):
return normalize_text(meta_author["content"])
# 2) Common meta properties
for prop in ["article:author", "og:article:author"]:
m = page.find("meta", attrs={"property": prop})
if m and m.get("content"):
return normalize_text(m["content"])
# 3) rel=author links
rel_author = page.select_one("a[rel='author']")
if rel_author:
txt = normalize_text(rel_author.get_text())
if txt:
return txt
# 4) JSON-LD
for ld in page.find_all("script", type="application/ld+json"):
try:
data = json.loads(ld.string or "")
except Exception:
continue
if isinstance(data, dict):
a = data.get("author")
if isinstance(a, dict) and a.get("name"):
return normalize_text(a["name"])
if isinstance(a, list) and a:
entry = a[0]
if isinstance(entry, dict) and entry.get("name"):
return normalize_text(entry["name"])
elif isinstance(data, list):
for item in data:
if isinstance(item, dict):
a = item.get("author")
if isinstance(a, dict) and a.get("name"):
return normalize_text(a["name"])
if isinstance(a, list) and a:
entry = a[0]
if isinstance(entry, dict) and entry.get("name"):
return normalize_text(entry["name"])
# 5) Visible text fallback: name + role or twitter handle
page_text = normalize_text(page.get_text(" "))
name_match = re.search(r"Eleftheria\s+Batsou", page_text, flags=re.IGNORECASE)
role_match = re.search(r"Developer\s+Advocate", page_text, flags=re.IGNORECASE)
twitter_match = re.search(rf"(x\.com|twitter\.com)/{re.escape(AUTHOR_TWITTER_HANDLE)}", page_text, flags=re.IGNORECASE)
if name_match and (role_match or twitter_match):
return AUTHOR_NAME
return None
def parse_post_page(url):
r = safe_get(url)
if not r:
return None
page = BeautifulSoup(r.text, "html.parser")
author = detect_author(page)
if not author or "eleftheria" not in author.lower() or "batsou" not in author.lower():
log(f"[SKIP] Not authored by {AUTHOR_NAME}: {url} (detected: {author})")
return None
# Title
title = None
og_title = page.find("meta", property="og:title")
if og_title and og_title.get("content"):
title = normalize_text(og_title["content"])
if not title and page.title and page.title.string:
title = normalize_text(page.title.string)
if not title:
h1 = page.find("h1")
if h1:
title = normalize_text(h1.get_text())
if not title:
title = url
# Date
dt = None
date_str = ""
pub_meta = page.find("meta", property="article:published_time")
if pub_meta and pub_meta.get("content"):
try:
dt = dateparser.parse(pub_meta["content"])
except Exception:
date_str = pub_meta["content"]
if not dt:
time_el = page.find("time")
candidate = (
(time_el.get("datetime") if time_el else None)
or (normalize_text(time_el.get_text()) if time_el else None)
)
if candidate:
try:
dt = dateparser.parse(candidate)
except Exception:
date_str = candidate
if not dt:
page_text = normalize_text(page.get_text(" "))
m = re.search(r"(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2},\s+\d{4}", page_text)
if m:
candidate = m.group(0)
try:
dt = dateparser.parse(candidate)
except Exception:
date_str = candidate
dt = normalize_date(dt)
if dt and not date_str:
date_str = dt.strftime("%Y-%m-%d")
# Cover image
cover = ""
og_img = page.find("meta", property="og:image")
if og_img and og_img.get("content"):
cover = og_img["content"].strip()
if not cover:
twitter_img = page.find("meta", property="twitter:image")
if twitter_img and twitter_img.get("content"):
cover = twitter_img["content"].strip()
if not cover:
img = page.select_one("article img") or page.find("img")
if img and img.get("src"):
src = img["src"].strip()
cover = src if src.startswith("http") else f"https://cosine.sh{src}" if src.startswith("/") else src
return {
"source": "Cosine",
"title": title,
"url": url,
"cover_image": cover,
"date": dt,
"date_str": date_str or (dt.strftime("%Y-%m-%d") if dt else ""),
}
def fetch_cosine_author_posts():
index_links = get_cosine_links_from_index()
sitemap_links = get_cosine_links_from_sitemap()
manual_links = set(COSINE_AUTHOR_URLS)
links = sorted(index_links.union(sitemap_links).union(DEFAULT_KNOWN_COSINE).union(manual_links))
posts = []
for url in links:
time.sleep(0.25) # be gentle
p = parse_post_page(url)
if p:
posts.append(p)
log(f"[INFO] Cosine posts authored by {AUTHOR_NAME}: {len(posts)}")
return posts
def render_markdown_grid(posts):
# HTML grid: 2 columns x 3 rows; images width-limited to 280px
rows = []
items = posts[:MAX_POSTS]
if len(items) % 2 == 1:
items.append({"title": "", "url": "", "cover_image": "", "source": "", "date_str": ""})
def cell_html(p):
if not p.get("title"):
return "<td></td>"
img_html = f'<img src="{p["cover_image"]}" alt="cover" style="width:280px; max-width:100%; border-radius:8px;" />' if p.get("cover_image") else ""
title_html = f'<a href="{p["url"]}">{p["title"]}</a>'
meta_html = f'{p.get("source","")} • {p.get("date_str","")}'
return f"<td valign='top' style='padding:8px;'>{img_html}<div style='margin-top:6px; font-weight:600;'>{title_html}</div><div style='color:#666;'>{meta_html}</div></td>"
for i in range(0, len(items), 2):
left = cell_html(items[i])
right = cell_html(items[i+1])
rows.append(f"<tr>{left}{right}</tr>")
html = []
html.append("")
html.append("### Recent Articles")
html.append("")
html.append("<table>")
for r in rows[:3]: # 3 rows max
html.append(r)
html.append("</table>")
html.append("")
html.append("_Auto-updated daily from dev.to and cosine.sh/blog_")
html.append("")
return "\n".join(html)
def update_readme_section(new_content):
if not os.path.exists(README_PATH):
print("README.md not found.", file=sys.stderr)
sys.exit(1)
with open(README_PATH, "r", encoding="utf-8") as f:
readme = f.read()
if START_MARK not in readme or END_MARK not in readme:
print("Markers not found in README.md. Please add the markers to enable updates.", file=sys.stderr)
sys.exit(1)
pattern = re.compile(
re.escape(START_MARK) + r"(.*?)" + re.escape(END_MARK),
re.DOTALL
)
updated = pattern.sub(
START_MARK + "\n" + new_content + "\n" + END_MARK,
readme
)
if updated != readme:
with open(README_PATH, "w", encoding="utf-8") as f:
f.write(updated)
print("README.md updated.")
else:
print("README.md already up to date.")
def main():
devto = fetch_devto_posts()
cosine = fetch_cosine_author_posts()
all_posts = devto + cosine
def sort_key(p):
return p["date"] or datetime.min.replace(tzinfo=timezone.utc)
all_posts.sort(key=sort_key, reverse=True)
latest = all_posts[:MAX_POSTS]
md = render_markdown_grid(latest)
update_readme_section(md)
if __name__ == "__main__":
main()
Debugging: when sites don’t behave like APIs
One interesting part of this build is cosine.sh/blog itself. Unlike dev.to, it doesn’t expose a dedicated public JSON API for blog posts, so we:
- Use the blog index for links:
https://cosine.sh/blog - Use the sitemap for a more complete list:
https://cosine.sh/sitemap.xml - Crawl each page and detect: The author, the date, the cover image
The author detection has to be robust, because templates and metadata can vary. We look for:
-
meta name="author"ormeta property="article:author". - JSON-LD
author.name. - A visible byline that looks like: “Eleftheria Batsou”, “Developer Advocate”, and/or your X/Twitter handle.
This is the kind of thing Cosine is very good at automating. You could easily imagine giving Cosine a task like:
“Make sure my GitHub profile README always shows my last 6 articles from dev.to and cosine.sh/blog. Parse the author name and role correctly, and don’t break if Cosine’s blog markup changes slightly.”
and letting it iterate until the pipelines and selectors are robust.
Why this is useful
Some practical benefits:
- Your GitHub profile stays up to date without you thinking about it.
- Recruiters or collaborators see your latest thinking, not just your repos.
- You can write wherever it makes sense (personal blog, dev.to, company blog like Cosine) and still have a single “portfolio view” on GitHub.
- The setup is simple: one workflow file, one Python script.
This also fits nicely with the way Cosine approaches development work:
- Small, automatable tasks
- Clear, visible diffs (your README changes are just regular commits)
- Easy to extend over time
Ideas for future features
Once this is working, there are several directions you can take it:
1.Include tags or topics:
Parse tags from dev.to and Cosine and show them under each article title:
- “React, TypeScript”
- “AI tooling, Developer Experience”
2.Filter by category:
For example, only show “Insights” posts from Cosine:
- Filter by URL pattern like /blog/… + tag
- Or parse category labels from the page.
3.Add fallback text mode:
Some people prefer plain Markdown instead of HTML tables. You could add a configuration flag that switches between:
- Grid layout (HTML table, smaller covers)
- Simple list layout (Markdown):
- [Title](url) — Source • 2025-11-17
4.Cache responses
To be friendly to dev.to and Cosine, you could store a simple cache file in the repo (or in Actions artifacts). This isn’t strictly necessary for a daily cron, but becomes helpful if you start running the workflow more frequently.
5.Integrate Cosine directly
Right now, we’re using raw Python + GitHub Actions. You could:
Use Cosine to:
- Generate and maintain this script over time.
- Automatically adjust selectors when the Cosine blog layout changes.
- Add tests to validate that your Cosine posts are still being detected correctly.
This is similar to another workflow where we synced YouTube videos into a GitHub profile. These are exactly the kinds of “small but annoying” tasks that Cosine can take off your plate while keeping everything transparent and reviewable.
Conclusion
We’ve built a small but powerful system:
→ A scheduled GitHub Action
→ A Python script that:
- Fetches your dev.to posts via their API
- Scrapes cosine.sh/blog and filters posts authored by you
- Merges, sorts, and renders the latest 6 into a 2×3 grid
- Updates your GitHub profile README automatically
It’s not a big framework or a complicated service—just a focused tool that keeps your profile in sync with your writing across platforms.
This is the sweet spot where tools like Cosine shine: automating the repetitive glue work around your developer presence, while still giving you full control and visibility.
If you’d like to go further, you could:
- Extend this to YouTube, personal blogs, or newsletters.
- Have Cosine manage the workflow, selectors, and tests for you.
- Turn this into a reusable template for your team’s GitHub profiles.
Either way, you now have a pattern: small automations that keep your developer identity consistent across platforms—with your GitHub profile as the source of truth.
Happy Coding ✌️
Top comments (0)