DEV Community

YMori
YMori

Posted on

Cross-Repo README Sync with GitHub Actions — Push vs Pull Pattern

The Problem

When you manage multiple GitHub repositories, you often want to display stats from one repo in another — for example, showing contribution counts in your profile README.

Manually updating these numbers is error-prone. Lists get out of sync, numbers become stale, and you forget to update after changes.

This article covers how to build cross-repo README sync with GitHub Actions, and a key architectural decision that saves you from permission headaches.

Two Approaches: Push vs Pull

Push: Source repo writes to target

Source repo → (PAT) → Update target repo's README
Enter fullscreen mode Exit fullscreen mode
  • Requires a Personal Access Token (PAT)
  • Fine-grained PATs can unexpectedly return 403 even with correct permissions
  • PAT management overhead (rotation, scope, etc.)

Pull: Target repo reads from source

Target repo → (GITHUB_TOKEN) → Read source repo's README via API
            → (GITHUB_TOKEN) → Update own README
Enter fullscreen mode Exit fullscreen mode
  • No PAT needed — GITHUB_TOKEN always has write access to its own repo
  • Public repo data is readable without any token
  • Just add a workflow to the target repo

Verdict: Pull wins. It eliminates PAT management entirely.

Implementation

1. HTML Comment Markers

Mark the auto-updated sections in your target README:

## Stats

<!-- STATS_START -->(10 PRs / 5 Merged)<!-- STATS_END --> across 3 repositories.
Enter fullscreen mode Exit fullscreen mode

Only the content between markers gets replaced — everything else stays untouched.

2. Python Sync Script

import base64
import re
import subprocess
import sys
from pathlib import Path

README = Path(__file__).resolve().parent.parent / "README.md"


def run(cmd: list[str]) -> str:
    result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
    return result.stdout.strip()


def fetch_source_readme(owner: str, repo: str) -> str | None:
    """Fetch README via GitHub API (no token needed for public repos)."""
    output = run([
        "gh", "api",
        f"repos/{owner}/{repo}/contents/README.md",
        "--jq", ".content",
    ])
    if not output:
        return None
    return base64.b64decode(output).decode("utf-8")


def replace_marker(text: str, marker: str, replacement: str) -> str:
    """Replace content between HTML comment markers."""
    pattern = rf"(<!-- {marker}_START -->).*?(<!-- {marker}_END -->)"
    return re.sub(pattern, rf"\1{replacement}\2", text, flags=re.DOTALL)


def parse_stats(source_text: str) -> dict:
    """Extract stats from a markdown summary table."""
    m = re.search(
        r"\| \*\*Total\*\* \|.*?\| \*\*(\d+)\*\* \| \*\*(\d+)\*\*",
        source_text,
    )
    if not m:
        return {}
    return {"total": int(m.group(1)), "merged": int(m.group(2))}


def main():
    source = fetch_source_readme("your-org", "your-source-repo")
    if not source:
        print("Failed to fetch source README", file=sys.stderr)
        sys.exit(1)

    stats = parse_stats(source)
    if not stats:
        print("Failed to parse stats", file=sys.stderr)
        sys.exit(1)

    readme = README.read_text(encoding="utf-8")
    readme = replace_marker(
        readme, "STATS",
        f"({stats['total']} PRs / {stats['merged']} Merged)",
    )
    README.write_text(readme, encoding="utf-8")
    print(f"Updated: {stats['total']} PRs / {stats['merged']} Merged")


if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

3. Workflow

name: Sync README Stats

on:
  schedule:
    # Run after the source repo's update schedule
    - cron: '30 9 * * 1'
  workflow_dispatch:

permissions:
  contents: write

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Sync stats from source repo
        env:
          GH_TOKEN: ${{ github.token }}
        run: python scripts/sync_stats.py

      - name: Commit and push if changed
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add README.md
          if ! git diff --cached --quiet; then
            git commit -m "docs: sync stats $(date -u +%Y-%m-%d)"
            git push
          fi
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls

PAT 403 Errors

With the push approach, Fine-grained PATs can return 403 even when configured with "All repositories" and "Contents: Read and write":

remote: Permission to user/repo.git denied to user.
fatal: unable to access '...': The requested URL returned error: 403
Enter fullscreen mode Exit fullscreen mode

The GitHub Contents API (-X PUT) also returns 403. Rather than debugging token permissions, switching to the pull approach is the most reliable fix.

Cron Timing

If your source repo updates at 09:00 UTC on Mondays, schedule the sync workflow for 09:30 or later:

# Bad: same time as source → may fetch stale data
- cron: '0 9 * * 1'

# Good: after source update completes
- cron: '30 9 * * 1'
Enter fullscreen mode Exit fullscreen mode

Marker Design

Use unique marker names per section to avoid collisions:

<!-- PROJECT_STATS_START -->...<!-- PROJECT_STATS_END -->
<!-- BADGE_COUNT_START -->...<!-- BADGE_COUNT_END -->
Enter fullscreen mode Exit fullscreen mode

The replace_marker function only touches content between markers, so the rest of your README is safe.

Summary

Principle Description
Use pull, not push Place the workflow in the target repo, use GITHUB_TOKEN
HTML comment markers Isolate auto-updated sections from manual content
Stagger cron schedules Run sync after the source has finished updating
Single Source of Truth One canonical data source, everything else pulls from it

This pattern works for any cross-repo data sync — contribution stats, package versions, badge counts, or anything else you want to keep consistent across repositories.

Top comments (0)