DEV Community

SEN LLC
SEN LLC

Posted on

Building a Changelog Generator in Rust — From Conventional Commits to Keep a Changelog

Building a Changelog Generator in Rust — From Conventional Commits to Keep a Changelog

A Rust CLI that parses conventional commit messages from git log and produces a structured CHANGELOG.md grouped by type, with version headers, scope annotations, and breaking change detection.

Every project reaches the point where someone asks, "What changed in this release?" If your team uses conventional commits, the answer is already embedded in your git history. The problem is extracting it. You could scroll through git log, mentally categorize each commit, and type up a changelog by hand. Or you could write a tool that does it in under a second.

I built changelog-rs to solve exactly that. It reads git log output, parses conventional commit subjects, groups them by type, and renders a changelog in Keep a Changelog format. No configuration files. No plugins. No runtime dependencies beyond git itself.

📦 GitHub: https://github.com/sen-ltd/changelog-rs

Screenshot

What conventional commits look like

The conventional commits specification defines a structured format for commit messages:

type(scope): description

optional body

optional footer
Enter fullscreen mode Exit fullscreen mode

The type tells you what kind of change it is: feat for new features, fix for bug fixes, refactor for code restructuring, perf for performance improvements, and so on. The optional scope narrows the area of change. A trailing ! or a BREAKING CHANGE: footer marks the commit as a breaking change.

Here are some real examples:

feat(auth): add OAuth2 support
fix: resolve crash on empty input
refactor(api)!: remove deprecated endpoints
docs: update installation guide
Enter fullscreen mode Exit fullscreen mode

The tool recognizes ten types: feat, fix, refactor, perf, docs, test, build, ci, style, and chore. Anything that does not match the conventional format is silently skipped — merge commits, WIP messages, and freeform subjects are ignored without error.

Architecture: three pure functions

The entire library fits into three functions. This is deliberate. A changelog generator does not need a plugin system, a template engine, or a configuration DSL. It needs to parse lines, group them, and render text.

Parsing

pub fn parse_commit(line: &str) -> Option<Commit> {
    let line = line.trim();
    if line.is_empty() {
        return None;
    }

    let (hash, subject) = split_hash_subject(line)?;

    let re = Regex::new(
        r"^(feat|fix|refactor|perf|docs|test|tests|build|ci|style|chore)(\([^)]+\))?(!)?\s*:\s*(.+)$"
    ).unwrap();

    let caps = re.captures(subject)?;
    // ... extract fields, return Some(Commit { ... })
}
Enter fullscreen mode Exit fullscreen mode

The function takes a single line in <hash> <subject> format and returns Option<Commit>. The regex matches the conventional commit pattern, extracting the type, optional scope (in parentheses), optional breaking change marker (!), and the description after the colon. If the line does not match, it returns None.

A key design decision here is that the hash is optional. When piping from git log --format="%H %s", each line starts with a 40-character hex hash. But the tool also accepts bare subjects like feat: add feature with no hash prefix. This makes it easy to test and to use with custom input formats.

The Commit struct captures everything the renderer needs:

pub struct Commit {
    pub hash: String,
    pub category: Category,
    pub scope: Option<String>,
    pub description: String,
    pub breaking: bool,
}
Enter fullscreen mode Exit fullscreen mode

The Category enum maps commit types to changelog section headings. feat becomes "Added", fix becomes "Fixed", and so on — following Keep a Changelog conventions rather than echoing the raw type string.

Grouping

pub fn group_commits(commits: &[Commit]) -> BTreeMap<Category, Vec<Commit>> {
    let mut groups: BTreeMap<Category, Vec<Commit>> = BTreeMap::new();
    for commit in commits {
        groups.entry(commit.category).or_default().push(commit.clone());
    }
    groups
}
Enter fullscreen mode Exit fullscreen mode

This function is almost trivially simple, but the choice of BTreeMap over HashMap is intentional. A BTreeMap iterates in key order, which means the changelog sections always appear in the same deterministic order: Added, Fixed, Refactored, Performance, Documentation, Tests, Build, CI, Style, Chores. No sorting step needed — the data structure handles it.

Rendering

The renderer produces Keep a Changelog format. Every changelog starts with a standard header explaining the format and linking to the specification. Then comes the version section.

pub fn render_changelog(
    groups: &BTreeMap<Category, Vec<Commit>>,
    opts: &RenderOpts
) -> String {
    let mut out = String::new();
    out.push_str("# Changelog\n\n");
    out.push_str("All notable changes to this project ...\n\n");

    if let Some(ref version) = opts.version {
        let date = opts.date.as_deref().unwrap_or("Unreleased");
        out.push_str(&format!("## [{version}] - {date}\n\n"));
    } else {
        out.push_str("## [Unreleased]\n\n");
    }

    // Breaking changes section first
    // Then each category with its commits
}
Enter fullscreen mode Exit fullscreen mode

Breaking changes get their own section at the top, regardless of which commit type they came from. A feat!: drop Node 14 support appears both in the breaking changes section and in the Added section. This mirrors how tools like conventional-changelog handle it — breaking changes deserve prominent visibility.

Each commit line includes the optional scope in bold and the optional abbreviated hash in parentheses:

- **auth**: add OAuth2 support (a1b2c3d)
- add dark mode toggle (b2c3d4e)
Enter fullscreen mode Exit fullscreen mode

The CLI shell

The main.rs file is a thin shell around the library. It handles three input modes:

Auto mode (default): Runs git log --format="%H %s" as a subprocess and captures its output. This is the zero-configuration path — run changelog-rs in any git repository and get a changelog.

Stdin mode: When --stdin is passed or stdin is not a terminal, the tool reads lines from stdin. This supports piping: git log --format="%H %s" v1.0.0..HEAD | changelog-rs --stdin.

Range mode: The --from and --to flags construct a git log range. changelog-rs --from v1.0.0 --to HEAD is equivalent to running git log v1.0.0..HEAD internally.

Terminal detection uses std::io::IsTerminal, which was stabilized in Rust 1.70. When stdin is not a terminal (i.e., data is being piped in), the tool automatically switches to stdin mode without requiring the --stdin flag. This follows the Unix convention of tools that read from pipes transparently.

Avoiding unnecessary dependencies

The temptation with a Rust CLI is to pull in chrono for date formatting, serde for structured output, tokio for async subprocess execution. Each dependency adds compile time, binary size, and a surface area for supply chain issues.

This tool uses two dependencies: clap for argument parsing and regex for commit parsing. For dates, it shells out to the system date command. This is a pragmatic trade-off: the tool only needs today's date in YYYY-MM-DD format, and every Unix system has date. The alternative — adding chrono with its 200+ transitive dependencies — is not justified for a single date +%Y-%m-%d call.

The release profile is optimized for small binaries:

[profile.release]
strip = true
lto = true
codegen-units = 1
opt-level = "z"
panic = "abort"
Enter fullscreen mode Exit fullscreen mode

This produces a binary around 1 MB on macOS, small enough to distribute as a standalone tool without concern for disk space.

Testing strategy

The test suite covers 20 cases across three layers:

Parsing tests verify that each commit type is recognized, that scopes are extracted correctly, that breaking changes are detected via both ! suffix and BREAKING CHANGE: footer, that non-conventional commits return None, and that edge cases like empty lines, missing hashes, and full 40-character hashes are handled.

#[test]
fn test_parse_feat_with_scope() {
    let c = parse_commit("abc1234 feat(parser): add JSON support").unwrap();
    assert_eq!(c.category, Category::Feat);
    assert_eq!(c.scope.as_deref(), Some("parser"));
    assert_eq!(c.description, "add JSON support");
    assert!(!c.breaking);
}

#[test]
fn test_parse_non_conventional_returns_none() {
    assert!(parse_commit("abc1234 Update README").is_none());
    assert!(parse_commit("abc1234 Merge branch 'main'").is_none());
}
Enter fullscreen mode Exit fullscreen mode

Grouping tests verify that commits are correctly distributed into categories and that empty input produces empty output.

Rendering tests verify the complete markdown output: the Keep a Changelog header, version sections with dates, the breaking changes section, scope formatting, hash formatting, and the empty-group fallback message.

The test that verifies all ten commit types deserves a closer look:

#[test]
fn test_parse_all_categories() {
    let types = [
        ("feat", Category::Feat),
        ("fix", Category::Fix),
        ("refactor", Category::Refactor),
        // ... all 11 type strings (including "tests" alias)
    ];
    for (t, expected) in types {
        let line = format!("abc1234 {t}: something");
        let c = parse_commit(&line).unwrap();
        assert_eq!(c.category, expected, "failed for type {t}");
    }
}
Enter fullscreen mode Exit fullscreen mode

This table-driven approach ensures that adding a new commit type to the regex requires adding a corresponding test case. The "tests" alias (plural) maps to the same Test category as "test", because both forms appear in the wild.

Docker support

The Dockerfile uses a two-stage build. The builder stage compiles the release binary on rust:1.90-alpine. The runtime stage copies the binary into a minimal alpine:3.20 image with git installed. Git is necessary because the default mode runs git log as a subprocess.

FROM alpine:3.20
RUN apk add --no-cache git && adduser -D -u 1000 app
WORKDIR /work
USER app
COPY --from=builder /build/target/release/changelog-rs /usr/local/bin/changelog-rs
ENTRYPOINT ["/usr/local/bin/changelog-rs"]
Enter fullscreen mode Exit fullscreen mode

Mount your repository as a volume to use it:

docker run --rm -v "$(pwd):/work" changelog-rs --version 1.0.0
Enter fullscreen mode Exit fullscreen mode

Usage patterns

The simplest use case is generating an unreleased changelog for the current repository:

$ cd my-project
$ changelog-rs
Enter fullscreen mode Exit fullscreen mode

For a release, add a version header:

$ changelog-rs --version 2.0.0
Enter fullscreen mode Exit fullscreen mode

This automatically adds today's date. To override the date:

$ changelog-rs --version 2.0.0 --date 2026-04-01
Enter fullscreen mode Exit fullscreen mode

For a changelog covering only the changes since the last release:

$ changelog-rs --from v1.0.0 --to HEAD --version 2.0.0
Enter fullscreen mode Exit fullscreen mode

To write directly to a file:

$ changelog-rs --version 2.0.0 -o CHANGELOG.md
Enter fullscreen mode Exit fullscreen mode

To include commit hashes for traceability:

$ changelog-rs --hashes
Enter fullscreen mode Exit fullscreen mode

Design decisions worth noting

Why regex instead of a handwritten parser? The conventional commit subject format is regular. It is a type, an optional parenthesized scope, an optional !, a colon, and a description. A single regex captures all four groups in one pass. A handwritten parser would be longer, harder to read, and no faster for strings under 200 characters.

Why Option<Commit> instead of Result? Non-conventional commits are not errors. They are expected. A repository with 100 commits might have 60 conventional ones and 40 merge commits, WIP messages, and freeform subjects. Returning None for non-matching lines lets the caller use filter_map naturally. Returning Err would force error handling for a non-error condition.

Why Keep a Changelog format? It is the most widely recognized changelog format. Tools like Dependabot and Renovate can parse it. Humans can read it without explanation. The section headings ("Added", "Fixed", "Changed") are clear enough that even non-developers can understand what changed.

Why deterministic section ordering? Using BTreeMap means the changelog sections always appear in the same order regardless of which commits are present. If a release only has features and fixes, the "Added" section always comes before "Fixed". This consistency makes changelogs easier to scan across releases.

What I would add next

If this tool were to grow, the most useful additions would be:

  1. Append mode: Read an existing CHANGELOG.md and prepend the new version section, preserving previous entries.
  2. Template support: Allow custom section headings and formatting via a simple template file.
  3. Commit body extraction: Parse multi-line commit messages to include detailed descriptions under each entry.
  4. Link generation: Auto-generate GitHub compare links for version headers.

But for now, the tool does one thing — converts conventional commits to a changelog — and does it reliably. That is enough to be useful.


changelog-rs is part of the SEN portfolio — a collection of 200 open-source tools and libraries built to demonstrate practical software engineering.

Top comments (0)