git diff shows you lines. But when you're reviewing code, you think in functions, classes, and methods.
I built https://github.com/Ataraxy-Labs/sem, a CLI that uses tree-sitter to break source code into semantically meaningful chunks and diff them as individual entities.
What it looks like?
On a recent commit that added Dart language support, git diff showed x lines of changes across lock files and source code. sem diff showed this:
crates/sem-core/src/parser/plugins/code/entity_extractor.rs
∆ function find_name_byte_range [modified]
∆ function visit_node [modified]
⊖ function extract_name [deleted]
⊕ function walk_dart_class_member [added]
⊕ function map_class_member_type [added]
5 entities changed. That's what a reviewer actually needs to know.
Impact analysis
The part I find most useful: point it at any function and it shows everything that depends on it, transitively, across the whole repo.
$ sem impact visit_node
→ depends on: 13 functions
← depended on by: extract_entities, extract_ocaml_named_bindings
! 2 entities transitively affected
Before you refactor something, you know exactly what's downstream.
Commands
- sem diff - entity-level diff with word-level highlights
- sem entities - list all entities in a file with line ranges
- sem impact - show what breaks if an entity changes
- sem blame - git blame at the entity level
- sem log - track how an entity evolved over time
- sem context - token-budgeted context for LLMs
Supports 20+ languages (Rust, Python, TypeScript, Go, Java, C, C++, Ruby, Swift, Kotlin, and more). Written in Rust. Open source.
Top comments (0)