DEV Community

benzsevern profile picture

benzsevern

Building open-source data quality tools in Python. Creator of the Golden Suite: GoldenMatch, GoldenFlow, GoldenCheck, GoldenPipe, and InferMap. 2,400+ tests, 10K+ monthly downloads on PyPI.

Location Pennsylvania, United States of America Joined Joined on  Email address benzsevern@gmail.com Personal website https://bensevern.dev github website

Education

West Chester University of Pennsylvania

Work

Creator of the Golden Suite

Reconciling 15 OSS Vulnerability Databases: What They Actually Cover

Reconciling 15 OSS Vulnerability Databases: What They Actually Cover

Comments
12 min read

Want to connect with benzsevern?

Create an account to connect with benzsevern. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
Wallet Attribution at Scale: ER on 13M Blockchain Records

Wallet Attribution at Scale: ER on 13M Blockchain Records

Comments
11 min read
The OSS ER Bargain: What Entity Resolution Actually Costs You

The OSS ER Bargain: What Entity Resolution Actually Costs You

Comments
9 min read
Golden Suite + MCP: Giving AI Agents a Data Cleaning Toolkit

Golden Suite + MCP: Giving AI Agents a Data Cleaning Toolkit

1
Comments
5 min read
From Dirty CSV to Golden Records: A Python Walkthrough

From Dirty CSV to Golden Records: A Python Walkthrough

Comments
10 min read
GoldenMatch vs. Splink vs. Dedupe vs. RecordLinkage: A Practical Comparison

GoldenMatch vs. Splink vs. Dedupe vs. RecordLinkage: A Practical Comparison

Comments
8 min read
GoldenMatch vs. BPID: Testing Against an EMNLP Benchmark

GoldenMatch vs. BPID: Testing Against an EMNLP Benchmark

Comments
7 min read
Deduplicating 401,000 Equipment Auction Records with LLM Calibration

Deduplicating 401,000 Equipment Auction Records with LLM Calibration

Comments
6 min read
AI-Powered Deduplication: How LLMs Supercharge the Golden Suite

AI-Powered Deduplication: How LLMs Supercharge the Golden Suite

Comments
8 min read
Getting Started with GoldenPipe: Clean Data in Your Python Backend

Getting Started with GoldenPipe: Clean Data in Your Python Backend

Comments
6 min read
Entity Resolution on 208,000 Real Records with the Golden Suite

Entity Resolution on 208,000 Real Records with the Golden Suite

Comments
7 min read
10 Data Problems Every Pipeline Hits (and the One-Liner Fixes)

10 Data Problems Every Pipeline Hits (and the One-Liner Fixes)

Comments
4 min read
Two Hospitals Matched Patient Records Without Sharing a Single Name

Two Hospitals Matched Patient Records Without Sharing a Single Name

Comments
4 min read
I Deduplicated 100K Records in 12 Seconds With One Command

I Deduplicated 100K Records in 12 Seconds With One Command

Comments
5 min read
How to Deduplicate 100,000 Records in 13 Seconds with Python

How to Deduplicate 100,000 Records in 13 Seconds with Python

Comments
3 min read
loading...