Pengdows LLC

Posted on Apr 21

Dapper Has 464 Open Issues. I Had My pengdows.crud Codebase Audited Against Every One of Them.

#dapper #adonet #opensource

When I built pengdows.crud, I wanted every line to be testable. That meant building a fake provider — a full in-process ADO.NET implementation that lets you run tests without a real database. pengdows.crud ships with 94%+ line coverage as a result.

That same instinct led me to look at Dapper's test coverage: 0.61%. So I wrote 775 unit tests and submitted PR #2199, bringing their line coverage to 86.1%. I know that codebase.

Dapper currently has 464 open issues. Issue triage has stalled — the needs-triage label has become a holding state rather than part of an active triage pipeline. Releases still occur, but they don't materially reduce the issue backlog. The maintainers have moved on to DapperAOT, a build-time code generation successor. Those 464 issues are not a backlog being worked down. They are the permanent state of that codebase. I'm not writing this to pile on a library that has done genuinely useful work. I'm writing this because I wanted to know, precisely, whether pengdows.crud makes any of the same mistakes — the same classes of bugs, the same structural patterns. We're both doing a lot of the same things at the ADO.NET level. Similar mistakes are possible. I went into this asking "am I doing this wrong too?" — and had the codebase audited against every theme in Dapper's backlog to find out.

Before I get to results, the most important context: pengdows.crud and Dapper aren't competing solutions that made different tradeoffs. They're answers to different questions, independently designed around different constraints.

The core architecture — connection lifecycle ownership as the foundation, SqlContainer as the execution unit, TableGateway as the SQL-generation layer — was designed from scratch around the constraint that connection lifetime, parameter naming, and SQL construction must never be the caller's responsibility.

Dapper asked: How do I make raw ADO.NET less painful at the call site?

pengdows.crud asked: How do I make connection lifecycle and SQL construction safe and explicit at scale?

Dapper's bugs flow directly from its question. When you optimize for call-site convenience, you push lifetime management, parameter naming, and composition discipline onto the caller. The 464-issue backlog is the accumulated cost of that trade — not a failure of execution, but a consequence of the original design goal.

pengdows.crud doesn't share those bugs because it never made that trade. The safety properties aren't retrofitted. They're load-bearing, baked in from the start, and that matters — you can't patch your way to a different architecture.

The caller never owns connection lifecycle under any execution path. That's the core invariant. Everything that follows is a consequence of it.

With that established, here's the actual breakdown across every issue cluster in Dapper's backlog.

Issue classification was performed using a reproducible script against the GitHub Issues API. The script separates bug-like issues from feature requests and questions, then assigns each to a primary category based on heuristic matching. The full classification output is available as CSV for audit and sampling. The script and generated CSVs are in the repository for verification.

Of Dapper's 464 open issues, 270 classify as bug-like under this analysis (as of April 21, 2026). Here's how they map to pengdows.crud's architecture:

Bug Cluster	Open Bugs	% of Bugs	Outcome in pengdows.crud
Parameters / type handling	123	45.6%	Eliminated — no global handler registry; explicit per-instance construction
Mapping / materialization	68	25.2%	Fail-fast controlled — explicit column mapping; throws on bad input
Async / cancellation / lifetime	35	13.0%	Eliminated — caller never owns connection under any path
Provider compatibility / dialect	17	6.3%	Mitigated — dialect layer centralizes; CI tests 11 databases
Performance / caching / concurrency	7	2.6%	Eliminated — bounded caches; finite key spaces; no global state
Diagnostics / docs / usability	1	0.4%	Not applicable
Uncategorized	19	7.0%	—

A portion of issues fall outside these categories and are left uncategorized; they were not material to the overall distribution.

83% of Dapper's open bugs fall into categories that are structurally eliminated or fail-fast controlled in pengdows.crud. The remaining bugs are provider-drift issues that no abstraction layer can fully eliminate — only centralize and detect.

Structurally Eliminated

These two categories cannot occur without violating pengdows.crud's invariants. They're not "handled well" — they're unexpressible in the current design.

Connection / Reader Lifetime

Dapper borrows your connection. Lifetime is your problem. When an async path throws mid-execution, what gets cleaned up depends on where the exception lands. Dapper cannot fix this without breaking its core contract — the extension method model assumes the caller owns the connection.

In pengdows.crud, callers never work with a DbConnection or DbCommand directly — under any execution path.

In normal execution, callers build SQL through SqlContainer and call an execution method. Internally, the context acquires a TrackedConnection, creates the command, executes, and runs cleanup in a finally block. The caller never sees any of it. SafeAsyncDisposableBase underlies every tracked type; Interlocked.Exchange ensures idempotent disposal — double-dispose is a no-op, not a second cleanup pass.

The only externally visible streaming surface is ITrackedReader — and even that is a controlled façade, not a raw provider object. TrackedReader holds the connection lock for its entire read lifetime, owns command teardown, and auto-disposes when Read() reaches EOF. The caller streams rows; pengdows.crud owns everything beneath.

Transactions preserve the same invariant. BeginTransactionAsync() returns an ITransactionContext. Internally, a tracked connection is acquired, pinned, and held privately for the transaction's lifetime. The ITransactionContext exposes commit, rollback, and savepoint semantics — not the connection. All SQL execution within the transaction still routes through SqlContainer and the same internal acquisition path. On commit or rollback, cleanup runs in finally regardless of outcome. The caller controls transaction outcome; pengdows.crud owns connection lifetime.

This invariant holds without exception: connection ownership is never representable at the API boundary. The class of bugs that requires the caller to be the connection lifecycle authority — leaked connections, orphaned commands, partial async cleanup, connection reuse after rollback — cannot occur due to caller misuse.

Parameter Naming / Collisions

Dapper's parameter model is caller-controlled. You name parameters, you manage composition, you handle prefix conventions per provider. When you get that wrong — and composition bugs are easy — you get silent incorrect results or runtime errors with unhelpful messages.

pengdows.crud uses deterministic, namespace-isolated naming for all generated parameters: i0, i1 for INSERT values; s0, s1 for UPDATE SET clauses; w0, w1 for WHERE predicates; v0 for version columns. These namespaces don't collide by construction. Prefix stripping normalizes provider-specific prefixes (@, :, ?, $) on input. Clone counters ensure copied containers get independent parameter sets.

The parameter container is a custom OrderedDictionary<string, DbParameter> — per-instance, ordered (critical for positional providers like older Oracle and ODBC drivers), not shared across threads. There is no global parameter state to corrupt.

Composition collisions require the naming system to produce a collision. It cannot.

Controlled (Fail-Fast, Fully Tested)

These categories aren't structurally impossible — provider behavior can still produce surprises — but pengdows.crud handles them explicitly with fail-fast semantics and comprehensive test coverage. The blast radius is contained.

Cancellation Semantics

Dapper's cancellation story is a retrofit. The synchronous-first design got async layered on top, and the seams show in open issues for missing CancellationToken overloads and OperationCanceledException being swallowed in certain paths.

In pengdows.crud, cancellation tokens flow through both the semaphore acquisition layer and the execution layer. OperationCanceledException is never swallowed. Every public async method has a CancellationToken overload — this is a code review hard requirement, not a backlog item.

Provider behavior at the network level can still produce surprising cancellation timing (Npgsql and SqlClient behave differently under load). That becomes a provider problem, not an abstraction problem.

Null / Value Coercion

Dapper has open issues where type coercion fails silently — a null becomes a default value, a boolean coerces to 0 or 1 depending on provider, and nothing throws. Silent defaults are the worst category of data bug because they corrupt data without raising an exception.

TypeCoercionHelper throws on bad input. There are no silent defaults. The philosophy is fail-fast, not fail-silent.

Edge cases remain: DBNull, driver-specific structs, JSON column handling. These aren't eliminated, but they fail loudly so you know immediately where and why.

IN-List Expansion

Dapper's WHERE id IN (@ids) handling is one of its most-reported problem areas: empty collections generating invalid SQL, NULL semantics ambiguity, and query plan instability from variable-length parameter lists.

Empty collections are rejected explicitly. NULL semantics are handled correctly. On PostgreSQL, expansion uses ANY(@param) with a native array — one parameter, correct semantics, stable query plan. PostgreSQL's query planner caches plans by parameter count, so a 5-element list and a 6-element list produce different plan cache entries; ANY(@param) sidesteps this entirely. For other providers, parameter lists use power-of-2 bucketing (round up to 1, 2, 4, 8, 16...) to limit plan cache pollution.

Parameter limits are not an edge case left to the provider. Every dialect declares a hard ceiling as part of its contract — PostgreSQL at 32,767, SQLite at 999, MySQL/MariaDB and Oracle and DuckDB at 65,535. During command materialization, pengdows.crud checks _parameters.Count against _context.MaxParameterLimit. If the limit is exceeded, execution is blocked and InvalidOperationException is thrown naming both the limit and the database product — before a connection is opened, before a single byte reaches the server.

Dapper expands the list and lets the provider fail. pengdows.crud fails at construction time with a message that tells you exactly what went wrong and on which database.

That said, "enforced" doesn't mean "efficient." A collection of 50,000 IDs still produces a bad query shape regardless of how cleanly the limit is handled. At that scale the right answer is a temp table, a bulk insert, or a join — not an expanded IN-list. pengdows.crud catches the limit violation; it doesn't rewrite your query strategy for you.

Mitigated (Isolated, Not Eliminated)

These categories carry real residual risk. pengdows.crud centralizes and contains exposure, but cannot make external behavior deterministic.

Provider-Specific Quirks and Version Drift

No library eliminates provider bugs. pengdows.crud's RemapDbType() handles type remapping per provider. GuidStorageFormat handles the fact that Oracle, MySQL, and SQL Server all store GUIDs differently. AdvancedTypeRegistry handles provider-specific type edge cases. MakeParameterName() and WrapObjectName() own their respective concerns rather than delegating to callers.

The real win is centralization: when a provider changes behavior, one place needs to change. 11-database Testcontainers integration tests in CI make drift detectable. The TiDB dialect has a comment noting a MySql.Data prepare-statement incompatibility with no version numbers or upstream issue link — that's the visible symptom of this category. Version drift happens; the question is whether you find it in CI or in production.

The accurate claim: provider bugs are isolated and test-detectable, not impossible.

Metadata Caching

Dapper's global static ConcurrentDictionary caches compiled deserializers keyed by arbitrary SQL strings. Two problems: global scope means cross-query contamination, and the key space is unbounded.

pengdows.crud uses a different architecture. SqlContainer parameters use a per-instance custom OrderedDictionary — nothing shared, nothing global. Query and parameter-name caches use BoundedCache inside ConcurrentDictionary<SupportedDatabase, BoundedCache<...>> — LRU eviction with 32–512 entry caps, keyed by a finite enum. Metadata registry uses ConcurrentDictionary<Type, TableInfo> — keyed by entity Type, which is finite in a loaded assembly, not by arbitrary SQL strings.

Dapper's problem was global dictionaries keyed by arbitrary query strings. That pattern doesn't exist here. Unbounded growth isn't just "handled" — the key space design removes the growth vector.

The residual risk is operational: TypeMapRegistry entries live for the lifetime of the DatabaseContext instance. If a schema changes during a rolling deploy, cached TableInfo will not reflect it until the process restarts. There is no runtime invalidation. Each DatabaseContext maintains its own isolated registry — there is no cross-context contamination — but within a context, pengdows.crud assumes schema stability for the process lifetime.

Observability

Dapper's logging extensibility is one of its most-requested missing features. pengdows.crud has built-in structured observability. The notable design decision: parameter values are deliberately never logged.

That's the right security default — logging parameter values is how credentials end up in log aggregators and PII ends up in SIEM systems. Command text, timing, and execution metadata are captured. Values stay out of the log.

The tradeoff is real: debugging parameter-specific issues requires reproduction in a test harness, not log inspection. You cannot read a log and see what value was passed. That's the cost of the security boundary, and it's deliberate.

The Accurate Summary

Here's what the audit actually established:

State	Categories
Eliminated	Connection/reader lifetime ownership; parameter naming collisions
Controlled	Cancellation semantics; null/value coercion; IN-list expansion
Mitigated	Provider quirks; version drift; metadata staleness; observability tradeoffs

The strongest claim — and it holds — is this:

pengdows.crud removes caller-induced failure modes. It does not remove provider-induced failure modes.

Dapper's design pushes connection lifetime, parameter naming, SQL construction discipline, and transaction scoping onto the developer. That was a deliberate choice in service of call-site elegance, and it was coherent. pengdows.crud was independently designed around the opposite constraint: those concerns belong to pengdows.crud, not the caller.

Most of the bugs in Dapper's 464-issue backlog exist because the caller was handed responsibility the library didn't keep. When the caller owns connection lifetime, callers leak connections. When the caller names parameters, callers create collisions. When the library provides thin provider abstraction, provider differences become caller bugs.

pengdows.crud owns those responsibilities. So those caller-induced bugs don't have a place to live.

The database is still external. Providers still have bugs. Schema still changes. Those are real risks and this article doesn't pretend otherwise — the Mitigated category exists for exactly that reason.

But Dapper's backlog is not pengdows.crud's backlog. The failure modes are different because the responsibilities were never handed to the caller in the first place.

pengdows.crud is a SQL-first, strongly-typed data access layer for .NET 8+ supporting 12 databases with full connection lifecycle management, explicit parameter construction, and dialect-native SQL generation. NuGet | GitHub

DEV Community