DEV Community

ItsEvilDuck
ItsEvilDuck

Posted on

One ledger, two chains — what I learned about multi-chain payment architecture from a reader's correction

In the previous post in this series I wrote about Base as the payment rail for QuackBuilds, and made an argument about why card-network economics break for certain payment shapes that onchain rails handle natively. The post landed cleanly enough on its own, but a reader’s correction in the comments — and a follow-up post on his own blog — opened up a deeper architectural question that I wasn’t ready to answer at the time and am still not ready to answer now. This post is what I learned from the exchange. It is not a roadmap for what I’m building next, because I’m honestly not sure yet. The architectural reasoning is good enough to be worth writing down regardless, and good enough to apply to whatever direction I eventually move in.
I should be upfront about where I am, since it shapes how to read what follows. I’m relatively new to this field, and the architecture in this post is more carefully considered than it is committed to. The Base side of QuackBuilds is in production. Beyond that, I’m holding the next direction loosely — not because I lack ideas, but because I’d rather wait until I’ve learned more before I publicly commit to a path. The reason I’m writing this post anyway, rather than waiting until I’ve built the next thing, is that the architectural reasoning is the part that benefits from public scrutiny. If the reasoning is wrong, I’d rather hear that now from someone who’s run multi-chain infrastructure in production than discover it whenever I do decide to act on it. Treat this post as me thinking out loud about a class of decisions I’ll eventually face, not as a plan I’ve signed.
Most of what’s useful in this post comes from a commenter named Paul (QBitFlow), who’s running non-custodial payment infrastructure on Ethereum, Solana, and Base in production and corrected me three times across two threads in the most substantive comment exchange I’ve had on this platform. He’s since published his own architectural writeup of the same territory, and I’m going to credit him throughout because the architectural pattern this post describes is his first, not mine.
The mistake I was making, before the exchange that prompted this post, was thinking about multi-chain architecture as a reconciliation problem. The mental model I had was something like: there are two chains, each with its own state, and the work is to keep them synchronized — or at least to keep their views of the world consistent enough that the accounting doesn’t drift. That framing is wrong, and it’s wrong in a way that produces wrong code. If you start from “two chains that need to be reconciled,” you build a system where each chain has its own ledger and the ledgers have to be merged at some boundary. The merge is where the bugs live. The merge is where the audit trail breaks. The merge is where the operational complexity compounds.
The correct framing, which Paul articulated cleanly enough that I’m going to quote it directly, is one ledger, chain-specific handling underneath. The ledger is the source of truth. The chains are implementation details below it. Every transaction the system cares about — regardless of which chain it cleared on, regardless of who initiated it, regardless of whether it was settlement or a refund — writes to the same ledger, in the same accounting format, with the same audit-trail semantics. The chain-specific code lives below the ledger and is responsible for the actual on-chain operations: signing, broadcasting, watching for confirmations, handling reorgs, managing fee dynamics. The ledger doesn’t know any of that. The ledger knows only that a transaction happened, what it was, who it was between, and what amount in what asset cleared. The mental shift this asks for is from horizontal to vertical: chains aren’t side-by-side with glue between them, they’re below an accounting layer that sits above all of them. Above, not between.
That inversion sounds small but it changes almost every downstream design decision. The accounting layer becomes uniform and human-readable, regardless of how many chains are underneath it. The post-hoc work — finance reconciliation, tax reporting, customer support, refund flows — operates on the ledger, not on the chains, which means most of that work is chain-agnostic. New chains can be added without rewriting the accounting layer; you just write a new chain-specific handler that knows how to talk to the new chain and how to report back to the ledger in the existing format. The architectural complexity, which would be quadratic in the number of chains under the reconciliation model, becomes linear under the ledger-first model. That’s the whole point.
Paul’s published writeup of his own multi-chain journey makes the same architectural pattern visible from a different angle, and the principle he distills is one I’m going to be carrying around for years: chain identifiers as first-class values, no hardcoded execution-model assumptions. The distinction he draws is between a codebase that says “we support Ethereum and Solana” — two specific chains baked into the data model, the webhooks, the SDKs, the dashboard — and a codebase that says “we support multiple chains, currently Ethereum and Solana” — chain identifiers as runtime values, chain-specific behaviors isolated behind interfaces, no if EVM else Solana branches scattered through the business logic. The two codebases look superficially similar in their first version. Six months later, when a third chain shows up, they look completely different. The first one needs a refactor; the second one just needs an implementation behind an interface that already exists. Worth noting that this principle holds whether or not a system ever ends up multi-chain — the discipline pays off the moment any chain other than the original one enters the system, regardless of which chain that turns out to be or whether it ever happens at all.
The deeper question, which Paul surfaced in a follow-up exchange, is why you’d architect for multiple chains in the first place. The merchant-driven case is the obvious one — your users will eventually want chains you don’t yet support, so you build for capacity to add them. There’s a second flavor I’d been thinking about as substrate-driven — chains added because the internal requirements of the system will eventually need them, even if no user is asking. Paul, correctly, pointed out that both flavors collapse into a deeper category: capability-driven multi-chain. The chain set falls out of the operation set, not out of who’s asking. For a payments rail, that means: accept any stablecoin, settle non-custodially, in seconds, at sub-cent cost when needed. No single chain delivers all four. Different operations would surface different chain capabilities; the design heuristic is to enumerate the operations the system actually needs to perform, identify which capabilities each requires, then ask which chains deliver which capabilities. Who’s asking is then a forcing function on timing, not a flavor. That reframe is the cleanest way to think about chain selection I’ve encountered, and it’s the lens I’d use to evaluate any chain decision I eventually make — whether that’s adding a second chain, sticking with one, or something else entirely.
The integration cost lives in four places, and the ledger isn’t one of them. The second thing Paul corrected me on, which was the more important correction because it’s the one that affects how I’d price this kind of work, was the location of the actual integration cost. I’d been mentally budgeting multi-chain work as learning two SDKs and writing two integration paths, and that framing is wrong in a way builders consistently get wrong. The SDKs themselves are good now — that’s the part most people overestimate. The expensive part lives in the translation layer plus the operational surface around it: reorg handling when validators flap, fee anomalies on L2s during congestion, refund flows for overpayments and edge cases, the audit trail that finance and tax both need, the monitoring that catches a chain anomaly before it shows up as a customer complaint. None of that is multi-chain-specific in theory — single-chain systems need most of it too — but in practice the operational cost of a two-rail architecture is dominated by the work of squaring two chains’ worth of finality models, fee dynamics, address formats, event semantics, and reorg behaviors into a single coherent financial view that downstream systems can actually rely on.
The honest implication, which I want to flag because most architecture posts conveniently skip it, is that the cost of going from one chain to two is either weekend-shaped or quarter-shaped, depending on whether the codebase was structured for new chains or for two specific chains. That bimodal framing comes directly from Paul’s published experience — the Base addition to QBitFlow’s existing Ethereum-and-Solana stack was a weekend, and the reason it was a weekend rather than a quarter was almost entirely the day-one discipline of treating chains as runtime values rather than hardcoded assumptions. If you did the upfront work, additional chains are an interface implementation. If you didn’t, additional chains are a refactor. In practice, codebases tend to cluster at the two ends because the discipline either gets applied as a doctrine or doesn’t get applied at all — partial discipline tends to evaporate under the first deadline pressure. The architecture commits you to one of those two futures from very early on, often before you realize you’ve made the choice. Anyone telling you the cost is 2x is averaging across the two paths and producing a number that doesn’t describe either of them.
The third thing from the exchange that shifted my thinking is the architectural principle that I’m going to be quoting back to myself for years. Paul’s framing was: “maximum security and transparency where it counts, minimum on-chain footprint where it doesn’t.” The instinct in crypto-native systems, and one I’d been quietly drifting toward, is to put as much of the application state on-chain as possible — for verifiability, for transparency, for the cypherpunk virtue of cryptographic guarantees. That instinct breaks down fast in production. Fees pile up. Latency suffers. Most of the state doesn’t actually need cryptographic guarantees to be correct; it just needs to be correct, which is a much cheaper property to deliver. The discipline that holds up is putting the security-critical path on-chain — settlement, custody, authorization, anything where trust assumptions matter — and building around it with off-chain accounting and orchestration to keep the cost reasonable. Maximum transparency where it counts. Minimum on-chain footprint where it doesn’t. That principle holds regardless of how many chains are eventually involved, and it’s one of the cleanest ways I’ve encountered to think about what should actually be on-chain — a question I’d been answering by default rather than by design.
There’s a related lesson in Paul’s published reflection on QBitFlow’s first six months in production that I want to surface explicitly, even though it’s the lesson I’m still wrestling with the hardest. He wrote that if they were starting today, they would have shipped their second EVM chain in v1 alongside the original pair, even before merchant demand fully justified it — because the chain-specific code paths were going to need to exist anyway, the upfront cost of adding them while the codebase was small was lower than the cost six months later, and the conversion benefit was genuinely expensive to delay. That’s a strong argument for early commitment. The counter-argument, and the reason I’m not just acting on his lesson immediately, is that his case had merchant demand visible enough to retrospectively validate the discipline. I don’t yet have that kind of visible demand for any specific direction, and I’m new enough to this field that I’d rather learn more before committing to a path the architecture should be optimized around. Holding decisions open isn’t the same as having no plan; it’s a deliberate choice to keep the option space wide while I’m still calibrating my own judgment.
There’s a section of this post I almost didn’t write, because the architectural decisions in it are clean enough that the political dimension feels like a tangent. But Paul named something in our exchange that I think most engineering writing on this topic skips, and that I want to address directly. The framing was: merchant-driven discipline gets validated continuously by user requests; substrate-driven discipline has to survive a year of “why are we paying this complexity tax for a thing nobody asked for.” That’s exactly right, and it’s the version of the politics problem most architecture posts under-describe. The merchant-driven case carries its own justification — every new merchant request is a fresh data point that proves the discipline was correct. The substrate-driven case, and by extension the capability-driven case where the capability isn’t yet being exercised, has no continuous validation. The discipline costs real time and real complexity for a payoff that’s invisible by definition until the moment it pays off, and the larger the eventual payoff, the more invisible it is in the meantime. That’s an unfunded discipline tax. It’s the interval where engineering leaders quietly start dismantling the abstraction layers they don’t see being used. Same architectural conclusion, very different organizational politics. Anyone planning a substrate-driven multi-chain architecture should know this is the political shape of the work — not because the architecture is wrong, but because the discipline has to survive an interval of looking like overhead before it starts looking like foresight.
The catalog-vs-rail asymmetry — a comparison Paul surfaced and which I’ve been thinking of in those terms since — is worth naming here because it sharpens the politics question further. The avoided cost of substrate-driven discipline is structurally larger on the catalog side than on the payment-rail side, because the refactor cost compounds across every app in a catalog that bakes in state-shape assumptions. By the time the substrate-driven case is forcing the issue, you’d be retrofitting many apps instead of one payment rail. The fact that the discipline pays off more in the catalog case is, paradoxically, exactly why it’s harder to justify upfront — the avoided cost is invisible by definition, and the larger the avoided cost, the more invisible it gets. If you’re an engineering leader being asked to grant a discipline budget for a chain abstraction whose payoff isn’t yet visible, this paragraph is permission to grant it anyway. Paul flagged in a follow-up exchange that this dynamic generalizes well past chain architecture — it applies anywhere the discipline budget is paid in a currency the organization doesn’t track until the bill arrives, including observability investment, schema discipline, and type-safety adoption curves. The class of problem is broader than the instance, which is part of why naming it explicitly matters.
What’s left, given all of that, is the synthesis I’m walking away with: a set of architectural principles I’m reasonably confident about and a set of build decisions I’m not yet ready to make. The principles came from Paul, in scattered comments and a published post; the synthesis — the connecting tissue between them and the application to a catalog rather than a payment rail — is mine to commit to or not. Ledger-first architecture, chain identifiers as first-class values, capability-driven chain selection rather than identity-driven, maximum security on-chain where it matters and minimum on-chain footprint where it doesn’t, and an honest accounting of the political cost of substrate-driven discipline. Those are durable. They’ll apply to whichever direction I eventually take QuackBuilds. The build decisions — whether to add a second chain, when, which one, on what timeline — are downstream of decisions about product direction that I haven’t fully made yet, and I’d rather acknowledge that openly here than perform a roadmap I haven’t committed to. The architecture is ready for the decisions when I’m ready to make them.
What I’m taking away from this entire exchange — both the comment threads and Paul’s published writeup — is something larger than the specific architectural pattern, and I want to name it because it’s the kind of meta-lesson that’s easy to miss while you’re inside the work. Public technical writing, when done with even modest seriousness, is a way to access the expertise of people you have no other way to reach. The framings that are now load-bearing in my architectural thinking didn’t come from a paid consultant or a senior engineer at my company or a paper I read; they came from comments and a published post by someone running production infrastructure I didn’t know existed, who took the time to write thoughtful paragraphs into comment boxes because the posts raised questions worth his time. That doesn’t happen if the post is hedged, defensive, or selling something. It happens when the post is honest about what the writer doesn’t know and explicit about where the questions are. The architectural reasoning in this post is meaningfully better than what I would have arrived at without the exchange, and the cost of accessing that improvement was approximately the time I spent writing the original post. This kind of exchange isn’t typical — most public technical writing doesn’t surface contributors of this caliber, and I don’t want to overclaim that the platform reliably produces it — but when it does happen, it is one of the most asymmetric returns on time available to an independent builder.
If you’re a builder reading this and wondering whether writing publicly about your half-formed technical decisions is worth the time, this exchange is one data point in favor. The bar is honesty, not polish. The reward, when it lands, is people you didn’t know you needed, finding the work and helping you make it better. Being upfront about where you’re new and where you’re still deciding is part of what makes the exchange productive. Performing more certainty than you have closes off the kind of correction that this post benefited from most.
The next post in this series will probably be whatever direction I do eventually pick, written up after I’ve made enough of the decision to have something concrete to report. Until then, the architecture is sitting here, available for application, and the comments are open if any of the reasoning is wrong.

@itsevilduck 🦆 / quackbuilds.com

Top comments (2)

Collapse
 
qbitflow profile image
QBitFlow

One sharpening on the "weekend-shaped or quarter-shaped" line, since it's the one I'd most expect engineers to push back on with their own experience: the bimodal shape is real, but the sequencing matters more than the binary. The expensive chain in our case wasn't the second one to ship — it was the first non-EVM one to implement. We shipped Ethereum and Solana on day one, but Ethereum was the path I worked out completely, and Solana was the one that forced the architectural question: do I bake EVM-vs-Solana into the data model, or do I treat chain as a runtime value? Doing the latter is what made Base a weekend later. The honest version of the bimodal claim is something like: the first non-EVM chain pays for the discipline; every subsequent chain (EVM or otherwise) compounds the return. The trap is that the cost of the discipline arrives before you can validate it with users — you're paying for the abstraction during implementation of chain #2, before the value of having it shows up in chain #3.

Collapse
 
itsevilduck profile image
ItsEvilDuck

Paul — this is the sharpening the post needed and didn’t have. The “first non-EVM chain pays for the discipline; every subsequent chain compounds the return” framing is more accurate than the bimodal version I had, and it’s also more useful to anyone reading this who hasn’t yet shipped non-EVM. The bimodal claim describes the outcome; your version explains the mechanism — it’s the moment of EVM-vs-non-EVM forcing the data-model question that triggers either the discipline or the shortcut, and once the discipline is paid for, every chain after that is interface implementation rather than architectural rework.
The implication for QuackBuilds specifically, which I’m only now thinking through, is that the question of whether to add a second chain isn’t really about the which — it’s about whether I’m willing to pay the abstraction cost during chain #2 to avoid paying it during chain #3 or #4. That changes the framing of the decision in a useful way. The cost isn’t sunk against a specific second chain; it’s sunk against the architectural shape of the system going forward.
Going to be thinking about this for a while. Thank you for the production-evidence sharpening — the post is meaningfully better with this comment underneath it than it is on its own.

— Luke / @itsevilduck 🦆 quackbuilds.com