DEV Community

Cover image for Building a Rails Engine #16 --Publishing to RubyGems & Retrospective
Seryl Lns
Seryl Lns

Posted on

Building a Rails Engine #16 --Publishing to RubyGems & Retrospective

Publishing to RubyGems & Retrospective

From bundle gem to gem push: looking back at 14 articles, 20 components, and the lessons learned building a Rails engine from scratch with TDD.

Context

This is the final article in the series where we build DataPorter, a mountable Rails engine for data import workflows. In part 14, we added Dry Run mode -- the last safety net before data touches the database.

We started this series with a question: why do we keep rebuilding the same import workflow in every Rails app? Fourteen articles later, we have a published gem that answers it. This article covers the last mile -- publishing to RubyGems -- then steps back to look at what we built, what we learned, and what we would do differently.

Publishing the gem

The gemspec

The interesting parts of the gemspec are not the metadata -- they are the constraints:

# data_porter.gemspec
Gem::Specification.new do |spec|
  spec.name = "data_porter"
  spec.version = DataPorter::VERSION
  spec.required_ruby_version = ">= 3.2.0"

  spec.metadata["rubygems_mfa_required"] = "true"

  spec.add_dependency "csv"
  spec.add_dependency "phlex", ">= 1.0"
  spec.add_dependency "rails", ">= 7.0"
  spec.add_dependency "store_model", ">= 2.0"
  spec.add_dependency "turbo-rails", ">= 1.0"
end
Enter fullscreen mode Exit fullscreen mode

rubygems_mfa_required enforces multi-factor authentication for publishing -- a standard for any serious open-source gem. required_ruby_version at >= 3.2.0 excludes unmaintained Ruby versions. Runtime dependencies are intentionally wide (>= 1.0, >= 7.0) to avoid locking host apps to specific versions.

The spec.files filter excludes dev files (spec/, bin/, .github/) so the published gem only contains production code. Nobody wants to download 2 MB of specs when installing a gem.

Versioning

DataPorter follows semantic versioning:

  • 0.1.0: first release. The 0.x signals that the API may still evolve.
  • 0.x.y: each new feature increments minor, each bugfix increments patch.
  • 1.0.0: comes when the API is stabilized and battle-tested in production across multiple apps.

The version number lives in a single file:

# lib/data_porter/version.rb
module DataPorter
  VERSION = "0.1.0"
end
Enter fullscreen mode Exit fullscreen mode

One place to update. The gemspec reads it via require_relative. The CHANGELOG references it. The Git tag matches it. No duplication.

The release workflow

# 1. Update version
# lib/data_porter/version.rb -> VERSION = "0.1.0"

# 2. Update CHANGELOG
# CHANGELOG.md -> ## [0.1.0] - 2026-02-06

# 3. Commit, tag, push
git add -A && git commit -m "Release v0.1.0"
git tag v0.1.0
git push origin master --tags

# 4. Build and push
gem build data_porter.gemspec
gem push data_porter-0.1.0.gem
Enter fullscreen mode Exit fullscreen mode

Or, if the Rakefile includes bundler/gem_tasks:

bundle exec rake release
Enter fullscreen mode Exit fullscreen mode

This single command builds, tags, pushes to Git, and pushes to RubyGems -- guaranteeing the tag and the gem stay in sync.

Documentation

A gem without documentation is a gem nobody will use. DataPorter relies on three layers:

The README: entry point. Install in one command (rails generate data_porter:install), a 15-line Target example, the three-step workflow diagram. A developer should understand what the gem does and install it in under 5 minutes.

The CHANGELOG: every release documented with what changed, what was added, what broke. Keep a Changelog format -- a standard the Ruby community knows.

Inline comments: every public method documented with YARD. The DSL is the critical part -- column, sources, csv_mapping, persist need examples, because that is what users will read most.

What we built

Here is the complete list of components that make up DataPorter, in the order we built them:

# Component Role
1 Engine + isolate_namespace Gem structure, namespace isolation
2 Configuration DSL DataPorter.configure, defaults, context_builder
3 StoreModels (ImportRecord, Error, Report) Typed JSONB structures without extra tables
4 TypeValidator Type validation (email, phone, url) on columns
5 Target DSL label, model, columns, sources, persist
6 Registry Auto-discovery and resolution of targets
7 Source::Base + Source::CSV Source abstraction, CSV parsing with mapping
8 DataImport model ActiveRecord, enum status, polymorphic user
9 Orchestrator Coordinates parse/import, per-record error handling
10 RecordValidator Generic validations (required, type)
11 ParseJob + ImportJob Background processing via ActiveJob
12 Broadcaster + ImportChannel Real-time progress via ActionCable
13 6 Phlex components StatusBadge, SummaryCards, PreviewTable, ProgressBar, ResultsSummary, FailureAlert
14 Stimulus controller Client-side progress bar animation
15 ImportsController Dynamic inheritance, 7 actions, Turbo integration
16 Install generator Migration, initializer, routes, importers directory
17 Target generator Target scaffolding with column parsing
18 Source::JSON Import from JSON file or raw text
19 Source::API Import from HTTP endpoint with auth and params
20 Dry Run Transaction + rollback, enriches records with DB errors

Twenty components. Each with its specs. Each with an article explaining why it exists and how it works.

Lessons learned

TDD without a dummy app

The most consequential decision of the series: testing the engine without creating a Rails application in spec/dummy/. A 60-line spec_helper.rb that bootstraps in-memory SQLite, configures load paths, and stubs ApplicationController. It works, and it works well -- the full suite runs in under a second.

The unexpected benefit: this constraint forces every component to stay decoupled. If a component needs a router to be tested, that is a signal it is too tightly coupled to the framework. Structural controller tests (verifying inheritance, callbacks, method signatures) felt strange at first. In hindsight, they test exactly what the gem owns -- the wiring -- and leave integration testing to the host app.

The trap to avoid: duplication between the schema in spec_helper.rb and the migration template. If the two diverge, tests pass but the generated migration does not match what was tested.

StoreModel gotchas

StoreModel is powerful, but it has its subtleties:

Dirty tracking: when you modify an object inside a store_model attribute, ActiveRecord does not detect the change. You can set data_import.records.first.status = "complete" and call save -- nothing gets persisted. The fix: call records_will_change! before modifying, or reassign the entire attribute.

Serialization round-trip: symbol keys become string keys after save/reload. { name: "Alice" } comes back as { "name" => "Alice" }. You need to know this and code accordingly -- either always use string keys, or call symbolize_keys on the way out. DataPorter does the latter in ImportRecord#attributes.

SQLite vs PostgreSQL: in tests, StoreModel columns are text. In production, they are jsonb. StoreModel handles the difference transparently, but certain JSONB queries (indexes, contains) cannot be tested in SQLite. An acceptable tradeoff for the speed of the feedback loop.

Phlex in an engine: plain vs text

A Phlex-specific trap: to emit raw text inside an element, you must use plain (not text). In earlier Phlex versions, text existed but was renamed. If you use text with a recent version, you get a cryptic NoMethodError.

The other subtlety: calling super() in every component's initialize. Phlex requires it, and forgetting it produces silent errors or empty renders.

Testing patterns: controllers, channels, JS

Testing JavaScript from Ruby by reading the file as text and asserting on strings -- it sounds hacky. In practice, it catches the most common bug class in an engine: misalignment between Ruby and JS code. The channel is called DataPorter::ImportChannel in Ruby and "DataPorter::ImportChannel" in JS. If one changes and the other does not, the test fails. For a single 30-line Stimulus file, that beats adding Jest and node_modules to the project.

Structural controller tests (_process_action_callbacks, instance_method, superclass) form a contract: the gem guarantees the controller has the right shape. The host app guarantees it behaves correctly in context. A clean separation of responsibilities.

What is next

DataPorter 0.1.0 covers the standard workflow. Here is what could come in future versions:

Batch imports: for 100k+ line files, import in batches of 1000 with insert_all instead of create! per record. This requires rethinking the persist contract -- instead of one record at a time, the target would receive a batch.

Streaming progress: replace ActionCable with Server-Sent Events (SSE) for apps that do not need bidirectional WebSocket. Lighter, no Redis dependency.

Custom validators: let targets declare validators with a DSL:

columns do
  column :email, type: :email, required: true, validate: ->(val) {
    "already exists" if User.exists?(email: val)
  }
end
Enter fullscreen mode Exit fullscreen mode

Export: the reverse path. If we can parse and validate records, we can serialize them to CSV/JSON. The Target already has all the information needed (columns, types, labels).

Final reflection

Building DataPorter was an exercise in discipline as much as code. The method -- Taskmaster for planning, TDD for implementation, one article to document each step -- forces explicit decisions. No "we will figure it out later". Every component exists because a test demands it, and every test exists because a behavior was specified.

The choice to skip the dummy app was a gamble. It paid off: tests are fast, components are decoupled, and the gem is testable without Rails infrastructure. But it has a cost -- some integration bugs will only surface in the host app. That is an accepted tradeoff: the gem tests its wiring, the host app tests its behavior.

StoreModel, Phlex, Stimulus -- each dependency brought its share of surprises. StoreModel's dirty tracking, Phlex's plain vs text, Stimulus's double-dash naming for engines. These gotchas appear in no documentation. They appear when a test fails at 11 PM and you read the gem's source code to understand why. That is the real advantage of TDD: you discover problems in the terminal, not in production.

DataPorter is now a published gem on RubyGems. One bundle add data_porter, one rails generate data_porter:install, a 15-line Target, and any Rails app has a complete import system with preview, validation, real-time progress, and dry run.

That was the plan from the start. It took 16 articles to get there.


This is part 16 of the series "Building DataPorter - A Data Import Engine for Rails". Previous: ERB Views Meet Phlex Components


GitHub: SerylLns/data_porter | RubyGems: data_porter

Top comments (0)