Lakshay Tyagi

Posted on May 25 • Originally published at imlakshay08-complete-ruby-on-rails.hashnode.dev

How I Built a Rails App That Syncs Empire Flippers Listings to HubSpot and Google Sheets

#ruby #rails #webdev #api

A Rails integration app that syncs Empire Flippers listings to HubSpot and Google Sheets using Sidekiq, PostgreSQL, and service objects.

The Origin Story

This project started as a coding challenge for a job interview at Empire Flippers — the #1 marketplace for buying and selling online businesses. The challenge was simple on paper:

"Use our Public Listings API as an input data source, store listings in a database, create Deal objects in HubSpot, and run the sync once per day."

What followed was five days of learning Ruby on Rails from scratch, debugging SSL certificates on Windows, fighting libcurl DLLs, pagination loops that ran forever, and ultimately shipping a working integration with 213 real HubSpot deals and a live Google Sheet — all with a full RSpec test suite.

This is that story.

What the App Does

Empire Flippers lists online businesses for sale — Amazon FBA stores, SaaS products, content sites, eCommerce businesses. Their platform is essentially a marketplace for digital assets, similar to a real estate agency but for online businesses.

The challenge was to build a Rails app that:

Fetches all "For Sale" listings from Empire Flippers' public API
Stores listing data in a PostgreSQL database
Creates a corresponding Deal in HubSpot CRM for each listing
Exports all listings to a Google Sheet
Runs the entire sync automatically once per day via background jobs
Never creates duplicate listings or duplicate HubSpot deals — even if the sync runs multiple times

SyncListingsJob (runs daily at midnight via Sidekiq + sidekiq-scheduler)
    ↓
ListingSyncService (orchestrator)
    ├── EmpireFlippersService  →  fetches all listings from EF API (paginated)
    ├── HubspotService         →  creates deals in HubSpot CRM
    └── GoogleSheetsService    →  exports listings to Google Sheet

The Tech Stack

Layer	Technology
Framework	Ruby on Rails 7.1
Database	PostgreSQL
Background Jobs	Sidekiq + sidekiq-scheduler
EF API Client	HTTParty
HubSpot	hubspot-api-client gem
Google Sheets	google-apis-sheets_v4 gem
Testing	RSpec + factory_bot
Environment	dotenv-rails

Architecture: Service Objects

The first design decision was choosing service objects over fat controllers or fat models.

Each service has exactly one responsibility:

# EmpireFlippersService — knows how to talk to the EF API
# HubspotService        — knows how to talk to HubSpot
# GoogleSheetsService   — knows how to talk to Google Sheets
# ListingSyncService    — orchestrates all three, touches nothing else

This matters more than it seems. When the EF API added pagination requirements mid-project, only EmpireFlippersService needed to change. When HubSpot needed a different property format, only HubspotService changed. Nothing else was touched.

This is the Single Responsibility Principle in practice — each class has exactly one reason to change.

The Database Layer: Preventing Duplicates

The Listing model is simple but has two layers of duplicate prevention:

class Listing < ApplicationRecord
  validates :listing_number, presence: true, uniqueness: true
  validates :listing_price, presence: true
end

And at the database level:

add_index :listings, :listing_number, unique: true

Why both? The model validation catches duplicates in Ruby before touching the database. The database index is the last line of defense — even if two requests arrive simultaneously, PostgreSQL rejects the second one.

The database columns:

listing_number              — EF's unique ID for each business
listing_price               — sale price in USD
summary                     — full business description
listing_status              — "For Sale"
hubspot_deal_id             — stored after creating the HubSpot deal
average_monthly_net_profit  — monthly earnings

Fetching Listings: The Pagination Problem

The first version of EmpireFlippersService was simple:

def self.fetch_listings
  response = HTTParty.get(BASE_URL, query: {
    listing_status: "For Sale",
    limit: 100
  })
  response["data"]["listings"]
end

This returned 100 listings. During the interview, the interviewer noted: "There are 213 listings — I want all of them."

The fix required pagination:

def self.fetch_listings
  listings = []
  page = 1

  loop do
    response = HTTParty.get(BASE_URL, query: {
      listing_status: "For Sale",
      limit: 100,
      page: page
    })

    raise "Empire Flippers API error: #{response.code}" unless response.success?

    page_listings = response["data"]["listings"]
    break if page_listings.empty?

    listings.concat(page_listings)
    page += 1
  end

  listings
rescue StandardError => e
  Rails.logger.error "Failed to fetch EF listings: #{e.message}"
  raise
end

How it works: Page 1 returns 100 listings, page 2 returns 100 more, page 3 returns 13, page 4 returns empty — loop breaks. Total: 213.

The raise at the end of the rescue is intentional. Re-raising the error tells Sidekiq the job failed, triggering automatic retries. If the EF API is temporarily down, Sidekiq retries up to 25 times with exponential backoff — no manual intervention required.

The Sync Orchestrator: Idempotent by Design

The ListingSyncService is the heart of the app:

class ListingSyncService
  def self.sync
    listings = EmpireFlippersService.fetch_listings

    listings.each do |listing_data|
      listing = Listing.find_or_create_by(listing_number: listing_data["listing_number"]) do |l|
        l.listing_price = listing_data["listing_price"]
        l.summary = listing_data["summary"]
        l.listing_status = listing_data["listing_status"]
        l.average_monthly_net_profit = listing_data["average_monthly_net_profit"]
      end

      if listing.hubspot_deal_id.nil?
        deal_id = HubspotService.create_deal(listing)
        listing.update(hubspot_deal_id: deal_id)
      end
    end

    GoogleSheetsService.export(Listing.all)
  end
end

Three things worth noting:

find_or_create_by — This is the duplicate prevention for the database. If listing #91258 already exists, return it. If not, create it. The do |l| block only runs for new records.

if listing.hubspot_deal_id.nil? — After saving to the database, check if we already created a HubSpot deal. If yes, skip. If no, create and save the returned deal ID. This prevents duplicate HubSpot deals even if the sync runs ten times.

GoogleSheetsService.export(Listing.all) — We pass all listings from the database, not just new ones. The challenge requires clearing and rewriting the sheet completely each run.

This makes the entire sync idempotent — running it 10 times produces the same result as running it once.

HubSpot Integration: The libcurl Adventure

The HubSpot service itself is straightforward:

class HubspotService
  def self.create_deal(listing)
    client = Hubspot::Client.new(access_token: ENV['HUBSPOT_ACCESS_TOKEN'])

    properties = {
      dealname: "Listing ##{listing.listing_number}",
      amount: listing.listing_price.to_s,
      closedate: (Time.now + 30.days).strftime("%Y-%m-%dT%H:%M:%S.%LZ"),
      description: listing.summary,
      average_monthly_net_profit: listing.average_monthly_net_profit
    }

    response = client.crm.deals.basic_api.create(
      simple_public_object_input_for_create: { properties: properties }
    )

    response.id
  end
end

But getting here on Windows was not straightforward.

The HubSpot gem uses typhoeus internally, which requires libcurl. On Windows, typhoeus looks for libcurl.dll specifically — not the curl executable that ships with Windows 10+.

The fix:

# Download the curl DLL
Invoke-WebRequest -Uri "https://curl.se/windows/dl-8.11.1_2/curl-8.11.1_2-win64-mingw.zip" -OutFile "$env:TEMP\curl.zip"
Expand-Archive "$env:TEMP\curl.zip" -DestinationPath "$env:TEMP\curl"

# Copy to Ruby's bin directory
Copy-Item "$env:TEMP\curl\curl-8.11.1_2-win64-mingw\bin\libcurl-x64.dll" "C:\Ruby31-x64\bin\"

# Create a copy with the exact name typhoeus looks for
Copy-Item "C:\Ruby31-x64\bin\libcurl-x64.dll" "C:\Ruby31-x64\bin\libcurl.dll"

Then SSL verification on Windows required one more fix — a Typhoeus initializer:

# config/initializers/hubspot.rb
require 'typhoeus'

Typhoeus.before do |request|
  request.options[:ssl_verifypeer] = false
  request.options[:ssl_verifyhost] = 0
end

This is a Windows-only workaround. On a Linux production server, neither fix is needed.

Google Sheets Integration: The Permission Problem

The Google Sheets service authenticates via a service account:

def client
  @client ||= begin
    service = Google::Apis::SheetsV4::SheetsService.new
    service.authorization = Google::Auth::ServiceAccountCredentials.make_creds(
      json_key_io: File.open(ENV['GOOGLE_SHEETS_CREDENTIALS_PATH']),
      scope: SCOPE
    )
    service
  end
end

The @client ||= is memoization — create the authenticated client once per object instance, reuse it for all API calls. No re-authenticating on every sheet operation.

The first attempt to create a sheet from the service account returned a 403 PERMISSION_DENIED. The reason: service accounts live in their own Google Drive, separate from your personal Drive. They can't create sheets in your Drive without being explicitly granted access.

The clean solution: create the sheet manually in Google Sheets, share it with the service account email (empire-sync@project-id.iam.gserviceaccount.com), and store the sheet ID in .env. The service account then has Editor access to write data.

def find_or_create_sheet
  sheet_id = GoogleSheetReference.first&.sheet_id

  if sheet_id.nil?
    sheet_id = ENV['GOOGLE_SHEET_ID']
    GoogleSheetReference.create!(sheet_id: sheet_id)
  end

  sheet_id
end

GoogleSheetReference stores the sheet ID in the database — the sheet is created once and reused on every subsequent sync run.

Writing the data:

def write_rows(sheet_id, listings)
  values = [["Listing #", "Listing Price", "Summary"]]

  listings.each do |listing|
    values << [
      listing.listing_number.to_s,
      listing.listing_price.to_s,
      listing.summary
    ]
  end

  value_range = Google::Apis::SheetsV4::ValueRange.new(values: values)

  client.update_spreadsheet_value(
    sheet_id,
    'Sheet1',
    value_range,
    value_input_option: 'RAW'
  )
end

Each run: clear the sheet, write the header row, write one row per listing. Simple and deterministic.

Testing with RSpec: TDD in Practice

Every service has a corresponding spec. The testing philosophy: never hit real APIs in tests. Stub everything external, test your logic in isolation.

RSpec.describe ListingSyncService do
  describe ".sync" do
    let(:fake_listings) do
      [{
        "listing_number" => 12345,
        "listing_price" => 50000,
        "summary" => "A great business",
        "listing_status" => "For Sale"
      }]
    end

    before do
      allow(EmpireFlippersService).to receive(:fetch_listings).and_return(fake_listings)
      allow(HubspotService).to receive(:create_deal).and_return("hs_deal_123")
      allow(GoogleSheetsService).to receive(:export)
    end

    it "syncs listings from Empire Flippers" do
      ListingSyncService.sync
      expect(Listing.count).to eq(1)
      expect(Listing.first.listing_number).to eq(12345)
    end

    it "does not create duplicate listings" do
      ListingSyncService.sync
      ListingSyncService.sync
      expect(Listing.count).to eq(1)
    end

    it "creates a hubspot deal for new listings" do
      ListingSyncService.sync
      expect(HubspotService).to have_received(:create_deal)
      expect(Listing.first.hubspot_deal_id).to eq("hs_deal_123")
    end

    it "does not create hubspot deal if one already exists" do
      ListingSyncService.sync
      allow(HubspotService).to receive(:create_deal)
      ListingSyncService.sync
      expect(HubspotService).to have_received(:create_deal).once
    end
  end
end

The allow(...).to receive(...).and_return(...) pattern is stubbing — replacing the real method with a fake that returns predetermined data. Tests run in milliseconds and never depend on the internet.

The duplicate prevention test is particularly important: run the sync twice, expect only one listing in the database. If find_or_create_by is broken, this test catches it immediately.

The Daily Scheduler

Sidekiq handles background job processing. sidekiq-scheduler adds cron-like scheduling on top:

# config/sidekiq.yml
:scheduler:
  :schedule:
    sync_listings:
      :cron: "0 0 * * *"
      :class: SyncListingsJob

0 0 * * * = minute 0, hour 0, every day = midnight.

The job itself is intentionally thin:

class SyncListingsJob < ApplicationJob
  queue_as :default

  def perform
    ListingSyncService.sync
  end
end

Business logic lives in the service. The job is just a trigger. This makes the service independently testable and independently callable — from the console, from a rake task, from anywhere.

End-to-End Results

After running ListingSyncService.sync in the Rails console:

Listing.count                                    # => 213
Listing.where.not(hubspot_deal_id: nil).count    # => 213

213 listings saved to PostgreSQL. 213 HubSpot deals created with names like "Listing #91258", amounts like $1,293,233, and 30-day close dates. 213 rows in the Google Sheet with headers.

Running the sync a second time:

Listing.count                                    # => 213  (no duplicates)
HubSpot deals created                            # => 0    (all already exist)

Idempotency confirmed.

Key Learnings

1. Service objects make testing trivial. When each service has one job, you stub one thing and test one thing. The alternative — fat models or fat controllers — makes tests brittle and slow.

2. Idempotency is a design decision, not an afterthought. find_or_create_by + hubspot_deal_id nil check = a sync that can run 100 times safely. Design for repeated execution from the start.

3. Windows is a special case. Most Rails documentation assumes Linux. libcurl, SSL certificates, PowerShell aliases — none of these bite you on a Linux server. If you're developing on Windows, budget time for platform-specific debugging.

4. Memoize expensive operations. The @client ||= pattern in GoogleSheetsService creates one authenticated connection per sync run. Without it, every API call would re-authenticate — unnecessary and slower.

5. Re-raise errors in background jobs. Catching an error and logging it silently means Sidekiq thinks the job succeeded. Re-raising means Sidekiq retries automatically. For production background jobs, retries are almost always what you want.

What I'd Add Next

Pagination for listings over 1000 — loop until empty works but could be parallelized with ProcessListingJob workers for scale
Sentry for error alerting — know within minutes when a midnight sync fails
Webhook from EF — instead of polling once per day, react to listing changes in real time
Soft deletes for sold listings — currently sold listings stay in the database indefinitely; they should be marked and excluded from HubSpot

The Bigger Picture

Empire Flippers facilitates the sale of online businesses in the 5 to 8 figure USD range. Every listing in that Google Sheet and every deal in that HubSpot account represents a real business someone built — an Amazon FBA store, a content site, a SaaS product — and is now ready to sell.

Building integrations between systems is what Rails is genuinely good at. Not flashy frontends or complex algorithms — clean, maintainable code that connects APIs together and runs reliably in the background while you sleep.

That's what this project is.

💻 Source Code: github.com/imlakshay08/empire-sync

If you've built HubSpot or Google Sheets integrations with Rails and ran into different issues — I'd love to hear about it in the comments.

DEV Community