I built FirstVibe -- an AI-powered selfie analyzer that gives you a "vibe check." Upload a photo and within 30 seconds you get personality scores, a celebrity lookalike, aura type, dating energy, red/green flags, and a bunch of fun predictions. Think "what do people actually think when they first see you?" powered by Claude's vision capabilities.
It has processed over 6,000 selfie analyses so far. Here is how it works under the hood.
The Stack
- Rails 8.0.4 with Ruby 3.3.0
- PostgreSQL with UUID primary keys everywhere
- Hotwire (Turbo + Stimulus) -- no React, no SPA
- Tailwind CSS -- dark theme, no custom CSS beyond a few animations
- Propshaft + Importmap -- zero Node.js, zero Webpack
- Solid Queue -- DB-backed job queue running inside the Puma process
-
Claude Sonnet 4 + Haiku 4.5 via the
ruby-anthropicgem (Vision API) - OpenAI API for AI caricature generation
- Stripe Checkout for payments
- Render.com for deployment
The "no Node.js" part is not an accident. Rails 8 with Importmap and Propshaft means I have zero JavaScript build steps. The entire frontend is Stimulus controllers, Turbo Streams, and ERB templates. It keeps things simple.
Architecture: Progressive Rendering with Parallel API Calls
The core challenge was making AI analysis feel fast. Claude Vision is powerful but not instant -- a single call with an image can take 8-15 seconds. Users staring at a blank loading screen for that long would bounce.
My solution: split the analysis into two parallel API calls and stream results progressively as they complete.
class ClaudeVisionService
DETAILS_MODEL = "claude-haiku-4-5-20251001"
def analyze_progressive(on_core_complete:, on_details_complete:)
image_base64 = download_and_encode
overused = AiOutputFrequency.overused_values(@scan.locale)
core_prompt = build_prompt(:core, overused)
details_prompt = build_prompt(:details, overused)
core_queue = Queue.new
details_queue = Queue.new
Thread.new do
response = call_claude(image_base64, core_prompt)
core_queue.push([:ok, parse_response(response)])
rescue => e
core_queue.push([:error, e])
end
Thread.new do
response = call_claude(image_base64, details_prompt, model: DETAILS_MODEL)
details_queue.push([:ok, parse_response(response)])
rescue => e
details_queue.push([:error, e])
end
core_status, core_payload = core_queue.pop
raise core_payload if core_status == :error
on_core_complete.call(core_payload)
details_status, details_payload = details_queue.pop
if details_status == :error
on_details_complete.call(nil, details_payload)
else
on_details_complete.call(details_payload, nil)
end
end
end
Call 1 (Core) uses Claude Sonnet and returns the main score, vibe label, tags, first impression, and all five category scores. This is the "above the fold" content -- what users see first.
Call 2 (Details) uses Claude Haiku 4.5 (cheaper, faster) and returns everything else: celebrity match, dating energy, theme song, vibe animal, red/green flags, tips, etc. This is the paywall content, so it can arrive a few seconds later without impacting perceived performance.
When core results arrive, the background job broadcasts a Turbo Stream update immediately:
# In AnalyzeScanJob
service.analyze_progressive(
on_core_complete: ->(core_results) {
scan.update!(
results: core_results.merge("_details_pending" => true),
status: :completed,
completed_at: Time.current
)
Turbo::StreamsChannel.broadcast_update_to(
"scan_#{scan.id}",
target: "scan_content",
partial: "scans/results_card",
locals: { scan: scan, caricature_loading: true }
)
},
on_details_complete: ->(details_results, error) {
# Merge into existing results, broadcast again
}
)
The user's browser is subscribed to the Turbo Stream channel, so results appear in real-time without polling. I also run a Stimulus-based polling fallback at 3-second intervals for cases where WebSockets drop.
In parallel with all of this, a third job (GenerateCaricatureJob) kicks off an OpenAI image generation request for the AI caricature.
Result: the user sees their core score and personality breakdown within 8-12 seconds, then detailed results fill in over the next few seconds. It feels snappy.
Image Pipeline
Every uploaded photo gets compressed before storage or AI analysis:
class ImageCompressor
MAX_DIMENSION = 768
JPEG_QUALITY = 70
def self.compress(uploaded_file)
result = ImageProcessing::Vips
.source(uploaded_file.tempfile)
.resize_to_limit(MAX_DIMENSION, MAX_DIMENSION)
.saver(quality: JPEG_QUALITY, strip: true)
.convert("jpeg")
.call
base64 = Base64.strict_encode64(result.read)
result.close
[base64, "image/jpeg"]
end
end
Vips is fast and memory-efficient. Resizing to 768px and compressing to 70% JPEG quality reduces the image size significantly (often from 3-5MB to 80-150KB) without meaningful quality loss for AI analysis. The compressed image is cached in Redis for 10 minutes so the Vision API call does not need to download from S3 again.
The upload form itself does client-side resizing to 1000px via a Canvas element in a Stimulus controller, so the server-side compression is a second pass. Belt and suspenders.
JSONB for AI Results
All AI analysis results live in a single JSONB column (scans.results). No rigid schema, no migrations when I add a new analysis field.
# Free users see limited results
def visible_results
return results if is_unlocked?
free_categories = %w[attractiveness confidence]
categories_preview = results["categories"]&.each_with_object({}) do |(key, val), hash|
if free_categories.include?(key)
hash[key] = { "score" => val["score"], "note" => nil }
else
hash[key] = { "score" => nil, "note" => nil }
end
end
{
"overall_score" => results["overall_score"],
"vibe_label" => results["vibe_label"],
"vibe_tags" => results["vibe_tags"],
"first_impression" => truncate_to_first_sentence(results["first_impression"]),
"categories" => categories_preview,
"celebrity_match" => nil,
"aura_type" => results["aura_type"].present? ?
{ "name" => results["aura_type"]["name"], "hex" => results["aura_type"]["hex"], "why" => nil } : nil,
# ... everything else nil/locked
}
end
The visible_results method controls the free/paid boundary in one place. Free users see the overall score, vibe label, 2 of 5 category scores (attractiveness and confidence -- the hooks), and a truncated first impression. Everything else is locked behind a $1.99-$2.49 paywall.
Solving AI Output Repetition
One problem I did not anticipate: Claude would fall into patterns. After a few hundred scans, way too many people were getting "Timothee Chalamet" as their celebrity match or "Golden Hour Aura" as their aura type.
My solution was an AiOutputFrequency tracking system:
class AiOutputFrequency < ApplicationRecord
TRACKED_FIELDS = %w[celebrity_match vibe_label vibe_animal aura_type].freeze
WINDOW = 100
THRESHOLD = 0.15
def self.overused_values(locale)
overused = {}
TRACKED_FIELDS.each do |field|
recent = where(field_name: field, locale: locale)
.order(created_at: :desc).limit(WINDOW).pluck(:field_value)
total = recent.size.to_f
next if total < 10
recent.tally.each do |value, count|
overused[field] ||= [] if count / total > THRESHOLD
overused[field] << value if count / total > THRESHOLD
end
end
overused
end
end
Before each analysis, the service checks the last 100 outputs per field per locale. Anything appearing more than 15% of the time gets injected into the prompt as a "DIVERSITY RULE -- do NOT use these recently overused values." It works well. Celebrity matches and vibe labels are now genuinely varied.
Session-Based Identity (No Auth)
There are no user accounts. No sign-up, no login, no OAuth. Identity is a signed cookie:
module VisitorTrackable
included do
before_action :ensure_visitor_id
end
def ensure_visitor_id
cookies.signed[:visitor_id] ||= {
value: SecureRandom.uuid,
expires: 30.days.from_now,
httponly: true,
secure: Rails.env.production?
}
end
end
This was a deliberate choice. The product is a quick, fun interaction -- forcing account creation before seeing results would tank conversions. The visitor ID links to scans, payments, experiment assignments, and analytics events. If someone pays, they can optionally save an email to access their results later.
Rate Limiting
Rack::Attack handles all rate limiting with a dedicated MemoryStore so counters do not compete with the application cache:
- Scan creation: 3/hour per IP (AI calls are expensive)
- Checkout: 10/hour per IP
- OTP verification: 10/hour per IP
- Admin panel: 5/minute per IP
- General: 60 requests/minute per IP
Cost Optimization: Prompt Caching
Claude's prompt caching (cache_control: { type: "ephemeral" }) saves significant money. The system message and prompt template are cached, so repeated calls only pay for the image input and the unique response. With the system prompt being ~500 tokens and the analysis prompt being ~2,000 tokens, this cuts input token costs meaningfully across thousands of scans.
def call_claude(image_base64, prompt, model: nil)
api_client.call(
system: [
{ type: "text", text: SYSTEM_MESSAGE, cache_control: { type: "ephemeral" } }
],
messages: [
{
role: "user",
content: [
{ type: "text", text: prompt, cache_control: { type: "ephemeral" } },
{ type: "image", source: { type: "base64", media_type: @content_type, data: image_base64 } }
]
}
]
)
end
Background Jobs with Solid Queue
Solid Queue is underrated. It is a database-backed job queue that ships with Rails 8, and I run it inside the Puma process (SOLID_QUEUE_IN_PUMA=1). No Redis, no Sidekiq, no separate worker process. On Render.com this means one dyno handles everything -- web requests, background jobs, recurring tasks.
Recurring jobs handle cleanup and automation:
- Clearing finished Solid Queue jobs (hourly)
- Cleaning expired email verifications (daily)
- Purging stale AI frequency records (daily)
- Sending daily growth reports (8am UTC)
Deployment
The entire app runs on Render.com with no Docker. Build step: bundle install + assets:precompile + db:prepare. That is it. No container orchestration, no Kubernetes. A single web service with Solid Queue running in-process.
A/B Testing
I built a simple, deterministic A/B testing system. Variant assignment uses SHA256 hashing of "#{visitor_id}:#{experiment_name}" so it is stateless and consistent:
hash = Digest::SHA256.hexdigest("#{visitor_id}:#{experiment_name}").to_i(16)
bucket = hash % total_weight
Currently testing pricing ($1.99 vs $2.49), paywall headline copy (curiosity vs loss aversion), and paywall teaser styles (hard blur vs gradient fade). All configured in a single Ruby module, no external service needed.
Results After 6,000+ Scans
Some things I learned:
- Progressive rendering matters. The switch from "wait for everything" to "show core results immediately" reduced bounce rate during loading noticeably.
- Freemium works for impulse products. Showing the score for free and locking the details behind a paywall creates natural curiosity. People want to know their celebrity match.
- AI output diversity requires active management. Without the frequency tracking system, Claude would give 20% of users the same celebrity match.
- No-auth simplicity pays off. The upload-to-results flow takes 30 seconds. Adding a signup step would have killed conversions.
- Rails 8 with Hotwire is genuinely good for this. Real-time updates via Turbo Streams, no JavaScript framework, no build step. The entire JS footprint is Stimulus controllers.
- Prompt caching saves real money. At scale, caching the system message and prompt template with Claude API reduces costs significantly per scan.
What I Would Do Differently
- Start with progressive rendering from day one instead of retrofitting it.
- Build the AI output frequency tracking earlier -- the repetition problem is not obvious until you hit a few hundred scans.
- Use a more structured error recovery system for the details call. Right now, failed detail calls get retried once and then flagged. A circuit breaker pattern would be cleaner.
If you want to try it: firstvibe.app. The basic vibe check is free. Happy to answer questions about any part of the architecture.
Top comments (0)