loading...

How we moved from Ruby to GO and decrease our cost by %1400 and increased response time by %500

ebaykann profile image Universe MrBlood ・6 min read

As of today ( Aug 2019 ) it’s been more than 5 years since I wrote the first lines of the Insider’s Mobile suite.

alt text

I was a fresh graduate and I neither had the knowledge nor the experience to create big things was very limited. I was an iOS engineer, building applications that used parse.com (PaaS) and that was it. Without any knowledge of building a web application, the first challenge was picking the right platform that would stick with me for years of sleepless nights of developing new features and debugging problems ( basically coding :> ).

Criteria;

  1. I was in need of developing with high velocity, so the learning curve of the language had to be low
  2. The community. IMHO, the community is a crucial factor that’s being overlooked, there should be lots of content, docs, people that would help with your problems
  3. High availability of third party packages - a crucial item if you are building mvp that needs to go live, fast .
  4. A modern language that would get us a better talent pool, high caliber engineers (i’m not going to get into all that 10x engineer fight here) who stay at the edge of the technology

With these criteria in mind, I decided to go with Ruby with Rails framework

RoR was perfect for us to create new features since during the bootstrap phase, we needed to build almost a new feature / product per week.

We were using capistrano to deploy and unicorn as the http server, devise to authenticate.

For years, it served it purpose. But then..

PROBLEM:

We had lots of publisher clients.
This meant a flash news story would create an enormous traffic increase on our end.
Which was causing our system to - roughly - per minute request counts can escalate from 30k to 400k in 10 seconds.

This was an entirely different problem to solve.

  • Ruby is not designed to handle applications that need high concurrency. Rails model dictates that every request will use work process whereas processes represented and handled by unicorn workers.
  • It was taking around 3 minutes for a new worker instance to scale up by an auto scale group and by the time it completed the operation, the system load would turn back to normal. It was just a waste of resources and valuable monetary resources.
  • Ruby is not designed to be an asynchronous language. Sure there were lots of background workers ( rails 5’s native worker support wasnot even released back then) *RoR was a complete memory hog. We had to write custom bash scripts that check the workers and kills them (than used https://github.com/kzk/unicorn-worker-killer for literally the same purpose)

Short Term Bullets that failed :

We applied lots of patches to handle this problem,

  • Pooling requests coming from SDK to minimize the number of requests that hit the Load Balancer, which was creating a data delay - not good
  • Scale up the auto scale group BEFORE the pushes delivered to the end user RIGHT after creating a campaign, which had lots of false positive means burning money and for a new worker node to become available was already taking too long for this approach

At that point, we knew that we had to leave RoR behind.

We had to find the right tool for this job, where we needed a new language that is async by design.

Solution:

After weeks of research and quick benchmark tests we decided to re-write our APIs in GO

This marked the start of a new era for both our APIs and our wallets.

After months, RoR codebase was filled with monkey patches. A rewrite would mean a clean slate, free of the patches that were making the codebase almost unreadable and unmaintainable

The first iteration of API was not testable and didn’t include any unit tests , the chance of implementing a test driven development was great

Speed; HOLLY COW. even the most basic operation performance difference is insane which was exhilarating for us which will enable us to create state of the art applications

A short example

Ruby

numbers = []
size = 100_000_000
Benchmark.bm do |bm|
  bm.report('filling array') do
    for i in 0...size
      numbers.push(i)
    end
    puts  " array size: #{numbers.size}"
  end


  bm.report('iterating array') do
    counter = 0
    numbers.each do |number|
      counter += number
    end
    puts  " sum of numbers in array: #{counter}"
  end
end

Go

size := 100000000
numbers := make([]int, size)


fillingArrayStarter := time.Now()
for i := 0; i < size; i++ {
  numbers[i] = i
}


fmt.Printf("array size: %d took: %s", len(numbers), time.Since(fillingArrayStarter))


iterationStarter := time.Now()
var counter int
for _, number := range numbers {
  counter += number
}


fmt.Printf("\nsum of numbers in array: %d took: %s", counter, time.Since(iterationStarter))




Results

alt text

  1. Initializing an array and filling it with 10 million incremental number
Ruby Go
8.426792 s 382.182509 ms

Go is 22x faster

  1. Iterating the filled array and getting the sum of all the numbers
Ruby Go
4.686981 s 91.622391 ms

Go is 51x faster

Rewrite Process

We chose our first api to re-write and started the process.

First impressions

  1. RoR has a built-in ORM called ActiveRecord that comes with schema migrations and these migrations were absolutely amazing to dynamically add new columns, maintain multiple dev environments without even breaking a sweat #### example:
db:migrate # runs (single) migrations that have not run yet.
db:create # creates the database
db:drop # deletes the database
db:schema #:load creates tables and columns within the (existing) database following schema.rb
db:setup # does db:create , db:schema :load, db:seed 
db:reset # does db:drop, db:setup

For go, we needed to maintain our own structures with custom scripts and custom structures, which was hard to adopt at the beginning.

  1. GO is DEFINITELY not magical, which makes it
    • easy to read
    • easy to maintain

Let me elaborate;

alt text

in this screenshot, a simple initialized array of integers has 10s of functions by default !

In GO world, the only magic is how fast it is. You even have to implement the unique function yourself

### Example function to get

func uniq(intSlice []int) []int {
    keys := make(map[int]bool)
    list := []int{} 
    for _, entry := range intSlice {
        if _, value := keys[entry]; !value {
            keys[entry] = true
            list = append(list, entry)
        }
    }    
    return list
}

Bright side of this lack of magicalness; you will never ever be surprised by your code. Everything is in your hand, and your hands alone

Down side, the road of transition from ruby to go means lots of writing utility functions.

  1. Deployment is unbelievably easy on go. You just run go build and that's it. You will have an executable file which contains all of your source is and packages as compiled. And again, it has a cross compiler by default as well
GOOS=linux GOARCH=amd64 go build
GOOS=windows GOARCH=amd64 go build
  • As I stated in the problematic part of rails, for a new docker app to be come available, unicorn workers needs to be initialized, which was taking around 3 minutes. In the Go World, the application was able to become ready to handle traffic in a couple of seconds which was crucial in the need of scale up.

Endgame

The results were better than we anticipated.

  • Before GO
    we had (on average) 14 c5.xlarge (8 CPU / 16 GB Mem) instances and on each major push campaign (where the request count was increasing by %300 in seconds) we had huge outages and latency issues on all of our API's.

  • After Go

    • 14 c5.xlarge decreased to 2 c5.large
    • the number of needed scale up wasn't any need to scale up since the average execution time were also decreased dramatically
    • the system was written with unit tests which enabled us to adopt CI and decreased our bug ratio per sprint by around %5 percent per quarter

    - It was SUPER FUN.

Credit:

I'd also like to thank my very good friend and coworker Cem Sancak for introducing us to the amazing world of GO and be one of the frontiers of these transitions.

Discussion

markdown guide
 

Just so I am following... you converted from a dynamic language and opinionated framework designed for rapid feature execution and programmer happiness to a typed, compiled language that is 22-50x faster but considers uniq too frivolous to include in its standard lib.

Look, I'm happy for you and your team that you identified your bottlenecks and did the right thing to address them, but why did you have to write yet another clickbait "move from Ruby, save 5x on hosting!" blog post? Even though you take great pains to repeatedly say some nice things about the language and framework that got you this far (for free), you make it sound like Ruby was a problem that got solved with Go. It's nowhere near this simple, because designing your application and its API in Go from scratch would have been miserable for all of the reasons we champion in The Rails Doctrine.

Both Ruby and Rails call on libraries written in other languages because they are clearly more performant than Ruby ever will be. JSON parsing and sqlite3 come to mind immediately. How is what you're doing any different, except for the fact that you're recasting this architectural evolution as a "move" instead of optimizing the parts of your infrastracture that need to be compiled in something. Go is a great language, but you could have used Java, C++ or Rust, and achieved similar results. And you should!

But don't come on here and give the "Rails is dead" morons another piece of kindling. Rails is kicking ass right now.