DEV Community

Cover image for How Have You Refractored or Optimized Code for Improved Performance?
dev.to staff for The DEV Team

Posted on

How Have You Refractored or Optimized Code for Improved Performance?

Code optimization and refactoring are crucial for enhancing the efficiency and speed of software. Share your experience of a specific instance where you had to tackle performance issues in code. What steps did you follow to improve it? We're interested to know the outcome of your efforts and any lessons learned along the way.

Follow the DEVteam for more discussions and online camaraderie!

Image by pch.vector on Freepik

Top comments (17)

Collapse
 
codenameone profile image
Shai Almog

I have 3 performance tips:

  • Caching
  • Caching
  • Caching

It's almost always one form or another of caching (assuming it isn't a bug). One of the earliest examples I did of this was in the 80s where I pre-caculated the results of trig functions and stored them in an array. I could then perform trig calculations needed for my game with the cost of an array lookup.

Collapse
 
jonrandy profile image
Jon Randy 🎖️ • Edited

I totally relate to this. In the past I've made stuff like my own 3D engine, demoscene effects etc. in super low powered environments. The speed boost from using pre-calculated trig tables (and using plain integer arithmetic wherever possible) was huge!

Collapse
 
codenameone profile image
Shai Almog

That's great!

I remember doing that exact optimization in Turbo pascal in the early 90's when writing Wolfenstein 3d clones ;-)

Thread Thread
 
jonrandy profile image
Jon Randy 🎖️

Ah... back in the days when developers actually cared about performance 😉

Collapse
 
tiguchi profile image
Thomas Werner

About 10 years ago on earlier Android devices. I worked on image processing apps and a couple games. Floating point calculations were super slow on Android devices back then, specifically for image processing and video game math. So step one was to convert those floating point calculations to integer calculations, which already gave a pretty decent performance boost. Step two was to rewrite those routines in C and call them using the Java Native Interface.

The next problem was garbage collector activity that would stall especially video game animations and game performance. So the optimization trick was to recycle all objects and arrays etc., so nothing would be ever garbage-collected while the game was running. So if the game had entities such as enemies, projectiles etc, I would use so-called "pools" for each entity type and retrieve them when needed, and put them back again when done.

Collapse
 
sjmulder profile image
Sijmen J. Mulder

Pools are great for this sort of thing! I hear games using MonoGame using similar approaches, never allocating in the game loop. In C I just use static arrays for this kind of thing, saves a bunch of work on memory management too. In C++ you can use custom allocators that'll automate this behaviour for you.

Collapse
 
bias profile image
Tobias Nickel

a while back, working on a SASS platform, I implemented something I named at the time "request caching". so caching requests instead of the results. Much later I found out this prwctise is named 'query batching' and it worked just like the Dataloader library from facebook. just that my solution worked with higher order functions, could seemless integrate with transactions and other contexts like pagination.

by adding this into the projects own ORM, the entire app got a perfornance boost.

by the way it was a time when node apps where made with callbacks and not even with promises.

by the way, tcacher still has some advantages over Dataloader. But it could be lifted to the age of ESM modules.

Collapse
 
thumbone profile image
Bernd Wechner

Ironically, I've not done much performance tuning/refactoring in the 30+ years I've been developing or maintaining software (except perhaps for caching, as Shai Almog says ... caching!).

But my most memorable and lauded work has often been quite the opposite.

That is, actively slowing things down or at least discarding performance as a criterion in order to pursue competing goals (generally, maintainability, the cost or even facility of maintenance).

I'm not the only person to have landed on a project that was a black box because no-one, since implementation, wants to touch it. Anyone who's looked at it saw a house of cards, a mysterium of complexity and fled. The risks of making changes or the costs of a complete replacement both judged too high ... Just leave it be, if it ain't broke don't fix it.

But then a rewrite is budgeted, mainly because new hardware is bought with new peripherals, firmware and OS etc... And so this black box needs porting which equates to a rewrite.

A deep analysis of the thing to be rewritten begins the job, teasing it apart, building an internal documentation of the old, an internal spec, all the things lost to time in this legacy system. Then a rewrite, but often the main goal this time is not to land here again, but to have software that can be maintained, enduring staff turn over. And with that goal eclipsing performance, with a new generation of hardware providing enormous performance gains, a lot of complicated and difficult to describe optimisations in the old software, on the old hardware are tested against a simpler implementation on the new hardware ... scrutinising performance and accuracy and precision (I have always worked in the engineering and science realms). If the new is not significantly slower than the old or if, on the new hardware, is still faster, then with its clarity of code, internal documentation and interface specifications, it is the winner.

The result: performance optimisations removed in favour of simpler code that can be maintained into the future and evolve incrementally unlike the black box that just burst from its bubble.

Collapse
 
jfrenchtweet profile image
Jeremy French

I once managed a 1000x speed up from some code. The language had two similar types lists and arrays.

Lists had some features not available to arrays but we're basically wrappers around strings. So doing any operation like sorting them involved a lot of string manipulation.

One feature was talking over a minute to run and it was due to this list processing. All of the things inside it could be done with arrays. So I tried changing it. Converting to and from lists on either side of the system. I was hoping it would run in less than thirty seconds but it came in as fractions of a second.

I guess the moral is know your data types and how they are handled internally.

Collapse
 
ralphhightower profile image
Ralph Hightower

I was part of a development team that were developing Microsoft ISAPI extensions in C++
for a web application.

  • Move invariant code out of loops. There was a utility function that returned the holidays for a year from a database. It was being called repeatedly while processing rows from a different database query.
  • Minimize database repetitions. There was an internally developed base class that retrieved rows from SELECTs on tables. The base class first performed the query and then processed the rows, and performing another SELECT for the row that was returned. The base class to retrieve rows already has the data. Why SELECT the row again? I rewrote the base SELECT rows class and modified the classes that used the base class.
Collapse
 
sjmulder profile image
Sijmen J. Mulder

As for actual examples, there's two that come to mind:

  • To temporarily replace Google reverse geolocation ("what city is this coordinate?") we exported data from OpenStreetMap to a file as binary records (e.g. 20 bytes for the name, 2 8-byte floats for coords, etc) and memory-mapped it to C# structs. Then used a simple Linq MinBy query checking squared distances. The JIT generated really tight vectorized code for this and it was super fast.
  • Did a POC for a pension fund rewriting some of the policy projection calculations in OpenCL to run on a graphics card. The original C# code wasn't bad but GPU performance was just another level!
Collapse
 
ralphhightower profile image
Ralph Hightower

Yes.

  • Loop optimization: moving invariant code out of loops. One instance was retrieving holidays from a database table.
  • Eliminate retrieving data twice. There were internally developed base class for retrieving a row, and rows from database tables. The class that retrieved rows, first retrieved rows from the table, using the rows returned and then retrieved the the rows individually again to create the list. Why do that all over again? You've already got the data.
Collapse
 
gbhorwood profile image
grant horwood • Edited

for many years our company did a lot of fix/rescue/up-feature work on other people's projects (usually apis). the first three places i always looked to address performance issues were always:

  1. database design. the persistence layer is almost always the most time-intensive component, and most of the time adding a few well-chosen indexes on columns made huge improvements.

  2. heavy loops. lots of devs will put some heavy call, like a query, in a loop. on test data with five or whatever records, it works great, but when live data grows to a thousand records, it becomes an issue. migrate looped selects to joins or some other strategy like joins. investigate memoization.

  3. caching. there are a lot of caching strategies. generally, i like to start by throwing everything behind a cloudfront and start from there. if you have heavy components that are generally static or long-lived data, a good caching strategy can be a massive win.

  4. worker queues. lots of tasks can just be deferred and a worker queue can make a huge improvement. keep an eye on the queues, though, since they're not free. i once had a fix project where the queue was so full that password recovery mails were taking two hours to get sent.

of course, before starting any optimization, it's necessary to figure out where the pain points actually are. spend some time profiling; get actual data on performance by using a profiling tool or home-rolling your own. i wrote an api logging and tracking tool for laravel for our company to do precisely this and it has been very valuable.

Collapse
 
siddharthshyniben profile image
Siddharth

This is oddly specific but for some reason Array.from is faster than a spread operator on strings in JavaScript.

Collapse
 
vishwassingh47 profile image
Vishwas Singh

When I joined my recent company, I was given the task to improve the performance of a Socket Server (Socket.io + Nodejs).
At that time we were only able to handle 2K concurrent users, after that our EC2 instance used to go down due to High CPU.

We were doing some API call whenever the users connects to the Socket Server. Doing lots of parallel API calls was resulting in High CPU as HTTP 3 way handshake is CPU intensive.

Then instead of making API calls we decided to do DB queries on the Socket Server only, eliminating the REST call.

Then we started queuing requests in a local IN Memory queue and were processing multiple users request in a Single DB Batch Query (one DB query per 500 requests).

Then we used Redis to horizontally scale these socket servers.

Now we are handling 35K concurrent users and have tested out Service to handle 100K concurrent connections per ec2 instance (1CPU x 2GB RAM).

Collapse
 
sjmulder profile image
Sijmen J. Mulder • Edited
  • Use a profiler
  • Do less work
    • Simplify and cut out the cruft. Avoid unnecessary abstractions or steps.
    • Don't repeat work (e.g. duplicate database queries)
    • Don't ask more than needed (DB queries are again a good example)
    • Preprocess what you can, e.g. take work out of the hot loop and do it once it advance, create lookup tables, etc.
  • Make the work faster
    • Lay out your data in a way that's convenient for the processing (data-oriented programming). Array in, loop, array out tends to play well with how CPUs and memory work.
    • Avoid indirection. That's pointers, virtual methods, string matching, etc.
Collapse
 
jonrandy profile image
Jon Randy 🎖️

Nice read on JS performance optimisation here.