loading...

3 Refactoring Techniques

ijlee2 profile image Isaac Lee Originally published at crunchingnumbers.live ・10 min read

This mess was yours. Now your mess is mine.

- Vance Joy


Hacktoberfest is coming up. If you're new to open source contribution and unsure how to help, may I suggest refactoring code? You can provide a fresh perspective to unclear code and discover ways to leave it better than you found.

There are 3 refactoring techniques that I often practice:

  • Rename things
  • Remove nests
  • Extract functions

Knowing how to apply just these 3 can get you far. I'll explain what they mean and how I used them (or should have used them) in projects.

1. Rename Things

The goal of this technique is to help people communicate through code.

I remember this story from my college professor. She once had inherited code in which variables had been named after baseball players. Why? I can only imagine spite from an unhappy programmer.

If you didn't understand right away what a variable, conditional, or function does, then there's a chance that someone else won't either. Once you do understand what it does and how it interplays with other code, please give it a better name.

a. Variables

A variable name, done right, explains purpose. In general, you will want to prefer fully-spelled words over truncated ones. This removes ambiguity and allows guessing when searching code by text.

Here's a change that I made to whats-new-in-emberland, an app that helps The Ember Times newsletter find who contributed to Ember repos.

// Before
let conListUniq;

// After
let contributorsList;

Had I not mentioned to you that the app finds contributors, I think you would've had a hard time deciphering what "conList" in conListUniq means.

A variable name can also explain type. For example, you can begin the name with is or can to indicate a boolean and pluralize the name to denote an array. Plural name comes in handy when you iterate over the array. You can use the singular noun for the array element.

export function filterMerged(pullRequests) {
  return pullRequests.filter(pullRequest => {
    const { isMadeByUser, isMergedThisWeek } = pullRequest;

    return isMadeByUser && isMergedThisWeek;
  });
}

Notice the naming conventions in the variables pullRequests, pullRequest, isMadeByUser, and isMergedThisWeek.

b. Conditionals

A conditional statement, since it is made up of dynamic values and language-specific syntax, can be tough to digest at once. This is more true for a compound conditional—two or more statements joined by the && or || operator.

To maintain conditionals, try creating temporary variables with a clear name. In general, each statement in a compound conditional should get its own variable. When you read the code aloud, it will sound almost natural.

Some time ago, I added a feature to ember-component-template-colocation-migrator. It runs in command line so I needed to support a couple of flags.

const { argv } = require('yargs');

// Allow passing the flag, -fs (flat) or -ns (nested), to specify component structure
const changeToFlatStructure = argv.f && argv.s;
const changeToNestedStructure = argv.n && argv.s;

let structure = 'flat';

if (changeToFlatStructure) {
  structure = 'flat';

} else if (changeToNestedStructure) {
  structure = 'nested';

}

Creating temporary variables has the benefit of abstraction. If we later decide to use a different library to handle flags, only lines 1-5 would change. The branching logic for structure can remain the same.

c. Functions

In Section 3, we will look at how functions play a critical role in refactoring.

As for naming, I encourage you to start a function's name with a verb. I like to use get or find to indicate a function that retrieves data, and set or update to denote one that changes data.

@action async getContributors() {
  const fetchRequests = this.mergedPRs.map(pullRequest => pullRequest.user);

  let users = await all(fetchRequests);
  users = this.identifyUsers(users);
  users = this.sortUsers(users);

  this.updateContributorsList(users);
}

Notice the use of verbs in getContributors, identifyUsers, sortUsers, and updateContributorsList. Although you don't see their implementation, you may be able to guess what each is supposed to do.

2. Remove Nests

Removing nests is about flattening code structure. By removing indentations that are unnecessary, the ones that remain can clearly show groups of related code.

Since code indentation is a bit of styling choice, you might wonder why removing nests matters. We'll look at nested conditionals and promises to see their drawbacks.

a. Nested Conditionals

Over time, a nested if statement can turn into mess. Business logic changes constantly. Pressured by time, we may add exceptions to allow new logic rather than refactor code in order to find a holistic solution.

The best fictitious example comes from Sandi Metz's 2014 RailsConf talk, All the Little Things. Sandi talks of the Gilded Rose problem. Given this code,

def tick
  if @name != 'Aged Brie' && @name != 'Backstage passes to a TAFKAL80ETC concert'
    if @quality > 0
      if @name != 'Sulfuras, Hand of Ragnaros'
        @quality -= 1
      end
    end
  else
    if @quality < 50
      @quality += 1
      if @name == 'Backstage passes to a TAFKAL80ETC concert'
        if @days_remaining < 11
          if @quality < 50
            @quality += 1
          end
        end
        if @days_remaining < 6
          if @quality < 50
            @quality += 1
          end
        end
      end
    end
  end
  if @name != 'Sulfuras, Hand of Ragnaros'
    @days_remaining -= 1
  end
  if @days_remaining < 0
    if @name != 'Aged Brie'
      if @name != 'Backstage passes to a TAFKAL80ETC concert'
        if @quality > 0
          if @name != 'Sulfuras, Hand of Ragnaros'
            @quality -= 1
          end
        end
      else
        @quality = @quality - @quality
      end
    else
      if @quality < 50
        @quality += 1
      end
    end
  end
end

you are to update tick to handle just 1 more feature and ensure that all tests continue to pass. Where do you even begin?

The key to refactoring tick is to make early exits, also called guard clauses. If you see code that can happen only when a condition is true, you leave immediately where you are (using return, break, or continue) if it evaluates to false.

// Before
function myExample() {
  if (condition) {
    /* Complex code omitted */
  }
}

// After
function myExample() {
  if (!condition) {
    return;
  }

  /* Complex code omitted */
}

Notice that we removed 1 indentation level from the complex code. Imagine you are able to make a few early exits. With each non-exit, the complex code can become simpler and allow other refactors. Moreover, by reading the series of if statements from top to bottom, you know exactly when the next code runs.

Through a series of small refactors, Sandi arrives at the following code. I bet that you can more easily understand and change this code.

def tick
  case name
  when 'normal'
    normal_tick
  when 'Aged Brie'
    brie_tick
  when 'Sulfuras, Hand of Ragnaros'
    sulfuras_tick
  when 'Backstage passes to a TAFKAL80ETC concert'
    backstage_tick
  end
end


def normal_tick
  @days_remaining -= 1
  return if @quality == 0

  @quality -= 1
  @quality -= 1 if @days_remaining <= 0
end


def brie_tick
  @days_remaining -= 1
  return if @quality >= 50

  @quality += 1
  @quality += 1 if @days_remaining <= 0
end


def sulfuras_tick
end


def backstage_tick
  @days_remaining -= 1
  return              if @quality >= 50
  return @quality = 0 if @days_remaining < 0

  @quality += 1
  @quality += 1 if @days_remaining < 10
  @quality += 1 if @days_remaining < 5
end

Let me defer my real-life example to Section 3c. I will show how to update the flags code from ember-component-template-colocation-migrator.

b. Nested Promises

Previously, we saw that a nested if statement can be hard to reason and modify. By the same token, we want to avoid nested promises.

Here's a server code that I had written in my nascent days:

router.get('/my-example', function(req, res) {
  Writer.findOne(/* Query options omitted */)
    .then(user => {
      // Get the user's profile
      const profile = user.dataValues;

      // Serialize the user's stories
      const stories = user.Stories.map(story => ({
        id: story.id,
        title: story.title,
        writer: {
          id: profile.id,
          fullName: profile.fullName
        },
        photos: story.Photos.map(photo => ({
          id: photo.id,
          url: photo.url,
          caption: photo.caption
        }))
      }));

      // Serialize the user's readers
      const readers = user.Readers.map(reader => ({
        readerId: reader.reader_id
      }));

      // Serialize the user's writers
      Reader.findAll(/* Query options omitted */)
        .then(results => {
          const writers = results.map(writer => ({
            writerId: writer.writer_id
          }));

          // Send the user's profile, stories, readers, and writers
          res.json({
            profile,
            stories,
            readers,
            writers
          });
        });
    });
});

The end goal is remarkably simple: Send to a client what we know about the user (lines 35-40). So why does the code feel like mess?

One reason is nested promises. (There's another and we will address it in Section 3a.) With so many indentations, it's difficult to see where code begins and ends, and which variables cross over from one promise to another. In addition, the code assumes no failure points.

Prior to the wide adoption of async and await, we might have used a promise chain to refactor this code. A promise chain isn't without problems, however.

Using async and await, we can rewrite the code as follows:

router.get('/my-example', async function(req, res) {
  try {
    const user = await Writer.findOne(/* Query options omitted */);

    if (!user) {
      throw new Error('Could not find user.');
    }

    // Get user's profile
    const profile = user.dataValues;

    // Serialize user's stories
    const stories = user.Stories.map(story => ({
      id: story.id,
      title: story.title,
      writer: {
        id: profile.id,
        fullName: profile.fullName
      },
      photos: story.Photos.map(photo => ({
        id: photo.id,
        url: photo.url,
        caption: photo.caption
      }))
    }));

    // Serialize user's readers
    const readers = user.Readers.map(reader => ({
      readerId: reader.reader_id
    }));

    // Serialize user's writers
    const results = await Reader.findAll(/* Query options omitted */);
    const writers = results.map(writer => ({
      writerId: writer.writer_id
    }));

    // Send the user's profile, stories, readers, and writers
    res.json({
      profile,
      stories,
      readers,
      writers
    });

  } catch (error) {
    /* Error handling omitted */

  }
});

Notice that profile, stories, readers, and writers are now defined at the same indentation level. This helps us trace the ending of the story that the code tells. In the end, we send data to the client, but where do they come from? Let's scroll up.

3. Extract Functions

Now for the grand finale. At times, you may encounter a function that does 1 thing (this is good) but has many lines of code (likely bad). In fact, you saw one in Section 2b.

The function contains a few key steps that run in sequence. Your goals are to:

  • Identify the key steps
  • Create a function for each step
  • Assign each function a descriptive name

This process of breaking a large function into smaller ones is called extraction. Some of the smaller functions, especially if they don't require talking to an external system (e.g. make an API request or search an index), can now be unit-tested.

If I were to pick the most useful refactoring technique, extraction would be it.

a. Example 1

In Section 2b, we managed to remove nested promises. Let's refactor the code further by extracting functions.

How do you identify the key steps? A good indicator is a comment that was left to describe what the code does. You may even name the function based on the comment.

If I got to rewrite the API, I think it would look something like this:

import { serialize } from '../some-path';

router.get('/my-example', async function(req, res) {
  try {
    const user = await Writer.findOne(/* query options omitted */);

    if (!user) {
      throw new Error('Could not find user.');
    }

    const profile = getProfile(user);
    const stories = serialize({ name: 'Story', data: user.Stories });
    const readers = serialize({ name: 'Reader', data: user.Readers });
    const writers = await getWriters(user);

    res.json({
      profile,
      stories,
      readers,
      writers
    });

  } catch (error) {
    /* Handle error */

  }
});

function getProfile(user) {/* ... */}

async function getWriters(user) {/* ... */}

b. Example 2

In whats-new-in-emberland, I found the model hook, a function that fetches PRs (pull requests) and RFCs (requests for comments), looking like this:

async model() {
  const store = this.store;
  const startOfWeek = this.startOfWeek;

  const projectFetches = CONSTANTS.REPOS.map((repo) => {
    return store.findRecord('github-organization', repo);
  });

  let orgs = await all(projectFetches);

  const prFetches = orgs.map((org) => {
    return fetch(`https://api.github.com/search/issues?q=is:pr+org:${org.id}+created:>=${moment(startOfWeek).format('YYYY-MM-DD')}`, {
      headers: {
        'Authorization': `token ${this.githubSession.githubAccessToken}`,
      },
    })
    .then((response) => response.json())
    .then((pulls) => this.store.pushPayload('github-pull', { githubPull: pulls.items }));
  });

  const rfcFetches = ['ember-cli/rfcs', 'emberjs/rfcs'].map((repo) => {
    return store.query('github-pull', { repo, state: 'all' });
  });

  await all(prFetches);
  let pulls = this.store.peekAll('github-pull').toArray();
  let rfcSets = await all(rfcFetches);

  let mergedPulls = pulls.filter((pull) => {
    return moment(pull.get('mergedAt')) > moment(startOfWeek);
  }).reduce((previousValue, item) => previousValue.concat(item), []);

  let newPulls = pulls.filter((pull) => {
    return moment(pull.get('createdAt')) > moment(startOfWeek) && !pull.get('mergedAt');
  }).reduce((previousValue, item) => previousValue.concat(item), []);

  let newRfcs = rfcSets.map((pulls) => {
    return pulls.filter((pull) => {
      return moment(pull.get('createdAt')) > moment(startOfWeek);
    });
  }).reduce((previousValue, item) => previousValue.concat(item), []);

  let mergedRfcs = rfcSets.map((pulls) => {
    return pulls.filter((pull) => {
      return moment(pull.get('mergedAt')) > moment(startOfWeek);
    });
  }).reduce((previousValue, item) => previousValue.concat(item), []);

  return hash({
    orgs,
    mergedPulls,
    newPulls,
    mergedRfcs,
    newRfcs
  });
}

The key to refactoring model was to extract functions one at a time. From lines 5-19 and 25-26, I understood that model fetches PRs. That's great! I extracted a function. Similarly, from lines 21-23 and 27, I saw that model fetches RFCs. That's yet another extraction.

It was interesting that extracting functions from lines 29-47 (a total of 4 functions) did require knowledge of Ember in order to refactor effectively. In addition to the model hook, Ember provides the setupController hook. It allows us to post-process data from model. For example, we can filter arrays.

I moved lines 29-47 to setupController for better separation of concerns, extracted functions, then further simplified code. In the end, I uncovered this beautiful code:

model() {
  return hash({
    prs: this.fetchPRs(),
    rfcs: this.fetchRFCs()
  });
}

setupController(controller, model) {
  super.setupController(controller, model);

  const { prs, rfcs } = model;

  controller.mergedPRs = filterMerged(prs);
  controller.newPRs = filterNew(prs);
  controller.mergedRFCs = filterMerged(rfcs);
  controller.newRFCs = filterNew(rfcs);
}

c. Example 3

In ember-component-template-colocation-migrator, I extracted a few functions from the main function, execute, before I added a feature. As a result, the feature caused a small, predictable change to execute (lines 9-10 below):

async execute() {
  let templateFilePaths = this.findClassicComponentTemplates();
  templateFilePaths = this.skipTemplatesUsedAsLayoutName(templateFilePaths);
  templateFilePaths = this.skipTemplatesUsedAsPartial(templateFilePaths);

  if (this.structure === 'flat') {
    this.changeComponentStructureToFlat(templateFilePaths);

  } else if (this.structure === 'nested') {
    this.changeComponentStructureToNested(templateFilePaths);

  }

  await this.removeEmptyClassicComponentDirectories();
}

Another example—one that hasn't been done (it's up for grabs for Hacktoberfest!)—is to extract a function from the flags code that we saw earlier:

const { argv } = require('yargs');

function getStructure() {
  const changeToFlatStructure = argv.f && argv.s;
  const changeToNestedStructure = argv.n && argv.s;

  if (changeToFlatStructure) {
    return 'flat';
  }

  if (changeToNestedStructure) {
    return 'nested';
  }

  return 'flat';
}

Notice the early exits, the refactoring technique that we learned in Section 2.

4. Conclusion

You can make an impact on an open source project by refactoring code. By practicing just 3 techniques—rename things, remove nests, and extract functions—you can help new contributors understand code and increase the longevity of the project.

You witnessed a few examples of what code can be like when you take good care of it. I encourage you to apply what you learned and share these techniques with others.

Discussion

pic
Editor guide