Devin Witherspoon

Posted on Nov 29, 2020 • Edited on Jan 5, 2021

Stop Using "data" as a Variable Name

#discuss #javascript #react #codequality

"There are only two hard things in Computer Science: cache invalidation and naming things."

- Phil Karlton

Setting aside cache invalidation, which is indeed difficult, this infamous quote is something that rings in my head whenever I'm having trouble finding the right name for something. Clear naming provides important context whenever someone needs to quickly understand code, whether they're firefighting, debugging, interviewing, or assisting a teammate - I don't have to ask someone what users means, but I do have to ask what data means. While I don't often find the best names, I try to optimize my code for the reader by following some basic rules.

The Rules:

Use Meaningful Prefixes

While these prefixes aren't universal, they are great to establish a shared language within your team. Using them consistently throughout your codebase can make reading comprehension easier.

get, find, fetch for functions that return a value or a Promise that resolves to a value without mutating arguments or itself.
set, update for functions that mutate arguments or the callee for member functions. These functions may also return the updated value or a Promise that resolves to the updated value.
on, handle for event handler functions. My team's convention is that onEvent is passed through props into the component and handleEvent is declared inside the component.
is, should, can for boolean variables and functions with boolean return values.

Any convention that becomes a standard in your team can help with readability. Make sure to document these in the project README or wiki. Creating custom linters to enforce these would be even more effective.

Use Words that Add Meaning

As an example, developers often name variables data by default, but let's examine a couple of its definitions:

"factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation"

"information in digital form that can be transmitted or processed"

These definitions could refer to any variable we process, so they give the reader no information. Let's look at an example that doesn't follow this rule:

function total(data) {
  let total = 0;
  for (let i = 0; i < data.length; i++) {
    total += data[i].value;
  }

  return total;
}

We know this function calculates a total of something, but we're not sure what.

Exceptions

Sometimes your variable could actually contain anything, like a network request response body. Libraries like axios use data, which is a reasonable name in this context. Even in this scenario, the alternative body conveys more meaning and is what the native web API fetch uses in its Response.

Use Full Words

Like everyone else's, the System 1 part of my brain always tells me to take shortcuts to finish my task sooner. When it comes to variable naming, shortcuts often mean abbreviations or single character variable names.

Like before, let's look at a function that doesn't follow the rule:

function totBal(accts) {
  let tot = 0;
  for (let i = 0; i < accts.length; i++) {
    tot += accts[i].bal;
  }

  return tot;
}

We can do some mental gymnastics to guess that accts means accounts and tot is total, but we can't process the code at a glance.

Exceptions

Common industry abbreviations are preferred over their long form (e.g. URL, API, CSS).

Don't Use "Fluff" Words

Container and Wrapper only have meaning in relation to the thing they're containing. The problem is that any component that isn't a base element contains another component. You also end up in the awkward position of naming components MyComponentContainerContainer. The same applies to Wrapper.

Exceptions

In some contexts, these "fluff" words can have significant meaning. A common pattern in React class components is the presentation/container component pattern. Container in this case may indicate a component that manages state on behalf of a presentation component - just make sure you consistently use it for this purpose, or it loses meaning.

Spelling Matters

Misspelling words creates bugs and makes searching your code harder. Typos are easy to ignore, but having the right spelling for everything in your codebase makes a world of difference, especially when attempting global find/replace.

Putting it Together

Applying all the rules at once, we get the following function:

function getAccountsTotalBalance(accounts) {
  let totalBalance = 0;
  for (let accountIndex = 0; accountIndex < accounts.length; accountIndex++) {
    totalBalance += accounts[accountIndex].balance;
  }

  return totalBalance;
}

While accountIndex vs. i might be contentious, the rest of the function should be much clearer. getAccountsTotalBalance fully communicates the intent of the function and the prefix get indicates that it will not result in any mutations. It's worth the code author investing increased effort in exchange for the benefit of the reader. Your future self will appreciate the extra work when they're maintaining the code six months later.

If you're worried about line length, consider using a tool like Prettier to automatically format the code.

Conclusion

The goal of these rules is to bring as much meaning as possible to the code we write for future readers. Find the ones that work for your context, and if a rule is doing more harm than good, change or abandon it. Codifying your team's rules will help communicate your thoughts on the subject and is not meant to bring a hammer down on your teammates.

Please share any other rules you follow when naming variables, functions, classes, etc. or let me know if you disagree with any of the rules here and how you'd change them.

Top comments (54)

Todd Pressley • Nov 29 '20

Thank you for articulating this in your own way and publishing :) This topic reminds me of something a favorite mentor once taught me, when confused about naming a particular function:

"If you're having trouble naming a function, then it's most likely doing too many things."

Years later, when encountering similar issues, I play this back in my head and have found it very useful. It can be expanded to naming just about anything.

Again, love the article!

Devin Witherspoon • Nov 29 '20

Thanks for sharing! That’s a great angle to look at it from. I love that quote, I’ve definitely encountered that scenario many times both as a reviewer and an author.

Given people seem to have appreciated this article, would you mind if I expand on this in the future along the lines of “X reasons why you might be struggling to name something”?

Todd Pressley • Nov 29 '20

That'd be awesome, man! Not at all!

Kelly Brown • Nov 30 '20

Also, stop using "temp" as parts of names. All variables are inherently temporary. They are local in scope. Telling me that it is "temp" adds no new information. There is always a superior name. Take the traditional swap algorithm:

swapValue = a;
a = b;
b = swapValue;

Knowing the variable is used to contain the swapped value is far more useful information than knowing it's a temporary store.

Devin Witherspoon • Nov 30 '20

Great call out! I don’t see this much in production code, but is super common in interview questions, as well as when people are asking for help. I also like your alternative as a replacement without additional context.

Kelly Brown • Dec 2 '20

I kid you not: I have seen tempData before. Best of both worlds!

Chaitanya Malireddy • Nov 30 '20

In this context, if this were JavaScript, I'd dispense with the 'temp' variable altogether :) Also, often it makes sense to chain operations to avoid having to name intermediate states.

[a, b] = [b,a]; // destructuring assignment

Kelly Brown • Dec 2 '20

To me, it depends on the complexity of the intermediate state. I used to aggressively pack as many operations into a single line as possible, but more recently, I like capturing operations into local variables both for readability and for debugging (easily mouse over variables to see what the answer was). Obviously, this can be taken too far, so it's judgment call.

Chaitanya Malireddy • Dec 2 '20

Yeah I don't like to pack too much into a clever oneliner either - makes it hard to read and debug. I like chaining methods if I can, like say a bunch of array transformations. But you're right it all depends on what you're trying to do - one has to strike a good balance per usecase.

Kelly Brown • Dec 2 '20

Chain methods are amazing. I can put each call onto its own line, and it becomes a super clear series of steps.

Etienne Burdet • Nov 30 '20

req and res are two good candidates too! Especially when you start caching, fetching from backend and api… resFromServ, resFromCache, resFromNetwork etc. make things much easier to understand!

Vincent Milum Jr • Dec 23 '20

It is until it isn't. This is one of the large problem with the suggestion to use shorthands, is they mean different things to different people.

Coming from a Win32 background, "res" is a "resource", such as icons, images, etc. They're non-code elements compiled into a file exe file.

Its easy to get keyword conflictions between people when shortening them like this.

Chad Windham • Dec 4 '20

I really like this example, good call out😉

𒎏Wii 🏳️‍⚧️ • Nov 30 '20

I don't like the get prefix at all. Returning a value is the default use of a function, so it shouldn't be part of the functions name.

The word "total" could mean many things. Regarding balance, we can guess that it's the sum, but it's still better to make it clear.

The accountIndex variable should be named i. Generally, one-letter variables are bad, but i, j and a few more are so ubiquitous that they can and should be used. Every competent programmer will know what they mean.

Instead, that loop shouldn't be written like that at all. It looks like C code. Javascript and comparable languages have much nicer ways to iterate over an array and those should be used instead.

A good solution would be either

const sumBalance =
   (accounts) => accounts.reduce((acc, account) => acc+account.balance, 0)

const sumBalance =
   (accounts) => accounts
      .map( (acc) => acc.balance )
      .inject( (a, b) => a+b )

Devin Witherspoon • Nov 30 '20

Thanks for sharing! As far as defaults go, I try to always make the implicit into the explicit. For me that means saying get when it only returns a value. Same goes for the i value, I want my reader to do as little work as possible to understand my code. Using i can also result in referencing the wrong value when looping over multidimensional arrays. Someone who is a bit distracted may also have trouble keeping track of i and j simultaneously.

Regarding your point about the loop, I agree. Personally I don’t write loops using indices either, but many people do, and it’s the most universally recognizable for loop format. Changing the format of the loop would have obfuscated the intent of the exercise - showing how having rules or conventions can help us find better names for things.

𒎏Wii 🏳️‍⚧️ • Nov 30 '20

when looping over multidimensional arrays

For two dimensions, i and j are still easy enough to follow. Starting at 3, it's very rare to not have better names resulting from context, like x, y and z for spacial coordinates, etc.

but many people do

They shouldn't. Iterating over data structures with C-style numeric for loops is a much worse habit than calling a variable data. But fair point on the intent of the exercise, sometimes we have to write "bad" code to avoid having to think up a convoluted example just to illustrate a very basic principle.

cubiclesocial • Dec 3 '20

I've never understood why people use i and j for loop variable names. I prefer x, y, and z because they are generally used for looping spatially over an array (columns = x, rows = y, depth = z). Using x, y, and z also mirrors the Cartesian-like planes in mathematics fairly well. That is, anyone with a strong math background will understand x, y, z, and n intuitively while i and j are largely meaningless with i being used for imaginary numbers. i and j and l (lowercase L) are also the thinnest characters in many fonts, making them harder to read.

Mark • Dec 31 '20

Check out this Answer on StackOverflow:
stackoverflow.com/a/4137890/4035952

The answer is mostly because of Math, and what "i" and "j" stood for and it's pretty easy to understand how it made it's way into code. Luckily these days, simple "for" loops can be often be replaced with functional versions, or use "for...of"

Aedan • Dec 2 '20

While working on smaller scripts or just scribbling for yourself, I always use i, however once you work on a massive project with 1000s of lines of codes where there are multiple loops and multidimensional arrays going on, and they use i, x, z etc. it gets so confusing. It honestly never hurts to write out a word like accountIndex. It's simply good practice that has no real downside. The argument that it takes longer to write accountIndex than just i is true, however with IDEs having autocomplete this isn't a real issue. Even when writing it manually, it takes maybe 1-2 seconds to write accountIndex. While it might cost you 1-2 seconds now, you'll easily make up for it down the road when you go over that code weeks later and you immediately know what accountIndex refers to, rather than seeing the i for the 50th time. Even just scrolling through the code while working on it makes it so much more visible, saving you seconds here and there.

Josef Jelinek • Dec 3 '20

accountIndex is hard to read because the reader really needs to read it... for i there is instant recognition in the brain.

  for (let accountIndex = 0; accountIndex < accounts.length; accountIndex++) {
    totalBalance += accounts[accountIndex].balance;
  }

vs.

  for (let i = 0; i < accounts.length; i++) {
    totalBalance += accounts[i].balance;
  }

fortunatelly, there is a superior alternative (without going into Array methods):

  for (const account of accounts) {
    totalBalance += account.balance;
  }

Mark • Dec 31 '20

Yeah i think it's important to use words where it makes sense, but sometimes the syntax gets lost in long lines because of longer words. For variables, meaningful names are important, but I agree in things like loops, it's better to use alternative methods or function methods instead! +1 for "for...of"

Kelly Brown • Dec 2 '20

I later noticed that I do occasionally use data as a variable name. The scenario often seems to be passing along an opaque byte array. I suppose I could add a little flavor to it by way of receivedData or httpData or whatever context, but I don't know how to improve on that name when that particular method is not in charge of decoding that data.

Devin Witherspoon • Dec 2 '20

I think that falls well into the exception category. byteStream or byteArray may be helpful for describing the nature of the data, but if there’s not a clear intent to the data then I suppose the next option is to make sure that name for it is as short lived as is reasonable.

Keno Clayton • Dec 3 '20

Good article 👍🏾 I prefer using i for any sort of single dimensional loop, but for multi-dimensional loops, it is important to know exactly which index you're referring to. e.g.

for ( rows = 0; ... )
  for ( columns = 0 ... )

Devin Witherspoon • Dec 3 '20

Yup, to each their own 👍 I tried to acknowledge that particular point as contentious. Just too much dogma around it. The important part is that the project is consistent and it's actually something that the team applies consistently.

Christopher Wray • Nov 29 '20

Wow, totally agree! Another one is “payload”. Why in the world would you use that as a variable? What is the payload expected to be? That is what the variable should be named.

I also really like your definition of accountIndex vs i. Makes way more sense.

Devin Witherspoon • Nov 29 '20

payload is a great example! Super generic, could be anything, all it tells us is it’s probably not metadata. I think it has the same exceptions as data as well.

official_dulin • Apr 29 '21 • Edited

If undefined is a meaningful value, replace it with a variable. E.g. We will do something if scoreType is all.

Bad

if (scoreType === undefined) {
  // do something
}

Good:

const all = undefined;
if(scoreType === all) {
  // do something
}

Aedan • Nov 29 '20

Great article, thank you.

I've recently watched "Clean Code - Uncle Bob" (Lesson 1 and 2, both on YouTube). His conclusion is that whenever any line of code requires comments, it's a failure on the developers part because good code should contain variable and function names that explain themselves, thus making comments unnecessary (though I disagree on the comment part, because there are some good reasons for comments in code). He mentions that a lot of developers dislike longer names because it makes the code look "uncool", bigger in file size, looks bloated, leads to horizontal scrollbars in the IDE (or wraps it to a new line) and some developers are even under the false impression that longer names lead to slower code. Just like in your article, he mentions to use names that are perfectly understandable even if consist of multiple words. We really need to stop being afraid of using longer names and start using names so that the name itself becomes self-explanatory and can potentially work without any additional commentary. Thanks again for the article.

Devin Witherspoon • Nov 29 '20

Thanks for the feedback, I'm glad you appreciated it. I agree with Robert Martin on his points about code communicating intent as much as possible. I wish he focused more on the people behind the code and expressing kindness to each other being as important as getting the code right.

Regarding comments, I personally try to add comments for context that is important and isn't really part of the code - e.g. intended lifespan of the code, maybe even linking to a ticket with an important conversation. For communicating intent - I try to resort to tests as much as possible since they're more inclined to change with the code. I don't consider it a failure to add a comment because it was hubris for me to ever think I didn't need them. I think it's an achievement of self awareness to know where to add proper comments rather than a failure of coding ability.

View full discussion (54 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.