loading...
Cover image for Are global variables bad?

Are global variables bad?

mortoray profile image edA‑qa mort‑ora‑y Originally published at mortoray.com ・5 min read

We hear it often, "global variables are bad, avoid using them!" But is this actually good advice? Simplified blanket statements are already a bit suspicious, but if we get into the details of this one, it seems to fall apart. In this article, I look at the lifetime and visibility of variables, needed to make sense of the claim. It may turn out to just be a meaningless statement.

What is a global variable?

The term "global" does not have a consistent definition across languages and architectures. This inconsistency requires us to be pedantic; if I need to issue a judgement about global variables, I have to understand what precisely I'm talking about.

Variables have several properties that define them. Two of the key properties for our discussion are lifetime and visibility. The type of the variable also plays a role, but it's a bit more subtle, so I won't cover it here.

It's important to understand the difference between names and values. Some language constructs can muddy the clarity of lifetime and visibility, such as with pointers.

Lifetime

The lifetime of a variable is how long the value exists (the value is referred to as the "object" in some languages). I don't want to get too deep into this concept, but we need a bit of an overview.

Some of the common lifetimes are:

  • temporary: These values are created by expressions, such as sin( a + 5 ). The resulting value can be passed to another function, or copied into a variable. The intermediate results disappear when no longer needed, or at the end of the statement in some languages.
  • block: Variables declared inside a block-scope (sections with {} or begin/end tags of some kind) tend to have values that exist only within that block.
  • instance: The values defined within a class exist on an instance of the class. This is a kind of child lifetime: when the parent dies so do the children.
  • application: These values share a lifetime with the application. They are created when it starts, or possibly later, and exist until the program terminates.
  • manual: The value's lifetime is controlled explicitly via new and delete operations.

Those are the classic options if you're considering a single executable written in one language. It would seem natural to label "global variables" as those with application lifetime. It can get confusing as we consider software systems comprising more than one program. We have values that persist beyond execution: database and configuration values. Some values exist so long as the host machine is running. In a web app, we have session lifetime: they disappear when the browser is closed. It's not clear what global lifetime means.

Visibility

Visibility is a statement about how we get access to a value. The typical approach is via a variable name. In this limited interpretation there are a few ways to create names:

  • block scope: Names within a block of code are only available in that block of code. Some languages have "function scope" instead of arbitrary block scope.
  • private member scope: Names within a class that are only accessible by member functions.
  • public member scope: Names within a class that are accessible so long as you have an instance of the class.
  • module scope: All the source code within a module sees these names. These may actually be public and private, like member variables.

Some values are "visible" via accessors function: setters and getters. At a quick glance, having the functions get_a and set_a is roughly equivalent to having a publicly visible variable a. Indeed many languages allow you to write accessor functions directly for a public variable. There's something a bit unsettling here; I'll get back to this.

A variable could also be marked as read-only with a const or final tag. Despite the names, this doesn't necessarily mean the backing object is constant, only that the symbol always points to the same object.

If we were to speak of "global" visibility, perhaps module scope comes the closest. These variables aren't visible everywhere though, usually within a module, or sometimes exported for public use. The visibility is also only upwards: a user of a module could see names inside it, but the module can't see names from the higher level code.

Correlation and specialness

Some of the lifetimes and visibilities have familiar relationships:

  • a block scope name tends to have a value with block lifetime
  • a module scope name tends to have a value with application lifetime

Though this may be the default, it's not the only option. By using a static, or similar, keyword we can give block scope variables a value with application lifetime. The same keyword can also make member properties that have application lifetime: these are called "class variables".

If we're doing multi-thread programming, we use a "thread local" lifetime. These variables may be visible to an entire module, but each running thread has a distinct value associated with it.

Back to those accessor functions I mentioned. If access to a value is hidden behind functions, there's no real way the caller can know the lifetime of the backing value. These accessors must play a role in any advice we give on "global" values.

Uhm, so what's good or bad here?

All of this leads back to the original question: are global variables bad? We couldn't answer before because we didn't have a clear definition. Now, armed with our knowledge of visibility and accessibility, I'm not sure we want to come up with a definition.

Should a global variable be defined as one with module scope and application lifetime? Does it matter if it's private to a module? If it's only modified through accessor functions is it still global? Consider that a function like get_time() is the same as a read-only variable accessing the current time, and I find it hard to believe we'd want to say this function is "bad".

What if my program is a micro-service architecture? I have several little programs that start and stop at frequent intervals. Though technically I have many values with an application lifetime, it feels more limited because of how I'm using them. They certainly aren't "global" in my system.

The answer

I don't think I can give a satisfactory definition of "global variable" that has any universal usefulness. That would mean it's somewhat meaningless to ask, "are global variable bad?" The question has to be more nuanced than that. It must refer to the applicability of all lifetimes and visibilities.

Answering the question "What are the applicability of various combinations of name visibility and value lifetime?" would be a long and complicated discussion. In short, I assure you that all combinations have both good and bad uses.

Posted on by:

mortoray profile

edA‑qa mort‑ora‑y

@mortoray

I'm a creative writer and adventurous programmer. I cook monsters.

Discussion

pic
Editor guide
 

Global is difficult to define. The issue with globals is not that they are global, but that they have a shared mutable state. We should be switching from saying global variables are bad to shared mutable state is bad.

 

I want to agree, but I've seen the result of this as well, such "single modifier" principles. You end up with a bunch of code that hides variables only to expose a set_x( value ) { x = value; } function -- missing the fact that it makes x a shared mutable variable.

Though I'd say the notion of ownership and control is important, and should be on a list of essential knowledge for programmers. Just like lifetime and visiblity, it plays a major role in where variables are defined.

 

I'd say "global" simply means "not namespaced or otherwise explicitly scoped".

When you look at it like that, they're untethered, untrustworthy and unnecessary.

 

Phrasing the whole issue from the other end, I think one could argue:
It's better to restrict access to mutable data as much as possible.

Otherwise it's easy to lose track of how the state of the application is changing, and reason about the data flow.

Your get_time() example is off, since the point is to not have a mutable global variable. If your alternative variable was NOT Read-Only, you could have hours of debugging at later points in your application. (If the application state would change with multiple calls to get_time(), then it would be bad)

 

It's more that code with unknown side-effects is bad. Globals are just a common path to weird side-effects. If you have a global that's mutable, then this seems fine as long as there are strict rules about how and when it can change.

 

I agree, restricting access to mutable data, in particular having a single owner who modifies the data, is a decent starting point. Of course, it also has nuances, but at least it's not as ambiguous as "global variables".

I'm not off on get_time because the question "are global variables bad?" makes no mention of mutability. It's one of those nuances, which is the point I'm trying to make. A read-only global time variable is something we agree that is okay.

 

Blanket statements are bad, because there is always a situation where they don't apply. You have to know when it is OK to make the exception though.
I tell my junior developers to start with the blanket statements: no globals, no gotos, no extern, etc. When they have some experience under their belt they'll know when it's OK to bend the rules they learned.

 

Maybe such simple blanket statements are okay for people just learning programmingm, but I'd expect my junior programmers to know at least some of the depth behind it.

Indeed, I'm tempted to make this an interview question. Are global variables bad? If a junior programmer can't give a sensible answer, either way, or neutral, and explain the reasons I honestly think they are underqualified.

 

The question still requires more information, because it depends on the situation it is in.

Take embedded programming with an "easy" case: a small photo display used for showing electronic convention badges at fan conventions. Yes, I built two versions.

Here, you have a situation where your program is the only thing running. There's no OS because there's no room for it -- you are the firmware and the OS. You got 30K of ROM and maybe 2K of RAM at the minimum. Oh, and you have to track which photo you opened, and some of that RAM can be used as a buffer for the pixels you're reading.

Arduino is embedded programming.

So, yes, global variables are good if used sparingly in this situation.

Now take something two levels up: You're using a Raspberry Pi that's on the International Space Shuttle. You're displaying sensor data and if things get into a warning level, you're flashing a warning. You're an application on the Rasbian OS. You got almost half a gig of RAM.

Here, the need for global variable across your code is... well... next to nil. Plus, most libraries can't access your global variables unless explicitly asked for. The only global you probably will have is the glibc standard "errno" which everything uses. If you're using a language that's not C, all of that is abstracted away into Exceptions, and you can trap those (try/catch/throw).

In this case, global variables have to have a damn good use case for them. Otherwise, they're in lower scope levels, and get more insulated from each other -- they don't become global. They start becoming localized.

I haven't come across a C# program that had global variables (probably because the program itself was wrapped in a class). But I've come across many Arduino sketches that needed them.

 

Yes, context is important. It's what I was hinting at when I mentioned a micro-service architecture, but didn't want to get into too many other options.

I also mentioned briefly that the "type" of the variable is important as well. This can also depend on domain, and threading, but certain types, like fundamentals, are a bit safer as globals than large classes, or pointers.

 

I think you're answering the wrong question. Global variables themselves are not bad. Spamming the global scope is, because you'll have stuff bleeding out of your scope or even worse affect other parts of a system.

 

Again, same problem, define "global" scope. Module scope is often restricted to module. Languages like C++ offer namespace scope as well. Top-level variables in Python don't affect the scope in sub-modules.

Sure, if you're sticking everything into one scope, that's a problem, but I don't it matters what scope that is.

 

Same difference. It's not about global variables, its about being mindful what to expose.

Well yes, but that's the point of my article. To expose the ideas behind what "global" might mean to draw a learner's attention to the real issues at play.

 

Like many "rules of thumb" in programming, it's talked about to beginners in order to prevent certain practices without properly explaining those concepts we're trying to impart. This, of course, is code that modifies variables that are not passed to it and not "owned" by the object/library that is modifying it. Doing so makes it incredibly hard to debug, manage the program state, and refactor. Beyond that, it further complicates threading and forking.

Is there a place for "global" variables (however you want to define it)? Of course, but it's something that should be wielded with caution and deliberation, and not standard behavior. These variables should be used inside the code that creates them in ways that validates their state/value without assumptions, and on projects that may one day become "mature," you will have to accept that changing the functionality and purpose of the global variables you create will (potentially) have huge consequences on the entire code base. That's not something I expect a beginner to understand, therefor, "Global variables are bad!"

 

I don't like these rules of thumb because they are misapplied and misunderstood. You end up having programmers that think they are following rules but still end up making the mistakes the rule is meant to prevent, and restricting themselves from using something that would help them.

It's why the definition of "global" is so important. If a programmer chooses to narrowly apply it to a specific variable type in their language they may entirely miss the point. For example, they might freely use Windows registry values, DB values, or otherwise to achieve the same thing, thinking it's alright since they don't have a mantra against it, and it isn't violating the "global" variable rule.

 

I look at global variables as variables in a scope you cannot reinitialize. In C it'd be C's global scope (you'll need a new application instance to reinitialize its global variables). In C# it's going to be the scope of a static class (you'll need a new application instance here again). Wikipedia has a better definition than mine 😅

Also, I think it's important to use the term global-variable in the context of the language when discussing its pros and cons.

Anyway according to your post, Application level lifetime would be my definition of global variables.

Something you might be interested in: softwareengineering.stackexchange....

 

Global mutable variables makes it much easier to create bugs.

Where global is "mutable variables which are ex machina to the routine", including mutable instance variables.

And easier to create spooky action at a distance bugs, which are always fun to track down.

Throw multithreaded programming into the mix, and bam instant job security.