DEV Community

Cover image for In Defense of Defensive Programming
Adam Nathaniel Davis
Adam Nathaniel Davis

Posted on • Edited on

In Defense of Defensive Programming

[NOTE: In this article I reference a validation library that I wrote called allow. It's now in an NPM package that can be found here: https://www.npmjs.com/package/@toolz/allow]

My regular readers (both of them) know that I've written a lot about the integrity of values that are passed between different parts of an application. Sometimes, we add manual validations. Sometimes, these values aren't checked at all. Sometimes, we check them at compile time, but we assume they'll be correct at runtime (I'm looking dead at you, TypeScript).

Whatever the approach, I've only recently become aware that the term "defensive programming" is generally used as a pejorative by many programmers. My impression is that "defensive programming" is often interpreted as "jumping through a ridiculous number of hoops to validate data - data that probably doesn't really need to be validated at all." And I don't entirely disagree with this assessment. But I fear some may have become so averse to the idea of defensive programming that they don't recognize the other loopholes they're incorporating into their own code.


Alt Text

Basic Assumptions

Let's ensure that we're all on "the same page" here. I'm sure there are multiple definitions for defensive programming. So, for the sake of this article, this is the definition I'll be using:

Defensive Programming: The practice of treating all inputs to a program as "unknown" - hostile, even. This practice guards such inputs from the main application flow until they've been validated as conforming to the "expected" type/value/format.


I'm focusing on inputs. It would be possible to validate data within the same code block where it was defined. And such a practice would certainly be defensive. But it would also be extreme. And silly.

But inputs represent the strongest case for defensive programming. Because inputs come from... somewhere else. And you don't want this program to be aware of the inner workings of another program for it to do its business. You want this program to be a standalone unit. But if this program stands alone, then it must also assume that any input to the program is potentially hostile.


Alt Text

Validation Hell

This is where "defensive programming" becomes a dirty word. When we talk about validating all of our inputs, we fear it will lead to something like this:



const calculatePassAttemptsPerGame = (passAttempts = 0, gamesPlayed = 0) => {
  if (isNaN(passAttempts)) {
    console.log('passAttempts must be a number.');
    return;
  }
  if (isNaN(gamesPlayed)) {
    console.log('gamesPlayed must be a number.');
    return;
  }
  if (gamesPlayed === 0) {
    console.log('Cannot calculate attempts-per-game before a single game has been played.');
    return;
  } 
  return passAttempts / gamesPlayed;
}


Enter fullscreen mode Exit fullscreen mode

The function has inputs. And the function shouldn't be aware of where those inputs originated. Therefore, from the perspective of the function, the inputs are all potentially dangerous.

That's why this function already has some significant baggage attached to it. We can't necessarily trust that passAttempts or gamesPlayed are numbers. Because passAttempts and gamesPlayed are inputs to this program. And if we feel the need to program "defensively", we end up stuffing extra validations inside our program.

Honestly, the validations shown above aren't even adequate, as far as I'm concerned. Because, while we're ensuring that the inputs are numbers. We're not validating that they're the right kind of numbers.

Think about this: If we're logging the pass attempts per game, does it make sense that either could be negative? Would it make sense if either of them are fractional?? I can't remember the last time a player threw 19.32 passes in a single game. I can't remember the last time a player played in -4 games. And if we want to ensure that our function is truly equipped to always provide the most logical returns, we should also ensure that it is always given the most logical inputs. So if we really wanted to go all-in on defensive programming techniques, we'd add even more validations to ensure that the inputs are non-negative integers.

But who really wants to do all of that?? All we wanted was a simple function that returns the result of passAttempts divided by gamesPlayed, and we ended up with a bloated mess of code. Writing all of those defensive validations feels laborious and pointless.

So how do we avoid the nuisances of defensive programming? Well, here are the approaches (excuses) that I most frequently encounter.


Alt Text

Missing The Forest For The Trees

Is the picture above a bunch of trees? Or is it a single forest? Of course, depending upon your frame of reference, it may be either (or both). But it can be dangerous to assume that the picture above shows no "trees" and only shows a single "forest".

Similarly, what do you see when you look at code like this?



const calculatePassAttemptsPerGame = (passAttempts = 0, gamesPlayed = 0) => {
    //...
}

const calculateYardsPerAttempt = (totalYards = 0, passAttempts = 0) => {
    //...
}

const getPlayerName = (playerId = '') => {
    //...
}

const getTeamName = (teamId = '') => {
  //...
}


Enter fullscreen mode Exit fullscreen mode

Is this one program (a "forest")? Or is it a bunch of individual programs ("trees")??

On one hand, they're presented in a single code example. And they all seem related to some kind of central player/team/sport app. And it's entirely possible that these functions will only ever be invoked in a single runtime. So... they're all part of a single program (a "forest"), right??

Well, if we think beyond our overly-simplistic example, the simple fact is that we should always be trying to write our functions as "universally" as possible.

This means that the function might only ever be used in the context of this particular example. But the function also might be referenced dozens of different times across the app. In fact, some functions prove to be so utilitarian that we end up using them across multiple applications.

This is why the best functions operate as standalone, atomic units. They are their own "thing". And as such, they should be able to operate irrespective of the broader app from which they're called. For this reason, I believe, religiously, that:

Every single function is: a program.


Of course, not everyone agrees with me on that front. They argue that each function is a tree. And they only need to worry about the inputs that are provided to their overall program (the forest).

This gives devs a convenient way to avoid the headaches of acid-testing their code. They look at the example above and they say things like, "No one will ever pass a Boolean into getPlayerName() because getPlayerName() is only ever called from within my program and I know that I'll never pass something stupid into it - like a Boolean." Or they say, "No one will ever pass a negative number into calculateYardsPerAttempt() because calculateYardsPerAttempt() is only ever called from within my program and I know that I'll never pass something stupid into it - like a negative number."

If you're familiar with logical fallacies, these counterarguments basically fall under Appeal to Authority. These devs treat the program as the "authority". And they simply assume that, as long as the input is provided from somewhere else within the same program, there will never be any problems. In other words, they say, "The inputs to this function will be fine because 'the program' says they're fine."

And that is fine - as long as your app is miniscule . But as soon as your app grows to the point that it's a "real", robust app, this appeal falls flat. I don't know how many times I've had to troubleshoot code (often... my code), when I realized that something was failing because the wrong "kind" of data was passed into a function - even though the data came from somewhere else inside the same program.

If there are (or will ever be) two-or-more devs on the project, this "logic" is woefully insufficient. Because it relies on the silly idea that anyone else who works on the project will never ever call a function in the "wrong" way.

If the project is (or will ever be) large enough that it's impractical to expect a single developer to have the entire program in their head, this "logic" is, again, woefully insufficient. If an end-user can put ridiculous values in a form field, then it's equally true that another programmer can try to call your function in a ridiculous way. And if the logic inside your function is so brittle that it blows up whenever it receives bad data - then your function sucks.

So before we move on, I want to make this crystal clear: If your excuse for not validating your function inputs is simply to lean on the fact that you know all the ways the function will be called by you in your app, then we really never need to be on the same dev team. Because you don't code in a way that is conducive to team development.


Alt Text

The Testing Shell Game

I've found that many devs don't try to solve the problem of brittle inputs by writing a bunch of defensive code. They "solve" it by writing a metric crap-ton (technical term) of tests.

They'll write something like this:



const calculatePassAttemptsPerGame = (passAttempts = 0, gamesPlayed = 0) => {
  return passAttempts / gamesPlayed;
}


Enter fullscreen mode Exit fullscreen mode

And then they shrug off the brittle nature of this function by pointing to the incredible pile of integration tests they wrote to ensure that this function is only ever called in the "right" way.

To be clear, this approach isn't necessarily wrong. But it only shunts the real work of ensuring proper application function to a set of tests that don't exist at runtime.

For example, maybe calculatePassAttemptsPerGame() is only ever called from the PlayerProfile component. Therefore, we could try to craft a whole series of integration tests that ensure this function is never actually invoked with anything other than the "right" data.

But this approach is tragically limited.

First, as I've already pointed out, tests don't exist at runtime. They're typically only run/checked prior to a deployment. As such, they are still subject to developer oversight.

And speaking of developer oversight... trying to acid-test this function through integration tests implies that we can think of all the possible ways/places where the function can be called. This is prone to short-sightedness.

It's much simpler (in the code) to include the validations at the point where the data needs to be validated. This means that there are usually fewer oversights when we include the validations directly in-or-after the function signature. So let me spell this out simply:

Tests are great. But they are never a one-for-one replacement for data validation.


Obviously, I'm not telling you to eschew unit/integration tests. But if you're writing a pile of tests just to ensure proper functionality when a function's inputs are "bad", then you're just doing a shell-game with your validation logic. You're trying to keep your application "clean" - by shoveling all of the validation into the tests. And as your application grows in complexity (meaning that: there are more conceivable ways for each function to be called), your tests must keep pace - or you end up with glaring blindspots in your testing strategy.


Alt Text

The TypeScript Delusion

There's a large subset of Dev.to readers who would read this with a cocky smirk and think, "Well, obviously - this is why you use TypeScript!" And for those cocky devs I'd say, "Yeah, ummm... sorta."

My regular readers (both of them) know that I've had some real "adventures" over the last half-year-or-so with TS. And I'm not against TS. But I'm also wary of the over-the-top promises made by TS acolytes. Before you label me as a Grade-A TypeScript Haterrr, lemme be clear about where TS shines.

When you are passing data within your own app, TS is incredibly helpful. So for example, when you have a helper function that's only ever utilized within a given app, and you know that the data (its arguments) only ever emanate from within the app, TS is incredible. You pretty much catch all of the critical bugs that might occur throughout the app whenever that helper function is called.

The utility of this is pretty obvious. If the helper function requires an input of type number and, at any point in the rest of the app, you try to call that function with an argument of type string, TS will immediately complain. If you're using any kind of modern IDE, that also means that your coding environment will immediately complain. So you'll probably know, immediately, when you're trying to write something that just doesn't "work".

Pretty cool, right???

Except... when that data emanates from outside the app. If you're dealing with API data, you can write all the comforting TS type definitions that you want - but it can still blow up at runtime if the wrong data is received. Ditto if you're dealing with user input. Ditto if you're dealing with some types of database inputs. In those cases, you're still resigned to either A) writing brittle functions, or B) adding additional runtime validations inside your function.

This isn't some knock on TS. Even strongly-typed OO languages like Java or C# are susceptible to runtime failures if they don't include the proper error handling.

The problem I'm noticing is that far-too-many TS devs write their data "definitions" inside the function signature - or inside their interfaces - and then... they're done. That's it. They feel like they've "done the work" - even though those gorgeous type definitions don't even exist at runtime.

TS definitions are also (severely) limited by the basic data types available in JS itself. For example, in the code shown above, there is no native TS data type that says passAttempts must be a non-negative integer. You can denote passAttempts as a number, but that's a weak validation - one which is still vulnerable to the function being called the "wrong" way. So if you really want to ensure that passAttempts is the "right" kind of data, you'll still end up writing additional, manual validations.


Alt Text

The Try-Catch Hail Mary

There is one more avenue we could explore to avoid defensive programming: the try-catch.

Try-catch obviously has its place in JS/TS programming. But it's quite limited as a tool for defensive programming when it comes to validating inputs. This happens because try-catch is really only meaningful when JS itself throws an error. But when we're dealing with aberrant inputs, there are frequently use-cases where the "bad" data doesn't result in an outright error. It just provides some kind of unexpected/undesired output.

Consider the following example:



const calculatePassAttemptsPerGame = (passAttempts = 0, gamesPlayed = 0) => {
  try {
    return passAttempts / gamesPlayed;
  } catch (error) {
    console.log('something went wrong:', error);
  }
}

const attemptsPerGame = calculatePassAttemptsPerGame(true, 48);
console.log(attemptsPerGame); // 0.0208333333


Enter fullscreen mode Exit fullscreen mode

The try-catch is never triggered, because true / 48 doesn't throw an error. JS "helpfully" interprets true as 1 and the function returns the result of 1 / 48.



Alt Text

It's Not That Hard

At this point, for those still reading, you're probably thinking, "Well then... there's no good answer to this. Defensive programming is cumbersome and slow. Other techniques are prone to oversights and failures. So... what's to be done???"

My answer is that defensive programming doesn't need to be so hard. Some people read "defensive programming" as "validate ALL inputs" - and they jump to the conclusion that validating ALL inputs must, by definition, be a nightmare. But that's not the case.

I've written before about how I do runtime validation on ALL of my functions that accept inputs. And for me, it's easy. (If you'd like to read about that, the article is here: https://dev.to/bytebodger/better-typescript-with-javascript-4ke5)

The key is to make the inline validations fast, easy, and concise. No one wants to clutter every one of their functions with 30 additional LoC of validations. But - you don't have to.

To give you a tangible example of my approach, consider the following:



import allow from 'allow';

const calculatePassAttemptsPerGame = (passAttempts = 0, gamesPlayed = 0) => {
  allow.anInteger(passAttempts, 0).anInteger(gamesPlayed, 1);
  return passAttempts / gamesPlayed;
}


Enter fullscreen mode Exit fullscreen mode

The entire runtime validation for this function is handled in a single line:

  • passAttempts must be an integer, with a minimum value of 0.
  • gamesPlayed must also be an integer, with a minimum value of 1.

That's it. No TS needed. No fancy libraries. No spaghetti code crammed into every function to manually validate all of the arguments. Just a single call to allow, that can be chained if there are two-or-more arguments expected in the function.

To be absolutely clear, this is not some kind of (long-winded) advertisement for my silly, little, homegrown validation library. I couldn't care less which library you use - or whether you roll your own. The point is that runtime validation doesn't need to be that hard. It doesn't need to be verbose. And it can provide much greater overall security to your app than any kind of compile-time-only tool.


Alt Text

The Arrogance of the Entrenched

So should you reconsider any aversions you have to "defensive programming"?? Well, umm... probably not.

I understand that, you probably already have a job where you're paid to program. And in that job, you probably already work with other programmers who set all of their coding ideas in stone years ago. They've already allowed those programming bromides to sink deep into their soul. And if you question any of that, you'll probably be shot down - and quietly scorned.

Don't believe me? Just take a look at the article that I linked to above. There was some nice feedback in the comments. But one, umm... "gentleman" decided to respond with nothing but: "Yuck..."

That's it. No constructive feedback. No rational logic. Just: "Yuck..."

And that is basically what soooo much of programming comes down to these days. You could develop a way to do nuclear fusion merely by writing JavaScript code. But someone will come along, with no additional explanation, and just say, "Yuck..."

So... I get it. I really do. Keep writing your TS. And your copious tests. And keep refusing to validate your function inputs. Because that would be "defensive programming". And defensive programming is bad, mmmmkay????

And I'll keep writing applications that are more fault-tolerant, with fewer lines of code.

Top comments (26)

Collapse
 
cipharius profile image
Valts Liepiņš • Edited

As for the safety of external input in TypeScript case, I want to suggest a third option - parsing the data.

There are some great articles on validating vs parsing. Validation is like poking the data at certain points of application to see if it seems alright, but that comes with a drawback of possibly not checking enough or checking already validated data over again.

Where parsing is like taking the data and transforming it to a concrete form, that other functions can work with. If the data can not conform to this form, that's erroneous data. Once the data is transformed to this specific datatype, it's safe to assume that it will be correct since you have ended up describing it with concrete type and type checker can ensure that it will be properly treated by rest of the code.

Specifically in TypeScript case, one would define the acceptable format of data using interface. So a parser would be any function, that can take raw data (string, other possibly invalid structure) and returns object that fulfills an interface.

Sources of inspiration:

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

This is quite valid. And I've already noticed a few other comments referring to parsing. I myself have actually gone quite a ways down this road in some of my previous projects, and I think there's a definite time-and-place for this. But I don't think it's a use-everywhere kind of solution.

In TS, you can do this by creating classes that will, essentially, validate the data. And if the data doesn't conform, it throws an error or populates some kind of error state. Then you force the function to require an instance of that class. This is useful - but it still runs into a few problems:

  1. Like everything with TS, it's useless (nonexistent) at runtime.
  2. Dealing with the object that holds our value can sometimes be more annoying than simply having direct access to that value.

Granted, these aren't "deal breakers". But they're something to think about when considering the "parse in TS" approach.

Collapse
 
cipharius profile image
Valts Liepiņš

I'm coming with this idea from Haskell, so I wanted to try expressing it in TypeScript.
This is what I ended up with:

As for runtime safety, it's really up to how well one uses type system to enforce valid system state. The main constraint here is how expressive the type system is. I can't personally comment on limits of TypeScript, but from my little codepen expriment, it seems to be pretty capable.

Collapse
 
ksaaskil profile image
Kimmo Sääskilahti • Edited

Thanks for the great article! One thing that came to mind is that in the great Pragmatic Programmer book the authors say your code should always fail early. So if something unexpected happens, one shouldn't just accept it but raise an exception. So could your example of using allow be interpreted as following their "use asserts in production" advice, what do you think?

For run-time type checking in TS, have you taken a look at the io-ts library? It's nice when the user or server inputs are complex.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

First - yes! "Using asserts in production" is absolutely another way of describing what I'm trying to get across!

And second, no - I haven't checked out io-ts. But I appreciate the heads-up, and I'll definitely give it a look!

Collapse
 
ksaaskil profile image
Kimmo Sääskilahti

Cool! I also can't resist linking the great Parse, don't validate blog post by Alexis King, though it's most directly concerned with strongly typed languages like Haskell.

Thread Thread
 
cipharius profile image
Valts Liepiņš

I referenced the exact same blog post in my response!

You might be interested in seeing my attempt at applying those ideas to TypeScript:
dev.to/cipharius/comment/15im8

Collapse
 
stereoplegic profile image
Mike Bybee • Edited

Upon reading the title, I had a feeling TypeScript would come into this, since most of the derogatory mentions of "defensive programming" I've heard (and most utterances of "comments are a code smell") have come from those defending TS against my critiques of it.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

I actually believe that "comments are a code smell". But that's a topic for another article...

Collapse
 
stereoplegic profile image
Mike Bybee • Edited

I think comments can get out of hand, but it's good to comment functions, their params/types/etc., and their returns (and such use of JSDoc with linting - and ironically even ts-lint in VS Code - makes the most common argument for TS a moot point), and JSDoc can be just as handy for generating code docs as javadoc and other similar inspirations for it; it is especially rich, however, to hear "code smell" uttered by those who tack on mountains of extra nonstandard syntax just to hack JS into behaving like another language (and do so using a language with a typeof keyword that switches from its own context to JS context depending on where it's written).

Thread Thread
 
bytebodger profile image
Adam Nathaniel Davis

I think we're basically in agreement. I don't know if you saw it, but I basically wrote a whole article talking about how JSDoc is basically... TS. (dev.to/bytebodger/a-jsdoc-in-types...)

So, if your comments are essentially a way of providing type hinting for your IDE (like JSDoc), then yeah, I get that. But most "traditional" comments - the ones where you think you're telling me what your code does - are a "code smell". If I have to read the comments to understand your code, it's crappy code.

Thread Thread
 
stereoplegic profile image
Mike Bybee • Edited

Mostly. I think a landmark here and there can be helpful, especially for future refactors. Sure, one could always ask, "Why aren't you just abstracting it now?" but I think we've both seen enough premature abstractions and lost edge cases to know better.

Collapse
 
zilti_500 profile image
Daniel Ziltener

fyi, a function checking its inputs before processing (and potentially its outputs before returning them), those checks are called contracts.

And user inputs... yea, that's not "defensive programming", that's just common sense to check them

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Totally agree!

Collapse
 
huncyrus profile image
huncyrus

Two story for defensive programming.

1.) Personal story
12 years ago, I wrote a check and validation heavy small framework w/ PHP 5, what never trusted anything, paired and hashed everything and checked everything always. The service still running with it, and I did not touched the code in the previous 5 years at all. So there are a bunch of things what is deprecated because of PHPv7. The service got approx 8-10 million of bot inquiry (ddos, checks, typical injections and generic attacks) per year, and because of the heavy security solutions, the service still up and running, had zero downtime nor breach (what I am proud of).

  1. company story Many years ago, when I started to work at the company where I still workin', I was the only one who designed, planned, built everything as defensive, as possible under c++/js/php. My colleagues does not like it, so they started to remove most of my modules and applications/services. After a few years, GDPR and security breach hitted the company, because of the weakened security. A few quote:
    • "You have to trust the input what other internal service give you, because we did it!"
    • "You only need SSL/HTTPS for web security!"
    • "You don't need ACL in the portal!"
    • "You can not breach a HTTPS connection"
    • "The perfect load for the server is around 90% always!"
    • "We do not need scalability, we just click on the cloud dashboard for more cpu and memory!"
    • "You do not have to double sanitize logined user input, we trust our partners after they login!"

So now everything on fire, and the company will hire a devop/sysop and security expert company for huge amount of money to review everything and point out how to fix the infrastructure and codes.
Except my parts. What they did not touched, they are still safe, not compromised and has been updated by going through OWASP and other security lists (since there are a few other also).

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Awesome stories! And this hits upon a point that I didn't really address in my article. Mainly, that when you do those "acid tests" on inputs, it doesn't just make your code sturdier in the short term. That bulletproof code tends to stay bulletproof for a very long time.

I've seen some very old, very stable mainframe code - the kinda code that was written decades ago and is still running. And that code tends to use a lot of these voracious ("defensive") techniques.

Collapse
 
somedood profile image
Basti Ortiz • Edited

Glad to see this discussion again. I agree with your overall thesis, but I propose a slight adjustment to the assertion that all functions must be treated as their own separate programs.

When every single function is treated as a "hostile" entity, I'd be remiss if I wouldn't mention the performance implications of multiple redundant validations throughout the app.

Personally, this is the main reason why I've been reluctant to validate every single input to every single function. It's like an "itch" of sorts. You can argue that the costs are negligible, but they indeed exist.

Hence my slight adjustment to your assertion. I propose that all validations must be done within a central area of the app, preferably the user-facing side. That way, all defensive tactics may be employed at the site of external input. After this point, one can use TypeScript (for example) to uphold the "contracts" between the central input validator and the application logic.

Moreover, this allows us to focus our validation-based integration tests on that specific area of code. The rest of the codebase can sleep well on the assumption that the central input validator has done its job well. If not, then we only need to worry about changing the code for the central validator. Rinse and repeat until the application is "robust".

To cite an analogy, one can imagine a bouncer at a night club. The bouncer is responsible for "validating" all guests from outside. Once they've been "validated", the internal structure (i.e. the night club) can service the guest on the assumption that the bouncer has done their job well. No validation redundancy required.

In your example in the article, we can apply this technique by creating a class for player statistics. All input would be validated in the constructor. Once all of the assumptions have been validated, then the methods could sleep well on the assumption that nothing can go wrong with the initialized class properties.

Basically, what I'm trying to say is that I wouldn't advocate for an extremely defensive paradigm. We can have the best of both worlds simply by delegating and centralizing all validation logic in the application so that the validation overhead cost is only paid once, preferably at the exact site/s of external input.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Thanks for the thoughtful reply. And I'm about 99% in agreement with you. I do concur that, if external inputs are properly vetted (by the "bouncer"), then TS multiplies in usefulness.

The only part of your response I'd quibble over is the concern over performance.

Obviously, every single LoC is, on some level, a "hit" to performance. So I'm not going to tell you validating, in the way I've shown above, has absolutely no impact. But such concerns almost always fall under the category of micro-optimizations.

I see these same kind of discussions from those who want to bicker over how much faster a for () loop is compared to an Array.prototype function. (Not saying that you're that person - just saying that these performance "concerns" can lead down the same path.) The simple fact is that the vast majority of all applications - even large, heavily-used applications - will never have the slightest need to worry about for-vs-Array.prototype, or defensive-vs-bouncer validation. And if the app does need to focus on such minute optimizations, the programmers could probably achieve much greater gains by focusing on much larger pieces of the application's general flow.

Nevertheless, none of that is meant as any kind of "rebuttal" against what you've written. You make excellent points.

Collapse
 
somedood profile image
Basti Ortiz

That is definitely true. I must concede that it does sound a bit like premature optimization on my part. It's really just that "itch" to squeeze out every single CPU cycle, you know? 😅

Thread Thread
 
bytebodger profile image
Adam Nathaniel Davis

Oh, yeah. I'm with you. And I certainly don't speak about "micro-optimizations" in an accusatory manner. I've gone wayyyyy down that road - too many times. We've all been there. Once you start trying to count those CPU cycles, it quickly becomes a bit of an obsession!

Collapse
 
thescottyjam profile image
theScottyJam

It sort of sounds like if typescript automatically provided runtime assertions with each function, then half of your concerns would be gone (though you still have to do manual checks for things such as distinguishing positive from negative numbers).

All of your arguments in this post seem to hang on the idea that "Every single function is a program", and I'm going to disagree with that root argument.

When you mean every, do you really mean every?

What about functions inside functions?

export function doStuff() {
...
// I assume this function doesn't need to be validated
const result = myArray.map(user => user.name)
...
// nor does this
const extractName = user => user.name
const result2 = myArray.map(extractName)
}

ok, that was a bit silly. But what if we moved them outside? Is our program that much more fragile now?

const extractName = user => user.name

export function doStuff() {
...
const result2 = myArray.map(extractName)
}

All we've done is moved the function to a place where anything in this module can call it, instead of whatever is just inside the doStuff() function. The amount of danger this poses depends on how big the module is - if this module only contained the doStuff() function and the helpers that we pulled out from it, then there's no more danger having them outside the function than inside. It's unclear to me whether or not you find non-param-validating private functions like this bad or not, so let's take it a step further.

Let's say our module name was _shares.js, and the developers on the team understood this naming convention to mean "don't touch this unless you're inside this package" (or maybe we're working in a language like Java which actually has package-private access-modifiers). And now we start exporting our extractName function for the package to use. How bad this is depends on the size of the package. Having this exported utility function in a really small package is less risky than keeping it private within a ginormous module, so a rule like "all exported functions should validate their params" is a bit of an arbitrary boundary.

We can take it to the next step and make it a public export of an internal company library, or another step to make it a public export for the general public.

In all of these steps, the only thing we're doing is changing how many places have access to this function - the more places, the riskier it is to not validate its inputs. So claiming that "all functions should be stand-alone programs" sounds nice in theory, but in practice, no one's going to add parameter validation to every single function (like the anonymous functions used in .map()), and there's no clear cut way to define the line of when it should and shouldn't happen.

And what's the disadvantage to not validating parameters? Bugs in your program are harder to track down.

I guess what I'm getting at is that there's a balance. if few things call your function (which is the story of most functions in a project), then it's better to keep it lean and focus on readability over helpful errors. As its usage grows, people can always retroactively make improvements to it. If your function gets used all over the place, then put some more helpful errors. In some cases, you might have to add a fair amount of code to really generate good error message for the number of ways people may mess up (even using a library like yours), and you might need to write longer, more explanatory error messages too - that kind of verbosity just isn't appropriate in every single function ever.

Other places that benefit from parameter validation include:

  • Some forms of string interpolation (e.g. generating HTML, or an SQL query - these are attack surfaces, and should be heavily validated).
  • Code that's doing persistent changes (e.g. database writes), where it's preferable to have errors happen in advance so that it doesn't leave things in a permanent bad state. In other words, a chunk of code that's needs to perform a single, reliable atomic operation using multiple steps.
  • I'm sure other specific scenarios exist too.
Collapse
 
bosepchuk profile image
Blaine Osepchuk

The amount of defensiveness I employ depends on the context.

For example, little utility scripts I write for my own one-time use see very little defensive programming. It's usually unnecessary. If I encounter an error when I run the script, I'll fix it immediately and continue until it runs successfully and then delete the script.

Whereas extremely complicated, large, long running, multi-programmer, multi-million dollar efforts see extensive defensive programming (especially in the critically important code paths). My experience in these kinds of projects is that the cost of adding defensive programming to the code is easily recovered by the time and frustration I save in testing, responding to bug reports, debugging, patching, and so forth.

So the key, in my opinion, is to know how to program defensively and then constantly evaluate the context of the code I'm writing and apply the right amount of defensiveness for that context. This last point where some of the disagreement may be coming from in the discussions of this issue. People are working on many different kinds of projects and what's appropriate for one project isn't necessarily appropriate for another.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Couldn't agree more.

One of the common threads in my articles is my hatred for universal (mindless) rules - for the sake of rules. So even though I generally advocate for defensive programming - as a default - I would HATE for anyone to adopt it as a mindless dictate. There are absolutely times when defensive programming is simply a waste of effort.

Even the most basic, common, universally-accepted rules of programming should still be secondary to common sense. And defensive programming is no exception.

I just get annoyed when people choose to paint "defensive programming" as a known bad, merely because they can't be bothered to do the work that's necessary to properly validate their programs' functions.

Collapse
 
merkrynis profile image
Julien Bouvet

Very very nice article.

I think it's a matter of being pragmatic. Defensive programming save some times on debugging, on security fixing, etc...

It's only a matter of implementation details to provide simple and clear way to do so.
Your allow is a very good example, and a lot of ways exist to do the same.
I used to implement validator classes in C#, and it was easy to use, crystal clear, and in the end, with a very limited impact on performances (especially in web tech where those checks, if properly implemented represents a marginal cost compared to page generation, images loading etc...)

Thanks a lot for sharing this with us all :) I will look in your other articles, you just got a new follower :p

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Thank you for the feedback! And, indeed, I agree with pragmatism. I'm not going to "code shame" anyone because they haven't validated every single input on every single function/method/etc. But I do look askance at anyone who presumes that any such validation represents the "dreaded" defensive programming.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Thank you!