Working on a defect in Leaf I had a question: should function arguments be reassignable within a function? Are they just like local variables, or should they be treated specially? It would solve my problem in the Leaf compiler, but I don't like making decisions for technical convenience. What is the correct answer?
This is an open question and I'd love to hear your feedback. The article is details and my viewpoint, but I don't reach a conclusion.
Imperative approach
In the languages rooted in imperative programming, like C, C++, Java, and C#, we can freely use function arguments as local variables.
int calc( int a, int b ) {
a += b
b += a
return a
}
I admit that I often write code that does this. It's convenient to reuse an existing variable rather than introducing a new one. I realize it's also gotten me into trouble before. When a piece of code modifies the arguments, but code later in the function wasn't expecting that.
int calc( float a, int b ) {
float result = a;
if (some_conditon_on(a)) {
b /= 5;
result += b;
}
if (some_condition_on(b)) {
result = alt_calc(a,b);
}
return result;
}
Though contrived, it shows that a second section of the code in the function may be relying on unmodified arguments. It's a subtle defect as it requires both conditionals to evaluate to true. Add in more branches that may or may not modify the arguments, and the problem intensifies.
In JavaScript, that situation is worse. If I modify a named argument, it also modifies the arguments
array.
function hidden_arg( name ) {
name = "weird"
console.log(arguments[0])
}
hidden_arg("expected")
That writes weird
, not expected
.
Functional approach
If we look to a language like Haskell we see that reassigning variables, in general, is frowned upon (is it even possible?). It's not something fundamental to a functional programming though, whether a function a reassigns an argument doesn't affect the purity of that function.
A function could, however, modify the value of an argument, and that would certainly ruin the immutable requirement.
This got me to thinking that perhaps the requirement should go even further: arguments should also be read-only by default. Consider the below code, where the "values" name is not reassignable (C and C++ are of the few languages where this notation is even possible):
//this prevents reassigning the "values" pointer...
float calc( vector<float> * const values ) {
values[0] = 1; //...but we can still modify the values
...
}
What if the default were also to make everything read-only? (This is the typical C++ syntax for how that is done)
float calc( vector<float> const & values ) {
values[0] = 1; //error!
...
}
This function has a much safer signature. I can call it without worrying that my vector might be accidentally changed on me.
I guess it's unavoidable for this discussion to get deeper into the difference between a name of a value.
The default, but not a hard requirement
I'm starting to think that non-reassignable and read-only should be the default. If I want a mutable argument, I can mark it.
float sort( vector<float> mutable & values )
For complex value types that makes a lot of sense. But for service types, like say a file or window handle, it would be inconvenient. At least in Leaf, I have a distinct service
type, which could be mutable by default instead. I don't like inconsistency, but sometimes it has to be sacrificed for convenience.
Another situation that gives me pause is argument sanitization. For example:
float calc( float a, float b ) {
if (b < 0) {
a = -a;
b = -b;
}
...
}
In this situation, we don't want the remainder of the function to have access to the original arguments. They're intentionally hidden. Cleaning arguments may not be common, but I do it often enough that I'd need to have a solution for it. Perhaps hiding the arguments by a local variable of the same name might work.
Your thoughts?
I'm undecided on what the correct solution is. Current languages and best practices don't appear to give a definite answer yet. This makes it one of those engaging topics in language design.
It's part of my adventure in writing Leaf. I'd be happy to hear your thoughts on the topic as well.
Top comments (13)
Personally, I prefer the functional style. I like it because it sets you up for parallelism without a lot of code changes. I also like that it's declarative, rather than imperative. I'm also wary of mutability in general.
In Rust, they have the
mut
keyword. Things are immutable by default and you have to explicitly mark something mutable.Equivalent signatures from Rust are as follows:
And taking them by reference:
For sanitization, you could use variable shadowing inside blocks to explicitly define the scope of the shadowed variable:
Or, even better, you could use pattern matching:
With pattern matching and guard clauses, you could make all of your validation clean and have it cover all of the variants without having lots of nested constructs (like loops and conditions).
Pattern matching is super-powerful. Here's a fizzbuzz example using match:
Thanks for pointing out Rust's immutable by deault. It makes me more comfortable taking the same approach. I handle references and shared values a bit differently, but it appears to be an orthogoanl concept.
Yes, variable shadowing is definitely an option. The one danger it opens is last-shadowed variables, where something early uses the original name, and something later the new name, but both in the same scope.
Pattern matching looks like a clean approach. I don't always like creating separate functions, but I could always use a local function definition, or combine it with lambdas in the simple cases.
I like the languages where you can specify, and communicate your intention (using pointers, or in your example mutable == I will modify your value). If I had to choose I will clearly put default as read only, the side effects are the root cause of many bugs, and is not intuitive in most of the cases ( a function effect is the return result, not modifying the Input data).
Function parameters should never be re-assignable from within the function. It's just unclean. If the language allows you to assign a default value in case of the absence of a parameter, that's fine, but once the function context opens and the first statement is placed, they should be locked down tight. Don't give people needless room for error.
I think I've been thoroughly convinced this is the right direction.
It still leaves open how to handle argument sanitization, or prep-work. But I think that can be handled by convention, either by hiding, or different variable names.
I prefer rust's approach of immutability by default, which also applies to function arguments, you have to add
mut
to be able to change it in the functionI think I'll be going this way as well now -- I don't have the immutable feature yet, on my infinite todo list.
As to normal variables, it'll be easy to introduce, since declarations now use
var
implying variable, thus mutable, and functions usedefn
implying a literal function. I can also add alet
which denotes an immutable value.I agree that function arguments should be immutable, it's the safer option but it's true there's no definitive consensus on this.
Languages that have objects and "pass by reference" allow you to modify a given object as argument and also with that they occupy less memory.
If you have a complex object in memory you can make a function
setTheseRelatedFields(object)
which might or not return anything but which operates on the given object by address instead of making a copy for the function.The other side of coin is that sometimes it leads to unexpected consequences (side effects).
I usually have a "semi functional" approach even in languages that are not functional, the code tends to be easier to test and read.
So, if I were to design functions I would make the arguments read only BUT with an option to mutate the passed variable (which is indeed a label standing in for something, be it an integer or a hash).
If the argument is a "simple" value reassign it leads to an error, if the argument is an array changing one of the items leads to an error, the same for hashes and so on.
I wonder how much the lack of consensus rests on history though. For languages in the C family it'd be kind of shocking if a new one had differnet semantics.
But I don't want history holding me back. It does seems the consensus on a "safe default" would be immutable and not-rebindable. It's only a default of course, and there must be a way to mark arguments as mutable.
IMO,
Short answer: no,
Long answer: noooooooooooooooooooooooooooooooooooo,
I only do this on recursive functions in the internal implementation, as an output argument. And I document this as an exception so everybody knows that's not the default way to go, but it was necessary in that case.
EDIT:
UPS, after reading the post in detail, I saw you were asking about how to make the syntax easier in your language.
I'd just say default is not mutable, and if it mutates, i'd either mark it with
out
ormutable
as you said. I always think of mutable arguments as output arguments, a way to give a function an "empty" box (with a "particular shape", if it's not a primitive value, like an instance of a class) that the function is gonna fill for me.At work, one of our teams is designing a language for some specific use cases, and one of our top goals is "safety" - many of the users in-house will actually be graphics designers and content developers, neither of whom have extensive programming experience. I've brought this article up to the team for consideration, but I think I agree with your assessment: immutable-by-default helps prevent some rather nasty logic errors.
In languages that manage memory for you (i.e. Java, or in general garbage collected runtimes), even if imperative, it's better to leave parameters as they are: the compiler may want to do some optimizations on their passing, and won't be able to if you modify them.
If you have a functional language, those optimizations are probably the norm, and you won't be able to do it at all (thus, you have one less problem to worry about).
If you have a runtime where you manage the memory directly (C, C++, or in embedded/IoT situations) there may be some circumstances where you're better off mutating your parameters; but you should be able to recognize them and use them correctly.
Otherwise, is much better to leave your parameters alone: they make the code much easier to reason about.
Part of my motivation for the question was a defect in the Leaf compiler (my language). The fix is relatively simple, but it does imply an efficiency lost for calling functions.
As you say, if I limit what can be done with arguments by default, I gain a lot of flexibility in the compiler.