They're wrong because they're showing how your implementation decided to implement this undefined behavior, this time, and don't reflect on how C works.
C programs are understood in terms of the CAM (C Abstract Machine).
The compiler's job is to build a program that produces the same output as the CAM would for a given program.
The CAM says that a variable can only be read, or read-to-modify, once between two sequence points.
There are no sequence points between the i++ and i+1, so this produces a read/write conflict, which means that the program has undefined behavior in the CAM, and so the compiler can do whatever it wants.
It could crash, or print out 23, 37 or -9, 12, and these would all be equally correct behaviors.
The increment must happen before the print, as there is a sequence point between the evaluation of the arguments and the call.
But there are no sequence points between the evaluations of the arguments.
Leading to undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
This isn't quite "undefined behaviour", just weird syntax and one of those moments when you ought to know operator precedence and evaluation order, which is pretty much the same in every language (in some languages with dialects or multiple compilers it may just be more apparent).
Undefined behaviour would be something along the lines of:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <err.h>
intmain(void){constsize_tsize=1024*1024;char*data=malloc(size);if(!data){err(1,"malloc");}// replace with assert if you don't have err.h from libbsdmemset(data,0,size);// write zeroesfree(data);memset(data,0xff,size);// write onesreturn0;}
Interesting. As far as I could read up because I didn't think it was either; in most cases the compiler will handle it as you expect, but it doesn't have to according to the spec which is why it is undefined?
There is no guarantee in the specification for c that the increment of i will be done when you use it as the third argument to printf(). So you could reasonably get 1, 1?
I think you're imagining that the operations occur in an unspecified order, as would be the case for
foo(a(), b());
There is a sequence point when a call is executed, so a(), and b() occur in some distinct, if unspecified, order.
The program will not have undefined behavior, but may have unspecified behavior (if it depends on the order of those calls), but we can continue to reason about the C Abstract Machine for both cases.
foo(i, i++);
There is no sequence point between i and i++, so they occur at the same time, leading to a violation of the C Abstract Machine, producing undefined behavior.
We cannot reason about the program from this point onward.
It's undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
This happens because there is no sequence point between the i++ and the i.
Precedence doesn't come into this.
Here's a more interesting variation on your example.
Can you spot the undefined behavior here? :)
int main() {
char *data = malloc(1);
if (data) {
free(data++);
data++;
}
}
Ah, I see.
So C literally doesn't define any order on those instructions and it's up to the compiler?
Wouldn't have expected that, though I've seen the example a few times.
Excuse my hasty assumption then please.
First off, I'd really appreciate it if you specified the syntax in the code blocks so syntax highlighting kicks in ;-)
Something along the lines of that (without the backslashes, markdown dialect doesn't allow nested fences):
\`\`\`c
int main(void)
{
return 0;
}
\`\`\`
Actually no I can't see the undefined behaviour in that example.
In all cases you're manipulating the pointer only if I see correctly, and since free takes the pointer by value and not reference, you'd end up with a copy of data before increment in the call, and move along the pointer twice afterwards, but in either case the pointer is invalid.
What am I missing?
But you're not actually using that pointer in the code, so I fail to see how that's undefined behaviour.
An invalid pointer which isn't used still doesn't cause any runtime issues, or is there something about that too in the standards?
Again, it depends. For example, now I'm working on a personal project (it's not C but Java). Among goals of this project is the search for new style of writing code. I'm often rewrite code several times in order to make it easier to read and/or more reliable. When I get code which looks satisfactory I often discover that it violates one or more Sonar rules (i.e. "best practices"). In vast majority of cases the considerations behind those rules are no longer valid because whole approach is different. What I'm trying to say is that "best practices" is a set of compatible rules/guides/considerations and there might be more than one such a set.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
It's worth learning in order to fully appreciate the wonders of undefined behavior.
Consider the following C program -- what does it do?
This is one of the reasons I write prototypes and tests. I'll try it.
In that case, I think you missed the point -- but I look forward to explaining why your results are wrong. :)
It's funny. First using i, then increment i, then use i as a second parameter.
The result is the same:
Your results are wrong. :)
They're wrong because they're showing how your implementation decided to implement this undefined behavior, this time, and don't reflect on how C works.
C programs are understood in terms of the CAM (C Abstract Machine).
The compiler's job is to build a program that produces the same output as the CAM would for a given program.
The CAM says that a variable can only be read, or read-to-modify, once between two sequence points.
There are no sequence points between the i++ and i+1, so this produces a read/write conflict, which means that the program has undefined behavior in the CAM, and so the compiler can do whatever it wants.
It could crash, or print out 23, 37 or -9, 12, and these would all be equally correct behaviors.
Print 1 and then 2? Genuinely curious where is the undefined behaviour? :)
Ah, I see it now. There is no guarantee the increment will happen before the print. Only before the next sequence point!
The increment must happen before the print, as there is a sequence point between the evaluation of the arguments and the call.
But there are no sequence points between the evaluations of the arguments.
Leading to undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
Thanks for clarifying! That makes more sense.
This isn't quite "undefined behaviour", just weird syntax and one of those moments when you ought to know operator precedence and evaluation order, which is pretty much the same in every language (in some languages with dialects or multiple compilers it may just be more apparent).
Undefined behaviour would be something along the lines of:
Interesting. As far as I could read up because I didn't think it was either; in most cases the compiler will handle it as you expect, but it doesn't have to according to the spec which is why it is undefined?
There is no guarantee in the specification for c that the increment of
i
will be done when you use it as the third argument toprintf()
. So you could reasonably get 1, 1?I may well have misunderstood though!
I think you're imagining that the operations occur in an unspecified order, as would be the case for
There is a sequence point when a call is executed, so a(), and b() occur in some distinct, if unspecified, order.
The program will not have undefined behavior, but may have unspecified behavior (if it depends on the order of those calls), but we can continue to reason about the C Abstract Machine for both cases.
There is no sequence point between i and i++, so they occur at the same time, leading to a violation of the C Abstract Machine, producing undefined behavior.
We cannot reason about the program from this point onward.
It's undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
This happens because there is no sequence point between the i++ and the i.
Precedence doesn't come into this.
Here's a more interesting variation on your example.
Can you spot the undefined behavior here? :)
Ah, I see.
So C literally doesn't define any order on those instructions and it's up to the compiler?
Wouldn't have expected that, though I've seen the example a few times.
Excuse my hasty assumption then please.
First off, I'd really appreciate it if you specified the syntax in the code blocks so syntax highlighting kicks in ;-)
Something along the lines of that (without the backslashes, markdown dialect doesn't allow nested fences):
Actually no I can't see the undefined behaviour in that example.
In all cases you're manipulating the pointer only if I see correctly, and since free takes the pointer by value and not reference, you'd end up with a copy of data before increment in the call, and move along the pointer twice afterwards, but in either case the pointer is invalid.
What am I missing?
Pointers are only well defined for null pointer values or when pointing into or one past the end of an allocated array.
The first increment satisfies this, since it happens before the free occurs.
After the free, the pointer value is undefined and so the second increment has undefined behavior.
But you're not actually using that pointer in the code, so I fail to see how that's undefined behaviour.
An invalid pointer which isn't used still doesn't cause any runtime issues, or is there something about that too in the standards?
The last increment of the pointer is when it has an undefined value, producing undefined behavior.
For example it might behave like a trap representation.
Regardless, the program cannot be reasoned about after this point. :)
This is the same in any language, there are many ways to do something, not all of them are advised.
Understand the language, learn the best practices and fundamentally write decent code.
I realise this is easier said than done, but it is the golden principal we should be adhering in our products.
But keep in mind that they are not a dogma and should be broken if there is a significant reason for that.
Agreed, if you can justify the need to do something with a particular unconventional approach, then go for it.
BUT it must be well documented, with the emphasis on how it works and that changes must be carefully considered.
Again, it depends. For example, now I'm working on a personal project (it's not C but Java). Among goals of this project is the search for new style of writing code. I'm often rewrite code several times in order to make it easier to read and/or more reliable. When I get code which looks satisfactory I often discover that it violates one or more Sonar rules (i.e. "best practices"). In vast majority of cases the considerations behind those rules are no longer valid because whole approach is different. What I'm trying to say is that "best practices" is a set of compatible rules/guides/considerations and there might be more than one such a set.