Thomas Köppe wrote the proposal P0614R1 to describe a new feature called "Range-based for statements with initializer". This document has been approved as part of the C++20 standard.
The document is pretty straight-forward since the feature is quite simple. If you have heard of if statement with initializer from C++17, then you have probable already guessed what "range-based for statements with initializer" means.
To understand this new feature, let's say we want a code to print all the elements of a collection. The first idea would be to use a range-based for loop. But let's add another requirement: we want to print the index of the element in the collection. Let's write some code before and after C++20 and compare them.
Before C++20
Using a range-based for loop, the code would look like this:
#include <iostream>
#include <array>
int main() {
std::array data = {"hello", ",", "world"};
std::size_t i = 0;
for (auto& d : data) {
std::cout << i++ << ' ' << d << '\n';
}
}
Some people, including me, would argue that i
's scope is too large. The variable is still available after the loop and this may not be desirable. Then they would backup to a code like this:
int main() {
std::array data = {"hello", ",", "world"};
for (std::size_t i = 0; i < data.size(); ++i) {
std::cout << i << ' ' << data[i] << '\n';
}
}
Some people, including me, would argue this code is more verbose, less generic (since not all collections has a size()
member function), and has a less explicit intent.
After C++20
With C++20, we can now do this:
int main() {
std::array data = {"hello", ",", "world"};
for (std::size_t i = 0; auto& d : data) {
std::cout << i++ << ' ' << d << '\n';
}
}
Close to perfection 🙂
A more complex case
The proposal also talks about an undefined behavior that we are likely to run into as we try to reduce the scope of variables with range-based loops. Let's consider this code:
#include <iostream>
#include <vector>
class Foo {
public:
const auto& items() {
return data;
}
private:
std::vector<const char*> data{"hello", ",", "world"};
};
Foo getFoo() {
return Foo();
}
int main() {
for (auto& d : getFoo().items()) {
std::cout << d << '\n';
}
}
Something is wrong with this code. Can you guess what?
getFoo().items()
is a dangling reference.
We can run this code in cppinsight to understand how the loop is compiled:
const std::vector<const char*, std::allocator<const char*>>& __range1 = getFoo().items();
__gnu_cxx::__normal_iterator<const char* const*, std::vector<const char*, std::allocator<const char*>>> __begin1 = __range1.begin();
__gnu_cxx::__normal_iterator<const char* const*, std::vector<const char*, std::allocator<const char*>>> __end1 = __range1.end();
for (; __gnu_cxx::operator!=(__begin1, __end1); __begin1.operator++()) {
const char* const& d = __begin1.operator*();
std::operator<<(std::operator<<(std::cout, d), '\n');
}
- A reference to the vector is saved as
__range1
. - Iterators are created from this reference.
- The loop is performed using these iterations.
We can see that the object returned by getFoo()
and the vector it contains don't survive the first line. Hence, __range1
is obviously a dangling reference.
I have to be honest: it took me some time to understand the problem here, so I guess I would have run in this UB in a real-life situation... 😒
With C++20, we can write this UB-free code with tight scope:
int main() {
for (auto foo = getFoo(); auto& d : foo.items()) {
std::cout << d << '\n';
}
}
Conclusion
I thought with article would be simple because the addition of an init statement to range-based loop is easy understand. In the end, I have discovered a subtle and not-so-easy to spot UB.
Top comments (1)
That's a cool feature, even the if with initializer. By the way, if you have access to boost, you have one more option that works before C++20: