DEV Community

Lesley Lai
Lesley Lai

Posted on • Originally published at lesleylai.info

C++ Lambda Tutorial

C++ lambda expressions are a construct added to C++ back in C++11, and it continues to evolve in each version of the C++ standard. A core part of the language nowadays, lambda expressions enable programmers of writing anonymous functions in C++. This post describes what a lambda is, provides some basic usages, and outlines their benefits.

Basic Usage

Passing functions as a parameter to customize the behavior of functions is a common task in programming. For example, since the conception of standard algorithms library, a lot of the algorithms in the <algorithm> can take an invokable entity as a callback. However, before C++11, the only kinds of invokable entities in C++ are function pointers and function objects. Both of them require quite a bit of boilerplate, and this cumbersomeness even impedes the adaption of the standard algorithm library in practice.

On the meantime, lots of programming languages support features of anonymous functions. Before C++11, such features are achieved by metaprogramming. For example, the Boost C++ library provided its boost.lambda library. Those metaprogramming hacks are slow to compile and not performant; moreover, they require more boilerplate then one want. Thus, in C++11, lambda expressions are added as a language extension. The ISO C++ Standard shows usage of a lambda expression as a comparator of the sort algorithm. 1

#include <algorithm>
#include <cmath>

void abssort(float* x, unsigned n) {
    std::sort(x, x + n,
        [](double a, double b) {
            return (std::abs(a) < std::abs(b));
        });
}
Enter fullscreen mode Exit fullscreen mode

Inside the function abssort, we passed lambda into std::sort as a comparator. We can write a normal function to achieve the same purpose:

#include <algorithm>
#include <cmath>

bool abs_less(double a, double b) {
    return (std::abs(a) < std::abs(b));
}

void abssort(float* x, unsigned n) {
    std::sort(x, x + n, abs_less);
}
Enter fullscreen mode Exit fullscreen mode

We still do not know what the strange [] syntax is for, and that is our topic next.

Captures

The above example shows the basic usage of Lambdas, but Lambdas can do more. The main difference between a lambda and a regular function is that it can "capture" state, and then we can use the captured value inside the lambda body. For example, the below function gets a new vector with the element above a certain number in the old vector.

// Get a new vector<int> with element above a certain number in the old vector
std::vector<int> filter_above(const std::vector<int>& v, int threshold) {
    std::vector<int> result;
    std::copy_if(
      std::begin(v), std::end(v),
      std::back_insert_iterator(result),
      [threshold](int x){return x > threshold;});
    return result;
}

// filter_above(std::vector<int>{0, 1, 2, 4, 8, 16, 32}, 5) == std::vector<int>{8, 16, 32}
Enter fullscreen mode Exit fullscreen mode

The above code captures threshold by value. The [] construct is called a capture clause. There are two kinds of captures, capture by value or capture by reference ([&]). For example, [x, &y] - capture x by value and y by a reference explicitly. You can also have a default capture clause, for example, [=] captures everything in the current environment by value and [&] captures everything by reference.

A function that store an environment is called a closure; almost all modern programming languages support closures. However, in all languages that I know except C++, the capture list is implicit. In those languages, a closure captures all the bindings from the current environment.

We can mimic the behaviors in those languages by capturing everything by reference ([&]), and it captures all the variables in the environment that are used in the lambda. However, default capture can be dangerous in C++; it leads to potential dangling problems if the lambda lives longer than the captured object. For example, we can pass a callback to asynchronous functions:

auto greeter() {
    std::string name{"Lesley"};

    return std::async([&](){
        std::cout << "Hello " << name << '\n';
    });
}
Enter fullscreen mode Exit fullscreen mode

The above code is undefined behavior since name may be destroyed when we execute the asynchronous operation. The rule of thumb is only to use default capture by reference when the lambda is short-lived. For example, when passing a lambda to STL algorithms.

The implicit capture strategy works in garbage-collected languages. Rust gets away with implicit capture because of its borrow checker. On the contrary, by requiring the programmer to be explicit about ownership, the C++ approach provides more flexibility than the counterparts in other programming languages.

What is Lambda

We discussed quite a lot of usage of Lambda so far. However, curious readers may start to wonder, what exactly is a C++ Lambda? Is it a primitive language construct like closures in functional languages? However, before I talk about the internal of Lambda, I will first talk about a construct date back to C++98 era, function objects.

Function Object

Function objects are normal objects that are able to be invoked. They are implemented by overloading a class' operator() operator. Below is our abs_less example as a function object:

#include <algorithm>
#include <cmath>
class abs_less {
  bool operator()(double a, double b) {
    return (std::abs(a) < std::abs(b));
  }
};

void abssort(float* x, unsigned n) {
    std::sort(x, x + n, abs_less{});
}
Enter fullscreen mode Exit fullscreen mode

Function objects are more flexible than regular functions because they can store data like ordinary objects. Let us implement the previous filter_above example with function object:

template <typename T>
class GreaterThan {
public:
  GreaterThan(T threshold): threshold_{threshold} {
  }

  bool operator()(const T& other) noexcept {
    return other > threshold_;
  }

private:
  T threshold_;
};

std::vector<int> filter_above(const std::vector<int>& v, int threshold) {
    std::vector<int> result;
    std::copy_if(std::begin(v), std::end(v), std::back_insert_iterator(result), GreaterThan{threshold});
    return result;
}
Enter fullscreen mode Exit fullscreen mode

I am using Class template argument deduction (CTAD) in this snippet. CTAD is a C++17 feature. In the previous versions, we need to write GreaterThan<int>{threshold} with the template parameter int specified.

Going back to lambda

Lambdas in C++ are syntactic sugars of function objects. In other word, the compilers translate lambda expressions into function objects. Through the amazing C++ Insights website, we can see a desugared version of our abssort example:

#include <algorithm>
#include <cmath>

void abssort(float * x, unsigned int n)
{

  class __lambda_6_9
  {
    public: inline /*constexpr */ bool operator()(float a, float b) const
    {
      return (std::abs(a) < std::abs(b));
    }

    ...
  };

  std::sort(x, x + n, __lambda_6_9{});
}
Enter fullscreen mode Exit fullscreen mode

Lambda is merely a default constructed object of a local class. Thus, C++ Lambda can do a lot of stuff anonymous functions in other languages may not allow to do. For example, you can inherit from lambda and have mutable states from lambda. Though I haven't found too much use for either of them.

The compilers generate the types of Lambdas; however, there is no way to use such types by their name through any standard means in a program. Nonetheless, type inferences and template works just normal for those types. Also, we can use those types explicitly by decltype. Below is an example from the cppreference:

auto f = [](int a, int b) -> int
    {
        return a * b;
    };

decltype(f) g = f;
Enter fullscreen mode Exit fullscreen mode

Such anonymous types are called "Voldemort's types" in the world of C++ and the D programming language because they cannot be directly named, but codes can still use this type.

Capture with an initializer

Now you understand a Lambda is a function object; you may expect Lambdas to store arbitrary values, not just to capture the values from their local scope. Fortunately, in C++14, lambdas can introduce new variables in its body by the mean of capturing with an initializer2.

[x = 1]{ return x; // 1 }
Enter fullscreen mode Exit fullscreen mode

Move capture

Rust lambdas can take ownership of the values in the environment. C++ lambdas do not have special support for such move capture, but the generalized capture in the C++14 covers such use case:

// a unique_ptr is move-only
auto u = make_unique<some_type>(
  some, parameters
);
// move the unique_ptr into the lambda
go.run( [u=move(u)] {
  do_something_with(u);
});
Enter fullscreen mode Exit fullscreen mode

Immediately Invoked Lambda

You can invoke lambda at the same place where we construct them.

[]() { std::puts("Hello world!"); }(); // Same as what is inside the curly braces
Enter fullscreen mode Exit fullscreen mode

In the world of Javascript, immediately invoked function expressions are all over the place. JavaScript programmers use them to introduce scopes. C++ does not need this kind of trickery. As a result, C++ programmers are more reluctant to use immediately invoked lambda. For example, in her talk during CppCon 2018, Kate Gregory concerns about the readability of the immediately invoked lambda for people not familiar with this idiom.

Nevertheless, if you follow the best practice of declaring as more const values as possible, immediately invoked lambda does provide an advantage. Some objects require complex construction beyond the constructor's capability. Mutations will only happen during the construction of objects. Once the construction is completed, the objects will never be modified again. If such construction is reusable, then writing builder classes or factory functions is a sane choice. However, if such construction only happens once in the codebase, a lot of the people will drop the const qualifier instead. For example, consider that if you want to read several lines from stdin into a vector:

std::vector<std::string> lines;
for (std::string line;
     std::getline(std::cin, line);) {
    lines.push_back(line);
}
Enter fullscreen mode Exit fullscreen mode

It seems no way to make lines constant since we need to modify it in the loop. Immediately invoked lambda solves this dilemma. With it, you can have both const and no boilerplates:

const auto lines = []{
    std::vector<std::string> lines;
    for (std::string line;
         std::getline(std::cin, line);) {
        lines.push_back(line);
    }
    return lines;
}();
Enter fullscreen mode Exit fullscreen mode

This post is first published on Lesley Lai's Blog.


  1. See [expr.prim.lambda] 

  2. C++14 Language Extensions: Generalized lambda captures 

Top comments (0)