DEV Community

Diego Saraiva
Diego Saraiva

Posted on

Introduction to Functional Programming

Introduction to Functional Programming (FP)

Functional Programming (FP) is an old programming paradigm that was born in academia during the 1950s; it stayed restrict tied to that environment for decades. Although it was always a hot topic for scientific researchers, it was never popular in the industry. Instead, imperative languages (procedural and object-oriented) became the mainstream programming languages.

However, in the past few years, we have seen a boost in the popularity of the functional programming paradigm. However, instead of functional programming languages becoming the most popular, something else is happening: the most popular programming languages have started introducing features inspired by functional programming languages. Consequently, we can state that functional programming has burst the academic bubble and now it is among us, albeit in an unexpected way, but life is complicated.

But, what is functional programming after all? Well, answering this question is very complicated because no widely accepted definition exists. But don't panic. The lack of a broadly accepted definition it's quite common when we are talking about programming languages, and being honest it's common in other computer science topics...

People usually tend to define FP through features related to functional concepts in their favourite language. In the FP context, concepts including pure functions, lazy evaluation, pattern matching are frequently mentioned. But defining functional programming in this way is alienate people. The focus shouldn't be on particular language features. We need a better definition to avoid language bias. And I know that many object-oriented programming books apply this strategy to define what is an object-oriented language, but it's out of scope for now, and this isn't the way definitely.

In programming, we keep decomposing a problem until you reach the level of detail that you can deal with, solve each subproblem in turn, and re-compose the solutions bottom-up. There are, roughly speaking, two ways of doing it: by commanding the computer what to do, or by telling it how to do it. This difference originated the terms declarative and imperative programming.

Broadly speaking, Functional Programming is a declarative style of programming in which the main program building blocks are functions as opposed to objects and procedures. A program written in the functional style doesnโ€™t specify the instructions that should be performed to produce the outcome but rather it defines what the outcome is.

Be advised: The way we write a function in the functional paradigm is different from the way we write functions in the procedural style. Functional programming focuses on expressions, while Procedural programming focuses on statements. Don't worry, I know that things can be a bit blurry now. Then, let's consider a small example to illuminate our thoughts: A program to calculate the sum of a list of numbers.

The imperative version of this program is usually described as a sequence of actions ordered in time. Thus we implement this by iterating over the number list and adding the numbers to the accumulator variable. So, we are just describing the sequence of interdependent steps (fine-grained instructions) of how to sum a list of numbers and modifying the state of some conceptual state machine.

int accumulate(const std::vector<int>& numbers) 
{
    int acc = 0;
    for (auto& number : nuumbers)
        acc += number;
}
Enter fullscreen mode Exit fullscreen mode

On the other hand, in the declarative version, you need to define only what a sum of a list of numbers is. The computer knows what to do when itโ€™s required to calculate a sum. One way you can do this is to say that the sum of a list of numbers equals the first element of the list added to the sum of the rest of the list and that the sum is zero if the list is empty. You define what the sum is without explaining how to calculate it.

using iterator = std::vector<double>::iterator;
int accumulate_impl(const iterator begin, const iterator end)
{ 
    return (begin != end) ? *begin + accumulate_impl(begin + 1, end) : 0;
}
int accumulate(const std::vector<int>& numbers)
{
    //
    return accumulate_impl(numbers.cbegin(), numbers.cend());
}
Enter fullscreen mode Exit fullscreen mode

You can notice that the code presented above avoids state and mutable data. It emphasizes the application of functions. Also, we can see that iteration isn't the predominant control structure anymore.

Imperative vs. Declarative programming

This post could be very abstract until this moment, so to clarify the concepts, let's look at a more elaborate but still simple program implemented in the imperative style and its functional equivalent. The main goal here is demonstrate the difference between these two approaches.

Here we go: Imagine that we want to write a function that takes a list of files and calculates the number of lines in each. I assume for simplicity that the last line in the file also ends with a newline character, so we need to count only the number of newline characters in the current file.

Reasoning imperatively, you might implement the solution decomposing it into the following steps :

  1. Open each file.
  2. Define a counter to store the number of lines.
  3. Read the file one character at a time, and increase the counter every time the newline character occurs.
  4. At the end of a file, store the number of lines calculated.

The following algorithm implements the steps listed before. In summary, we are reading the files character by character and counts the number of newline characters. At the end, we salve the results in a vector.

// Calculating the number of lines the imperative way
int count_lines(const std::string& filename)
{
    int new_line_count = 0;
    std::ifstream infile(filename);
    char ch;
    while (infile.get(ch)) 
    {
        if (ch == '\n')
            new_line_count++;
    }
    return new_line_count;
}
std::vector<int> count_lines_in_files(const std::vector<std::string>& files)
{
    std::vector<int> lines_in_each_file(files.size());
    for (const auto& file : files) 
    {
        lines_in_each_file.push_back(count_lines(file));
    }
    return lines_in_each_file;
}
Enter fullscreen mode Exit fullscreen mode

Now, we are moving this code to a more functional one. The first version exploits a set of C++ abstractions, such as use the standard std::count algorithm instead of counting the number of newlines manually and the stream iterators that allow us to treat the I/O streams similarly to ordinary collections like lists and vectors.

//functionalizing the code.... 
int count_lines(const std::string& filename)
{
    std::ifstream infile(filename);
    return std::count(
                   std::istreambuf_iterator<char>(infile),           
                   std::istreambuf_iterator<char>(), '\n');
}
//
std::vector<int> count_lines_in_files(const std::vector<std::string>& files)
{
    std::vector<int> lines_in_each_file(files.size());
    for (const auto& file : files) 
    {
        lines_in_each_file.push_back(count_lines(file)); 
    }
    return lines_in_each_file;
}
Enter fullscreen mode Exit fullscreen mode

When we compare the solutions presented above, we can notice that the main change occurred on the count_lines function.
Now, his only task is to convert its input (the filename) to the type that std::count can understand: a pair of stream iterators. In this way, we are no longer concerned about how to implement the counting new lines algorithm. We are just declaring that we want to count the number of newlines that appears in the given input stream.

It's quite common to use the number of lines of code (LOC) to measure the complexity of a program. Although LOC can perfectly express the difference in the complexity between imperative and functional solutions, I prefer to highlight another benefit from the functional solution: The fewer state variables to worry about. Each program has an implicity conceptual state machine binds to it. The correct management of state changes into a program is one of the most significant sources of software bugs. I believe that by reducing the number of states to keep track, we reducing the number of bugs.

Therefore, the main idea in the functional style is to use
abstractions to express the higher-level intent of an algorithm instead of specifying how to do something.

But, we can do even better. We can convert the entire algorithm to functional style, a.k.a. we are going to specify what should be done, instead of how it should be done. I know that I'm being repeating but it's important to consolidate this principle in our minds.

To start, we need an abstraction that allows us to applies a function to all elements in a collection and collects the results. Fortunately, The STL designed the std::transform algorithm for this. Essentially, the std::transform abstraction applies a function to each element of a range defined by a pair of iterators and write the results somewhere.

std::vector<int> count_lines_in_files(const std::vector<std::string>& files)
{
    std::vector<int> lines_in_each_file(files.size());
    //
    std::transform(
        files.cbegin(), files.cend(), 
        lines_in_each_files.begin(), 
        count_lines                
    );
    return lines_in_each_files;

}
Enter fullscreen mode Exit fullscreen mode

The new implementation of the count_lines_in_files applies the std::transform in each element of the files collection one by one, transforms them using the count_lines function, and stores the resulting values in the lines_in_each_file vector.

This code no longer specifies the algorithm steps that need to be taken, but rather how the input should be transformed in order to get the desired output. But, what's the big deal here? This solution makes the code less prone to errors, since it removes the state variables, and rely on the standard library implementation of the counting algorithm instead of rolling your own.

BONUS TRACK: Range Library - C++20

We can remove the boilerplate code from the previous functional solutions to be considered more readable than the imperative version. We can use the C++ 20 std::ranges library to do it. (I think that we can talk about ranges in near future, ranges concepts will be a big improvement on STL).

int count_lines(const std::string& filename)
{
    std::ifstream in(filename);
    return std::count( 
        std::istreambuf_iterator<char>(in), 
        std::istreambuf_iterator<char>(), '\n');
}

std::vector<int> count_lines_in_files(const std::vector<std::string>& files)
{
    auto rng = files 
                | ranges::views::transform(count_lines);
    return std::vector<int>(rng.begin(), rng.end());
}
Enter fullscreen mode Exit fullscreen mode

This solution is much less verbose than the imperative solution and much more obvious.

References

Top comments (0)