Naveen

Posted on Apr 22, 2017

Introduction to functional programming with C#

#programming #csharp #softwaredevelopment #dotnet

It is no surprise that one of the biggest challenges in the enterprise software development is complexity. Change is inevitable. Especially when a new feature is introduced that adds complexity. This might lead to difficulty in understanding and verify the software. It doesn’t stop there, it increases development time, introduces new bugs and at some point, it is even impossible to change anything in the software without introducing some unexpected behavior or side effects. This may result in slowing down the project and eventually lead to a project failure.

Imperative programming styles like object oriented programming have capabilities to minimize complexity to a certain level when done right by creating abstractions and hiding complexity. An object of a class has certain behavior that can be reasoned without caring about the complexity of the implementation. Classes, when written properly, will have high cohesion and low coupling where the code re-usability is increased while complexity is kept reasonable.

Object oriented programming is in my blood stream, I’ve been a C# programmer most of my life and my thought process always tends to use a sequence of statements to determine how to reach a certain goal by designing class hierarchies, focusing on encapsulation, abstraction, and polymorphism that tend to change the state of the program which actively modifies memory. There’s always a possibility that any number of threads can read a memory location without synchronization. Consider if at least one of them mutates it that leads to a race condition. Not an ideal condition for a programmer who tries to embrace concurrent programming.

However, it is still possible to write an imperative program by writing a complete thread-safe code that supports concurrency. But one has to still reason about performance, which is difficult in a multi-threaded environment. Even if parallelism improves performance, it is difficult to refactor the existing sequential code into parallel code as the large portion of the codebase has to be modified to use threads explicitly.

What’s the solution?

It is worth to consider “Functional programming”. It is a programming paradigm originated from ideas older than the first computers when two mathematicians introduced a theory called lambda calculus. It provided a theoretical framework that treated computation as an evaluation of mathematical functions by evaluating expressions rather than execution of commands and avoiding changing-state and mutating data.

By doing so, it is much easier to understand and reason about the code and most importantly, it reduces the side effects. In addition to that, performing unit testing is much easier.

Functional Languages:
It is interesting to analyze the programming languages that support functional programming such as Lisp, Clojure, Erlang, OCaml and Haskell which have been used in industrial and commercial applications by a wide variety of organizations. Each of them emphasizes different features and aspects of the functional style. ML is a general-purpose functional programming language and F# is the member of ML language family and originated as a functional programming language for the .Net Framework since 2002. It combines the succinct, expressive and compositional style of functional programming with the runtime, libraries, interoperability and object model of .NET

It is worthy to note that many general programming languages are flexible enough to support multiple paradigms. A good example is C#, which has borrowed a lot of features from ML and Haskell although it is primarily imperative with a heavy dependence of mutable state. For example, LINQ which promotes declarative style and immutability by not modifying the underlying collection it operates on.

As I am already familiar with C#, I favor combining paradigms so that I have control and flexibility over choosing the approach that best suits the problem.

Fundamental principles:

Having stated before, Functional programming is programming with mathematical functions. The idea is, whenever the same arguments are supplied, the mathematical function returns the same results and the function’s signature must convey all information about possible input it accepts and the output it produces. By following two simple principles, namelyâ€Š–â€ŠReferential transparency and Function honesty.

Referential Transparency:
Referential transparency, referred to a function, indicates that you can determine the result of applying that function only by looking at the values of its arguments. It means that the function should operate only on the values that we pass and it shouldn’t refer to the global state. Refer to the below example:


public int CalculateElapsedDays(DateTime from)
{
     DateTime now = DateTime.Now;
     return (now - from).Days;
}

This function is not referentially transparent. Why? Today it returns different output and tomorrow it will return another. The reason is that it refers to the global DateTime.Now property.

Can this function be converted into a referentially transparent function? Yes.
How?

By making the function to operate only on the parameters that we passed in.


public static int CalculateElapsedDays(DateTime from, DateTime now) => (now - from).Days;

In the above function, we eliminated the dependency on global state thus making the function referentially transparent.

Function honesty:

Function honesty states that a mathematical function should convey all information about the possible input that it takes and the possible output that it produces. Refer to the below example:


public int Divide(int numerator, int denominator)
{
   return numerator / denominator;
}

Is this function referentially transparent? Yes.

Does it qualify as a mathematical function? May not be.

Reason: The input arguments states that it takes two integers and return an integer as an output. Is it true in all scenarios? Nope.

What happens when we invoke the function like:

var result = Divide(1,0);

Yes, you guessed it right. It will throw “Divide By Zero” exception. Hence, the function’s signature doesn’t convey enough information about the output of the operation.

How to convert this function into a mathematical function?

Change the type of denominator parameter like below

public static int Divide(int numerator, NonZeroInt denominator)
{
    return numerator / denominator.Value;
}

NonZeroInt is a custom type that can hold any integer except zero. Now, we have made this function honest as it conveys all information about the possible input that it takes and the output that it produces.

Despite the simplicity of these principles, functional programming requires a lot of practices that might seem intimidating and overwhelming to many programmers. In this article, I will try to cover some basic aspects of starting with which includes “Functions as first class citizens”, “Higher order functions” and “Pure functions”

Functions as first-class citizens:

When functions are first-class citizens or first-class values, they can be used as input or output for any other functions. They can be assigned to variables, stored to collections, just like values of other type. For example:


Func<int, bool> isMod2 = x => x % 2 == 0;
var list = Enumerable.Range(1, 10);
var evenNumbers = list.Where(isMod2);

The above code illustrates that functions are indeed first-class values because you can assign the function ( x => x % 2 == 0) to a variable isMod2 which in-turn passed as an argument to Where (an extension on IEnumberable)
Treating functions as values is necessary for functional programming style as it gives the power to define Higher-order functions.

Higher-order functions (HOF):

HOFs are functions that take one or more functions as arguments or returns a function as a result or both. All other functions are First-order functions.

Let’s consider the same modulo 2 example in which “list.Where” does filtering to determine which number to be included in the final list based on a predicate provided by the caller. The predicate provided here is isMod2 function and theWhereextension method of IEnumberable is the Higher-order Function. Implementation of Where looks like below


public static IEnumerable<T> Where<T>
   (this IEnumerable<T> ts, Func<T, bool> predicate)
{
   foreach (T t in ts)  â¶
      if (predicate(t)) â·
         yield return t;
}

â¶ The task of iterating over the list is an implementation detail of Where.
â· The criterion determining which items are included is decided by the caller

The Where HOF applies the given function repeatedly to every element in the collection. HOFs can be designed to conditionally apply the function to the collection as well. I’ll leave that for you to experiment. By now you would have realized how concise and powerful Higher-order functions are.

Pure functions:

Pure functions are mathematical functions that follow the two basic principles that we discussed beforeâ€Š–â€ŠReferential transparency and Function honesty. In addition to that, Pure functions don't cause any side effects. Which means, it mutates neither the global state nor input arguments. Pure functions are easy to test and reason about. Since the output is dependent only on input, the order of evaluation isn’t important. These characteristics are very vital for a program to be optimized for parallelization, Lazy evaluation, and Memorization (Caching).

Consider the below example console application that multiplies a list of numbers by 2 and nicely formats that into multiplication table.


// Extensions.cs
public static class Extensions
{
    public static int MultiplyBy2(this int value)â¶
    {
       return value * 2;
    }
}
// MultiplicationFormatter.cs
public static class MultpilicationFormatter
{
    static int counter;

    static string Counter(int val) => $"{++counter} x 2 = {val}"; â·

    public static List<string> Format(List<int> list)
        => list
         .Select(Extensions.MultiplyBy2) â¸
         .Select(Counter) â¸
         .ToList();
}
// Program.cs
using static System.Console;
static void Main(string[] args)
{
     var list = MultpilicationFormatter.Format(Enumerable.Range(1, 10).ToList());
     foreach (var item in list)
     {
        WriteLine(item);
     }
     ReadLine();
}
// Output
1 x 2 = 2
2 x 2 = 4
3 x 2 = 6
4 x 2 = 8
5 x 2 = 10
6 x 2 = 12
7 x 2 = 14
8 x 2 = 16
9 x 2 = 18
10 x 2 = 20

â¶ A pure function
â· An impure function as it mutates the global state
â¸ Pure and impure functions can be applied similarly

The above code executes well without any issue as we are operating only on 10 numbers. What if we want to do the same operation for a bigger set of data especially when the processing is CPU intense? Would it make sense to do the multiplications in parallel? That means the sequence of data can be processed independently.

Certainly, we can use the power of Parallel LINQ (PLINQ) to get parallelization for almost free. How can we achieve this? Simply by adding AsParallel to the list.


public static List<string> Format(List<int> list)
=> list.AsParallel()
         .Select(Extensions.MultiplyBy2)
         .Select(Counter)
         .ToList();

But hold on, pure and impure functions don’t parallelize well together.

What do I mean by that? As you know that Counter function is an impure function. If you have some experience in multi-threading, this will be familiar to you. The parallel version will have multiple threads reading and updating the counter and there’s no locking in place. The program will end up losing some updates which will lead to incorrect counter results like below


1 x 2 = 2
2 x 2 = 4
7 x 2 = 6
8 x 2 = 8
6 x 2 = 10
4 x 2 = 12
9 x 2 = 14
10 x 2 = 16
5 x 2 = 18
3 x 2 = 20

Avoiding state mutation:
One possible way to fix this is to avoid state mutation and running counter. Can we think of a solution to generate a list of counters that we need and map them from the given list of items? Let’s see how:


using static System.Linq.ParallelEnumerable;
public static class MultpilicationFormatter
{
   public static List<string> Format(List<int> list)
   => list.AsParallel()â¶
      .Select(Extensions.MultiplyBy2)
      .Zip(Range(1,list.Count), (val, counter) => $"{counter} x 2 = {val}")â·
      .ToList();
}

Using Zip and Range, we have re-written the MultiplicationFormatter. Zip can be used as an extension method, so you can write the Format method using a more fluent syntax. After this change, Format method is now pure. Making it parallel is just a cherry on top of the cake â¶.This is almost identical to the sequential version except â¶ and â·

Of course, it is not always as easy as this simple scenario. But the ideas that you have seen till now will deploy you into a better position to tackle such issues related to parallelism and concurrency.

This article was originally posted in my Medium blog

Top comments (11)

Fabricio Cabral • Apr 22 '17

Hi Naveen!

I'm sorry, but do you have any reference (papers, books, and so on) to support this affirmation: "Imperative programming styles like object oriented programming"?

Thanks for your attention!

Naveen • Apr 23 '17

Hello Fabricio,

In my opinion, it means different things to different people. It is hard to make a definite statement to say that OOP does or doesn't belong to Imperative programming style.

Having said that, OOP is a way of managing state in an imperative program.

bob • Apr 23 '17

Hi Naveen.

"It is hard to make a definite statement to say that OOP does or doesn't belong to Imperative programming style."

And yet you made one for some reason. :)

OOP is no more imperative than Functional is imperative. If you write impure functions you're not really being functional you're being imperative. If you write mutable objects full of procedures ("sequences of statements") you're not really being OO you're being imperative or procedural.

I'm not sure what was in your blood stream, but it doesn't sound like it was OO. ;)

vedrantrebovic • Nov 20 '17

It seems like a blasphemy here if you mention some language, other than JS, can be used (or even worse: used for FP). Not sure why but many JS devs have this urge to show everyone JS is the best programming language out there (p.s. it isn't). Comments are mostly about the authors and code he wrote. I agree, if you publish some code online, even in blog post, it should be working, but it would be much better if we stick to the topic, rather than discussing the author. It's childish how people try to justify their language of choice by claiming their language is the best, and paradigm supported by that language is the only right one. If you catch yourself saying JS is better than C# (Java, C++...) or vice versa I have bad news for you: you're lousy software developer. If you catch yourself saying FP is better than OOP or vice versa I have bad news for you again: you don't understand neither of them. You're no different to any other extremist finding his holly grail.
First of all FP is around for so long, obsessing about it these days looks like obsessing about invention of wheel.
As a predominantly .Net developer I mostly rely on OOP. But that doesn't mean I don't use FP. Yes, C# supports FP programming style, probably not as good as some pure (pun intended) FP languages, but good enough to help you write better code. And it's been like that for years, with each new language spec supporting FP even more. If you want to criticize C# or any other programming language please write some code in it first. If you truly understand OOP and FP, you will find out you can use both of them. They both have pros and cons. If you manage to get pros column of both to your code, you have a winner.

Simon Sabin • May 31 '17

Your code has a huge flaw in that you aren't outputting what is being calculated. You are outputting the index of the item being multiplied, not the actual value being multiplied.

public static List<string> Format(List<int> list)
    => list.AsParallel()
    .Select(Extensions.MultiplyBy2)
    .Zip(list.AsParallel(), (result, item) => $"{item} x 2 = {result}")
    .ToList();