Jakob Christensen

Posted on Oct 20, 2021 • Edited on Nov 10, 2021

Real life performance optimizations in F#

#dotnet #fsharp #performance

At work we are building our first major F# application. I work at a pension company, and the application is a simulation model to estimate our future liabilities in regards to our policy holders under different scenarios.

The model will run 100 years into the future on at least 100,000 policies for perhaps 6,000 scenaries. So the application needs to be fast!

Bartosz Sypytkowski wrote a great blog on high performance F# code and we deployed some of his advice. Some of it worked and some of it didn't. Remember, when optimizing for performance, you need to test thoroughly whether your changes actually work for the better. One thing that didn't really work for us was changing our types to structs.

Below is a list of what worked for us. It might also work for you 😉

Avoid Single Case Discriminated Unions

This is taken straight out of Bartosz Sypytkowski's blog. It is quite common to wrap your simple types in Single Case Discriminated Unions in F#.

type PolicyId = PolicyId of string

This gives you nice code and better type safety but it kills performance. Instead we settled for unit of measure and typedefs.

[<Measure>] type percent
[<Measure>] type kr // Danish monetary unit

The percent unit is particularly nice as you never again have to ask how to specify it. When you read it with a unit of measure, you intuitively know that you need to put it as 2.0<pct> and not 0.02<pct>. You also know you need to divide by 100.0<pct> every time you use it, otherwise you will end up with a wrong unit.

let savings = 100000.00<kr>
let interestRate = 2.23<pct>
let interestBad = savings * interestRate // <kr*pct>
let interestGood = savings * interestRate / 100.0<pct> // <kr>

Don't use monadic binding

Personally, I am a big fan of Kleisli composition and monadic binds.

Impeachment in a functional way

Jakob Christensen ・ Jan 13 '20

#functional #fsharp

So naturally we started out with a workflow that looked something like this.

let workflow inputParameters policy =
    policy
    |> calculatePremium inputParameters
    >>= calculatePayments
    >>= calculateYield
    >>= calculateNewSavings
    >>= calculateFoo
    >>= calculateBar

Each function had the same type:

calculateFunction: InputParameters -> Policy -> Result<Policy, 'TError>

It was beautiful. It was also slow.

It turned out that almost all of the calculate functions returned Ok and never Error. So we changed the whole thing to simple piping and added a validation function at the end that returns Result. This gave us much better performance:

let workflow inputParameters policy =
    policy
    |> calculatePremium inputParameters
    |> calculatePayments inputParameters
    |> calculateYield inputParameters
    |> calculateNewSavings inputParameters
    |> calculateFoo inputParameters
    |> calculateBar inputParameters
    |> validatePolicy inputParameters

With this we changed the type for each function to:

calculateFunction: InputParameters -> Policy -> Policy

To be fair, I later on found out that we had forgotten to add inline to the >>= operator and that may have had something to do with the bad performance. Which brings me to:

Use inline

Yes, don't forget to add inline to your generic small helper functions and operators 🙄

Only pass needed parameters

As I said above, all our calculate functions had the same type:

calculateFunction: InputParameters -> Policy -> Policy

But we soon found out that not all functions actually used all of InputParameters. InputParameters was quite large and that meant a performance hit for each call. Therefore, we split InputParameters into smaller logically grouped types and changed each calculate function to only take whatever it needed as parameter.

That gave us a small gain on the performance.

Memoize stuff, and consider using Dictionary instead of Map

We added memoization where ever it made sense and it gave us a lot performance wise. Our first try on a memoize function looked like so:

let memoize f =
  let cache = ref Map.empty
  fun x ->
    match (!cache).TryFind(x) with
      | Some res -> res
      | None ->
           let res = f x
           cache := (!cache).Add(x,res)
           res

We found out that using the .NET type Dictionary<,> gave us better performance.

let memoize f =
  let cache = Dictionary<_, _>()
  fun x ->
    let (found,result) = cache.TryGetValue(x)
    if found then
      result
    else
      let res = f x
      cache.[x] <- res
      res

I both implementations originally came from [@dsymetweets] but I have been unable to find the links.

The above memoization works for functions of one parameter only which led us to some "overloads" for functions with more than one parameter. The following functions create memoized versions of functions with 2 or 3 parameters int that they create a tuple from the parameters and memoize based on that tuple:

let memoize2 (f': 'a -> 'b -> 'c) =
    let f = memoize (fun (a, b) -> f' a b)
    (fun a b -> f (a, b))

let memoize3 (f': 'a -> 'b -> 'c -> 'd) =
    let f = memoize (fun (a, b, c) -> f' a b c)
    (fun a b c -> f (a, b, c))

You'd use them like so:

let add x y z = x + y + z
let addMemoized = memoize3 add
let r1 = addMemoized 1 2 3
let r2 = addMemoized 1 2 3 // Result is now cached

This will create extra allocations but we have not found a better way to do it.

Map string types to integers

In our line of business we use a lot of strings to identify business entities, like for example policy numbers and fund IDs. This leads to a lot of comparison of strings which can hurt performance. So we changed all IDs to integers and created dictionaries on the side to look up IDs before and after the calculations where done.

Conclusion

As I said, these optimizations worked for us but they may not work for you. As we start using the model for real, we might even discover that we have to revert some of the things. Specifically, we may one day find out that structs will work better for us.

I don't recommend most of the things we did if your application is a LoB model that does not need to be very high performant. In that case you should value beautiful code over high performance.

The steps above gave us close to 90 percent improvement but it still might not be enough. In that case we still have one more trick up our sleeves: The monster that goes by the name Mutability! :-)

DEV Community