loading...
Cover image for Dirt simple SQL queries in F#

Dirt simple SQL queries in F#

kspeakman profile image Kasey Speakman Updated on ・5 min read

SQL is something many of us have gone to great lengths to try to abstract away. The Repository pattern, ORMs, and Type Providers are examples of abstractions around SQL. But each of these have trade-offs which do not fit every use case.

Today I put forth something that is less of an abstraction and more of a set of helper functions. This has trade-offs too, but it fits my needs very well. See what you think.

What it's like to use

It doesn't try to hide SQL from you, but just tries to get out of your way while you use SQL. Here's how you can define a query.

open QueryHelpers

let listCourses offset limit =
    sql
        """
        SELECT CourseId, CourseName
          FROM Course
        OFFSET @Offset ROWS
         FETCH
          NEXT @Limit ROWS ONLY
        ;
        """
        [
            p "Offset" offset
            p "Limit" limit
        ]

The helper functions in the above example are sql and p. Actually, p (short for parameter) is just a shortcut for making a tuple and boxing📌 the parameter value.

📌 Boxing parameter values

Boxing means converting something to type obj. For primitive types like int, converting it to an obj has a memory/performance penalty. (Because it has to be wrapped in a reference type and allocated by the garbage collector IIRC.)

At the time of this writing, ADO.NET requires boxing parameter values before it will use them. So there is no disadvantage to doing so, and it can prevent type inference problems in our F# code.

Here's how you would run the query.

type Course =
    {
        CourseId : int
        CourseName : string
    }

let query = listCourses request.Offset request.Limit
let coursesAsync = Sql.query<Course> connectString query

// coursesAsync is Async<Course array>

Here, Sql.query<Course> is a function which runs the query and converts each data row into a Course object. Like with most mappers, the property types and names have to match the returned columns.

Note that the query is created separately from the code which executes the query. This is particularly handy for queries which perform updates. I can setup multiple writes ahead of time, even from separate pieces of code, then perform them later in the same transaction.

let deactivateCourse courseId =
    sql "UPDATE ..." [ p "CourseId" courseId ]

let cancelCourseRegistrations courseId =
    sql "DELETE ..." [ p "CourseId" courseId ]

...

let patches =
    [
        deactivateCourse courseId
        cancelCourseRegistrations courseId
    ]

...

Sql.writeBatch connectString patches

// returns Async<unit>

Implementing it

Here is the entire 40-line implementation for the usages above.

namespace Foo.MsSql

open System
open System.Data.SqlClient


type SqlQuery =
    {
        Query : string
        Parameters : (string * obj) list
    }


module QueryHelpers =

    let p name value =
        (name, box value)


    let sql query parameters =
        {
            Query = query
            Parameters = parameters
        }


module Sql =

    open Dapper  // adds QueryAsync and ExecuteAsync


    let query<'T> connectString q =
        async {
            use connection = new SqlConnection(connectString)
            do! connection.OpenAsync() |> Async.AwaitTask
            let! result =
                connection.QueryAsync<'T>(q.Query, dict q.Parameters)
                    |> Async.AwaitTask
            return Array.ofSeq result
        }


    let writeBatch connectString writes =
        async {
            use connection = new SqlConnection(connectString)
            do! connection.OpenAsync() |> Async.AwaitTask
            use transaction = connection.BeginTransaction()
            for write in writes do
                do!
                    connection.ExecuteAsync
                        (write.Query, dict write.Parameters, transaction)
                        |> Async.AwaitTask
                        |> Async.Ignore
            transaction.Commit()
        }

Dapper does the hard work of mapping rows to objects. Dapper is awesome.

This code is only a starting point, but it is pretty open for adding new features. For instance, I have added the following over time which I may cover in a later post:

  • returning multiple data sets from a query
  • more settings like command timeout
  • batching by combining writes into one query different from above by making only one round-trip to database
  • large result set handling returns a sequence that doesn't read into memory until iterated
  • Postgres version
  • parameter types mainly for Postgres jsonb type, which Npgsql does not correctly infer

I did have to add Dapper Type Handlers for option types. But with that, I can map null DB fields to F# Option, and avoid null handling.

What about ...?

Getting only 1 row

Since the data is returned as an array, you can use F# Array functions like Array.tryHead to get the first row if one exists.

Getting a scalar

Dapper supports converting rows to primitives like int. So you can use Sql.query<int> and that will return the int value out of the first column for each row. Then you can use Array functions to get only the first row.

How far can this get you?

I use a few principles which makes this (plus the additions I mention) satisfy all my SQL needs.

First, I separate decisions from side-effects. So my code returns messages representing what decisions and events happened as a result of the user's request. Later in the pipeline, these decisions are turned into side effects like saving to the database or sending emails.

Second, I model the data multiple ways. The first model is the authoritative model (source of truth) on which the others are based. This could be an event log or normalized relational tables. This is where most people stop when using a relational database. However, I additionally create models which are tailor-fit for specific purposes.

For example, I store a Course's data (which has several layers of nested objects) as a JSON object, since it is convenient to load and save it that way for Add/Edit/View scenarios. I have a separate model (table) for CourseSearch which consists of some of the same data, but as relational columns. These columns are used for filtering and sorting and populating a text search index, to enable users to perform Full Text Searches. Both models are useful for their specific purposes. This is Separation of Concerns at the data level. But if I tried to squeeze both into the same mega-model, it would be hard to work with. Because I'd always have to think of the mega-model from two different dimensions at once in order to make the right choices about it. My dimensionally-challenged brain can't do that consistently.

Third, as a rule, I do not have queries which both change data and return data. It's either/or. That way I can't introduce bugs due to unanticipated side effects. As a consequence, I cannot use auto-generated IDs. But in the end that turns out to be ok. Because they complicate other features, like retries. And they have to be abandoned (or outsourced) anyway once you outgrow a single server.

Due to the above, my queries stay relatively simple and targeted for a specific purpose. So a simple SQL API meets all my needs.

for each desired change, make the change easy (warning: this may be hard), then make the easy change

  • Kent Beck

Posted on Oct 3 '17 by:

kspeakman profile

Kasey Speakman

@kspeakman

collector of ideas. no one of consequence.

Discussion

markdown guide
 

This looks really good -- in my attempts I am having trouble with null values. I've tried the linked OptionHandler but doesn't seem to work...

Are you able to provide an example handling Options<string>?

Also, have you considered releasing what you've got as a library?
If not, i might once I can work out the details...

 

Here is what I use to setup the optional type handlers. After that string option and many other optional types are handled automatically.

namespace Foo.MsSql

open System
open Dapper

type OptionHandler<'T>() =
    inherit SqlMapper.TypeHandler<option<'T>>()

    override __.SetValue(param, value) = 
        let valueOrNull = 
            match value with
            | Some x -> box x
            | None -> null

        param.Value <- valueOrNull    

    override __.Parse value =
        if isNull value || value = box DBNull.Value then
            None
        else
            Some (value :?> 'T)


module DapperConfig =

    let RegisterOptionTypes () =
        SqlMapper.AddTypeHandler(new OptionHandler<bool>())
        SqlMapper.AddTypeHandler(new OptionHandler<int>())
        SqlMapper.AddTypeHandler(new OptionHandler<int64>())
        SqlMapper.AddTypeHandler(new OptionHandler<string>())
        SqlMapper.AddTypeHandler(new OptionHandler<Guid>())
        SqlMapper.AddTypeHandler(new OptionHandler<DateTime>())
        SqlMapper.AddTypeHandler(new OptionHandler<single>())
        SqlMapper.AddTypeHandler(new OptionHandler<double>())
        SqlMapper.AddTypeHandler(new OptionHandler<decimal>())

Then as part of app startup, I call:

    DapperConfig.RegisterOptionTypes()

To me this is still a bit too jagged of an edge. Probably the way around having to call something at app startup is to include the initialization as a let statement in the Sql module. But I am not sure if it would get optimized away in a production build, since it will never get referenced.

One other gotcha: The column order in the returned query data must match the property order on F# records. Maybe this is because of the way the F# record constructor is generated, but Dapper cannot figure it out if the returned data is in a different order.

I will setup a github repo for it tonight, time permitting. I can include multi queries too (i.e. SELECT ... ; SELECT ... ;).

 

Created an initial repo. Still need to figure out how to make a NuGet package, add examples, etc.

SlimSql - seems a stupid name, but best I could come up with tonight.

 

Thanks for this!

I had the project on hold that could use this, but am restarting it now.

I currently have bits of code from your blogpost, plus other bits I scrapped together, as I discover things I need.

When it settles down I will try and switch to SlimSql... and perhaps add a PR or two :)

Sure thing. I need to figure out how to execute the startup code so the user doesn't have to wire it into their app startup. I hate that Dapper requires this.

FWIW I am currently calling this as part of opening the connection. The result is a noop since the method checks if the typehandler already exists in the SqlMapper.typeHandlers dictionary.

Looks something like this:

    let registerTypeHandlers() = 
        SqlMapper.AddTypeHandler(OptionHandler<Guid>())
        SqlMapper.AddTypeHandler(OptionHandler<int64>())
        ...

    let dbConnect string = 
        registerTypeHandlers()
        new OracleConnection(string)

PS: Am also forced to use Oracle, so would be nice to make that an option :)

 

Is there any way to avoid boxing when passing int parameters?

 

Sure. You can manually construct the query to include the data as part of the query. That way, you deal with the types without boxing to construct the query. However, this is considered bad practice because it makes SQL Injection attacks possible.

For parameterized queries, this is currently not possible when using ADO.NET (which is used by Dapper). There is an open issue to use parameters without boxing. I'll have to see what implementation they come up with before I know whether/how it will complicate the functions I posted here.

In practice, I have not found the performance to be a problem... considering that IO is typically slower than boxing perf and there is a hard limit to the number of parameters (~2100). Are you just curious or did you have a specific use case where this was a problem?

 

It's generally better to avoid additional work for GC, but in this case fsharp type checking forces writing box even for reference types, which is inconvenient, as params are of this type Parameters : (string * obj) list. I have do this for both ints and strings:
[
p "paramInt" (box paramInt)
p "paramString" (box paramString)
]

It is possible to avoid boxing in F#, but there is no point because the value is going to get boxed anyway by the database client (SqlParameter only takes objects as values). Avoiding boxing in the F# code will just bloat the code to no benefit.

I see, thank you for the clarifications

Hi Kasey, thank you for the useful post.

Is there any problem with or disadvantage to defining p as:
let p name value = (name, (box value))

Which allows you to just type
[ p "paramInt" paramInt ]

Edit: I just re-read the preceding conversation and realized I answered the wrong question. Both yours and @lanayx previously.

No it is not a problem at all for the p helper to box the value, to avoid repeating box every time. And in fact that is exactly how I defined it in my own library. I believe it was just an oversight in the post. I will update the post.

Leaving the original response below.


I created this tiny helper due to a couple of factors.

  1. ADO.NET boxes all parameters values to obj anyway
  2. F# will not auto-upcast the values in the list, which can lead to type inference compile errors.

I recall there being an issue on a dotnet repo about having the option to not box sql parameters. But I am not sure if typed SQL parameters ever made it into a release.

So without the helper, the parameter list looked like this:

[
    "@SomeParam", box someParam
    "@OtherParam", box otherParam
]

Not terrible, but removing characters , box to add p on the front seemed a favorable trade.

That ended up being fortuitous because later, using a tuple was not quite enough. Sometimes it is necessary to declare the database type of the parameter. For example, when the value is null. (With Npgsql also when the column is jsonb.) So the parameter tuple became a full record type.

    type SqlParam =
        {
            Name : string
            Value : obj
            Type : DbType option
        }

p can be changed to work with this, and in a way that does not break existing defined queries. And it will be drastically shorter than spelling out the full SqlParam record.

    let p name value =
        {
            Name = name
            Value = value
            Type = None
        }


    let pTyped name value type_ =
        {
            Name = name
            Value = value
            Type = Some type_
        }

Hi Kasey, thank you for the quick and helpful reply.

Moving from the F# type providers to Dapper has been a real pleasure for me. For the price of giving up the compile time type checking, I now have simple reusable data transfer objects (DTOs) and functions to convert these from domain and back.

Best regards

Roland

 

Nice! I'll have to look into Dapper.

 

Thanks for this! I'm an Elm/OCaml programmer and this really makes me want to learn F#