DEV Community

elfalem for DealerOn Dev

Posted on

Tale of .NET Core and the JSON SQLite Extension

I was working on a .NET Core application and was getting to the stage where data needed to be persisted. I wanted to use SQLite because it's great as an application file format. Some of the data would be stored using JSON. Although SQLite doesn't have built-in functionality for working with JSON, it allows the use of extensions. This is a very powerful feature allowing anyone to customize SQLite to their needs. Things like JSON support (JSON1) and full-text search (FTS3 and FTS4) are implemented in this way.

Although loadable extensions can be statically linked at compile time, they can also be dynamically linked at run time. The latter is very beneficial because it means you don't have to re-compile your own version of SQLite to include an extension and ship it with your application. Instead, you can leverage an existing SQLite instance and load the necessary extensions at run time.

In the case of .NET, there is the System.Data.SQLite.Core NuGet package, the official database and ADO.NET provider. This instance of SQLite doesn't have JSON support enabled by default but I should be able to load the extension as mentioned above. However there were some unexpected challenges and thus began my journey into getting a SQLite extension to load in a .NET core application on a Linux machine.

Initial Attempt

I was working on a Linux Mint machine with .NET SDK version 2.2. After referencing the necessary packages, the first step was to create a connection to the database.

dotnet new console
dotnet add package System.Data.SQLite.Core
dotnet add package Dapper
Enter fullscreen mode Exit fullscreen mode
using(var connection =  new SQLiteConnection("Data Source=:memory:"))
    var result = connection.QueryFirst<string>("select SQLITE_VERSION() AS Version");
Enter fullscreen mode Exit fullscreen mode

This part was straightforward. I was able to connect to an in-memory database and successfully execute a query through Dapper.

The next step was loading the JSON1 extension. This is what the internet says to do to load an extension:

using(var connection =  new SQLiteConnection("Data Source=:memory:"))
    connection.LoadExtension("SQLite.Interop.dll", "sqlite3_json_init");
    var result = connection.QueryFirst<string>("select json(' { \"this\" : \"is\", \"a\": [ \"test\" ] } ')");
Enter fullscreen mode Exit fullscreen mode

The above actually works on Windows (which I'll discuss at the end) but it fails on Linux. It shows an error indicating that the DLL is not found:

Unhandled Exception: System.Data.SQLite.SQLiteException: SQL logic error
The specified module could not be found.
Enter fullscreen mode Exit fullscreen mode

Unfortunately I wasn't able to find a lot of information on the net to help with this issue.

Structure of the NuGet package and SQLite

I started looking into how the NuGet package and SQLite work in hopes of uncovering something that resolves the issue. SQLite is written in C. The NuGet package allows us to interface with it by calling the C functions through C#. Since the C code is native, it's specific to different operating systems and architectures. The NuGet package contains the native SQLite code compiled to different platforms.

Actually, if you directly download System.Data.SQLite.Core from NuGet and extract it (.nupkg files are just archives), there isn't a lot of code. If you open up System.Data.SQLite.Core.nuspec in a text editor you will find that it's a simple wrapper for other packages. It conditionally references various other packages based on the version of .NET framework encountered. For instance, the package Stub.System.Data.SQLite.Core.NetStandard is referenced for .NET Core. This package has a much larger size (3.72 MB vs 11.44 KB for the wrapper) and is actually what contains the code.

Downloading and extracting this package, you'll find a lib/ folder with the C# DLL as well as the runtimes/ folder with platform specific native code as mentioned above.

Enter fullscreen mode Exit fullscreen mode

You can inspect DLLs with nm on Linux. For example: nm --dynamic runtimes/linux-x64/native/SQLite.Interop.dll shows all the symbols in the DLL. This is one way to examine what functions exist. A portion of the output for this command looks like this:

0000000000017028 T sqlite3_hard_heap_limit64
0000000000114ea2 T sqlite3_index_column_info_interop
00000000000d3cc4 T sqlite3_initialize
00000000000d6240 T sqlite3_interrupt
0000000000142a7f T sqlite3Json1Init
0000000000142c9d T sqlite3_json_init
00000000000d2533 T sqlite3_keyword_check
00000000000d2528 T sqlite3_keyword_count
00000000000d24ba T sqlite3_keyword_name
00000000000d556d T sqlite3_last_insert_rowid
0000000000114679 T sqlite3_last_insert_rowid_interop
00000000000d3ca1 T sqlite3_libversion
00000000000d3cae T sqlite3_libversion_number
00000000000d764f T sqlite3_limit
0000000000095810 T sqlite3_load_extension
000000000001b695 T sqlite3_log
000000000001746c T sqlite3_malloc
00000000000174a4 T sqlite3_malloc64
Enter fullscreen mode Exit fullscreen mode

In the above snippet, we find functions that are part of the core SQLite code such as sqlite3_initialize and sqlite3_load_extension. But we also find sqlite3_json_init which is the entry point of the JSON1 loadable extension. It looks like the DLL is sourced from the SQLite Amalgamation. The SQLite Amalgamation is a single file containing all of SQLite proper as well as about half a dozen loadable extensions that are maintained by the SQLite team. Although the extensions are not enabled during compilation, having them bundled makes it easier to distribute and enable them at runtime.

Putting all this together, the structure of the NuGet package is as follows:
SQLite package structure

How SQLite Loads an Extension at Runtime

Using a SQLite loadable extension at runtime entails dynamically linking to the extension code. Whether it's done through C# code (connection.LoadExtension() in the above snippet), the command-line SQLite shell (.load), or other means, ultimately the process involves searching for a DLL and loading it into memory.

I came across a Microsoft Docs page which indicates that SQLite doesn't leverage the .NET Core logic for finding native libraries. It directly calls the platform API to load the DLL. The way to influence this is to modify environment variables (PATH on Windows, LD_LIBRARY_PATH or DYLD_LIBRARY_PATH on Linux). There is sample code linked from the Doc that finds native libraries for the current runtime in a referenced NuGet package. The code notes that on Windows, PATH could be modified with the path of found native libraries after the application starts. However for Linux, the environment variable must be modified before running the application which is cumbersome.

Instead of changing environment variables to modify the DLL search locations, I wanted to see if we can somehow provide an absolute path for the DLL location. Looking at the source for the NuGet package it's clear that, as the Microsoft Docs page noted, the C# code just invokes the native code (sqlite3_load_extension()) directly. Therefore I looked at the SQLite source code. SQLite has the concept of VFS which is a portability layer at the bottom of the SQLite architecture. There are implementations of this abstraction for specific operating systems. Windows and Linux implementations are os_win.c and os_unix.c respectively. The extension loading code eventually relies on VFS implementations to load DLLs. On Linux this is handled by dlopen and it's LoadLibraryW on Windows. I later rediscovered this same information after reading the bottom (section 7) of a SQLite documentation page.

Focusing on dlopen(), the documentation shows that it accepts a full file path. This is great because I can obtain the location of the platform specific native library using the same approach as the Microsoft sample code from earlier. However instead of attempting to modify environment variables, I can provide the direct path to connection.LoadExtension() and load the DLL. Since I'm on a Linux machine, I started by directly hard-coding the path to the native Linux library. If it works, I should be able to easily generalize it to work on other platforms.

I changed the code to the following and ran dotnet publish.

using(var connection =  new SQLiteConnection("Data Source=:memory:"))
    connection.LoadExtension("/<my-project-path>/bin/Debug/netcoreapp2.2/publish/runtimes/linux-x64/native/SQLite.Interop.dll", "sqlite3_json_init");
    var result = connection.QueryFirst<string>("select json(' { \"this\" : \"is\", \"a\": [ \"test\" ] } ')");
Enter fullscreen mode Exit fullscreen mode

I then ran the published DLL from the /bin/Debug/netcoreapp2.2/publish/ folder and it worked. The query that includes the json() function provided by the extension ran successfully and I got the expected result.

I didn't realize it at the time but it becomes important that I ran the DLL from the publish/ folder and not the one that's one level up at /bin/Debug/netcoreapp2.2/.

Confusing Observations

At this point I thought I had a good solution for loading the DLL. I started making other unrelated changes to my program. But when I tried to run it with dotnet run, it wouldn't work. The program abruptly exits when it reaches the line that loads the extension. I tried running the published DLL as before, and that works. But using dotnet run does not. There are no exceptions or errors of any kind that's outputted. Even when debugging (with justMyCode set to false in .vscode/launch.json) there are no signs that anything is wrong. It even indicated an exit code of 0 (i.e. successful termination of program). Yet the program was quitting abruptly. I was very confused and posted a question on Stack Overflow.

Unfortunately I didn't get any answers. One thing I eventually tried was using newer versions of the .NET framework since I was on version 2.2. I changed the project target to 3.1 and tried dotnet run. I got the same behavior of the program terminating without errors when it got to loading the extension. I executed the DLL (from the /bin/Debug/netcoreapp3.1/ folder) directly and this time it terminated in the same place but I got a segmentation fault error. I was more confused because I expected this to work (it seemed like it was working when using version 2.2). I changed the target framework to 5.0 and tried again. I got the same results as 3.1. Then I changed it back to 2.2 and I started getting segmentation fault there as well when executing the DLL directly. This was really puzzling because it was working earlier. We'll get to it later but this was because this time I was executing the DLL from /bin/Debug/netcoreapp2.2/ instead of /bin/Debug/netcoreapp2.2/publish/. However, I didn't realize it at the time.

Segmentation Fault

I turned my attention towards investigating the segmentation fault I was seeing. It means there is improper memory access occurring somewhere. However this can't happen in managed code so it was likely coming from the native C code for SQLite. I used gdb to debug and find the offending line. The following is a snippet of the call stack after my program crashes.

Thread 1 "dotnet" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007fffec630c04 in sqlite3_mutex_enter (p=0x555555859f28) at ../core/sqlite3.c:25905
#2  0x00007fffec6f05c9 in createFunctionApi (db=0x5555559a43c8, zFunc=0x7fffec773d60 "json", nArg=1, enc=2099201, p=0x7fffec99a3cc <aFunc.54932+12>, xSFunc=0x7fffec75ae78 <jsonRemoveFunc>, 
    xStep=0x0, xFinal=0x0, xValue=0x0, xInverse=0x0, xDestroy=0x0) at ../core/sqlite3.c:162077
#3  0x00007fffec6f0708 in sqlite3_create_function (db=0x5555559a43c8, zFunc=0x7fffec773d60 "json", nArg=1, enc=2099201, p=0x7fffec99a3cc <aFunc.54932+12>, 
    xSFunc=0x7fffec75ae78 <jsonRemoveFunc>, xStep=0x0, xFinal=0x0) at ../core/sqlite3.c:162117
#4  0x00007fffec75cb33 in sqlite3Json1Init (db=0x5555559a43c8) at ../ext/json1.c:2591
#5  0x00007fffec75ccbd in sqlite3_json_init (db=0x5555559a43c8, pzErrMsg=0x7fffffffc670, pApi=0x7fffecd215c0 <sqlite3Apis>) at ../ext/json1.c:2623
#6  0x00007fffeca366ae in sqlite3LoadExtension (db=0x5555559a43c8, 
    zFile=0x7fff5807dd98 "/<my-project-path>/bin/Debug/netcoreapp2.2/publish/runtimes/linux-x64/native/SQLite.Interop.dll", zProc=0x7fff5807de60 "sqlite3_json_init", 
    pzErrMsg=0x7fffffffc7c8) at ../core/sqlite3.c:124451
#7  0x00007fffeca36850 in sqlite3_load_extension (db=0x5555559a43c8, 
    zFile=0x7fff5807dd98 "/<my-project-path>/bin/Debug/netcoreapp2.2/publish/runtimes/linux-x64/native/SQLite.Interop.dll", zProc=0x7fff5807de60 "sqlite3_json_init", 
    pzErrMsg=0x7fffffffc7c8) at ../core/sqlite3.c:124484
#8  0x00007fff7d01ca34 in ?? ()
#9  0x4164657a696c6169 in ?? ()
#10 0x0000000094cf77ea in ?? ()
#11 0x00007ffff66f6c38 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/2.2.8/
#12 0x00007fffffffcbb8 in ?? ()
Enter fullscreen mode Exit fullscreen mode

As expected the issue was in SQLite code. I tried looking at the code where the problem was occurring but nothing stood out. Searching the web for segmentation faults related to SQLite I came across this forum post. The answer pointed to multiple instances of SQLite being the issue. Although the forum post was about a python application, I started wondering if there were multiple instances being created in my application as well. It seemed unlikely because I wasn't using threads or doing anything fancy. It was a simple console application. Perhaps something was lingering from earlier runs of my application? I tried restarting VS Code and the entire machine. I tried using an actual file instead of an in-memory database. I tried explicitly enabling extension loading (connection.EnableExtensions(true)) even though it was already enabled. I looked at various compile and connection setup flags in SQLite. None of this helped.

Looking up information about the function where the error is occurring (sqlite3_mutex_enter), I came across a Stack Overflow question about access violation (i.e. segmentation fault) in python. The accepted answer talks about multiple modules which apparently are DLLs on Windows. The solution in this case was invoking sqlite3_initialize() in the SQLite code for the second module. Although it was a different language, it seemed like the same issue I was facing. And it seems to align with the eariler forum post about multiple instances of SQLite being the issue.

Given the above information, I wanted to see if I can reach into the native library and invoke sqlite3_initialize(). This was quite straightforward. I added the following method signature that corresponds to the C function I wanted to invoke and decorated it with DllImport attribute specifying the full path to the DLL.

static extern void sqlite3_initialize();
Enter fullscreen mode Exit fullscreen mode

Then I added a call to this method before loading the extension.

using(var connection =  new SQLiteConnection("Data Source=:memory:"))
    connection.LoadExtension("/<my-project-path>/bin/Debug/netcoreapp2.2/publish/runtimes/linux-x64/native/SQLite.Interop.dll", "sqlite3_json_init");
    var result = connection.QueryFirst<string>("select json(' { \"this\" : \"is\", \"a\": [ \"test\" ] } ')");
Enter fullscreen mode Exit fullscreen mode

After the above changes, the program did not terminate abruptly or lead to a segmentation fault. The JSON extension was loaded and the SQL query that made use of syntax provided by the extension executed successfully.

It was working at this point. But the fix was a workaround. It was clear that there were two different DLLs being loaded. As we saw earlier SQLite.Interop.dll is made from the SQLite Amalgamation. Therefore when the extension is loaded, a second full instance of SQLite was also being loaded. The first instance would be what is referenced directly by the NuGet package and used to establish the database connection and execute queries. I wanted to inspect the location of this first instance. To do so, I printed the location of the assembly that contains the SQLiteConnection class.

Enter fullscreen mode Exit fullscreen mode

and got the output:

Enter fullscreen mode Exit fullscreen mode

This was surprising because I thought it would be referencing the package from the publish/ folder. I changed my code to refer to the same absolute path as above when loading the extension. I also removed the call to sqlite3_initialize():

using(var connection =  new SQLiteConnection("Data Source=:memory:"))
    connection.LoadExtension("/home/<my-user>/.nuget/packages/", "sqlite3_json_init");
    var result = connection.QueryFirst<string>("select json(' { \"this\" : \"is\", \"a\": [ \"test\" ] } ')");
Enter fullscreen mode Exit fullscreen mode

The above worked. The same exact DLL was being referenced so there weren't multiple instances of SQLite.

At this point, I remembered that publishing my program creates a DLL in the publish/ folder. Ever since I encountered the segmentation fault, I was executing the DLL that was one level up. When I executed the DLL inside the publish/ folder, I saw that the path of the assembly containing SQLiteConnection was the following and the program was terminated with a segmentation fault error.

Enter fullscreen mode Exit fullscreen mode

It now made sense why the program worked when I initially hard-coded the full path of the DLL. I executed it from the publish/ folder. Therefore, the NuGet package and the LoadExtension call were referencing the same native library. It meant only one instance of SQLite existed and hence there wasn't an issue. This observation raised a couple of questions.

Questions About .NET SDK

The first question was why there are two seemingly identical DLLs created. Ever since I first worked with .NET Core I've noticed that when publishing a program Foo, both /bin/Debug/netcoreapp2.2/Foo.dll and /bin/Debug/netcoreapp2.2/publish/Foo.dll get created but didn't figure out why. It was clear that the one in the publish/ folder is part of the end result of the publishing process. However, why does the same DLL exist one level up? There were quite a few Stack Overflow questions that asked about this (among other things) but frustratingly none of their answers addressed this particular question. I finally came across one question that directly answered it. I now understand that /bin/Debug/netcoreapp2.2/ is used as the output location when building. The files are then copied to /bin/Debug/netcoreapp2.2/publish/ when publishing. This would have been less confusing if Microsoft defaulted to a different folder as the output location of build artifacts, perhaps /bin/Debug/netcoreapp2.2/build/.

The second question was why when executing the DLL from the build location, it looks for packages in /home/<my-user>/.nuget/packages/ (trivia: this is called a "probing directory". In this context, probing is the process of locating assemblies by the runtime). I was able to find the answer in the documentation for dotnet build. It states that:

For executable projects targeting versions earlier than .NET Core 3.0, library dependencies from NuGet are typically NOT copied to the output folder. They're resolved from the NuGet global packages folder at run time.

So this was a change in newer versions of the framework. I was able to confirm this by outputting the location of the assembly containing SQLiteConnection for the different versions.

Version Build output DLL Publish output DLL
2.2 /home/<my-user>/.nuget/packages/ /<my-project-path>/bin/Debug/netcoreapp2.2/publish/System.Data.SQLite.dll
3.1 /<my-project-path>/bin/Debug/netcoreapp3.1/System.Data.SQLite.dll /<my-project-path>/bin/Debug/netcoreapp3.1/publish/System.Data.SQLite.dll
5.0 /<my-project-path>/bin/Debug/netcoreapp5.0/System.Data.SQLite.dll /<my-project-path>/bin/Debug/netcoreapp5.0/publish/System.Data.SQLite.dll

As can be seen above, newer versions of the framework load dependencies from a location relative to the executing program whether the program was built or published. This makes it easier to consistently locate a native library for my use case.

Another unresolved item was why directly executing the DLL resulted a segmentation fault error but using dotnet run abruptly ends the program with no indication as to the cause. I was curious as to what actually prints the error on the terminal in the first case. In Linux, a segmentation fault is one of the many signals that can be sent from the kernel or between processes. According to this question, if the SIGSEGV signal (i.e. a segmentation fault) is not handled by the program, the program is terminated and the the parent process (i.e. the shell) prints out the message to it's (not the program's, which is already terminated) standard error.

So in the context of my program, it's the Linux shell that's telling me that there's a segmentation fault. I wanted to know how dotnet run executes programs. Looking at the .NET SDK source code, it starts a new process to execute the program. It also reads the standard output and standard error streams from the process it creates. However, I did not see logic dealing with handling signals such as SIGSEGV. I've opened an issue for this.

Behavior on Windows

To gain a broader understanding of items discussed above, I also looked at the behavior on a Windows machine. I used VS Code and not Visual Studio so that that doesn't become a factor. I looked into why simply specifying the DLL name without a path (e.g. SQLite.Interop.dll) works on Windows. This is explained by the search order used when loading DLLs. The first thing that the system checks is if the DLL with the same module name is already in memory. If so, then the already loaded DLL is used and the search stops. This seems to be what's happening in this case. When the application creates the database connection, it's using the native library (since it's an amalgamation containing the core SQLite code as well as the loadable extension). Therefore, the DLL is already loaded by the time we invoke LoadExtension().

To test this further, I made a copy of the DLL in the same folder and renamed it to SQLite.Interop_copy.dll. Then I modified the code to load this renamed DLL (again without specifying the path) and it was not able to find it. The search order is then to look for the DLL in various directories, and lastly in paths listed in the PATH environment variable. The location of the renamed DLL doesn't match any of these and thus it's not found. If I modify the PATH environment variable and add the location, then the renamed DLL is found. This made it possible to see the Windows DLL search heuristic in action.

I also wanted to see if Windows will report a segmentation fault. Using the bash shell within VS Code, I got a segementation fault while using dotnet run as well as when directly executing the DLL. I then switched to using PowerShell in which case the program halted abruptly for both ways of executing it. So it looks like the behavior differs based on the shell I'm using when it comes to Windows.

Lastly, I wanted to see how .NET handles segmentation faults within C# code. I didn't find a way to intentionally cause this with managed code so I had to switch to an unsafe context and write unmanaged code. I declared an integer variable and assign it's memory address to a pointer. I then shift the pointer by an arbitrary value and attempt to access that address as an integer.

    int foo = 70;
    int *pointer = &foo;
    pointer += 500000000;
Enter fullscreen mode Exit fullscreen mode

The above results in the exception: Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. This indicates that there is a difference in how improper memory access is handled when it occurs in unmanaged code as opposed to a native library (which technically is still unmanaged code). If anyone has an explanation of why this is, I'd like to know.

Lessons Learned

This journey was a diversion from the project I was working on but a lot of lessons were learned in the process. I learned more about the .NET build and publish steps and why there are duplicate DLL files in the output directory. I also learned about the the types of things that could be different between .NET SDK versions and the importance of ensuring installed versions are up-to-date. I have a better understanding of how DLLs are loaded in both Linux and Windows. I gained insight into the inner workings of SQLite and the NuGet packages that interface with it to allow it's usage in .NET. Diving into the source code of a very successful project and a complex one like that of a database seems daunting, but SQLite has excellent documentation and it was a rewarding experience.

Furthermore, I got to use gdb to debug a segmentation fault and also learned about how signals in general work in Linux. This was also a good experience at putting together information from various sources and different programming languages to solve a problem. It also demonstrated the effectiveness of using different phrases when searching for solutions. Searching for "segmentation fault" and "access violation" provided different pieces to the puzzle. Finally, this was a practice in perseverance. Not giving up easily when feeling lost is difficult. But at the same time, you gain a powerful sense of accomplishment when things make sense in the end. This image is an apt illustration:

The two states of every programmer: "I am a God" and "I have no idea what I
Image: George Broussard

Top comments (0)