Originally posted at http://erikscode.space/index.php/2019/07/20/how-to-work-with-memory-optimized-objects-in-sql-server/
Good morning everyone, today I’m going to show you how to make memory optimized tables and natively compiled stored procedures. First, I’ll talk about why you might want to use these things, then we’ll just dive straight in to how to do it.
Relevant Video
If you don’t feel like reading all of this, I made a video with the same scripts and same general information:
Why Memory Optimized Tables?
So why would you not want to use disk based tables and opt for the fancy in-memory ones? There is really only one answer: speed.
As I’ll demonstrate later, the in-memory tables are written to and selected from so much faster than disk based tables. This is to be expected, of course, as memory is always faster than retrieving data from a file. But why then shouldn’t every table be memory optimized?
Well, roughly speaking, disk based tables will be “safer.” First of all, you have no control over memory allocation, so who knows how much memory your server will set aside for its operations?
Also, your database has a lot less memory than it does disk space. You will have an overflow long before you run out of hard drive on your database server.
In my humblest of opinions, in-memory tables and natively compiled stored procedures are best used when a customer or user of your product has to sit through a long write or select process. I think it’s better to have these operations handled by the optimized tables, then offloaded into disk based storage in a place that doesn’t affect user experience.
I’m definitely no software or database architect though, so lets skip the theory and jump into how we make these things!
What we’re about to do:
We’re going to make a FlowerStore database with six tables. Actually, it’s really three tables, but we’re going to make disk based and memory based ones to compare. Here are the steps we’ll take —
- Create the database and configure it to be able to use memory optimized objects
- Create identical disk based and in-memory tables
- Create identical views that will be used later to show the difference in querying speed
- Creating the stored procedures – one normal, one natively compiled
- Demonstrating the difference speed of the stored procedures by using SSMS
- Demonstrating the difference in speed of querying by using jMeter
If you want my sample code:
For this demonstration, I’ve uploaded the scripts to GitHub:
https://github.com/erik-whiting/SQLServerTutorial-MemoryOptimization
You can either do a git pull or just download and unzip. We’re going to run the scripts in the order they appear in the repo, although a couple of them can be interchanged.
Creating the Database
First thing is first, we have to actually create our FlowerStore database and tell it to expect there to be memory optimized data. Take a look at 01-CreateDB.sql in the repository. Lines 1 – 21 are normal disk-based stuff, so lets look at lines 23 through 32:
-- Make Database compatible with memory optimized objects
ALTER DATABASE FlowerStore SET COMPATIBILITY_LEVEL = 130;
ALTER DATABASE FlowerStore SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT = ON;
-- Add filegroup for memory optimized data
ALTER DATABASE FlowerStore ADD FILEGROUP fs_mem CONTAINS MEMORY_OPTIMIZED_DATA;
ALTER DATABASE FlowerStore ADD FILE (
NAME = fs_mem_file,
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\fs_mem'
) TO FILEGROUP fs_mem
First thing we have to do is set the compatibility level of the database to 130. What does this mean? I have no idea, but this is what Microsoft says needs to happen.
Next, we set the isolation level of snapshot with lien 25. What this does is basically use an isolation level that will copy the data memory objects are using into tempdb. In any transaction, all the memory based objects will have access to the same copy of data, meaning any disk based rows are not locked.
Next, we add filegroup with the CONTAINS MEMORY_OPTIMIZED_DATA
attribute. We have to add a filegroup with that attribute in order to have a data file that can contain that data, which we make on line 29.
Creating the Tables
In script 02, there is really nothing to learn because it is basic table building stuff. Please run it though, as we’ll be using the disk based tables to see how much faster the memory tables are.
Check out script 03, where we create the identical tables but in memory, there are a few things to note:
- On line 6, we have to specify non-clustered index. This has to do with the fact that only index definitions are stored for in-memory tables, not the actual index pages (memory tables have no pages at all).
- Each table has
MEMORY_OPTIMIZED = ON
andDURABILITY = SCHEMA_AND DATA
. The memory optimized clause is kind of a no-brainer, this is basically how you say “don’t be a disk based table.” The durability part has to do with how the data is persisted.- Schema refers to the structure of the table, like the column names.
- Data refers to the actual data that is inserted into these tables.
- The durability option tells SQL Server which one of these things to persist, but if you try to persist schema only in this script, the script will fail.
Creating the Views
The views in script 04 and 05 are going to be used to show how much faster the memory tables will query, and there really not much to learn from these scripts (unless you want to see how to make views).
Script 04 creates a view that totals line items in an order, and script 05 creates views that get the grand total for each order.
Creating the Stored Procedures
This is probably the most interesting section. If you can’t tell, both these stored procedures are going to take a number from the user and make up some fake data about sales and line items. There is really no actual point to these except for me to show you how much faster the writes will be.
Notice in the first stored procedure, it’s business as usual. We declare our variables and write our inserts. The natively compiled one is a little bit different.
The NATIVE_COMPILATION
is what tells the database to save this as a stored procedure, easy enough. So what’s up with the SCHEMABINDING
option?
This basically says that the tables this procedure uses cannot be dropped until the procedure is dropped, and you must set this to true for native compilation.
You also must use BEGIN ATOMIC
after the AS
statement. It must be used here, which is odd, because it cannot be used in non-compiled procedures. I’m not entirely sure what this flag does, but I do know it implies the BEGIN
and COMMIT
or ROLLBACK
lines, so you don’t need to write them.
Again, you have to use ISOLATION LEVEL = SNAPSHOT
because this is really the only isolation level you can use with in-memory objects.
Finally, you have to specify the language. See line 34 LANGUAGE = N’us-english’
. This is another thing that is required, but I couldn’t really tell you why. For now, just know that you have to specify the language used when creating compiled stored procedures.
Lets Test it Out!
Ok! Let’s actually run these things and see what happens. First, let’s populate the database with some random data using the stored procedures. First up, disk based tables with the T-SQL interpreted procedure:
EXECUTE makeLotsOfRecords @Iterations = 100000
So we’re making 100,000 line items. Your architecture may vary, but when I ran this, it took 40 seconds to run.
Next up, the compiled procedure that writes to memory tabels:
EXECUTE makeLotsOfRecordsMEM @Iterations = 100000
Again, depending on how much juice your personal computer has, your results may vary, but when I ran this, it was done in less than a second. Wow!
Testing Query Speed
Because I was tired of staring at SSMS, I wrote the query test in an Apache JMeter test plan. I won’t get into how to set this up, as that’s beyond the scope of this article, but I’ll describe the results.
What I did was make 2 thread groups, one that queried the summing view for the disk based tables, and the other for the in-memory table querying view. Each group had 200 threads and looped 5 times, for 1000 queries each. Here are the summary results:
One thing I think is interesting is the standard deviation. If you don’t know, this is basically the variability of response times. For the memory tables, the standard deviation is a little more than one-third the average, whereas with the disk tables, the deviation is nearly equal to the average. Crazy.
Conclusion
Today we went over how to configure a database to use memory-optimized objects, how to create in-memory tables, and how to write natively compiled stored procedures. We then tested each thing out to see what kind of performance differences we got.
Top comments (0)