DEV Community

Mike Richter
Mike Richter

Posted on

Azure Files Transactions Experiments

What are Azure File transactions? Can we measure them? Are all transactions equal? Let's find out!

Background

Azure Files is a file share service from Azure. It comes in several tiers: Premium, Transaction Optimized, Hot and Cool. All tiers charge you based on the amount of data in the file share. If you use any tier besides Premium, you are charged for transactions as well. You can find more details on the Azure Files pricing page.

For Transaction Optimized (previously known as the Standard tier), Write transactions cost $0.015 per 10,000. Straightforward, right?

But, what is a transaction and how do you know when they happen and what causes them? A penny-and-a-half per 10 thousand transactions sounds reasonable, but do you know how many transactions your workload is going to need? Are all transactions equal? If I write a 1 MB file, is that the same as writing a 1 GB file when it comes to transactions?

Existing Documentation

There is some documentation on what storage transactions are. But this doesn't give you a complete picture. To be fair to the Azure team, it can be difficult to give a complete picture. The Azure File service can be used multiple ways, all utilizing different transactions in different ways. For example, you can connect to Azure Files over SMB, NFS, or the Rest API. Each way will have an impact on what transactions happen and how often.

My Questions

I wanted to see if there's a way to reduce transactions and optimize costs. Would writing fewer larger files reduce transactions and be cheaper than writing many smaller files?

I did some of my own experiments. Here's what I was looking for:

  • I wanted to see what the transaction impact is when I add files of different sizes at different times.
  • I also wanted to see if files with different sizes have an impact on transactions.
  • How could I see transactions per file share within a storage account. In the Azure Portal, you can only see transaction metrics aggregated at the account level. But you can have many file shares per storage account; is there a way to see transactions per file share?

Set Up

I created a General Purpose v2 Storage Account in Azure and called it transactiontests. Within the storage account I created a Transaction Optimized File Share and called it fileshare1. The URI for the fileshare is \\transactiontests.file.core.windows.net\fileshare1\

Azure Storage accounts collects diagnostic logs and allows you to send them somewhere where you can inspect them.
diagnosticMenuItem

I enabled Diagnostic Settings to write the logs to an Azure Log Analytics account. I used Log Analytics to query for transactions that were happening in specific time frames.
diagnosticSettings

I created a Windows VM in the same region as my storage account and I mounted the file share to the Windows VM via SMB.

SMBMount

I created dummy files of various sizes using this Online Random File Generator.

Now I had everything I needed to run my experiments

The Experiments

I added different files of different sizes to the file share over time. I started doing this one by one. I added a file, waited a minute for Diagnostic Settings to ship the logs to the Log Analytics workspace. Then in the Log Analytics portal experience, I would query to see what transactions were logged.

StorageFileLogs
| where TimeGenerated > now(-3m)
Enter fullscreen mode Exit fullscreen mode

Here's an example of the output
loganalyticssampleoutput

I kept track of Write transactions since I was adding files to the File share. Any other transactions I just considered 'overhead.'

I added a 1 KB file, a 100 KB file, 1 MB file and so on. You can see the table below in the Results section for all the file sizes and the resulting transaction values.

After adding each file one by one. I added three copies of files of the same size to the file share. I was curious to see if there was any decrease in overhead when doing multiple operations at the same time.

Results

Transactions Per File Count and Size

# of Files File Size (MB) # of Transactions # of Writes Writers per MB Transaction Overhead Overhead / File
1 0.001 34 1 1 33 33.0
1 0.100 48 1 1 47 47.0
1 1.000 38 4 4 34 34.0
1 5.000 48 10 2 38 38.0
1 100.000 136 100 1 36 36.0
1 1024.000 1057 1024 1 33 33.0
3 0.001 53 3 1 50 16.7
3 0.100 58 3 1 55 18.3
3 1.000 62 12 4 50 16.7
3 5.000 85 30 2 55 18.3
3 100.000 364 300 1 64 21.3
3 1024.000 3142 3072 1 70 23.3

Breakdown of Transaction Types

OperationName Count % of all
QueryInfo 116 2%
QueryDirectory 46 1%
Create 212 4%
TreeConnect 21 0%
Ioctl 81 2%
Close 103 2%
SetInfo 48 1%
Write 4560 88%
TreeDisconnect 19 0%

Thanks to this tool for converting Excel to Markdown!

Observations

  • As I suspected, writing files of different sizes do not equal the same number of writes.
  • It looks like writing 1 MB counts as 1 Write transaction.
  • For some reason adding an exactly 1 MB file results in four Write transactions and writing a 5 MB file results in 2 Write transactions. I saw this consistently whether I added 1 file at a time or 3.
  • File sizes that were less than 1 MB results in a minimum of 1 Write transaction.
  • Some overhead could be saved by adding multiple files at a time.
  • The bulk of transactions were Writes. Considering many of the overhead transactions like QueryDirectory or TreeConnect cost even less than Writes, there may not be much cost savings for trying to reduce that overheard even more.

So to answer my original questions at the top of this post, here is what I learned:

Larger files result in more Write transactions. Overhead transactions cost less than Writes and are a small over-all percentage of the transactions. Therefore, I do not believe I will save much money by adding fewer larger files as opposed to many smaller files.

I also learned that by viewing the diagnostic logs, I can see transactions per file share.
transactionsperfileshare

Conclusion

I feel better now about understanding how Azure File transactions happen. Obviously there are more experiments I can do, like reading files, modifying and deleting them. I am reasonably confident that the transaction counts will be similar in those situations as well; deleting large files will be more transactions than deleting small ones, etc.

I hope this has been helpful!

Top comments (0)