DEV Community

Cover image for Using Azure Express Route for online data transfers
David Pazdera
David Pazdera

Posted on

Using Azure Express Route for online data transfers

Introduction

Let's imagine this scenario: you have built a hybrid network between your corporate network and Azure using Azure Express Route and now you would like to leverage this private high-bandwidth connectivity for various purposes.

First scenario that comes to my mind is data migration, for example you want to move your 500TB archive to the cloud.

You have read all about Azure Blob Storage and its tiering model, and you think this can be the right solution.

Since you now have both your "source" and "destination", the next logical step is to figure out "how".

On a high level, you have two options:

  • online data transfer
  • offline data transfer

The first category is represented by tools (azcopy or Azure Storage SDK) and Azure services (like Azure Data Factory) and it requires a good network bandwidth (and time). The second category is dominated by the Azure Data Box product family.

Data Transfer Advisor

If you need some help with making the right choice, you could leverage quite new experience in the Azure portal. When you provision a new Storage Account (or open an existing one), you can see the "Data Transfer" blade, where you fill in three attributes - estimated data size, available network bandwidth, and transfer frequency - and based on your input you will get a recommendation similar to this one:

DataTransfer

Now, you feel confident that using the online method with azcopy is the best option for you, especially since you have this nice 1Gbps private "pipe" to Azure.

Little catch

So far so good? Well, the moment you start planning your data transfer in details, you will (most likely) realize, that Azure Storage service exposes several public endpoints, and for Blob storage it will look something like this:
https://cs44deib9068be17x4267xa4b.blob.core.windows.net/

Unless you configured something called Microsoft peering on your Express Route circuit, all traffic from your network to those public endpoints will be routed over your Internet connection (router) and not via Express Route!

Private Link to the rescue

Do not despair. Microsoft launched a new service called Private Link that allows you to create a "private endpoint" that represents a specific instance of a PaaS service like Azure Storage and make it available (present it in form of a network interface card with a private IP) inside your Virtual Network.

PrivateLink

This has several benefits like the ability to completely close public endpoints to your PaaS instance (block any access from the Internet) and make this instance available not only from within your VNet, but all peered VNets as well as your corporate network.

How is this helping in our data transfer scenario? A lot. That Private Endpoint that I will attach to my target Blob storage will have a private IP address belonging to the IP range that is advertised via BGP protocol and my Express Route connection to my corporate network. In other words, if I use a tool like azcopy or Azure Storage Explorer from a computer inside my network and target such private endpoint, this connection will utilize my Express Route connection and its bandwidth. This is exactly what we wanted :)

Solution design

This is how our solution will look like in the end.

SolutionDesign

This is what we need to do, assuming that all networking bits and pieces (Virtual Network, Express Route circuit with Private peering, ER Gateway) have been provisioned and configured in advance:

First, create a new Storage Account in the same region as your Virtual Network. In the 'Networking' part of the wizard, select Private endpoint option:

Wizard1

Populate all important parameters of the new private endpoint and link it with your VNet:

Wizard2

You can keep 'Integrate with Private DNS zone' option enabled, so the wizard also creates a private DNS zone (part of Azure DNS service offering). However, depending on your scenario, making this private zone available from your on-premises network requires additional configuration and careful planning. We will use a simpler option to make it working for our use case.

Wizard3

Depending what you provisioned in what Resource Group, the final result could look like this:

Wizard4

End-to-end testing

There is one more step you need to do to complete the setup. As I mentioned above, the name resolution is a complex topic in hybrid DNS scenarios, where you are in many cases bringing your existing DNS to Azure VNets while trying to utilize Azure DNS private zones. I will leave this topic for another article :).

For now, we will use the simplest option we have available, we will modify the hosts file in our source server (e.g. a machine, where we have our archive mounted) and add an entry that will resolve our blob endpoint to the private IP address of our Private endpoint. You can get both values from the Private endpoint resource:

PrivateEndpoint

There are many articles describing how to change hosts file, so I won't describe the process here.

After adding that entry, you should be able to test it using nslookup.

What is left is testing the actual upload of the data. Follow this article:

  • authenticate to Azure using AzCopy login command. Eventually you can use a SAS token (you need to create one in your target storage account and append it to the target Uri).
  • upload a directory with files (change both source drive/folder and destination Uri): azcopy copy 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive

Conclusion

This might look like a lot of work, but provisioning and configuring those handful of resources took me approx. 15 minutes (again assuming that all the "plumbing" was done beforehand), so it is definitely worth it.

Top comments (0)