Comparing the Performance of Azure Table Storage with Other Repositories

I have been using Azure Table Storage—ATS in a couple of my personal projects, and I just love it. It is simple, the performance was decent and the storage quite cheap. A NoSQL key-value store like ATS is just perfect for storing lots of unrelated records like audit, and error. In our case, around 70% of our data fall into this category.

I had the perception that ATS was not that fast, but I did not notice much impact on the performance of the site. Anyway, the audit and error reporting operation were asynchronous. Probably the only major drawback during the project was the ATS poor API – I still cannot conceive the anemic LINQ support.

During the architecture definition of a new web-based project I started to consider some other options for the data storage. This new project required a data model with way more relations between entities – a productive API was key, although I wanted to stick to a NoSQL store for future-proof scalability.

One of the serious options that we started to contemplate was MongoDB. I had some quick experiences in the past with it but nothing serious. I knew that their support of LINQ was phenomenal, the option of growing to a massive scale thanks to sharding and replica sets, but what about performance…? How would ATS performance for read/write operations compare against MongoDB, or Azure SQL?

I built a simple MVC 5 application and deployed it in a Azure Web Role (XS). Using the ATS .NET Storage Client Library 2.2, I built a simple page which reads 100 records in a ATS table, and another page for reading each one of the records in that same ATS table. The average latency of each write and read operation is displayed. I based my application on the tutorials and walkthroughs available  from Microsoft. The idea was to build it with the techniques available, without any tuning.

I did the same thing with Azure SQL, procuring the smallest database I could. I used plain old fast ADO.NET DataReaders for implementing the operations.

For MongoDB, I launched a Extra Small VM with Linux CentOS. This is a 1Ghz CPU, and 786 MB RAM VM.  MongoDB was deployed with default settings.

Naturally, all these resources were located in the same region (US West). This diagram summarizes the topology:

Arqchitecture

And…these are the results:

Perf

Yes, the performance of plain vanilla ATS is just disappointing. After some research I found blog post with similar findings, which indicated how to improve the performance turning-off the nagle before the calls:

public static void InsertRandomEmployeeData()

        {

            string connStr = ConfigurationManager.ConnectionStrings["ConnString"].ConnectionString;

            /// For increased perf Turn off naggle alg

            /// http://alexandrebrisebois.wordpress.com/2013/03/24/why-are-webrequests-throttled-i-want-more-throughput/

            ServicePointManager.UseNagleAlgorithm = false;


            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connStr);

            CloudTableClient client = storageAccount.CreateCloudTableClient();

            CloudTable table = client.GetTableReference("Employees");


            table.CreateIfNotExists();

            var emp = new EmployeeEntity(GenerateRandomInt(1,10000), GenerateRandomString(0), GenerateRandomDouble(1.0,100000.0));

            TableOperation insertOp = TableOperation.Insert(emp);

            table.Execute(insertOp);

        }

The performance benefit is impressive – I wonder why this is not a default setting. Note how the read operation were not affected by that.

The performance of Azure SQL operations was really good (under 10 msecs on average), but the winner as you can see was MongoDB – impressive, with both operations under 2 msecs!

Well, that was a eye-opener. It is pretty obvious what we are going to use for next projects. Unfortunately, neither Azure nor Amazon Web Services offer a managed MongoDB service at this time, so I would need to setup and maintain my own set of VMs running MongoDB, which is not a big deal, but I would need to pay for this in addition to the storage.

Cheers, see you next time amigos.

Deploying an Entire Environment using Azure and PowerShell

The IaaS capabilities of Azure could be very handy when you need to create temporary development/test environments during the SDLC. Automating the creation and clean-up of these environments could save a lot of time and compute time ($$).

Azure exposes an interface based on PowerShell to automate all the steps required to do this, and I spent some time researching how to properly do it. You will find many references and blog post on how to create VMs using the PowerShell API in Azure; however I did not find many updated, accurate references of how to do it for an entire environment. Probably because the API has evolved so quickly and these articles are no longer relevant.. The results are summarized in the following script which demonstrates the creation of a standard deployment of an enterprise multi-tier environment (web front-end, application server and database server).

$ErrorActionPreference = "Stop"   # stop on any error

 

function GetLatestImage($family){

$images = Get-AzureVMImage `

| where { $_.ImageFamily -eq $family } `

| Sort-Object -Descending -Property PublishedDate

 

$latestImage = $images[0]

return $latestImage

}

 

 

# Environment variables are defined here:

# ONLY LOWERCASE LETTERS HERE!!

$EnvironmentName = "azrtest"

# Create Storage Account through the Portal (vmstorageazrtest)

$StorageAccount = "vmstorage$EnvironmentName"

 

Write-Host $StorageAccount

 

$AzurePubSettingsFile = "C:\MyStuff\MyDrop\Dropbox\Personal\Windows Azure MSDN - Visual Studio Ultimate-12-19-2013-credentials.publishsettings"

$VMSize = "Small"

$Location = "Southeast Asia"

$AdminUserName = "admin2K"

$AdminPwd = "password2K"

$OSFamily = "Windows Server 2008 R2 SP1"

 

$server_A_Name = "WFEServer"

$server_B_Name = "DBServer"

$server_C_Name = "AppServer"

 

# This must be unique

$CloudServiceName = "vmstorageazrtest"

# Run GetLatestImage.ps1

$Image = GetLatestImage($OSFamily)

$ImageName = $Image.ImageName

 

# Create Storage Account through the Portal

# IF NO StorageAccount exists, ONE IS CREATED HERE!

#New-AzureStorageAccount -StorageAccountName $StorageAccount -Location $Location -Label "azrtest"

# Remove-AzureStorageAccount -StorageAccountName $StorageAccount

 

 

# Config subscription

import-azurepublishsettingsfile $AzurePubSettingsFile

#Get-AzureStorageAccount | Select Label

Set-AzureSubscription -SubscriptionName "Windows Azure MSDN - Visual Studio Ultimate" -CurrentStorageAccount $StorageAccount

 

# Create Azure Service

New-AzureService -ServiceName $CloudServiceName -Location $Location

 

# Create Machine (1) - Windows

# You can create a new virtual machine in an existing Windows Azure cloud service, or you can create a new cloud service by using the Location parameter.

New-AzureQuickVM -Windows -ServiceName $CloudServiceName -Name $server_A_Name -ImageName $ImageName -Password $AdminPwd -AdminUsername $AdminUserName -Verbose

New-AzureQuickVM -Windows -ServiceName $CloudServiceName -Name $server_B_Name -ImageName $ImageName -Password $AdminPwd -AdminUsername $AdminUserName -Verbose

New-AzureQuickVM -Windows -ServiceName $CloudServiceName -Name $server_C_Name -ImageName $ImageName -Password $AdminPwd -AdminUsername $AdminUserName -Verbose

 

 

This sample script will create 3 VMs running “Windows Server 2008 R2 SP1”: WFE, App Server and DB Server. All VMs will be “grouped” in the same Azure Cloud Service: $CloudServiceName.

If you plan to use it, make sure you edit the environment variables in the top of the script:

  • The Environment Name: $EnvironmentName
  • The storage account used to store the VHDs: $StorageAccount
  • Location of the Azure Publishing settings file: $AzurePubSettingsFile
  • Size of the VMs: $VMSize
  • Location –  use get-azurelocation for a list of locations: : $Location
  • Admin Username and Password – this is the local account you will use to remote into them: $AdminUserName and $AdminPwd
  • OS: $OSFamily

To shut down and clean-up the VMs created, you could use the following script:

#CleanUp

 

# Environment variables are defined here:

# ONLY LOWERCASE LETTERS HERE!!

$EnvironmentName = "azrtest"

# Create Storage Account through the Portal (vmstorageazrtest)

$StorageAccount = "vmstorage$EnvironmentName"

 

Write-Host $StorageAccount

 

$AzurePubSettingsFile = "C:\MyStuff\MyDrop\Dropbox\Personal\Windows Azure MSDN - Visual Studio Ultimate-12-19-2013-credentials.publishsettings"

$VMSize = "Small"

$Location = "Southeast Asia"

$AdminUserName = "admin2K"

$AdminPwd = "password2K"

$OSFamily = "Windows Server 2008 R2 SP1"

 

$server_A_Name = "WFEServer"

$server_B_Name = "DBServer"

# They must be unique

$CloudServiceName = "vmstorageazrtest"

$server_A_ImageName = $ImageName

$server_B_ImageName = $ImageName

 

# Config subscription

import-azurepublishsettingsfile $AzurePubSettingsFile

#Get-AzureStorageAccount | Select Label

Set-AzureSubscription -SubscriptionName "Windows Azure MSDN - Visual Studio Ultimate" -CurrentStorageAccount $StorageAccount

 

# Stop & Remove  VMs

$vm = Get-AzureVM -ServiceName $CloudServiceName -Name $server_A_Name

if($vm){

    Stop-AzureVM -ServiceName $CloudServiceName -Name $server_A_Name

    Remove-AzureVM -ServiceName $CloudServiceName -Name $server_A_Name

}

 

$vm = Get-AzureVM -ServiceName $CloudServiceName -Name $server_B_Name

if($vm){

    Stop-AzureVM -ServiceName $CloudServiceName -Name $server_B_Name

    Remove-AzureVM -ServiceName $CloudServiceName -Name $server_B_Name

}

 

 

# Remove any existing Azure Cloud Service

$azureService = Get-AzureService -ServiceName $CloudServiceName

if($azureService){

    Write-Host "Cloud service: $CloudServiceName found!, deleting it.."

    Remove-AzureService -ServiceName $CloudServiceName -Force

}

 

#Remove Storage Account

# Remove container first

Remove-AzureStorageContainer -Name vhds -Force

Remove-AzureStorageAccount -StorageAccountName $StorageAccount

Some aspects were not fully covered in this version of the script:

· Networking: VMs will be able to talk between each other, but we currently do not have any control on the addressing assigned to them.

· AD: VMs are created as standalone servers, not part of an AD domain.

I hope you can find this helpful and time saving.

Beefing-up the Azure Platform

During PDC 2010 in Redmond, WA Microsoft announced a bunch of improvements to the whole Azure platform, some of them desperately needed:

  • Support for the new Virtual Machine role, in addition to the existing Web and Worker roles. This could allow PaaS scenarios, where you can build, configure and upload your own Windows Server 2008 R2 VMs as VHDs – quiet similar to the AWS model. (Great!!!) In addition, the pricing model for the Windows Azure VM role is the same as the existing pricing model for Web and Worker roles.
  • Enhancements to the Web and Worker roles: with the introduction of Elevated Privileges and Full IIS support!!! – so we now can have multiple IIS sites per Web role and the ability to install IIS modules. (Cool!!)
  • Windows Azure will also provide Remote Desktop functionality, which enables customers to connect to a running instance of their application or service in order to monitor activity and troubleshoot common problems. So basically your Azure computing instances are no longer black-boxes. (Finally!!!! OMG, I am going to cry…)
  • The introduction of an Extra Small Windows Azure instance – great!!, now you can configure an instance to run low-priority Worker Roles, or Admin apps without ruining your budget.
Compute Instance Size CPU Memory Instance Storage I/O Performance Cost per hour
Extra Small 1.0 GHz 768 MB 20 GB Low $0.05
  • A range of new networking functionality under the Windows Azure Virtual Network name was introduced. Windows Azure Connect (formerly Project Sydney), which enables a simple and easy-to-manage mechanism to setup IP-based network connectivity between on-premises and Windows Azure resources, is the first Virtual Network feature that we’ll be making available as a CTP later this year. With this, your can establish VPN between your on-premises servers and your cloud machines. Much needed for some enterprise scenarios.
  • The Windows Azure portal will also be improved with SL technologies, and with access to new diagnostic information including the ability to click on a role to see type, and deployment time. (Finally, for god sake!!!)
  • A much needed update for the pretty basic Database Manager for SQL Azure (formerly “Project Houston”) was also announced.

Let’s wait these enhancements are released as soon as possible

Large DBs in SQL Azure

Recently, one of my customers asked me that question: “Based on the updated SQL Azure plans, the maximum database size is now 50GB. What if my DB requires more storage?

The first recommendation could be: try to measure how your DB is growing, and (if possible) try to have there only the most relevant information – SSIS is a great option to download all that historic data to your on-premises servers. Another option is Data Sync. Some good articles on measuring your DB size are:

How to Tell If You Are Out of Room – SQL Azure Team Blog – Site Home – MSDN Blogs

CalculatingTheSizeOfYourSQLAzureDatabase

Well, according to Microsoft 50GB is the maximum size, and if you need more space you will need to partition your data (either horizontally or vertically). Unfortunately, SQL Azure won’t help you much with this, and you will need to make some changes in your app logic to handle this. This should be done in your Data Access Layer, and it will not be an easy process to implement, let me warn you. Following articles could give you some insight on the details and limitations of this process:

SQL Azure Horizontal Partitioning- Part 2 – SQL Azure Team Blog – Site Home – MSDN Blogs

Scaling out with SQL Azure – TechNet Articles – Home – TechNet Wiki