Hyper Scale Parallel Processing with Azure Batch

Microsoft has released in preview another new service – Azure Batch.  Batch is a job scheduling framework that allows you to distribute jobs to dozens or hundreds or thousands of computers on demand.

Here is a great scenario where Batch would be awesome – we have an insurance client who has a proprietary 500 core computing grid, all running on premise, to calculate actuarial tables for its various insurance plans.  It refreshes its plans once a month and needs the intense computing power in order to take incoming actuals and turn them into policy rates. 

Imagine instead of building your own 500 core cluster and paying for proprietary grid frameworks you can just rent one by the hour?  This is essentially what Microsoft Azure is providing with Batch, coupled with an easy to use .NET or REST API.

Parallel tasks

The key value proposition with Batch is the same as any cloud service – you pay by the minute and you can dynamically spike up and down the cores you need.  Each instance in your resource pool of VMs costs as little as $0.008 / hr – that means you could be running batch jobs across 1,000 virtual machines at $8 / hr.  If you need a high performance compute environment, you can have rent a cluster of A8 instances each with 8 cores and 56 GB of RAM for $0.0317 / hr – go rent 1000 for just $31.70!

Batch is designed for running large volumes of tasks programmatically where there is an opportunity to go massively parallel with data processing.  Good examples of such scenarios include:

  • Image or video processing
  • 3D rendering
  • Software testing
  • Indexing of files
  • Actuarial analysis
  • Risk modeling
  • Weather analysis

Creating a new Batch account is easy – of course, it’s just done through the Azure portal.

Create a Batch account