New Elastic Database Pools in Azure SQL Targeted to SaaS Developers

Microsoft launched in preview this week a new Elastic Database Pool model specifically designed to support SaaS developers building massively multi-tenant applications.  In these scenarios, SaaS application vendors provision a database per client and with thousands of clients, each having their own dedicated Azure SQL database.

The challenge in this scenario is that each database is charged on a per unit basis with its own pool of resources.  In many cases, you can have databases that are not using up their allotted resources and therefore the SaaS developer is forced to over provision and therefore be over charged for the databases for each of their clients. 

image

With the Elastic Database Pool model, the SaaS Developer now can buy a pool of resources that spans across thousands of databases.  In a similar way to packing virtual machines on a host to optimize resources, the Elastic Database Pool allows you to pack databases into a shared pool of database resources by average the peaks and valleys across many databases.  The efficiency gain is dependent on the variability of the database and the number of databases in the pool.  The ideal target for the Elastic Database Pool model are applications where are hundreds of databases provisioned but they are not constantly used and have lots of variability in usage patterns.

While paying more per performance unit than a dedicated database, this can result in significant savings in scenarios where there are a large number of database instances that are not under utilized as the pool averages out the usages across all the instances.  If the Elastic Pool runs out of resources, the SaaS can easy scale up the entire pool.

Read More

SharePoint 2016 Preview Release Now Available

Microsoft has made available for download the first preview release of SharePoint 2016.  The download.  The SharePoint 2016 IT Preview release, as it has been called, is now available to installed.

SharePoint Server 2016 IT Preview has been designed, developed, and tested with the Microsoft Software as a Service (SaaS) strategy at its core. Drawing extensively from that experience, SharePoint Server 2016 IT Preview is designed to help you achieve new levels of reliability and performance and empower users while meeting their demands for greater business mobility.

Microsoft has also provided:

These are documents just posted today that describe the new features and installation steps for this new version of SharePoint.

Read More

Comparing IBM Watson Analytics with Azure ML

In a previous post, I compared the new IBM Watson Analytics with Power BI as a business intelligence and visualization tool.  Watson Analytics also includes a predictive analytics tool as well so let’s compare with Microsoft’s Azure Machine Learning service (Azure ML).

Watson Analytics is a Data Discovery Tool, Azure ML is a Pseudo Development Tool

The first thing you notice immediately when using both tools is the difference in their target audience.  Azure ML is targeted to developers, data scientists and very advanced business users who want to build their own analytics pipelines.  It is similar to SQL SSIS or BizTalk in its user interface.  It provides the ability to chain inputs, actions and outputs together into a pipeline and to visualize the data that is being processed along the way.

In contrast, IBM Watson Analytics is trying to take all of that complexity away – you just upload your file and Watson Analytics analyzes your data and tries to provide the best pipeline for you under the covers and present the results.

Using a cleaned up data set of automobile pricing data, here is what a linear regression pipeline looks like in Azure ML.

image

This pipeline uses a linear regression algorithm and a bayesian linear regression algorithm and compares the accuracy in predicting price from a set of existing features.

In contrast, with IBM Watson Analytics, you just upload your file and it takes care of the rest.

Azure ML is More Transparent and more Flexible

When you create a pipeline in Azure ML, you can pick and choose the algorithms that you want to run against your dataset.  If you understand the differences between a linear regression algorithm vs. a bayesian linear algorithm vs. a decision forest regression, Azure ML is the tool for you.  It also provides good error measurement to compare algorithms for their ability to predict against your dataset.  For each algorithm, you can also various configuration parameters to tweak the algorithm and hopefully improve your model’s ability to predict reliably.  You can also create specific training sets for machine learning and separate datasets for testing.  

In contrast, when you upload your file to IBM Watson Analytics, you are trust IBM to pick the best algorithm for you.  The tool doesn’t show you what type of algorithm has been run or how they were configured until you start digging into the detail screens:

image

Watson Analytics Provides Guidance on What Drives Prediction

When you upload your dataset to Watson Analytics, it provides this nice visualization to show you the different features and how they influence the predictive ability of the model.

image

The tool also shows fields and how they are correlated.

image

Watson Analytics Provides Insights Into Your Data, But Doesn’t Actually Predict

After viewing these various charts and understanding my dataset, I was interested to see how IBM Watson Analytics performed against Azure ML in actually predicting the price.  However, this feature seems to be missing!

Once you see all the influencers in your dataset, there doesn’t seem to be any way to generate the predictive value.

The closest you can get seems to be a graph that shows the features as they influence the price and the average price for each combination of those features. 

In contrast, Azure ML will populate your dataset with a predicted price for each row.

image

Azure ML Allows for Exporting, IBM Watson Analytics Does Not Export

Once you have your predicted data, you’ll want to export it to either Excel, a database or some other visualization tool.  Azure ML provides many options for exporting data at any stage in the pipeline.

IBM Watson Analytics doesn’t support any exporting options at all.  The export feature is listed as “coming soon”.

image

Azure ML Supports R and Python

Azure ML supports injection of R or Python code into your pipelines for those advanced data scientists who are developing their own algorithms.  This allows for lots of interesting possibilities for transforming, scoring and evaluating data as it is moved through the pipeline.

Watson Analytics as no such feature – as a business centric tool, it provides no ability to customize at all.

Azure ML Provides the ability to Publish to a Web Service

Imagine you have done some in depth analysis and built a model that has amazing predictive power.  How do you now share this or monetize it?

Azure ML provides the ability to take your experiment or machine learning model and publish it as a production ready web service.  Using a REST API, your users can then supply inputs and receive a prediction as an output.  You can even take your model and publish to the Azure Marketplace and charge for the model you have developed.

Read More

Comparing Microsoft Power BI and IBM Watson Analytics

I just received an ad for IBM’s new cloud based BI Tool, IBM Watson Analytics.  Watson became famous initially for beating two champion Jeopardy contestants through its natural language processing. 

However, despite the brand, this IBM Watson Analytics service isn’t really related – it just seems to be a basic data exploration and visualization tool.  Watson Analytics combines aspects of Microsoft’s Azure ML (predictions and data science) and Power BI (visualizations and dashboards) to provide a business user targeted data exploration tool. 

Here is an initial comparison between Power BI and Watson Analytics.  Look for a comparison of Azure ML and Watson Analytics in a future post!

Price Comparison

Both Power BI and Watson Analytics have free versions and paid versions.  IBM has two paid levels, “Personal” at $30.00 US per month and “Professional” at $80.00 US per month.  The differences are based on the amount of data the service will support:

image

Power BI is $9.99 US per user month and includes a 10 GB per user of data capacity.  In addition, there are no limits on the number of rows other than the amount of rows per hourly refresh (which is 1 million).

It should be noted that Power BI has no predictive analytics capabilities in itself, where IBM Watson Analytics has this built in to the tool.  Microsoft has an entirely different product for Machine Learning called Azure ML.  Azure ML has its own pricing structure.

Uploading Your Data

One of Azure ML’s tutorials involves using Auto pricing data.  I created the tutorial experiment using Azure ML and extracted the dataset to a CSV file.  I could then use the same sample data to upload to Watson Analytics and to Power BI.  I created a “raw” auto data file that has some null values and question marks where there was missing data in the original file and a cleaned up version that has these rows stripped out.

Dropping data into Power BI is easy – you just go to “Get Data” and upload a file.  However, unlike Watson Analytics, you can also pull data from a myriad of other data sources including databases, cloud based services, web pages, REST APIs, etc.  At least with the free version, Watson Analytics limits you to files and with the paid versions the data sources are still pretty limited (e.g. Box, DropBox, IBM DB2, IBM DashDB, IBM SQL, and Twitter). 

image

With Watson Analytics, you can upload Excel files or CSV files and it does some analysis on the quality of your data. 

image

Ensuring Clean Data

One of the challenges with the raw data file is that it contains blank rows and rows with “?” as the value for price. 

If you upload the raw file to either Power BI or Watson in its raw form, it no longer sees price as a numeric column because of the “?” in the data.  The column is no longer useful for running averages, totals, etc. and the only option you can use is to count the values (either distinct or not).

In both cases, Watson and Power BI are at least smart enough to not count rows that are blank – for example, if you take an average of 6 rows and 1 row is blank your average is the 5 rows that have data.

Exploring the Data through Natural Language

Once you have uploaded your data, you can ask questions and receive answers through natural language queries.  Both Power BI and Watson provide similar facilities.  However, Watson seems limited to drilling down to a table of values while Power BI will drill down to a single value. 

For example, asking the question “What is the average price for audi” shows the following results in Power BI:

image

In Watson, the same question sends me to a list of average prices by make instead:

image

Creating Dashboards

Both Power BI and Watson have the ability to combine charts into dashboards. 

In Watson, dashboards are called “Assembled Views”.  You create them by picking a layout and then filling it with charts.  There a quite a few layouts including traditional single page layouts, tabbed layouts, a vertical infographic style layout, a slide show, and a time journey layout. 

image

Here is a dashboard I created as an Assembled View.

image

Strangely, the layouts don’t enforce a grid so you can still dashboards around, you can create some awkward designs etc.  The layouts seem to be more guides than grids.

In Power BI, you have two levels of dashboards – reports which are dataset specific and dashboards which can aggregate widgets from reports.  Reports have more sophistication and support drill down – dashboards provide an easy view while linking to the underlying reports when you drill into an individual widget.

image

Using Power BI Studio, you can create DAX driven formulas, add additional columns, etc. that go beyond just basic dashboarding.  There does not seem to be an equivalent power user tool with IBM Watson Analytics.

One feature that Watson Analytics has that Power BI lacks is the ability to add non-BI information into the dashboard such as external web pages, images, text, etc.  Power BI only allows you to insert text or images.

Watson Analytics also provides some interesting layouts specifically for telling a story and creating timelines.  These allow you to assemble dashboards in a series and page through them.

Sharing Your Dashboard / Report

Power BI only allows you to share dashboards and not reports.  In addition, you can only share within your organizational network – you cannot share with external users at all. 

image

Watson Analytics does not currently support sharing – it says it is “coming soon”.

Read More

Azure DocumentDB Now Supports Geospatial Queries

Azure DocumentDB will now support Geospatial queries and storing of geospatial data using the GeoJSON standard.

The GeoJSON standard provides a specification for storing geo spatial data such as points, lines, polygons, etc. used for managing data points on a map.

Along with storing locations in your records, Azure DocumentDB provides a set of querying functions to find records based on a radius or polygon boundary. 

GeospatialDistance

GeospatialWithin

Read More

Microsoft is Consolidating Microsoft Account and Azure AD Authentication Models

In the current Microsoft cloud world, there are two core identity management systems – 1) the Microsoft account and 2) the Azure AD account.  As a user, you are continually asked to choose whether you are a business user or a personal user and then you login using one of these two paths.image

For developers building applications against these identity stores, there are two separate APIs to contend with to authenticate the user, enforcing the same forking of the authentication workflow and making it more complex to build applications.

Microsoft has announced that they are working on a consolidated authentication flow and API that will provide a consolidated user experience and API for developers.  The App Model v2.0 is now in preview.

  

Instead of forcing the user to tell us what type of account they are using, the API will figure it out under the covers and authenticate using either account type in a single flow.  In addition, the API for authenticate either Microsoft personal accounts or AD Azure accounts will be the same to make it easier for developers to build external applications that use these accounts as identities.

Read More

Azure Backup Size Limit Increased to 54 TB from 1.7 TB

Azure Backup is Microsoft’s cloud backup solution that enables enterprises to backup data to the cloud virtual machine instances, databases, file repositories or any other content.  Combined with Azure Site Recovery, Microsoft provides a comprehensive Availability on Demand service that works as a cloud based failover environment for your organization.

Microsoft has just announced an increase in the maximum size for a data source from 1700 GB to 54400 GB (54 terabytes).  For those organizations that have more than 2 TB in a single repository this will be a welcome change and provide more opportunity to leverage the service.

Read More

SharePoint 2013 Update Released to Enable Hybrid Search with Office 365

Microsoft has just released the anticipated update for SharePoint 2013 on premise.  Along with fixing some bugs, the key feature for this update is the enablement of improved hybrid search integration with Office 365.

There is also now a hybrid sites feature that consolidates Followed Sites between SharePoint 2013 on premise and Office 365.

image

In the old hybrid search scenario, you can enable searching of on premise content from Office 365 and vice versa.  However, in the old scenario, the indexes are still separate from each other and when you search they are treated as separate.

In the new hybrid search, you can now create a true hybrid index that combines search results from Office 365 and SharePoint on premise in a single index.

image

Read More