Testing Azure CDN Caching

This web site uses Azure CDN as a method to cache content (in addition to WordPress’ CDN).  I wanted to see how Azure CDN really makes a difference so I created a specific Azure CDN Performance Test to see how it impacts Azure Web Site Performance.

Creating an Azure Web Site and CDN

To test out our application, I created a new Azure Web Site called cachetest.  In addition, I created an Azure CDN end point based on this web site.  Note: it can take an hour for the Azure CDN to successfully cache all your content. 

Designing the Basic Test

I created a very basic web application using the ASP.NET MVC Framework (you can find the files on Github here).

The application displays a simple page that displays a random number and the current date/time.  In addition, it has a method called “SomeSlowOperation” which calculates prime numbers using a really bad algorithm from 1 to 100,000.  Running the page takes about 10-15 seconds under zero load and produces the following output.

image

Refreshing the page generates a new random number and updates the date that the page was constructed.

image

Adding Caching Settings

When you use Azure CDN, you have to make sure you are issuing HTTP Cache Control settings for both static content and dynamic content.

For static content, you can set the cache parameter in the web.config like this:

<system.webServer> <staticContent> <clientCache cacheControlMaxAge="1.00:00:00" cacheControlMode="UseMaxAge" /> </staticContent> <validation validateIntegratedModeConfiguration="false" /> <modules> <remove name="ApplicationInsightsWebTracking" /> <add name="ApplicationInsightsWebTracking" type="Microsoft.ApplicationInsights.Extensibility.Web.RequestTracking.WebRequestTrackingModule, Microsoft.ApplicationInsights.Extensibility.Web" preCondition="managedHandler" /> </modules> </system.webServer>

This directive will mean any static content (e.g. scripts, CSS files, images, etc.) are cached by the Azure CDN for 1 day.

For dynamic content (e.g. our ASP.NET web page), you need to add HTTP CACHE directives into the page itself.  The easiest spot to put this is in the controller’s method like this:

public ActionResult CacheTest() { Response.AddHeader("Cache-Control", "public, max-age=900, s-maxage=900"); CacheTestModel cacheTestModel = new CacheTestModel(); ViewBag.Message = "Cache Testing Page."; return View(cacheTestModel); }

The Response.AddHeader method will add the Cache-Control directives that will be read by Azure CDN to dictate when to expire the page. 

NOTE: without these cache directives, Azure CDN caches content for 7 days and once cached there are no tools available to invalidate the cache!  Make sure you have the cache directives working before you start accessing the page through Azure CDN.

Running the Test

To run the test, we simply accessed the page through a regular browser through some test VMs running in Azure.  We also tested access through my local laptop in Toronto. 

The Results

Hitting the page directly without going through Azure CDN produces the expected result – each page load takes 10-15 seconds.  Hitting the same page once cached by Azure CDN takes milliseconds!  However, the results of my tests yield some interesting results in terms of when a page is actually cached:

  • It takes 3 hits to the same page before Azure CDN seems to cache the page.  The first two loads are slow and reload the page (you can tell because the random number and the dates change).  On the third load the page is now served from cache and is very quick (e.g. milliseconds) and the content is now the same on each load.
  • Caches seem to be isolated to particular locations.  For example, my browser running in Toronto is served cached content after 3 page loads but when I run the same page in a VM in North Central US, there is no cached content and it takes another 3 page loads before that location is serving cached content.
  • As a result of the local caches in different locations, the content that is cached can be different if it’s rendered dynamically.  For example, here are the results for my browser in Toronto vs. a VM running in the US.   Conversely, two VMs running in the same region had the same behavior, e.g. both were served the cached version of the page.  It’s not clear where the regions are located and their boundaries.

image  image

  • If query string is enabled, then each variation of query string follows the same rules as above, e.g. you need to load the exact same URL with the same query string three times before caching seems to take any effect.

Conclusions

Azure CDN will definitely take load off your web server, especially in cases where you have significantly complex or expensive page loads.  However, the caching behavior means that Azure CDN may not be caching as expected – each region has its own caching and pages are only cached after being repeatedly hit in the same region.

Once the page is cached, performance is a lot snappier especially for pages that take a long time to process.  Make sure you set your cache expiry to be as long as possible to maximize the value of the cache and avoid it being expired. 

Read More

Forms on SharePoint Lists (Potential InfoPath Replacement) Cancelled

InfoPath is officially a retired productwhile it will continue to be supported in Office 365 and the next version of SharePoint, there are no expected updates to the product and InfoPath 2013 is the last version on the roadmap.

Embedded image permalink

At the last SharePoint conference, Microsoft suggested it was working on several potential InfoPath replacement options including:

  • Excel Surveys: collecting information via a web form and drop it into Excel for analysis, charting, etc.

  • Forms on SharePoint Lists: a new upcoming feature which will allow configurable forms based on SharePoint Lists.  Think of this similar to an Access Form based on an Access Table.
  • Word Form Documents: structured forms created in Word.
  • App Forms: forms generated from Access databases.

Microsoft has now just officially cancelled one of these potential options – Forms on SharePoint Lists.

image

This is one of the example screens for what Forms and SharePoint Lists might have looked like if released.

Read More

Power BI Preview Adds Alerts, Annotations and Favorites

As part of the new Power BI Preview service and the new preview IOS and Android Power BI Apps, Microsoft has now added the ability to create alerts and annotations.

Alerts

Alerts provides the ability to be notified on your mobile when data changes, hits thresholds or exceeds a target.

Annotations

Annotations provides the ability to comment on data points.  Annotations are attached to the data point to provide insights, clarification or action items to your team.

Favorites

Favorites allows you to tag dashboards, KPIs and data to your own personal “favorites dashboard”.

Read More

Integrating WordPress with Azure Search Service

This blog runs on WordPress using the Brandoo WordPress Plugin.  One of the key challenges with the Brandoo plugin is that the default search service doesn’t work.  I decided to build my own using Azure WebJobs, Azure Search Service and the WordPress REST JSON API.  Here are my lessons learned from developing an Azure Search Solution. 

Note: you can find all the code from the sample solution in GitHub here.

Getting Started

In order to integrate WordPress and Azure Search, the basic flow for data is:

clip_image001

In order to pull posts from WordPress, install the JSON REST API plugin found here (or in the plugin gallery). 

To create a custom WebJob, use the latest Azure SDK and Visual Studio 2013.  Once you have installed the Azure SDK, you’ll see a project template for Azure WebJobs. 

To use the Azure Search service, you need to create a search service in Azure.  See this article for directions on how to do this through the Azure Portal.

To access the Azure Search API, you can go through the REST API directly, or you can use the RedDog.Search C# client.  To install the client into your WebJob, you run the NuGet package console and enter “Install-Package RedDog.Search”.  This also installs the NewtonSoft JSON.NET library which we can also use for interacting with the WordPress REST API.

WebJobs Architecture

When you create a WebJob in Visual Studio, it provides the ability to deploy straight to your Azure Web Site.  This works really well.  Alternatively, you can upload it manually as an .exe through the portal.  You can also run your WebJob locally in debug mode which in this case works perfectly because we have no real dependencies on Azure Web Sites to run the job.

The basic components of the architecture are:

  • Program: the main web job console app.
  • WordPressJSONLoader: service class responsible for pulling posts from WordPress
  • WordPressPosts and WordPressPost: value objects representing the loaded collection of wordpress posts and each individual post.
  • AzureSearchIndexer: service class responsible for pushing posts into Azure Search.

Runtime configuration is done through the App.config and/or the Azure Web Sites configuration.  As part of the Azure SDK you can use the CloudConfigurationManager to get environment settings and it is smart enough to use values in the Azure Web Sites configuration as priority over any settings found locally in the App.Config.  If you are running locally, it degrades automatically to looking in your App.Config for configuration values. 

// load configuration attributes webSiteURL = CloudConfigurationManager.GetSetting("WebSiteURL"); searchServiceName = CloudConfigurationManager.GetSetting("ServiceName"); searchServiceKey = CloudConfigurationManager.GetSetting("ServiceKey"); indexName = CloudConfigurationManager.GetSetting("IndexName");

Retrieving Posts from WordPress

With the JSON REST API plugin installed, retrieving posts from WordPress is easy – just call the URL www.yourwebsite.com/?json=get_posts.  This will by default retrieve the last 10 posts but you can use filtering parameters and paging to change how many posts you retrieve.

Using the JSON.API library, you can deserialize your JSON into a JObject which provides you an easy way to pull entities such as posts, comments, etc. out of the returned JSON.

When the JSON REST API is called, it provides 10 posts and the number of “pages”.  Based on this number of pages, we can pull all the posts 10 posts at a time.

public static WordPressPosts LoadAllPosts(string URL) { try { WordPressPosts wordPressPosts = new WordPressPosts(); string query = "?json=get_posts"; WebClient client = new WebClient(); Stream stream = client.OpenRead(URL + query); StreamReader reader = new StreamReader(stream); var results = JObject.Parse(reader.ReadLine()); var JsonPosts = results["posts"]; if (JsonPosts != null) { foreach (var JsonPost in JsonPosts) { wordPressPosts.Posts.Add(loadPostFromJToken(JsonPost)); } } if (results["pages"] != null) { int pages = (int)results["pages"]; if (pages > 1) { for (int i = 2; i <= pages; i++) { query = "?json=get_posts&page=" + i; stream = client.OpenRead(URL + query); reader = new StreamReader(stream); results = JObject.Parse(reader.ReadLine()); JsonPosts = results["posts"]; foreach (var JsonPost in JsonPosts) { wordPressPosts.Posts.Add(loadPostFromJToken(JsonPost)); } } } } return wordPressPosts; } catch (Exception e) { throw; } }

In this method, we simply pull out the posts and deserialize these to a collection of WordPressPost objects. 

Running Async Tasks in Console Apps

The RedDog.search library contains only the new .NET 4.5 async methods.  You need to be careful to wrap these methods so that your console app doesn’t delegate out to these methods and then end the program prematurely.  The way to achieve this is to create an async method that you execute from your main program and wait for it using the Wait() method.

You can then call this method from Main() like this:

In addition, make sure that all your async methods return Task instead of void as this will cause your console app to prematurely exit.

Checking for Errors

In the RedDog.Search library, you call all its methods like this:

public async Task CreateIndex() { // check to see if index exists. If not, then create it. var result = await managementClient.GetIndexAsync(Index); if (!result.IsSuccess) { result = await managementClient.CreateIndexAsync(new Index(Index) .WithStringField("Id", f => f.IsKey().IsRetrievable()) .WithStringField("Title", f => f.IsRetrievable().IsSearchable()) .WithStringField("Content", f => f.IsSearchable().IsRetrievable()) .WithStringField("Excerpt", f => f.IsRetrievable()) .WithDateTimeField("CreateDate", f => f.IsRetrievable().IsSortable().IsFilterable().IsFacetable()) .WithDateTimeField("ModifiedDate", f => f.IsRetrievable().IsSortable().IsFilterable().IsFacetable()) .WithStringField("CreateDateAsString", f => f.IsSearchable().IsRetrievable().IsFilterable()) .WithStringField("ModifiedDateAsString", f => f.IsSearchable().IsRetrievable().IsFilterable()) .WithStringField("Author", f=>f.IsSearchable().IsRetrievable().IsFilterable()) .WithStringField("Categories", f => f.IsSearchable().IsRetrievable()) .WithStringField("Tags", f => f.IsSearchable().IsRetrievable()) .WithStringField("Slug", f => f.IsRetrievable()) .WithIntegerField("CommentCount", f => f.IsRetrievable()) .WithStringField("CommentContent", f=>f.IsSearchable().IsRetrievable()) ); if (!result.IsSuccess) { Console.Out.WriteLine(result.Error.Message); } } }

The result will provide a status of success and in the case of an error, some important error details.   Anything that is written to the Console is redirected into the Azure Web Sites log for the WebJob.

Creating an Index

Creating an index is reasonably easy but I found a few gotchas along the way:

  • The key field MUST be a string (I originally tried to use an integer field).
  • Searchable fields MUST be of type string (I originally tried to make a date field searchable). 

If you try to violate the rules, the Index creation process fails and the result returned will be an error.

Adding Posts to an Index

Now that we have our index, we can push posts into the index.

foreach (WordPressPost post in WordPressPosts.Posts) { IndexOperation indexOperation = new IndexOperation(IndexOperationType.MergeOrUpload, "Id", post.Id.ToString()) .WithProperty("Title", post.Title) .WithProperty("Content", post.Content) .WithProperty("Excerpt", post.Excerpt) .WithProperty("CreateDate", post.CreateDate.ToUniversalTime()) .WithProperty("ModifiedDate", post.ModifiedDate.ToUniversalTime()) .WithProperty("CreateDateAsString", post.CreateDate.ToLongDateString()) .WithProperty("ModifiedDateAsString", post.ModifiedDate.ToLongDateString()); IndexOperationList.Add(indexOperation); } var result = await managementClient.PopulateAsync(Index, IndexOperationList.ToArray() ); if (!result.IsSuccess) Console.Out.WriteLine(result.Error.Message); foreach (WordPressPost post in WordPressPosts.Posts) { IndexOperation indexOperation = new IndexOperation(IndexOperationType.MergeOrUpload, "Id", post.Id.ToString()) .WithProperty("Title", post.Title) .WithProperty("Content", post.Content) .WithProperty("Excerpt", post.Excerpt) .WithProperty("CreateDate", post.CreateDate.ToUniversalTime()) .WithProperty("ModifiedDate", post.ModifiedDate.ToUniversalTime()) .WithProperty("CreateDateAsString", post.CreateDate.ToLongDateString()) .WithProperty("ModifiedDateAsString", post.ModifiedDate.ToLongDateString()); IndexOperationList.Add(indexOperation); } var result = await managementClient.PopulateAsync(Index, IndexOperationList.ToArray() ); if (!result.IsSuccess) Console.Out.WriteLine(result.Error.Message);

One key gotcha on adding items to the index – the date field must be in UniversalTime or you’ll get an error message.   For example, instead of supplying post.ModifiedDate as a DateTime attribute you need to call post.ModifiedDate.ToUniversalTime() or the index operation will generate an error.

The RedDog.Search PopulateAsync method allows you to add multiple IndexOperations objects that store up your document post requests into a batch.  The maximum number of IndexOperations the library supports is 1,000 or 16 MB.  In our method, we limit the number of posts per batch to 100 posts to be well under this limit.

public async Task AddPosts() { // if not previously connected, make a connection if (!connected) Connect(); // create the index if it hasn't already been created. await CreateIndex(); // run index population in batches. The Reddog.Search client maxes out at 1000 operations or about 16 MB of data transfer, so we have set the maximum to 100 posts in a batch to be conservative. int batchCount = 0; List<IndexOperation> IndexOperationList = new List<IndexOperation>(maximumNumberOfDocumentsPerBatch); foreach (WordPressPost post in WordPressPosts.Posts) { batchCount++; // create an indexoperation with the appropriate metadata and supply it with the incoming WordPress post IndexOperation indexOperation = new IndexOperation(IndexOperationType.MergeOrUpload, "Id", post.Id.ToString()) .WithProperty("Title", post.Title) .WithProperty("Content", post.Content) .WithProperty("Excerpt", post.Excerpt) .WithProperty("CreateDate", post.CreateDate.ToUniversalTime()) .WithProperty("ModifiedDate", post.ModifiedDate.ToUniversalTime()) .WithProperty("CreateDateAsString", post.CreateDate.ToLongDateString()) .WithProperty("ModifiedDateAsString", post.ModifiedDate.ToLongDateString()) .WithProperty("Author", post.Author) .WithProperty("Categories", post.Categories) .WithProperty("Tags", post.Tags) .WithProperty("Slug", post.Slug) .WithProperty("CommentCount", post.CommentCount) .WithProperty("CommentContent", post.CommentContent); // add the index operation to the collection IndexOperationList.Add(indexOperation); // if we have added maximum number of documents per batch, add the collection of operations to the index and then reset the collection to add a new batch. if (batchCount >= maximumNumberOfDocumentsPerBatch) { var result = await managementClient.PopulateAsync(Index, IndexOperationList.ToArray()); if (!result.IsSuccess) Console.Out.WriteLine(result.Error.Message); batchCount = 0; IndexOperationList = new List<IndexOperation>(maximumNumberOfDocumentsPerBatch); } } // look for any remaining items that have not yet been added to the index. var remainingResult = await managementClient.PopulateAsync(Index, IndexOperationList.ToArray() ); if (!remainingResult.IsSuccess) Console.Out.WriteLine(remainingResult.Error.Message); }

Now that we have our index, we can push posts into the index.

Checking our Index in the Portal

We can verify that we have content in the index by going to the portal and checking out our index:

image

As shown, we have a newly created index with 291 items in it.

Building a Search Portal

Now that we have some content, let’s build a simple search interface using just HTML and JavaScript.  We’ll use the REST APIs to fetch data from the index and display the search results using Angular.JS as a framework.

Publishing to Azure Web Sites into a Virtual Application

Our WordPress site has been installed into the root of the Azure Web Site.  When we publish our search pages and JavaScript code, we don’t want them clobbering our existing WordPress site or getting deleted or mangled by mistake if there is an upgrade to WordPress.

Azure Web Sites supports the addition of virtual applications that run in their own sub-directory.  To create one, go into the Configure tab of the Azure Web Site and go to the bottom of the page.  You will see a section called “virtual applications and directories”.  In here, we can create a completely separate application that runs in its own directory, with its own web.config and publishing profile.

clip_image001[6]

In Visual Studio, you can configure the publishing profile to publish to this new virtual application.

clip_image002

Specify the subdirectory in both the Site Name and Destination URL fields.

Fetching the Search Results With AngularJS

Building a search form using AngularJS is ideal for pulling in data from Azure Search because Azure Search returns JSON data by default.  We can simply assign the results to an AngularJS variable and then use the AngularJS framework to display the results dynamically.

We start with a basic Search form styled using Bootstrap.  I use the Sparkling Theme for my WordPress blog and this them already uses Bootstrap as its core CSS framework so adding in some custom HTML using the same Bootstrap CSS elements works really well.

clip_image003

The nice thing with using Bootstrap is that if you switch your WordPress theme, as long as it uses Bootstrap (most of them do these days) your search form and results will take on the style of your blog.

If you perform a search with no keywords specified, Azure Search will return ALL documents.  This isn’t something we would want so we have made keyword a required field and check to ensure it isn’t blank before submitting.

The submit method for fetching the Azure Search results is the key for pulling in the results from Azure Search.  In building this method, I found a few gotchas to share:

  • Make sure you include the api-version in the request or Azure Request will return an error.
  • The default order by is relevance.  In our case, we have also added an additional option to sort by Create Date (e.g. $orderby=CreateDate desc.
  • You have to include the api-key in the HTTP header when you send in the request.  You can create a Query key in the azure portal instead of using the admin key and having it public.
  • You assign the JSON object “value” – this contains the search results.
vm.submit = function (item, event) { if (vm.orderby == "Relevance") var URLstring = vm.URL + "?search=" + vm.keywords + "&api-version=" + vm.APIVersion; else var URLstring = vm.URL + "?search=" + vm.keywords + "&$orderby=CreateDate desc" + "&api-version=" + vm.APIVersion; if (!isEmpty(vm.keywords)) { var responsePromise = $http.get(URLstring, config, {}); responsePromise.success(function (dataFromServer, status, headers, config) { vm.results = dataFromServer.value; vm.showSearchResults = true; }); responsePromise.error(function (data, status, headers, config) { alert("Submitting form failed!"); }); } else { vm.showSearchResults = false; vm.results = []; } }

Displaying the Results

Once we have a JSON object with the search results, displaying them is pretty easy – just use the AngularJS ng-repeat attribute to iterate through the results returned.

<div ng-repeat="result in search.results"> <a class="h1" href="http://wordpressazuresearchintegration.azurewebsites.net/?p={{result.Id}}">{{result.Title}}</a> <div class="h6" ng-bind-html="result.CreateDateAsString | unsafe"></div> <div ng-bind-html="result.Excerpt | unsafe"></div> </div>

One key note is the use of a filter to treat the HTML returned as HTML – by default AngularJS will HTML encode the HTML instead of letting it through raw.  In order to change this behaviour, you can add this function:

angular.module('app').filter('unsafe', function ($sce) { return function (val) { return $sce.trustAsHtml(val); }; });

Using this filter you can then declare the variable as unsafe and it will be allowed through as raw HTML.

Adding a link to the original post is easy – just create an anchor link with the ID of the post.  (You could also use the slug variable that is indexed if permalinks are turned on for more friendly URL’s).

Integrating into WordPress

With the solution published to Azure Web Sites into a Search subdirectory, we can use the published JavaScript files and embed them into our WordPress site.  While a proper WordPress plugin would be ideal, we just added the search.html code into a WordPress page using the out of the box content editor.

Note: when adding HTML into a page using the text editor in WordPress, if you lead any line feeds WordPress converts them into <p> tags.  This isn’t what we want with all our javascript and AngularJS code.  If you delete all the line feeds and keep all the HTML together, you can mitigate this problem.

clip_image001[8]

The Final Result – Search Results!

Here is the final result – a fully functioning search page that pulls WordPress posts from Azure Search and searches against keywords with the results sorted by either relevance or create date.

clip_image002[6]

Read More

Publishing to Azure Web Sites with Multiple Virtual Applications

Azure Web Sites supports the addition of virtual applications that run in their own sub-directory.  By default, your main web application is in the root directory.  You can create a new application that is isolated to its own sub-directory and with its own web.config.

To create a new virtual application, go to the Configure tab of the Azure Web Site and go to the bottom of the page.  You will see a section called “virtual applications and directories”.  In here, we can create a completely separate application that runs in its own directory, with its own web.config and publishing profile.

image

In Visual Studio, you can configure the publishing profile to publish to this new virtual application.  Just make sure to specify the subdirectory in both the Site Name and Destination URL fields.

Read More