Azure SQL Now Supports JSON Storage and Query

Microsoft has just released into General Availability support for storage and retrieval of JSON data.  JSON is a convenient data format used by web sites, JavaScript, REST APIs, etc.  Using the new JSON functions, you can run a query against an Azure SQL database and return a JSON object or you can store JSON entities as values within Azure SQL tables.

The ability to store and retrieve JSON also means you can combine traditional relational data with no-SQL JSON data and use SQL to query both at the same time and to link relational data records and semi-structured JSON data together through queries.

Read More

Using Power BI to Visualize WordPress Blog Posts

Microsofttrends.com runs on WordPress, and I have been working with Power BI to show some analytics on my posting behavior.  Fetching the data can be done using a JSON plugin (I use the JSON API plugin but there is also now a new WordPress REST API plugin) which will provide all your blog posts as a JSON data source.  

Accessing WordPress Blog Posts

The first step is accessing the JSON data – in the latest version of Power BI desktop there is a dedicated JSON data source that you can provide the URL for your JSON feed and it will fetch the data into PowerQuery for further manipulation.  Here is what the results of my query look like in Power BI Desktop.

image

As a table, this just provides the count of the posts but not the posts themselves – not very interesting.  However, if I click on the List of posts then I can drill down into the JSON and fetch the list of posts themselves.

Each post is represented as a record.  You can convert this now to a table by clicking on the “To Table” function and then clicking on the expand icon to the right of column 1.  When you do this you are prompted to select the fields that you want to select as part of your table.

image

If you click Close and Apply this will load all your blog posts into a table. 

Adding Date Filtering and Drill Down

Each post as a posting date.  We can create a calculated table that represents all the possible dates and add in year, month, and day columns that are nicely formatted to support drill down by year, month or day.

Start by creating a calculated table by clicking on “New Table” with the following formula:

DateTable = CALENDAR (MINX(‘Blog Posts’, [date]), NOW())

This will create a table of all the possible dates from the earliest blog post to today.

Now we need to add a bunch of calculated columns to represent year, month, day, day of the week, etc.  Here are some formulas that you can use – each line is a calculated column that you add to the table.

DateTable = CALENDAR (MINX(‘Blog Posts’, [date]), NOW())

Day = Format([Date],”DDDD”)

DayofMonth = DAY([Date])

DayofWeek = WEEKDAY([Date],2)

ISOWeekofYear = IF ((([OrdinalDate] – [DayofWeek] + 10)/7) >= 53, 1, TRUNC(([OrdinalDate] – [DayofWeek] + 10)/7))

LongDay = Format(DateTable[Date], “dddd”)

Month = FORMAT([Date], “MMM”) & ” ” & [Year]

MonthDate = Date(DateTable[Year], DateTable[MonthofYear], 1)

MonthofYear = MONTH([Date])

OrdinalDate = DATEDIFF([Year]&”,1,1″,[Date],DAY)+1

QuarterofYear = ROUNDUP (MONTH([Date])/3,0)

Year = YEAR([Date])

In addition, you will need to change the sort column for columns such as LongDay to the integer or date representations so that your columns sort in the appropriate order and not in alphabetical order.

Once you have your date table set up, you can create a relationship between the date column in the blog posts table and the DateValue column.  Depending on the format of your original date column, you may need a calculated column that reflects the identical format so that the ids match.  In my case, I created a calculated column called “DateOnly” that has the same formula as the DateValue column and linked these together through the relationship management dialog.

image

Once you have this set up, you can now create date driven visualizations such as the number of posts per month or per day of the week!

image

image

Adding Metadata Filtering

One of the key features of a blog are tags and categories that act as metadata for filtering.  The JSON provides these within its structure so can we extract the data and use it for creating some dashboards by category or tag?

If we go back to our original Power Query, we can create a new table that contains a denormalized set of keys that provide the link from the blog post ID to the tag values (the same approach will work for categories, authors, etc.).   When we get the list of posts, we do the same as above and expand the records but in this case we only need two columns – the blog post ID and the tags.

image

Then, we need to expand the list of tags so that we can fetch the tag ID and the title of the tag for each record.

image

Our new table should look something like this:

image

Once we load up this as a table, we can now create some interesting dashboards based on the tag frequency.  For example, here is the frequency of posts per tag:

image

Here is the frequency of tag use by month as a visualization.

image

Here is a comparison of the use of the tag “Azure” with the use of the tag “Power BI” over time.

image

Once you have the data there are lots of possible ways to explore and analyze the data!

Read More

Azure DocumentDB Now Supports Geospatial Queries

Azure DocumentDB will now support Geospatial queries and storing of geospatial data using the GeoJSON standard.

The GeoJSON standard provides a specification for storing geo spatial data such as points, lines, polygons, etc. used for managing data points on a map.

Along with storing locations in your records, Azure DocumentDB provides a set of querying functions to find records based on a radius or polygon boundary. 

GeospatialDistance

GeospatialWithin

Read More