Microsofttrends.com runs on WordPress, and I have been working with Power BI to show some analytics on my posting behavior. Fetching the data can be done using a JSON plugin (I use the JSON API plugin but there is also now a new WordPress REST API plugin) which will provide all your blog posts as a JSON data source.
Accessing WordPress Blog Posts
The first step is accessing the JSON data – in the latest version of Power BI desktop there is a dedicated JSON data source that you can provide the URL for your JSON feed and it will fetch the data into PowerQuery for further manipulation. Here is what the results of my query look like in Power BI Desktop.
As a table, this just provides the count of the posts but not the posts themselves – not very interesting. However, if I click on the List of posts then I can drill down into the JSON and fetch the list of posts themselves.
Each post is represented as a record. You can convert this now to a table by clicking on the “To Table” function and then clicking on the expand icon to the right of column 1. When you do this you are prompted to select the fields that you want to select as part of your table.
If you click Close and Apply this will load all your blog posts into a table.
Adding Date Filtering and Drill Down
Each post as a posting date. We can create a calculated table that represents all the possible dates and add in year, month, and day columns that are nicely formatted to support drill down by year, month or day.
Start by creating a calculated table by clicking on “New Table” with the following formula:
This will create a table of all the possible dates from the earliest blog post to today.
Now we need to add a bunch of calculated columns to represent year, month, day, day of the week, etc. Here are some formulas that you can use – each line is a calculated column that you add to the table.
DateTable = CALENDAR (MINX(‘Blog Posts’, [date]), NOW())
Day = Format([Date],”DDDD”)
DayofMonth = DAY([Date])
DayofWeek = WEEKDAY([Date],2)
ISOWeekofYear = IF ((([OrdinalDate] – [DayofWeek] + 10)/7) >= 53, 1, TRUNC(([OrdinalDate] – [DayofWeek] + 10)/7))
LongDay = Format(DateTable[Date], “dddd”)
Month = FORMAT([Date], “MMM”) & ” ” & [Year]
MonthDate = Date(DateTable[Year], DateTable[MonthofYear], 1)
MonthofYear = MONTH([Date])
OrdinalDate = DATEDIFF([Year]&”,1,1″,[Date],DAY)+1
QuarterofYear = ROUNDUP (MONTH([Date])/3,0)
Year = YEAR([Date])
In addition, you will need to change the sort column for columns such as LongDay to the integer or date representations so that your columns sort in the appropriate order and not in alphabetical order.
Once you have your date table set up, you can create a relationship between the date column in the blog posts table and the DateValue column. Depending on the format of your original date column, you may need a calculated column that reflects the identical format so that the ids match. In my case, I created a calculated column called “DateOnly” that has the same formula as the DateValue column and linked these together through the relationship management dialog.
Once you have this set up, you can now create date driven visualizations such as the number of posts per month or per day of the week!
Adding Metadata Filtering
One of the key features of a blog are tags and categories that act as metadata for filtering. The JSON provides these within its structure so can we extract the data and use it for creating some dashboards by category or tag?
If we go back to our original Power Query, we can create a new table that contains a denormalized set of keys that provide the link from the blog post ID to the tag values (the same approach will work for categories, authors, etc.). When we get the list of posts, we do the same as above and expand the records but in this case we only need two columns – the blog post ID and the tags.
Then, we need to expand the list of tags so that we can fetch the tag ID and the title of the tag for each record.
Our new table should look something like this:
Once we load up this as a table, we can now create some interesting dashboards based on the tag frequency. For example, here is the frequency of posts per tag:
Here is the frequency of tag use by month as a visualization.
Here is a comparison of the use of the tag “Azure” with the use of the tag “Power BI” over time.
Once you have the data there are lots of possible ways to explore and analyze the data!