In Power BI Desktop, we have the ability to create calculated “Columns” and “Measures”. Both work by creating new pseudo columns based on a DAX calculation. What is the difference between the two? The answer lies in when the calculation takes place and how it impacts aggregation of data.
Let’s look at a specific example.
I downloaded a dataset from the Community Health Status Indicators that provides demographic data and health indicators by county in the United States. One of the demographics available is the population broken down by age in 4 categories:
- Under 19
- 19 to 64
- 64 to 85
- 85 and over
Each of these are provided as a % value for the county. In addition there is a total population size for each county provided. What we want is a summary pie chart that looks something like this:
We want to ensure that the % for each age group aggregates properly as we roll up from county to state to country. Here are the steps required that show the difference between a calculated column vs. a calculated measure. The key difference between them is:
- A Calculated Column runs the calculation on each individual row. The calculation happens BEFORE aggregating.
- A Calculated Measure runs the calculation on a collection of rows. Calculations happen AFTER aggregating.
Let’s see the difference in action.
Step #1: Using a Calculated Column for the Population per Age Group
Our dataset only provides % of population for each age group. If you try to roll this up to the state or country, you cannot aggregate the % successfully. If you try SUM as your roll up option it will simply add all the % as if they were numbers – definitely not want we want. If you use AVERAGE to roll up this is a bit better but it doesn’t take into account that some counties are more populous than others, making your average very distorted (for example, the average in Alaska is given equal weight as California).
In order to calculate a true average, we need to translate the county % values into absolute population values so that when we roll them up it takes into account the size of the county.
Because we want the calculation to work on each individual row, we can use a Calculated Column. In this case, we need one for each age group like this:
Age 19 Under Population = [Age_19_Under] / [Total % Age Population] * DEMOGRAPHICS[Population_Size]
Age 19 to 64 Population = [Age_19_64] / [Total % Age Population] * DEMOGRAPHICS[Population_Size]
Age 65 to 84 Population = DEMOGRAPHICS[Age_65_84] / [Total % Age Population] * [Population_Size]
Age 85 and Over Population = [Age_85_and_Over] / [Total % Age Population] * [Population_Size]
These calculated columns represent the absolute number of people for each age bracket per county.
The reason we need the [Total % Age Population] (which is another calculated column) is because in this dataset, the original % values don’t add up to 100%. By dividing the % by the total % we effectively normalize all the % values so they are always out of 100%.
Here are the results for California (keep in mind this is 2010 data):
Step #2: Use a Calculated Measure to Roll Up to State and Country
Now that we have raw numbers for each county, we can use a Calculated Measure to calculated a true % based on the aggregated populates for each county. For each age group, the calculated measure looks like this:
<19 = Sum(DEMOGRAPHICS[Age 19 Under Population]) / Sum(DEMOGRAPHICS[Population_Size])
19 to 64 = Sum(DEMOGRAPHICS[Age 19 to 64 Population]) / Sum(DEMOGRAPHICS[Population_Size])
64 to 85 = Sum(DEMOGRAPHICS[Age 65 to 84 Population]) / Sum(DEMOGRAPHICS[Population_Size])
> 85 = Sum(DEMOGRAPHICS[Age 85 and Over Population]) / Sum(DEMOGRAPHICS[Population_Size])
As you can see, the key difference with a Calculated Measure is the aggregation happens first – the sum function takes the rows that fit the current filtering criteria, adds them together and then uses the total to calculate the % of the total population.