Model the data
- By Daniil Maslyuk
- 11/5/2022
- Skill 2.1: Design a data model
- Skill 2.2: Develop a data model
- Skill 2.3: Create model calculations by using DAX
- Skill 2.4: Optimize model performance
- Chapter summary
- Thought experiment
- Thought experiment answers
Skill 2.3: Create model calculations by using DAX
You used some DAX earlier in the chapter to create calculated tables and calculated columns as well as configure row-level security. In practice, DAX is most often used to create measures in Power BI.
Writing your own formulas is an important skill that allows you to perform much more sophisticated analysis based on your data compared to not using DAX.
In this section, we start by reviewing DAX fundamentals; then we look at CALCULATE, one of the most important functions in DAX, specifically in Time Intelligence or time-related calculations, which we review separately.
DAX can help you replace some columns with measures, allowing you to reduce the data model size. Not all DAX formulas need to be complex, and we review some basic statistical functions in this section as well.
Create basic measures by using DAX
Although many things can be computed by using calculated columns, in most cases it’s preferable to write measures, because they don’t increase the model size. Additionally, some calculations are simply not possible with calculated columns. For example, to calculate a ratio dynamically, you need to write a measure.
As you saw earlier, quick measures already allow you to perform basic calculations without writing DAX yourself. In this section, you start using DAX to build complex measures.
It’s important to understand that Power BI allows you to aggregate columns in visuals without using measures, a practice sometimes called implicit measures. These can be useful when you want to quickly test how a visual might look or to perform a quick analysis on a column. However, it’s always best practice to create explicit measures by using DAX—even with trivial calculations such as SUM. Here are some reasons it’s preferable to create measures yourself:
Implicit measures may provide unexpected results in some cases due to the Summarize by column property. For example, if you have a column that contains product prices and Power BI sets the summarization to SUM, then dragging the column in a visual will not produce meaningful results. Although you can change the summarization in the visual, following this approach means that you need to pay attention to this property every time you use implicit measures.
Explicit measures can be reused in other measures. This is beneficial because you can write less code, which saves time and improves the maintainability of your data model.
Implicit measures cannot leverage inactive relationships.
Implicit measures are not supported by calculation groups.
Measures are different from calculated columns in a few ways. The main difference is that you can see the results of a calculated column immediately after defining the calculation, whereas you can’t see the results of a measure until you use it in a visual. This behavior allows measures to return different results depending on filters and where they’re used.
Another difference between calculated columns and measures is that calculated column formulas apply to each row of a table, whereas measures work on columns and tables, not specific rows. Therefore, measures most often use aggregation functions in DAX.
There are a few ways to create a measure in Power BI Desktop. Here’s one way:
Go to the Report view.
In the Fields pane, right-click a table in which you want to create a new measure.
Select New measure.
Enter the measure formula and press Enter.
You can also create a measure by selecting New measure on the Home ribbon, but you have to make sure you’ve got the right table selected in the Fields pane; otherwise, your measure may not be created in the correct table. If you do create a measure in the wrong table, instead of re-creating the measure you can move it by performing the following steps:
Go to the Report view.
In the Fields pane, select the measure you want to move.
On the Measure tools ribbon, select the table your measure should be stored in from the Home table dropdown list.
For example, to compute the total profit of Wide World Importers, use the following measure formula:
Total Profit = SUM(Sale[Profit])
You can compute total sales, excluding tax, by using the following measure formula:
Total Sales Excluding Tax = SUM(Sale[Total Excluding Tax])
If you want to compute the profit margin percentage, there are two ways of doing it. You could use this:
Profit % = DIVIDE( SUM(Sale[Profit]), SUM(Sale[Total Excluding Tax]) )
However, this approach involves repeating your own code, which is undesirable because formulas become more difficult to maintain. You can avoid this issue if you reference the measures you created previously:
Profit % = DIVIDE( [Total Profit], [Total Sales Excluding Tax] )
When you’re referencing measures, it’s best practice to not use table names in front of them. Unlike column names, measure names are unique; different tables may have the same column names, but it’s not possible to have measures that share the same name.
Another feature of DAX that allows you to avoid repeating yourself is variables. Think of a variable as a calculation within a measure. For instance, if you want to avoid showing zeros in your visuals, you could write a measure as follows:
Total Dry Items Units Sold = IF( SUM(Sale[Total Dry Items]) <> 0, SUM(Sale[Total Dry Items]) )
By using a variable, you can avoid calling SUM twice:
Total Dry Items Units Sold = VAR TotalDryItems = SUM(Sale[Total Dry Items]) VAR Result = IF( TotalDryItems <> 0, TotalDryItems ) RETURN Result
Variables are especially useful when you want to store computationally expensive values, because variables are evaluated no more than once. As you’ll see later in this chapter, you can use many variables within the same formula.
Use CALCULATE to manipulate filters
Earlier in this chapter, you saw that the CALCULATE function can be used to alter relationships when paired with other DAX measures. The USERELATIONSHIP function with CALCULATE can activate inactive relationships, and CROSSFILTER with CALCULATE can change the filter direction.
The CALCULATE function also allows you to alter the filter context under which measures are evaluated; you can add, remove, or update filters, or you can trigger context transition. We cover row context, filter context, and context transition in more detail later in this chapter.
CALCULATE accepts a scalar expression as its first parameter, and subsequent parameters are filter arguments. Using CALCULATE with no filter arguments is only useful for context transition.
Adding filters
CALCULATE allows you to add filters in several formats. To calculate profit for the New England sales territory, you can write a measure that you can read as “Calculate the Total Profit where the Sales Territory is New England”:
New England Profit = CALCULATE( [Total Profit], City[Sales Territory] = "New England" )
Importantly, you’re not limited to using one value per filter. You can calculate profit for New England, Far West, and Plains:
New England, Far West, and Plains Profit = CALCULATE( [Total Profit], City[Sales Territory] IN {"New England", "Far West", "Plains"} )
You can specify filters for different columns at once too, which are combined by using the AND DAX function. For example, you can calculate profit in New England in 2020 that reads as “Calculate the Total Profit where the Sales Territory is New England and the Year is 2020”:
New England Profit 2020 = CALCULATE( [Total Profit], City[Sales Territory] = "New England", 'Date'[Year] = 2020 )
Removing filters
There are several DAX functions that you can use as CALCULATE modifiers to ignore filters, one of which is ALL. ALL can remove filters from:
One or more columns from the same table
An entire table
The whole data model (when ALL is used with no parameters)
For example, you can show profit for all sales territories regardless of any filters on the City[Sales Territory] column:
Profit All Sales Territories = CALCULATE( [Total Profit], ALL(City[Sales Territory]) )
If you create a table that shows the new measure alongside Total Profit by Sales Territory, you get the results shown in Figure 2-20.
FIGURE 2-20 Total Profit and Profit All Sales Territories by Sales Territory
Note that the new measure displays the same value for any sales territory, which is the total of all sales territories combined regardless of sales territory.
Updating filters
When you specify a filter such as City[Sales Territory] = "New England", it’s an abbreviated way that corresponds to the following filter:
FILTER( ALL(City[Sales Territory]), City[Sales Territory] = "New England" )
By adding this filter, you are ignoring a filter by using ALL, and you’re adding a filter at the same time. This allows you to filter for New England regardless of the selected sales territory.
If you create a table that shows Total Profit and New England Profit by Sales Territory, the result should look like Figure 2-21.
FIGURE 2-21 Total Profit and New England Profit by Sales Territory
When you have Sales Territory on rows, each row from the Total Profit column is filtered for a single sales territory and the Total row shows values for all sales territories. In contrast, by using the measure above in the New England Profit column, you are filtering regardless of the current sales territory, showing only the New England Profit.
Context transition
Another important function of CALCULATE is context transition, which refers to transitioning from row context to filter context.
In DAX, there are two evaluation contexts:
Row context This context can be understood as “the current row.” Row context is present in calculated columns and iterators. Iterators are functions that take a table and go row by row, evaluating an expression for each row. For example, FILTER is an iterator; it takes a table, and for each row, it evaluates a filter condition. Those rows that satisfy the condition are included in the result of FILTER.
Filter context This context can be understood as “all applied filters.” Filters can come from slicers, from the Filter pane, or by selecting a visual element. Filters can also be applied programmatically by using DAX.
To review context transition, let’s create a sample table in the data model:
On the Home ribbon, select Enter data.
Enter Sample in the Name box.
Enter the data shown in Figure 2-22.
FIGURE 2-22 Entering data
Select Load.
Now that you have the table, you can add two calculated columns to it to see the effect of context transition:
Go to the Data view.
Select the Sample table in the Fields pane.
Create a calculated column with the following formula:
Sum Number = SUM('Sample'[Number])
Create another calculated column with the following formula:
Calculate Sum Number = CALCULATE(SUM('Sample'[Number]))
The result should look like Figure 2-23.
FIGURE 2-23 Calculated columns in the Sample table
SUM, as an aggregation function, uses filter context. Because there are no filters in the data model—there are no visuals, and you’re not adding any filters by using DAX—SUM aggregates the whole Number column, so the result in the Sum Number column is 6 regardless of the row.
On the other hand, the Calculate Sum Number column uses the same formula as Sum Number, but importantly has been wrapped in CALCULATE. CALCULATE automatically performs context transition, so the result is different from using the SUM function alone. Context transition takes all values from all other columns and uses them as filters. Therefore, for the first row, you aggregate the Number column, where:
Sample[Letter] is A
Sample[Number] is 1
Sample[Sum Number] is 6
Where the sum of 1 is equal to 1, since there’s only one such row that meets these filters, you get 1. Separately for row 2, the sum of 2 equals 2, and for row 3, the sum of 3 equals 3. Context transition can be made even clearer by modifying the Sample table slightly as follows:
On the Home ribbon, select Transform data.
Select the Sample query.
Select the cog wheel in the Source step.
Change the third row to match the second row, as shown in Figure 2-24.
FIGURE 2-24 Modified Sample table
Select OK.
On the Home ribbon of Power Query Editor, select Close & Apply.
If you now look at the Sample table in the Data view, the result will look like Figure 2-25.
FIGURE 2-25 Sample table after update
Although the first row is calculated as you saw in the previous example, the second and third rows are now both showing 4. Intuitively, you could expect to see 2 and 2 in each row, though you’re getting 4 and 4. This is because for each row, due to context transition triggered by CALCULATE, you’re summing the Number column, where
Sample[Letter] is B
Sample[Number] is 2
Sample[Sum Number] is 5
Because there are two such rows, you get 2 + 2 = 4 in both rows.
Implement Time Intelligence using DAX
It is common for business users to want to aggregate metrics—for example, revenue—across time, such as year-to-date revenue for a certain date, or prior-year revenue for the comparable period. Fortunately, DAX has a family of functions, referred to as Time Intelligence, that facilitate such calculations.
All Time Intelligence functions require a calendar table that has a date type column with unique values. If the date column is not part of a relationship, the calendar table must be marked as a date table, which can be done as follows:
Go to the Report or Data view.
Select the calendar table in the Fields pane.
On the Table tools ribbon, select Mark as date table > Mark as date table.
Select the date column from the Date column dropdown list.
Select OK.
Most Time Intelligence functions return tables that can be used as filters in CALCULATE. For example, you can use the DATESYTD function to calculate a year-to-date amount as follows:
Profit YTD = CALCULATE( [Total Profit], DATESYTD('Date'[Date]) )
You can also combine Time Intelligence functions. For example, to calculate year-to-date profit for the previous year, use the following formula:
Profit PYTD = CALCULATE( [Profit YTD], DATEADD('Date'[Date], -1, YEAR) )
Some Time Intelligence functions, such as DATESYTD, can accommodate fiscal years. For example, if you had a fiscal year ending on June 30, you could calculate profit year-to-date for the fiscal year as follows:
Profit FYTD = CALCULATE( [Total Profit], DATESYTD('Date'[Date], "30-6") )
The Total Profit, Profit YTD, Profit PYTD, and Profit FYTD measures can be seen together in Figure 2-26.
FIGURE 2-26 Time Intelligence calculations
Notice how the Profit YTD measure shows the cumulative total profit within each year. The Profit PYTD measure shows the same values as Profit YTD one year before. Profit FYTD shows the cumulative total profit for fiscal years, resetting on July 1 of each year.
Replace implicit measures with explicit measures
It is sometimes possible to replace some numeric columns with measures, which can reduce the size of the data model. In our Wide World Importers example, there are several columns that could be replaced with measures.
For example, the Total Chiller Items and Total Dry Items columns in the Sale table show quantity of chiller and dry items, respectively. Essentially, these columns show filtered quantities depending on whether an item is a chiller or a dry item.
Before you replace the two columns with measures, create the following measure, which you’ll reference and build upon later:
Total Quantity = SUM(Sale[Quantity])
You can now create the following two measures and use them instead of columns:
Total Chiller Items (Measure) = CALCULATE( [Total Quantity], 'Stock Item'[Is Chiller Stock] = TRUE ) Total Dry Items (Measure) = CALCULATE( [Total Quantity], 'Stock Item'[Is Chiller Stock] = FALSE )
If you remove the Total Chiller Items and Total Dry Items columns from the model, you’ll make it smaller and more efficient.
Another example of a column that can be replaced by a measure is Total Including Tax from the Sale table. Since Total Excluding Tax and Tax Amount added together equals Total Including Tax, you can use the following measure instead:
Total Including Tax (Measure) = SUMX( Sale, Sale[Total Excluding Tax] + Sale[Tax Amount] )
Again, once you have the measure, removing the Total Including Tax column would reduce the size of the data model.
Use basic statistical functions
As mentioned previously, it’s best practice to create explicit measures even for basic calculations such as SUM, because you can build upon them to create more complex measures. You’ve already used SUM in our previous examples; here are several other basic statistical measures that are frequently used:
AVERAGE
MEDIAN
COUNT
DISTINCTCOUNT
MIN
MAX
All these functions take a column as a reference and produce a scalar value. In addition, every function except DISTINCTCOUNT has an equivalent iterator function with the X suffix—for instance, SUMX is the iterator counterpart of SUM. Iterators take two parameters: a table to iterate through, and an expression to evaluate for each row. The evaluated results are then aggregated according to the base function; for example, SUMX will sum the results. When you’re learning the difference, it can be helpful to create sample tables similar to the examples shown earlier to visually compare the nuances of the different functions.
Create semi-additive measures
In general, there are three kinds of measures:
Additive These measures are aggregated by using the SUM function across any dimensions. A typical example is revenue, which can be added across different product categories, cities, and dates, as well as other dimensions. Revenue of all months within a year, when added together, equals the total year revenue.
Semi-additive These measures can be added across some but not all dimensions. For example, inventory counts can be added across different product categories and cities, but not dates; if you had five units yesterday and two units today, that doesn’t mean you’ll have seven units tomorrow. On the other hand, if you have five units in Sydney and two units in Melbourne, this means you’ve got seven units in the two cities in total.
Non-additive These measures cannot be added across any dimensions. For instance, you cannot add up the average price across any dimension, because the result would not make any practical sense. If the average sale price in Sydney was $4.50, and it was $3.50 in Melbourne, you cannot say that across both cities, the average price was $8.00 or even $4.00 because the number of units sold could be very different.
In this section, we focus on semi-additive measures. There are several ways to write a semi-additive measure, and the correct way for you depends on your business requirements. Let’s say your business is interested in inventory counts, and you have the data model shown in Figure 2-27.
FIGURE 2.27 Inventory data model
If you have inventory figures for all dates of interest in your data, you can write the following measure:
Inventory Balance = CALCULATE( SUM(Inventory[Balance]), LASTDATE('Date'[Date]) )
In addition to LASTDATE and its sister function FIRSTDATE, there are some DAX functions that can help you retrieve the opening or closing balance for different time periods:
OPENINGBALANCEMONTH
OPENINGBALANCEQUARTER
OPENINGBALANCEYEAR
CLOSINGBALANCEMONTH
CLOSINGBALANCEQUARTER
CLOSINGBALANCEYEAR
The functions that start with CLOSING evaluate an expression for the last date in the period, and the functions that start with OPENING evaluate an expression for one day before the first date in the period. This means that the opening balance for February 1 is the same as the closing balance for January 31.
For example, you can calculate the opening month balance for inventory as follows:
Inventory Opening Balance Month = OPENINGBALANCEMONTH( SUM(Inventory[Balance]), 'Date'[Date] )
The date-based functions listed here only work if you have data for all dates of interest. For example, if you’d chosen to use CLOSINGBALANCEMONTH but your data ends on May 23, 2022, as is the case for sample data, you’d get a blank value for May 2022. For cases such as this, you can use LASTNONBLANKVALUE or FIRSTNONBLANKVALUE as shown here:
Inventory Last Nonblank = LASTNONBLANKVALUE( 'Date'[Date], SUM(Inventory[Balance]) )
This measure will show the latest available balance in the current context.
The Inventory Balance, Inventory Opening Balance Month, and Inventory Last Nonblank measures can be seen in Figure 2-28.
FIGURE 2.28 Inventory measures
Determining which calculation you should use depends on your business requirements—there is no single correct answer that applies to all scenarios. Missing data may mean there’s no inventory, or it may mean that data isn’t captured frequently enough, so the data modeler should understand the underlying data before writing the calculations to ensure the data isn’t represented incorrectly.
Use quick measures
A measure in Power BI is a dynamic evaluation of a DAX query that will change in response to interactions with other visuals, enabling quick, meaningful exploration of your data. Creating efficient measures will be one of the smartest things you can do to build insightful reports. If you’re new to DAX and writing measures, or you’re wanting to perform quick analysis, you have the option of creating a quick measure. There are several ways to create a quick measure:
Select Quick measure from the Home ribbon.
Right-click or select the ellipsis next to a table or column in the Fields pane and select New quick measure. This method may prefill the quick measure form shown next.
If you already use a field in a visual, select the dropdown arrow next to the field in the Values section and select New quick measure. This method also may prefill the quick measure form shown next. If possible, this will add the new quick measure to the existing visualization. You’ll be able to use this measure in other visuals too.
The following calculations are available as quick measures:
Aggregate per category
Average per category
Variance per category
Max per category
Min per category
Weighted average per category
Filters
Filtered value
Difference from filtered value
Percentage difference from filtered value
Sales from new customers
Time Intelligence
Year-to-date total
Quarter-to-date total
Month-to-date total
Year-over-year change
Quarter-over-quarter change
Month-over-month change
Rolling average
Totals
Running total
Total for category (filters applied)
Total for category (filters not applied)
Mathematical operations
Addition
Subtraction
Multiplication
Division
Percentage difference
Correlation coefficient
Text
Star rating
Concatenated list of values
Each calculation has its own description and a list of field wells—you can see an example in Figure 2-29.
FIGURE 2.29 Quick Measures dialog box
For example, by using quick measures, you can calculate average profit per employee for Wide World Importers as follows:
Ensure the Sale table is selected in the Fields pane.
Select Quick measure on the Home ribbon.
From the Calculation dropdown list, select Average per category.
Drag the Profit column from the Sale table to the Base value field well.
Drag the Employee column from the Employee table to the Category field well.
Select OK.
Once done, you can find the new measure called Profit average per Employee in the Fields pane.
If you select the new measure, you’ll see its DAX formula:
Profit average per Employee = AVERAGEX( KEEPFILTERS(VALUES('Employee'[Employee])), CALCULATE(SUM('Sale'[Profit])) )
You can modify the formula, if needed. Reading the DAX can be a great way to learn how measures can be written.