Design a multidimensional business intelligence (BI) semantic model
- 9/7/2017
- Skill 1.1: Create a multidimensional database by using Microsoft SQL Server Analysis Services (SSAS)
- Skill 1.2: Design and implement dimensions in a cube
- Skill 1.3: Implement measures and measure groups in a cube
- Chapter summary
- Thought experiment
- Thought experiment answers
Skill 1.3: Implement measures and measure groups in a cube
Often you continue adding measures and measure groups to a cube after using the Cube Wizard because it is easier to develop and test a cube incrementally rather than trying to add everything at once in the wizard. Another post-wizard task is to configure various properties for measures and measure groups in your cube to control behavior. Among other reasons, you can use properties to control how values aggregate from detail to summary levels or to apply a format string for better legibility.
Design and implement measures, measure groups, granularity, calculated measures, and aggregate functions
The cube designer provides a lot of functionality for developing your cube beyond the basic configuration defined by using the Cube Wizard. You can use it to add new measures and configure properties for those measures in addition to bringing in multiple measures in a bulk for a new measure group. You can also use it to adjust the granularity of relationships between measure groups and dimensions. Not only can you define measures from which values are retrieved from a source fact table, but you can also define calculated measures, which are measures derived by defining expressions that operate on fact-sourced measures. Last, an important step in cube development is ensuring that SSAS applies the appropriate aggregate function to each measure, which you also configure in the cube designer.
Measures
The “Develop a cube” section of Skill 1.1 described how to add measures to a cube in bulk by using the Cube Wizard. You can also add measures manually to an existing cube or change properties for measures by using the cube designer.
To add a new measure to an existing cube, perform the following steps:
Double-click the Wide World Importers DW.cube file in Solution Explorer, right-click the Sale measure group in the Measures pane on the Cube Structure page of the cube designer, and select New Measure.
In the current cube, all numeric columns that represent measures already exist in the cube, but if you had omitted a measure based on one of these columns when using the Cube Wizard, you could select it in the Source Columns list in the New Measure dialog box, and add it to your cube.
Another option is to create a distinct count measure based on a key column in the fact table. For example, let’s say that you want to include the ability to show the distinct count of stock items sold to each customer or sold on a given day. To do this, select the Show All Columns check box in the New Measure dialog box, select Distinct Count in the Usage drop-down list, select Stock Item Key in the Source Column list, as shown in Figure 1-51, and click OK.
FIGURE 1-51 Selection of a measure in the New Measure dialog box
Rather than appearing in the Sales measure group, Stock Item Key Distinct Count appears in a newly added measure group named Sale 1 because distinct count measures are managed differently from other measures by SSAS. Because the calculation of a distinct count value at query time is expensive for SSAS to perform, SSAS places each distinct count measure into its own measure group. You can rename the measure group by right-clicking it and typing a new name.
After you create a measure, or after adding several measures by using the Cube Wizard, you should review measure properties to ensure that each measure behaves as expected when business users browse the cube. You can review a measure’s properties by selecting the measure in the Measures pane of the cube designer, and then scrolling through the Properties window. Each measure has the following set of properties that you can configure:
AggregateFunction Determines how SSAS aggregates detail rows from the fact table to summary values. The most commonly used AggregateFunction property is Sum. The other AggregateFunction values are explained later in this section.
DataType Inherits from the data type for the measure column in the source fact table.
DisplayFolder Specifies the folder in which to display the measure in the client application. You type in the name of a new folder or select the name of an existing folder in the property’s drop-down list. Using a display folder is helpful for logically organizing measures when the measure group contains a lot of measures.
MeasureExpression Defines an MDX expression that resolves the value for a measure at the leaf level before aggregation. A common reason to implement a measure expression is to multiply a sales value by an exchange rate prior to summing up the sales values.
Visible Determines whether the client application displays the measure. If you change this property to False, you can still reference the measure in MDX expressions, which is useful when the base measure is not required for analysis.
Description Allows you to provide additional information about a measure for display in a client application.
FormatString You can select a format string from a list of possible formats, such as Percent or Currency, or type in a user-defined format, such as #,# to display an integer value with a thousands separator. You can use any valid Microsoft Visual Basic custom format for numeric values as a format string.
Name Provides a name for display in client applications or for reference in MDX expressions.
Source Binds the measure to a specific table and column in the DSV.
Currently, when you browse the cube, the default properties for each measure are in effect. If you browse the cube in the cube designer, you cannot see formatting. Although no formatting is set yet, you can set up a baseline query by using Excel to browse the cube. To do this, perform the following steps:
Open the Browser tab of the cube designer, and then select Analyze In Excel on the Cube menu.
Click Enable in the Microsoft Security Notice message box.
Next, in the PivotTable Fields list, select the following measures: Profit, Quantity, Sale Count, Tax Rate, Total Excluding Tax, and Unit Price. Scroll down in the PivotTable Fields List to locate and select the Buying Group check box in the Customer dimension.
The results of these selections display in a pivot table, as shown in Figure 1-52. Notice none of these values are formatted for better legibility and currently you can sum the values in each row of a column to produce the Grand Total value in the bottom row.
FIGURE 1-52 A pivot table that displays unformatted measures
The three most common changes that are necessary to produce desired results in a cube are to change the AggregateFunction, FormatString, and Name properties of a measure. The AggregateFunction property affects the totals that appear for dimension members in the query as well as the grand total. The FormatString property makes it easier to view the individual values in the query results, whereas the Name property should be a user-friendly name that clearly communicates what the value represents using terms that business users recognize.
To observe the effect of changing the properties of these measures, return to the Cube Structure page of the cube designer, and update the property values as shown for each measure listed in Table 1-4.
TABLE 1-4 New property values for selected measures
Measure |
Property Name |
Property Value |
Quantity |
FormatString |
#,# |
Unit Price |
AggregateFunction |
None |
FormatString |
Currency | |
Tax Rate |
AggregateFunction |
None |
Total Excluding Tax |
FormatString |
Currency |
Name |
Sales Amount Without Tax | |
Tax Amount |
FormatString |
Currency |
Profit |
FormatString |
Currency |
Total Including Tax |
FormatString |
Currency |
Name |
Sales Amount With Tax | |
Total Dry Items |
FormatString |
#,# |
Name |
Dry Item Count | |
Total Chiller Items |
FormatString |
#,# |
Name |
Chiller Item Count | |
Sale Count |
FormatString |
#,# |
Stock Item Key Distinct Count |
FormatString |
#,# |
Name |
Stock Item Distinct Count |
Deploy the project, and then switch to Excel. On the PivotTable Analyze tab of the ribbon, click Refresh to update the pivot table with the cube changes. Notice the new formatting of some measures, the missing values for other measures, and missing measures, as shown in Figure 1-53.
FIGURE 1-53 A pivot table that displays formatted measures
Each measure that you renamed no longer appears in the pivot table and must be added back again manually. Add Sales Amount With Tax and Sales Amount Without Tax to the pivot table, and then add Tax Amount so that you can validate the Sales Amount With Tax value in a row by highlighting the Sales Amount Without Tax and Tax Amount columns in the same row, and checking the Sum in the status bar at the bottom of the window, as shown in Figure 1-54.
FIGURE 1-54 Validation of Sales Amount With Tax values by summing Sales Amount Without Tax and Tax Amount
All measures that display with values in the pivot table have the AggregateFunction value set to Sum. These values reflect the sum of all rows in the fact table for the column associated with the respective measure with a grouping by Buying Group. After the AggregateFunction property for Tax Rate and Unit Price is changed to None, the values in the pivot table for those measures no longer display for those measures because SSAS skips the aggregation step when retrieving results for the query. Those values at the row level are still accessible to SSAS for MDX calculations when requested, but summing the tax rate or the unit price for a sale is meaningless for analyzing sales. You can view the values for these two measures at the detail level by double-clicking a cell, such as E2, to open a new Excel sheet that displays the row-level detail in the fact table for the selected cell, as shown in Figure 1-55 in which some columns and rows have been hidden to focus on the columns of interest.
FIGURE 1-55 Transaction detail for the N/A buying group in a new Excel sheet
If you switch back to the original sheet containing the pivot table, remove all measures except Sale Count by dragging each measure individually out of the Values pane at the bottom of the PivotTables Fields List. In addition, remove Buying Group from the Rows pane. Then select the Stock Item Distinct Count measure and State Province (from the City dimension) to add them to the pivot table, a portion of which is shown in Figure 1-56.
FIGURE 1-56 A pivot table showing a standard measure, Sale Count, and a distinct count measure, Stock Item Distinct Count
Here you can see the number of sales transactions as the Sale Count value, which counts each row in the fact table and sums it by State Province. However, the Stock Item Distinct Count reflects the distinct count of stock items sold to customers in a specific State Province. If you scroll to the bottom of the pivot table to locate the Grand Total, you find the Sale Count Grand Total is 228,266 and the Stock Item Distinct Count Grand Total is 228. That means there are a total of 228.266 rows in the fact table, but 228 distinct stock item keys in the same table.
Most of the time, measures are additive. That is, regardless of which dimension you include in the query with the measure, the aggregation of the measure by any of the dimension’s attributes is a value that accurately adds up. AggregateFunction property values like Sum or Count are additive measures. You can also define nonadditive measures semiadditive measures as described in more detail in the “Aggregate functions” and “Define semi-additive behavior” sections of this chapter.
Measure groups
Remember from Skill 1.1 that a measure group is a collection of measures that come from the same fact table. A cube can contain multiple measure groups. Ideally, these measure groups share one or more dimensions in common to facilitate analysis and reduce the potential of confusing users. If there are no overlapping dimension, consider creating separate cubes for each measure group.
Before you can add a measure group to a cube, a corresponding table must exist in the DSV. To review how to add a measure group (and later how to work with semi-additive measures), let’s add another fact table to the DSV by performing the following steps:
Double-click Wide World Importers DW.dsv in Solution Explorer.
Right-click anywhere in the diagram pane of the DSV designer, select Add/Remove Tables, select Stock Holding (Fact) in the Available Objects list, click the > button, and then click OK. The table is added to the DSV with a primary key defined as Stock Holding Key and a relationship added between the Stock Holding table, and the Stock Item table based on the Stock Item Key column.
The Stock Holding table is similar to an inventory table that shows a quantity for each stock item among other information. However, a typical inventory table also includes a date column to show the quantity of each stock item on different dates. Inventory can be tracked yearly, quarterly, monthly, or even daily, depending on business requirements. To better explore semi-additive behavior later in this chapter, you can replace the Stock Holding table with a named query that simulates changes in inventory over time by performing the following steps:
Right-click the Stock Holding table in the diagram, point to Replace Table, and select With New Named Query.
Replace the SELECT statement in the Create Named Query dialog box with the statement shown in Listing 1-10.
LISTING 1-10 Named query to simulate changes in stock item inventory over time
SELECT CONVERT(date, '2013-01-01') as DateKey, [Stock Item Key], [Quantity On Hand], [Bin Location], [Reorder Level], [Target Stock Level] FROM Fact.[Stock Holding] UNION ALL SELECT CONVERT(date, '2013-06-01') as DateKey, [Stock Item Key], [Quantity On Hand] - [Stock Item Key] AS [Quantity On Hand], [Bin Location], [Reorder Level], [Target Stock Level] FROM Fact.[Stock Holding] AS t;
A surrogate key column is not necessary in a fact table, so the named query eliminates that column along with other extraneous columns so that the results of the named query are simplified as much as possible. In addition, an arbitrary date column is added and a new value is computed for the quantity on hand for the second time period to force a difference in inventory between time periods.
Before clicking OK, be sure to type Stock Holding in the Name box at the top of the dialog box if it is not currently correctly named.
Press CTRL+S to save the DSV with the new named query.
Create a relationship between the Stock Holding and Date tables. Select DateKey in the Stock Holding named query diagram and drag it to the Date column in the Date table.
Now you are ready to use the new named query as a new measure group. Double-click the Wide World Importers DW.cube file in Solution Explorer, right-click in the Measures pane of the cube designer, and select New Measure Group. In the New Measure Group dialog box, select Stock Holding, and click OK.
In the Measures pane, expand the Stock Holding measure group to review the added measures: Quantity On Hand, Reorder Level, Target Stock Level, and Stock Holding Count.
All numeric columns that are not used in relationships in the DSV are assumed to be measures and added as a group when you create the new measure group. If you do not want to include a measure, such as Stock Holding Count, right-click the measure in the Measures pane, select Delete, and then click OK to confirm the deletion.
Set the FormatString property for each of the remaining three measures to #,#.
Each measure group has many different properties available to configure. For the most part, you can keep the default values. Chapter 4 explains more about the properties that affect aggregation design, file placement, and partition processing. For now, let’s consider only the following properties:
IgnoreUnrelatedDimension Determines whether SSAS displays a top-level aggregation value for all members in an unrelated dimension, as shown in Figure 1-57. The default is True.
FIGURE 1-57 Query results showing quantity on hand by buying group
Name Provides a name for display in client applications or for reference in MDX expressions. You can rename a measure group at any time. For example, consider changing the Sale 1 measure group name to Stock Item Sales Distinct Count.
On the Dimension Usage page, select the Stock Holding measure group, and then, in the Properties window, change the IgnoreUnrelatedDimension property’s value to False.
Deploy the project, open the Browser page of the cube designer, click Reconnect, and set up a query to view Quantity On Hand by Buying Group.
The following message displays: No Rows Found. Click To Execute The Query. Because no relationship exists between the measure and the dimension, and the IgnoreUnrelatedDimension property is False, SSAS has no results to display in the query.
Granularity
Normally, the granularity attribute is the one that is set as the key in the dimension, but there can be modeling scenarios in which it is a different attribute. To further clarify granularity, consider a situation in which you have a Date dimension that has Day, Month, and Year attributes, and you have a Sale fact table that includes a DayKey column that you associate with the Day attribute as a regular relationship. In the Date dimension, the Month attribute has two key columns, Month and Year. Let’s say you also have a Forecast fact table in which forecasted sales by month and year is stored, as shown in Figure 1-58.
FIGURE 1-58 Fact tables with different granularity sharing a common dimension
To configure the dimension usage properly for the hypothetical Forecast measure group and the Date dimension, create a regular relationship between them and set the granularity attribute to Month. When you define this granularity attribute, the Dimension Columns list in the Define Relationships dialog box displays the two key columns, Month and Year, for the selected granularity attribute and then you can map those two columns to the Month and Year measure group columns.
Calculated measures
An important feature of SSAS is the ability to define calculated measures. Whereas reporting against a relational database requires you to define business logic for calculations such as profit margins in a report, you can add this business logic to a cube and thereby ensure that every query that returns profit margin performs the calculation consistently. SSAS stores only the calculated measure definition and performs the calculation at query time. However, SSAS does cache the query results to improve performance for subsequent queries requesting the same calculation, which Chapter 4 explains in more detail.
To review the steps necessary to add a calculated measure, let’s add a simple calculated measure to the Wide World Importers DW cube to compute profit margin:
Open the cube designer, and then click the Calculations tab.
Click the Form View button in the cube designer toolbar, and then click the New Calculated Member button in the cube designer toolbar.
A calculated measure is a member of the Measures dimension that is calculated by using an MDX expression. You can also create calculated members for other dimensions in the cube, as explained in Chapter 3.
Type [Profit Margin Percent] in the Name box.
In the Expression box, type the following MDX expression:
[Measures].[Profit]/[Measures].[Sales Amount Without Tax]
Chapter 3 explains the principles of MDX expressions and its syntax rules in more detail. This expression is a simple example of computing the profit margin percentage by dividing the total profit by the total sales amount. SSAS automatically aggregates the numerator separately from the aggregation of the denominator and then performs the division. If you add a filter to a query, SSAS applies the filter to the numerator and denominator aggregations separately and then performs the division.
In the Format String drop-down list, select “Percent” to apply a format string.
Deploy the project to update your multidimensional database.
Check the calculation by opening the Browser tab of the cube designer, clicking Reconnect, and dragging Profit Margin Percent from the Measures folder in the metadata pane to the query window. The result is 0.497687049958665. (The cube browser in SSDT does not apply the format string.)
This calculation is functionally equivalent to executing the following T-SQL statement in SSMS, which yields 0.497687:
SELECT SUM(Profit) / SUM([Total Excluding Tax]) AS ProfitMarginPercent FROM Fact.Sale;
Aggregate functions
By default, each new measure you create is assigned Sum as the AggregateFunction property value. By far, this is the most common type of aggregation that you use when developing a cube. However, SSAS supports the use of other aggregate functions, some of which are additive like the Sum function while others are semiadditive and nonadditive. A semiadditive function performs an aggregation across some dimension, but not necessarily all dimension, which is described in more detail in the next section. A nonadditive function does not aggregate at all, but performs a specific type of calculation. Table 1-5 lists all aggregate functions supported by SSAS.
TABLE 1-5 Aggregate functions
Aggregate Function |
additivity |
Description |
Sum |
Additive |
Sums the values for each child member. |
Count |
Additive |
Counts the number of child members. |
Min |
Semiadditive |
Returns the lowest value for child members. |
Max |
Semiadditive |
Returns the highest value for child members. |
DistinctCount |
Nonadditive |
Performs a count of unique child members. |
None |
Nonadditive |
Ignores aggregation and returns a value only for child members requested in the query |
ByAccount |
Semiadditive |
Applies the aggregate function applicable to the current member of a dimension with its type set to Accounts. |
AverageOfChildren |
Semiadditive |
Sums the total of the child members and then divides the result by the count of non-empty child members. |
FirstChild |
Semiadditive |
Returns the value of the first child member. |
LastChild |
Semiadditive |
Returns the value of the last child member. |
FirstNonEmpty |
Semiadditive |
Returns the value of the first child member that is not empty. |
LastNonEmpty |
Semiadditive |
Returns the value of the last child member that is not empty. |
Define semi-additive behavior
Semi-additive aggregation is important when you need to analyze values from point-in-time fact tables, such as inventory. In this case, you add together at the values across one dimension, but choose a point-in-time value or average value when aggregating across the Date dimension. For example, if you have inventory counts for July and December, but you want to return the inventory count for the year, you typically want the last value of the year, which is associated with December. Another common scenario for semi-additivity is financial reporting. You add together revenue and expenses from month to month when viewing quarterly or yearly data, but you use the last value for assets or liabilities.
Let’s explore an example of semi-additive behavior by using the Stock Holding measure group added in the previous section. Before we make any changes, open the Browser tab in the cube designer to review the behavior of the measures when keeping the default values for the AggregateFunction property. Expand the Measures folder, expand Stock Holding, and then add Quantity On Hand, Reorder Level, and Target Stock Level to the query window. Next expand Date and add Date.Calendar to the query window to add all three levels of the hierarchy—Calendar Year, Calendar Month, and Date, as shown in Figure 1-59.
FIGURE 1-59 Query results showing selected measures by attributes of the Date dimension on the Browser page of the cube designer
Currently, the aggregate function applied to each of these measures is the Sum function. Therefore, the resulting values represent a total for all stock items, which is not the desired result. Let’s change the query to focus on a single stock item to help see measure behavior more clearly. Add a filter by selecting Stock Item in the Dimension drop-down list above the query window and Stock Item in the Hierarchy drop-down list. In the Filter Expression drop-down list, expand All, select the USB missile launcher (Green) check box, and click OK. Now you can see two separate inventory counts for January and June, as shown in Figure 1-60.
FIGURE 1-60 Query results for a single stock item
If you remove the Calendar Month and Date attributes from the query, the query returns a single row, as shown in Figure 1-61. The measure values reflect the sum of CY2013-Jan and CY2013-Jun aggregated values for each measure.
FIGURE 1-61 Query results for a single stock item for Calendar Year CY2013
This aggregation behavior is incorrect because the measures should reflect the value as of the last inventory date instead of the sum. To change this behavior, perform the following steps:
Open the Cube Structure page of the cube designer and set the AggregateFunction property for each of these measures to LastNonEmpty.
Deploy the project and then return to the Browser in the cube designer.
Click Reconnect in the toolbar, and then add the same three measures to the query window, add Calendar Year from the Date dimension, and set the filter for Stock Item to USB missile launcher (Green) to produce the results shown in Figure 1-62.
FIGURE 1-62 Corrected query results for a single stock item for Calendar Year CY2013