Книга: Power BI: 3 in 1- Comprehensive Guide of Tips and Tricks to Learn the Functions of Power BI+ Simple and Effective Strategies+ Advanced Guide to Learn the Advanced Realms of Power BI
Назад: Chapter Three: Introduction to DAX
Дальше: Chapter Five: Power BI and Power Query (M Language)

 

 

Previously we have seen that with a very simple DAX expression, such as the calculated field (measurement).

It has allowed us to analyze in a dynamic table, the total sales, the sales by store and year, the sales to single customers by store and year, ..., and thus we could have continued showing dozens of reports based on that calculated field.

But you will also sometimes find that the DAX expression you perform does not return the value you expect, much of the blame lies with the contexts . Therefore, we are going to study them next, since it is essential to know them and how they behave to understand any DAX function.

Contexts allow for dynamic analysis, where the results of expression vary according to the selections made in the row, column, and filter areas. We must know them and use them effectively, both avoid getting the wrong results and to generate efficient expressions.

We have three types of contexts:

It applies to the elements in the row in question, the "current row." It is used, among others, in the calculated columns. It covers the row being calculated and the rows of other tables related to that row.

It behaves like the Excel formulas applied inside a table, automatically referring to values in the same row. If the table is related to others, you can access any value in those tables that are related to the current row. To access these related tables, we have the RELATED function explained above.

There are other types of cases for which DAX has a series of functions that iterate, in a kind of internal loop, the calculations on a table. For the time being, it is sufficient to understand the above and to lay down a good foundation.

It evaluates a dataset for a specific cell. It refers to a subset of data that is implicitly retrieved for formula and allows elements to be added to the filter, row, and column areas.

When you place a calculated field or a column with numerical values in the value area of a pivot table, the Power Pivot engine examines the column and row headers, data segments, and report filters to determine the context. Power Pivot then performs the necessary calculations to fill each cell in the pivot table. The data set that is retrieved is the context of the query for each cell.

For a calculated field (measurement), it is the one defined by the row and column, plus the filters applied from segmenters ( slicers) and filter areas.

Let's look at a simplified sales example, with a very reduced number of rows and columns to understand it better. To analyze the sales, we have put in rows the dates, in values the calculated field "Sum of the amount" (implicit calculated field), and we have added two segmenters, one for Customer and another for Product.

In the table, you will see the cells read by the Power Pivot engine to obtain the value that appears calculated in the dynamic table on the right (light blue and white).

It is applied to the row and query contexts to generate exceptions and allow us to vary the evaluated cells based on our needs. That is, it allows you to reduce or extend the row context ( Row context) and query context (Query context) using additional filters applied in the DAX functions

There are many cases in which we must apply additional filter contexts, but to understand this we are going to make a simple example in which we are going to calculate the sales ratio of each client concerning the total sales of the company, for this, we are going to define the following calculated fields:

TotalImport:=SUM([Amount]) TotalCompanyImport:=CALCULATE([TotalImport];ALL(Sales2)) RatioSales:=DIVIDE([TotalImport]; [TotalCompanyImport])

Note that "TotalImportEnterprise" is always 656. Later on, we will study the CALCULATE and ALL functions, as well as other functions that allow us to perform filter contexts.

As you can see, we have applied the good practices of creating fields calculated with simple calculations and reusing them. It is preferable to do this with the three calculated fields shown above than with the following expression:

RatioSales:=DIVIDE(SUM([Amount]);

CALCULATE([TotalAmount];ALL(Sales2)) )

Both for the simplicity of understanding them, and for their subsequent reuse, if we do it as we initially proposed, we will be able to reuse the calculated fields "TotalImport" and "TotalImportEnterprise" in any other calculation we need to do, and if there were any error just correct it there would be automatically corrected all the calculated fields that include them.

Once we have laid the foundations, we can learn any DAX function and apply it correctly. Here are some frequently used ones, which are also often used in combination with other existing functions.

We have several aggregation functions ending in "X," such as SUMX, COUNTX, AVERAGEX, MINX, MAXX, etc. These functions iterate row by row on the table. We pass to it as the first parameter and do the indicated operation as the second parameter, to finally apply the indicated aggregation function to the result.

Let's look at something that seems very simple to use a priori, which is also an expression that we can use in the calculated columns (remember, row context), but which is not allowed to be used in the calculated fields (measurements).

In the following table, we have "Quantity" and "SalesAmount," but not "NetPrice," this is not a problem is a simple division to which we will also apply rounding to four digits.

If we later need to know the total sum of "NetPrice" we can create the next calculated field:

TotalNetPrice:=SUM([NetPrice])

Now we put ourselves in the situation that we need to make that calculation in a calculated field (measurement), dispense with the calculated column and save that disk space and memory; therefore, we go to the grid and create it:

NetXPrice:=ROUND([SalesAmount]/[Quantity];4)

But we came up against the mistake:

Semantic Error: The value of ImportSale cannot be determined

Well, what we really want is a calculated field that gives us the sum, so we include the SUM function as follows:

TotalNetPrice:=
SUM(ROUND([SalesAmount]/[Quantity];4))

And again we find a mistake, which is different now:

Semantic Error: The SUM function only accepts a column reference as an argument.

Can't we really make a calculated field that adds up the result of an operation? Well, we can, but not with the SUM function, for that we must use the SUMX function.

TotalNetPrice:=SUMX('Sales'; ROUND([AmountSales]/[Quantity];4))

Be very careful about making incorrect calculations, which we tend to do when we start in the DAX language. Let's see the next calculated field:

TotalMalPrice:=
ROUND(SUM([AmountSales])/SUM([Quantity]); 4)

This second case, what it does is, first add "SalesAmount" for all the rows, then add "Amount" for all the rows and finally divide both results.

Let us now handle the result of the three calculations on the pivot table; the calculation error made in the last case is abysmal.

This is an extreme case, but you should keep in mind that as we make more sophisticated calculated fields (measurements), you may get grand totals and subtotals that do not equal the sum or average of their detail. In these cases, the aggregation functions ending in "X" will help you.

The following table shows the syntax of these functions and a brief description. For more details, you can see the product's help.

SUMX(<Table>;

<Expression>)

Returns the sum of an expression

evaluated by each row of a table.

COUNTX(<Table>;

<Expression>)

Returns the number of rows that meet the evaluated expression.

AVERAGEX(<Table>;

<Expression>)

Calculates the average of an expression

evaluated by each row in the table

MINX(<Table>;

<Expression>)

Evaluates one expression per row of the

table and returns the smallest numerical value.

MAXX(<Table>;

<Expression>)

It evaluates one expression for each row in the table and returns the highest numerical value.

 

The ALL function is used to remove filters from the context. It has two syntaxes:

ALL(<Table>)

ALL(<Column1>;<Column1>;...;<Column1>)

You cannot combine both options. If a table is indicated, remove all filters from the specified table. If one or more columns are indicated, remove the filters from the specified columns, keeping the filters applied for the rest of the table columns.

This function is not used alone but is used as a parameter for other functions to change the set of results on which other calculations will be performed.

The ALLEXCEPT function is the opposite in terms of context filters to be removed, i.e., it keeps the context filters applied to the specified table or columns and removes all others. It also has two syntaxes:

ALLEXCEPT(<Table>)

ALLEXCEPT(<Column1>;<Column1>;...;<Column1>)

And on it applies everything explained for the ALL function.

Let's see an example of when to use one or the other since, in many cases, it is more a matter of writing less. If we have a table with 15 columns and we need to apply ALL on 14 of them, it is more practical to use ALLEXCEPT on that one column I do not want you to apply ALL on. The difference would be in the behavior it would have if a new column was included in the table, in the first case ALL would continue applying on the 14 columns indicated as parameters and would not apply now on 2 of them, while in the second case ALLEXCEPT would make it apply on 15 columns (the new one would be included) and would not apply on the only column we include as a parameter.

The FILTER function allows us to make a filter on a table. Its syntax is:

FILTER(<Table>;<Filter>)

<Table>: is the table to be filtered. Note that it can either be a table directly or a Provincement that returns a table as a result.

<Filter>: It is a Boolean expression, that is, the result of evaluating it has to be True or False. For example: [AmountSale] > 0,

[Product]="P2", ...

This function is not used alone but is used as a parameter for other functions to change the set of results on which calculations will be performed.

The FILTER function can be used as a parameter in functions such as CALCULATE, COUNTROWS, SUMX, etc.

If you know the SQL language, you should know that it has certain similarities with the WHERE condition.

Let's see an example of use:

FILTER(Sales2;[Product]="P2")

Later on, we will see additional examples using FILTER as a parameter for other functions that we will study.

The CALCULATE function evaluates an expression in a context that has been modified by the filters passed to it as parameters. Its syntax is:

CALCULATE(<Expression>;<Filter1>;<Filter2>;...;<FilterN>)

All the "Filter" parameters applied for work as if they were within a Y() ( AND()) function, that is, the expression will only apply to the rows that comply with all the filters.

As this definition can tell you little at this time, let's see with other words more understandable for any Excel user. CALCULATE is like the SUM.SI() function, but its power and flexibility has been increased almost inexhaustibly. But what does this really mean? So the function CALCULATE instead of applying only to "ADD" applies to any DAX expression we pass to it as the first parameter, and instead of the condition "YES," it applies to all the filters we pass to it as a parameter. Remember that the concept of the filter is used only to select a part of the selected values, but it is a switch to filter context, so you can select several rows that is higher than the filtered ones at a given time.

We are going to show a very basic example by emulating the function SUM.SI(), which although not very useful, will help us to understand better the function CALCULATE:

SumImportP2:=CALCULATE(SUM([Amount]); Sales2[Product]="P2")

SumImportP2:=CALCULATE(SUM([Amount]); FILTER(Sales2;[Product]="P2"))

The two expressions above are equivalent. We have used an addition expression and applied a filter so that it only adds up the amounts of the product "P2".

Let's take up the example we saw earlier when we explained the Filter Context, and now we will be able to understand it in detail:

TotalImport:=SUM([Amount])

TotalImportCompany:
=CALCULATE([TotalImport];ALL(Sales2))

We have used an addition expression and applied a filter context, so that instead of applying the filters defined by the user in the pivot table, always apply to all the rows of the table 'Sales2', since the function ALL() here means "return all the rows of the table sales2 without filtering".

Next, we will explain the keys to the operation of CALCULATE, to do this, we must understand how the filters behave:

The "Filter" parameters change the context of the pivot table, by changing it to filter context:

Let's look at a third example of the use of the CALCULATE function. In this case, something that has a certain complexity and poor performance in relational database systems, but here it is done very easily and very well. We need to know the accumulated sales from the beginning of time to a given date; for this, we use the following expression DAX:

Accumulated Sales:=CALCULATE([TotalSales]; FILTER(ALL(Date); Date[DateSK] <= MAX(Sales[DateSK]) ) )

Most of today's analytical systems incorporate several advanced time management features. In this case, Power Pivot and DAX will not be less, also incorporate them. This set of functions allows us to manipulate periods, which is a very common requirement in BI.

There is a simple but important requirement that we must take into account, to apply them, we need to have a column whose data type is "Date."

The time functions allow us:

We are not going to study them one by one since they are functions that either by their name or because they exist in Excel, are very simple to consult in the help and start using them.

There's another group that returns a set of dates, some of which are

Here is an example that will help us understand the usefulness of this type of function. It is very common that when we are working with a period, we want to see the result that was obtained in the same period of the previous year, for this, one of the functions that we can use is PARALLELPERIOD:

TotalSales:=SUM([AmountSales])

And finally, we'll see a third group that evaluates an expression over some time:

TOTALYTD(Expression, DateColumn, Filter) -- Year

TOTALQTD(Expression, DateColumn, Filter) -- Quarter

TOTALMTD(Expression, DateColumn, Filter) – Month Etc.

Let's see an example of the use of TOTALYTD and TOTALQTD, which allows us to make accumulated sales within the current year, initializing to zero automatically for each year.

TotalSales:=SUM([AmountSales])

TotalCurrentYear:=TOTALYTD([TotalSales]; Date[Date])

In the following image, you can see the result obtained:

Here are some examples which allow us to define an analysis report of the average ticket in our company.

These are the calculations we have used:

TotalSales:=SUM(Sales [AmountSales]) T

otalQuantity:=SUM([Amount])

NoOperations:=COUNTROWS(Sales [Ticket]))

AverageTicket(Amount):=DIVIDE([TotalSales]; [NoOperations])

Average Ticket(Quantity):=DIVIDE([TotalAmount]; [No.Operations])

Accumulated Sales:=CALCULATE([TotalSales]; FILTER(ALL(Date); Date[Date] <= MAX(Date[Date]))

As you can understand, it is not enough to do the calculations in DAX. Once we get to this point, we have to be able to show them with the greatest possible visual impact. We'll go deeper into these visualization techniques later on, knowing the possibilities of the tools at our disposal and showing multiple examples of use.

If we have previously worked on the development of data models using Power Pivot or SSAS Tabular when we have to do this same task, but from Power BI, we will surely notice the lack of an important feature such as the possibility of reviewing the results of the measures that we create in the model from an Excel pivot table.

Take as an example; the data model based on the Population Register outlined in chapter 7 on the design of a population data model of this book.

This chapter set out the creation of a set of measures for the model, which allowed us to carry out the population analysis by the absolute number of inhabitants and through a series of structure indicators. For the example that concerns us at the moment, let us keep the Total Population measure and the structure indicators by Age, assuming that we have not yet added them to the model.

Total Population =

 COUNTROWS  (  PopulationCensus  )

 

Youth Population =

CALCULATE  (  [Total Population], Age [AgeID]> =  0  && Age [AgeID] <=  14  )

 

Adult Population =

CALCULATE  (  [Total Population], Age [AgeID]> =  15  && Age [AgeID] <=  64  )

 

Major Population =

CALCULATE  (  [Total Population], Age [AgeID]> =  65  )

Let's start by adding the Total Population and Young Population measures.

Then, using the means offered by Power BI, if we wanted to verify the validity of the measurements, we would use Visual Card and Multi-row card controls, to observe the total values, or Table or Matrix controls, to analyze the numbers disaggregated by the elements of some of the search tables of the model.

However, this procedure will seem somewhat cumbersome to carry out if we already have experience in the development of multidimensional cubes or tabular models, in which, simply by opening Excel and connecting to the model, we already have a dynamic table with which to check the measures we have created.

For this reason, we propose a technique explained by in his blog, where in addition to Power BI, the use of Excel and DAX Studio is required, working all the elements in a coordinated way to achieve our goal.

If the reader usually works with tabular models, whether from SSAS Tabular, Power Pivot or Power BI, this utility can’t be missing in your toolbox, and not only by the technique that we will explain below, but by a large number of features it offers, and that will surely make your work a little easier.

First, we will download DAX Studio from the available on its website and proceed to install it on our machine.

Then we will open the Power BI file on which we are developing our data model, and in which we have already added the measures PopulationTotal and PopulationYoungBoys.

Next, we will run DAX Studio, which, as a first step, will ask us to connect to a valid data source. In our case, having already opened a Power BI file, DAX Studio will detect this situation, offering to connect with it.

Regarding the other connection modes, if it is an SSAS Tabular data source, it is very evident, since we only have to enter the server and instance name.

Finally, the connection against a Power Pivot model is made directly in the Excel file that contains the model, from the Add-ons tab, by clicking on the DAX Studio option.

Returning to the current scenario, Power BI acts as an SSAS server, which we can check from the Windows Task Manager.

This service is connected to DAX Studio to access the contents of the data model, which allows us, among other features, to execute DAX queries, obtain information about the execution times of the queries, execute dynamic administration views (DMV - Dynamic Management Views), and a very important fact: the address of connection to the service, which is shown in the lower right part of the window.

The next step will be to open Excel and establish a connection against the Analysis Services data source represented by the Power BI file, entering in the window of the data connection wizard the address of the service obtained from DAX Studio.

Accepting the rest of the steps in this wizard will result in a dynamic table, connected to the Power BI model, from which it will be easier to review your measurements.

Let us now continue with the development of the model by adding the two measures of structure indicators pending to include: Adult Population and Major Population.

Now we will return to Excel and select the Update option, located in the Data options group, PivotTable Tools tab | Analyze, which will produce an update of the contents of the pivot table, incorporating the new measures to it.

Reconnect with the Data Model in a New Work Session

It is important to mention that, after finishing our work session, closing Power BI, DAX Studio, and Excel, when reopening Power BI, a new address will be generated for the SSAS service, which we can verify when running DAX Studio. Due to this, if we try to validate the measures of the model using the same Excel file with which we had previously worked, we will get an error.

Accepting this warning window will open a connection window in which we can re-enter the new address. After confirming this and the next step, we will reconnect successfully from Excel, and the pivot table will be operational again.

Get Model Measurements from DAX Studio

Another small but useful trick of DAX Studio lies in the possibility of obtaining a list of the measures that make up our model, with diverse information among which is its source code. For this, in the query panel, we will write the following sentence.

SELECT  *  FROM $ SYSTEM. MDSCHEMA_MEASURES

We can also drag and drop this dynamic management view from the DMV tab located in the left pane of the DAX Studio window.

Then we will click on the Run option (or press F5), obtaining in the results panel all the information related to the model measurements.

The MDSCHEMA_MEASURES view returns a large number of columns, of which we may only need some very specific ones, such as the name of the measure and the expression used to create it. Also, we also obtain a row corresponding to the default measure of the model, which we can exclude using the MEASURE_AGGREGATOR column. Therefore, we can refine the previous query as follows.

SELECT

MEASURE_NAME, EXPRESSION

FROM $ SYSTEM. MDSCHEMA_MEASURES

WHERE MEASURE_AGGREGATOR =  0

The Power BI model that we are using for our example consists of a single table of data or measurements, but if we develop more complex models, in which several measurement tables intervene, it is advisable to add the MEASUREGROUP_NAME column to the previous query, which we reports the table name to which the measure belongs.

 

or weekly basis. This saves a lot of time and manual effort.

Назад: Chapter Three: Introduction to DAX
Дальше: Chapter Five: Power BI and Power Query (M Language)