DANIEL JONES
Business intelligence software is capable of turning large volumes of data into analytical and summary dashboards. They are configurable and capable of automatically executing the routine of data collection, treatment1, analysis, and visualization. The term BI began to be used for technology in mid-1958. From the initial letters of business intelligence, it was tied to software capabilities of transforming the way much information was consumed in Hans Peter Luhn's article " ." The computer scientist, who was currently working for IBM, described BI software as automatic systems designed to propagate information to various sectors of the economy. His research, among the developments of more than 80 patents, was initially applied in IBM analysis software.
Nowadays, there are many business intelligence software in the world that provides the basis for the functionalities that software of this category should offer, such as data analysis through calculations using different languages, connection to external sources in various ways and data visualization with specific graphics and with its own design and configuration. Given all of its similarities and differences, one of the most relevant in Power BI is the learning curve.
Perhaps because of the strategy of making some of its resources available in Excel before its release (Power Pivot and Power Query), but certainly not only because of that but because of the very way it was logically architected, the learning of languages and the process of building projects in Power BI is considered by us one of the fastest and most consistent, enabling results at a higher speed than many software in the same category.
Research that supports and shows some of the reasons why we, as business intelligence experts, chose Power BI as the leading BI software is published by Gartner, a leading global company focused on research and consulting in several areas of information technology. Every year an extensive report and graph summarizing it, called the "Magic Quadrant of Analytics and Business Intelligence Platforms," are published with analysis of BI software, being analyzed 15 critical aspects involving infrastructure, data management, analysis and content creation, discovery and sharing, ease of use and return on investment. Power BI is identified as "Microsoft" and is in the top direct quadrant, i.e., the market leaders today.
After downloading, open Power BI Desktop and you will observe each numerical legend commented below.
Power Query Editor is a free add-on created by Microsoft in 2013 for Excel 2010 and 2013 and later also included in Power BI. It is the intuitive tool that allows you to transform, maintain, manipulate, consolidate, merge, enrich, and do much more with your data.
It acts like the ETL (Extract, Transform and Load) tool, extracting data from nearly any source, transforming it and then loading it elsewhere - in this case, Power BI or Excel. The Power Query Editor interface is separate from Power BI, and its window opens when you click Edit Queries. The function language M is its base, coming from the term data mashup, i.e., obtaining and combining data.
There are over 600 functions in M language that, when combined, provide comprehensive variation for data collection, processing, and joining. Its main concept is in the automation of this whole process, being necessary to connect only once with the data source and treat it so that the next updates follow the same pattern. Among the categories of functions available, we have:
Data retrieval | Date and time | Expressions |
Binary | Date, time and time zone | Lists |
Combination | Duration | Lines |
Date | Error handling | Logic |
Replacement | Record | Numerical |
Type | Uri | Text |
Separation | Table | Weather |
In Power BI, we have a separate window, exclusive for data processing. Look below at each of the regions commented on to begin familiarizing yourself with your organization.
Another way to import data into Power BI is by using the Insert Data option, located next to getting Data. In the window shown below, you can copy data from other locations and paste it in, as well as declare it (by typing it in the columns and rows). As this data has no connection to external sources, it will be static and normally only used for fast data entry purposes that need not be changed.
When connecting with data in Power Query, lines of code of language M will be automatically created with the configuration of the connection made. As a Power BI user, you do not need to know the M language to use Power Query. Its interface was developed with the end-user in mind and is filled with buttons that automate transformations, processing, and data creation.
When we need to present a report at a meeting or do some kind of analysis, it is not uncommon to search for the data in various sources such as spreadsheets, text files, databases, or reports exported from systems. Then you have to start processing that data. When done in Excel, for example, this work is done by copying from here and pasting their several formulas in Excel to extract a part of the text, convert the data correctly, macro in VBA to remove unwanted rows, join information from one report with another to this information
Instead of filtering by a manually typed text, we will choose the filter by a parameter. Note that we could have created a new parameter in this window, too, by choosing the last option. Then, just select the parameter in the list next to it and click OK.
The column will be immediately filtered for the value that is stored in the Country parameter. Note the formula in Power Query: each [Country] = Country, i.e., each item in the Country column that matches the text entered in the Country parameter.
We can modify the current parameter by clicking on it and writing, for example, "CA" in the Current Value box. Afterward, just press Enter on the keyboard to save this new text in the parameter.
Before starting the technical studies in DAX, it is important to emphasize the concept of data modeling. This type of project usually involves databases and how to delimit them, obtain them, transform them, and model them. Data modeling simplifies a natural complexity of data and big data and can be defined as a logical process of relationships between different data, enhancing its possible links to meet and respond to business needs.
Data in modeling projects are usually presented in tables. Some of them have detailed operation data, de facto calls (such as sales orders, production orders, invoices). They are the result of pro- cessation of business and usually store numbers that can be calculated and summarized into scalar values that will be distributed in the contexts of visual filters in Power BI.
The fact tables correspond to a specific event and should contain only columns related to that event. For example, sales data for a company can have the granularity daily or even temporal (with the time of purchase) and contain information about the sale, such as quantity, values, method of payment, product purchased, the customer who bought that product and the seller who sold it. It is irrelevant to insert, for example, the date of birth of the customer in this table, since here we are talking about sales facts and not customers.
Other tables in the data model will have descriptive and unique data, the so-called dimension tables. Basically, they describe why, when, where, who, what, and how the events in the tables were recorded. They are able to group several facts into specific attributes or sub-dimensions, categorizing them and making it possible to filter through these similar descriptions among several records in the fact table. In addition to this attribute, they are essential for making the "connection" between fact tables, since they contain descriptions that may be common to them, through the relation of the similar record (in the same row). Size tables can be created from data standardization techniques, which will be explored in the next chapter.
Note that in the customer registration table above, we have the location of this customer in the columns Country, Province, and City, and the employee (seller) responsible for it. These four columns do not need to compose the sales fact table because we will have more than one record (row) of the same customer, repeating the information of the four columns several times and using hardware resources without need. Instead of having all the information in just one table, two have been created: one fact and another dimension, and both need to have a primary and foreign key to connect. In this case, the key (column) that connects the tables is the Client Code.
DAX is an expression language used to perform calculations and queries on created data models, being very similar to Excel functions, however, with a particularly different concept. DAX is an acronym for Data Analysis Expressions. Some experts have given the evolution of the MDX (Multidimensional Expressions) language.
Before we start exploring the language, let's first get to know the platforms where she is present. Besides Power BI, it's also in:
Created by the SSAS team of experts, with the Power Pivot, it is possible to analyze more than 20 million lines inside Excel, something completely impossible to execute previously. And all this, without harming performance, by storing the data and calculations in computer memory. Donald Farmer (2010), one of the top product managers at the time of Power Pivot development, considers the technology to be a new type of Analysis Service designed to put the power of SSAS calculation in the hands of an Excel user.
Due to the popularity of Excel and the robustness of a solution used in SSAS, the DAX language was quickly spread and accepted after the release of Power BI. Its functions are classified into ten categories. Between them:
The tables are stored using xVelocity in-memory technology, also called VertiPaq, which compresses the data in the best possible compression according to its variation, redundancy, classification, and type. Some important features should be observed in Power BI:
Rank Products = RANKX(ALLSELECTED(dCategoriaProduct);[Sum Sales])
The previous expression classified the categories from 1 to 9, according to their ranking based on sales. We use the ALLSELECTED function to consider the ex-suit filter next to the table.
Now let's use this ranking for the accumulated sum of each category. With the help of the CALCULATE function, we will calculate the sales according to the return table of the TOPN function.
Sum Products = CALCULATE([Sum Sales];
TOPN([Rank Products]; ALL(dCategoryProduct);[Sum Sales]; DESC
)
)
In the previous chapters, we have taken the necessary steps to reach this moment: the visualization and the application of the data. We go through the data collection and processing, we load and model it, then we create the necessary calculations, all this process to show coherent, consistent, clean data that answer business questions. Power BI offers dozens of visuals between columns, bars, and stacked, horizontal and vertical, line, area, and combination, ribbon, cascade, and scatter graphics.
Pizza charts, thread, and treemap. Real geographical maps, choleretic and area maps. Funnel, indicators, cards, tables, matrices and segmenters.
Not only those, but there is also a Marketplace with hundreds of visuals created by companies and experts and validated by Microsoft. Most are free, and some have subscription services, like one of the most famous: ArcGIS Maps. And you also have your own environment to create visuals and program yourself in the R language.
Marketplace visuals are installed per Power BI file. This means that when you open a new file, only the default looks will be displayed, and if you constantly use a custom look, you must add it again (or use Power BI template files).
The choice of the correct look, the fields to be displayed on it, the colors, and their elements are determining factors for a data visualization project. Its main objective should be to provide the fastest and simplest reading that meets the informational needs of its customers. Some principles can be used as a guide for building your reports.
When we analyze the total of two figures, for example, sales in 2016 and 2017, we do not immediately know the influences that these figures have suffered to be positive or negative in relation to the previous one, but there are ideal visuals to avoid this composition.
The first of these is the cascade graph, which aims to show the variation between two or more points. For example, if a cascade graph shows the year 2016 and 2017 and the variation in performance over the target. We can interpret that X and Y had great positive participation, but A and B compensated negatively, resulting in 117 thousand in 2016 to 118 thousand in 2017. Looking at these two figures in isolation, we can think that there were only 1,000 sales, but in fact, there was a large internal variation in the country segment.
The waterfall graph has an interesting feature that allows you to use it for many categories. For example, if this company operated in more than 40 countries, we would have 40 columns on the chart, and this would certainly hinder its reading and interpretation.
To get around this problem, you can format the look (by clicking on the "ink roller" symbol) and choose the Split formatting to show only one or more values of those with higher representativity and still add the rest in a category called Other . In the following example, only two of the most representative values are shown.
After performing the previous procedure, to run an automatic filter on the tooltip according to the axis of the visual (where this tooltip will be configured), it is important to select "Keep all filters" in the panel below Filters, called Tooltip (it will only be displayed after defining the page as Tooltip).
After this procedure, go to the main report page and select the look where you want this tooltip to appear. For example, a column graph containing sales values by country was selected. In the Format, the Look panel, under Tool Tip, enable the report page and select the page name with the tooltip just below. In this example, it is called "Details." When performing this procedure, by positioning the mouse over Y, the mini-report will appear with the data filtered for Y.
With the panel displayed, we will navigate to the page created with the order details and click Add. The current stage of the page will be "saved," considering current filters and displays. The name of the indicator created will be "Requests."
With the indicator created, after clicking the "View Orders" button, in the Action configuration, we must inform that the type of it will be the Indicator and that the indicator will be the "Orders."
The values in the axis of the visuals will be classified according to the standard order of the column. If it is formatted as text, it will be rated from A-Z, but if it is numeric or date, from the smallest to the largest. It is very common to have to change the classification from one text column to another than the alphabetical one. The nation of your classification can be modified in modeling.
Power BI Service is a cloud platform used to publish and share reports and dashboards, enabling you to access them over the Internet, configure your automatic update, manage shared workspaces, measure the usability of published reports and dashboards, and many other features.
Your access is via app.powerbi.com, and a user and password will be required to access your workspace. Its interface is completely different from Power BI Desktop because they are not the same version. In fact, they complement each other with several different usabilities. However, they also have similarities, and for example, you can create reports, connect to (limited) databases, and create visuals. Despite these similarities, it is not possible to perform data processing in Power Query or create measurements with DAX.
Its interface can be divided as follows:
Reports and shared dashboards can be viewed in the application to cellular Power BI normally, without limitations or need for special sharing.
To view or remove the access given to users, in the same panel, click on Access and then remove the access by clicking on the options of each user who has access. This setting can also be performed by clicking on the options in this report's dataset and choosing the Manage Permissions option.
A third way accessible only for Pro accounts is to insert reports into SharePoint1. Its use is recommended for those who already use SharePoint in the organizational environment or will start using it for Power BI reports. To start it, the first step is to get the report link to SharePoint (different from the web publishing link), in the File menu 'Insert in SharePoint Online.'
Power BI has applications for Android, Windows, and iOS smartphones, and you can connect to your Power BI account (free or Pro) and consume the created or shared relations (Pro accounts only). They can be found in the official store of each platform (https://powerbi.microsoft.com/mobile/). With them, you can have quick access to reports, receive comparison notifications, update reports, share annotations with others, and typically use filters and pages.
Annotations can be made on the graphs, and you can create callsigns with symbols, use the free brush, and add annotations with text.
As explained in the chapter on Power BI Service, there are two types of data collection: reports and panels. In the Power BI application, it is possible to visualize both, but the panel tends to a better visualization because it automatically organizes the visuals on top of each other, while the report displays exactly the layout developed in Power BI Desktop (horizontal). And while fully functional, its visibility can get too small for smaller screens.
One of the most used software in the world is Excel. And the reason for this is quite clear: its resources and flexibility are extraordinary. Many analyses are performed in Excel, and if structured correctly, we encourage their use for the purposes that the software was created. We see a real movement of many analyses that were previously performed in Excel now being transferred to Power BI. We evaluate that this change occurs naturally since the language of the two software is similar, and their integration is and has been increasingly consolidated.
Power BI is a business intelligence software with all the layers that the BI process needs. Precisely because of its objective of usability, it has its business intelligence stages very well defined and with the flexibility in the necessary measure. Excel, on the other hand, is a spreadsheet. Its flexibility is one of the highest, depending a lot on its user. Both can be used to analyze data, but there are differences in productivity, storage, availability, management, and sharing.