Real-time streaming allows you to stream the data and get up to date dashboards instantly. Any dashboard or visual which is made in Power BI is eligible to be displayed as well as updated in real-time. Streaming data sources are social media posts, factory sensors, service metrics, and more. In fact, all the devices having time-sensitive data in it are used for transmission.
Here you will learn how to set up the streaming in real-time. Let us first see the various types of real-time datasets designed to be used for showing in dashboards and tiles.
Basically, there are three datasets which are: Push dataset, Streaming dataset, and PubNub dataset.
Let us discuss all the three datasets one by one.
This dataset is used to push the data in Power BI. When you create a dataset in Power BI, a new database gets created in the service which is used to store data. Because you have this underlying database ready to store the Power BI data, you can create reports using the same. The visuals and the reports will be similar to the normal Power BI reports and can be used to create alerts, pin down to dashboards, etc.
As soon as the report gets created with the help of a push dataset, all its visuals are eligible to be pinned down on a dashboard in Power BI. All those visuals will be updated and synced in real-time if there is any change in the data. The dashboard triggers a refresh each time new data comes in the service.
Couple of points to note here are:
By using the streaming dataset, you will be pushing the data in Power BI service. But there is one difference with respect to push dataset. Power BI stores your data in a temporary cache, and it gets expired very quickly. The cache is mainly used to show the visuals having a transient history like a line chart, which expires after 60 minutes.
There isn’t any underlying database in the streaming dataset, and thus you will not be able to create report visuals with data present in the stream. Also, you will not be able to use certain features like Power BI visuals, data filtering, etc.
There is only one option for visualizing the streaming dataset. You need to add a tile first and then make use of a streaming dataset in the form of a “custom streaming data source.” In order to display real-time data quickly, customized streaming tiles are optimized. There exists very little inactivity time in between the data getting pushed in Power BI service and visual getting updated. This is because data is not required to be read or entered into the database.
In practical scenarios where it is important to reduce the inactivity time between data getting pushed and when the data gets visualized, streaming datasets are used. As a best practice in Power BI, you need to push the data in such a format which gets visualized as it is and doesn’t require any modifications. The examples for such types of data are pre-calculated averages and temperatures.
PubNub SDK is used by the Power BI web client in order to read directly from already existing PubNub stream. This ensures that the data is not stored in the Power BI service at all. The PubNub streaming dataset doesn’t have any underlying dataset associated with it, and thus you will not be able to create report visuals using the data flowing in the stream. Also, you will not be able to use certain features like Power BI visuals, data filtering, etc.
In order to visualize the PubNub dataset, you need to add a tile on your dashboard in Power BI and then configure a data stream (PubNub). In order to display the real-time data quickly, tiles customized on the PubNum dataset are optimized. There exists very little inactivity time in between the data getting pushed in Power BI service and visual getting updated because Power BI is connected with PubNum data stream directly.
As we have understood the real-time datasets in the above section, let us now move forward on how to push the data in datasets. You can use three different methods for pushing data in the dataset. They are mentioned as below:
Now let’s understand them one by one.
REST APIs are used in Power BI for creating and sending the data in push datasets followed by sending it to streaming datasets. If you have created a dataset with the help of this method, a defaultMode flag is used to indicate if the dataset is streaming or pushing. If there is no flag set in the system, it means it is a push dataset by default.
If the value of defaultMode is set as pushStreaming, it means that the dataset is of streaming and push type. This gives a dual benefit of both of them. One thing to note here is that if the defaultMode is set to pushStreaming value, then the restrictions for both datasets apply to it. For example: In the streaming dataset, the request has to be more than 15Kb in size, whereas, in the push dataset, the request has to be less than 16MB in size. When both of them are validated and are in the required range, the request will be successful. Finally, the data will get updated in the push dataset. But at the same time, if there are any streaming tiles in the dataset, they will fail temporarily.
When the dataset is completely created, you can use the REST APIs for pushing the data. It is done with the help of PostRows API in Power BI. All requests going to the APIs are highly secured with Azure OAuth.
You can easily set up a dataset with the help of the API method in Power BI service.
While setting up a new streaming dataset, you need to enable the slider for “Historical data analytics,” which plays an important role, as shown in the below image.
The historical data analysis slider is by default in the disabled mode, which makes your dataset a type of streaming dataset. Once you enable the same in the slider, the dataset will function as both push dataset and streaming dataset. One thing to note here is that Azure AD authenticated is not a mandatory requirement for streaming datasets that have been created using this method. Here, the dataset creator will be receiving a link having a rowkey. It will act as an authorization method for pushing the data in the dataset.
Power BI can be added like an output in the Azure Stream Analytics for visualizing the data streams in real-time. Let us see how this process actually happens.
ASA (Azure Stream Analytics) takes the help of REST APIs for creating the output data stream for Power BI. Its defaultMode value is set as pushStreaming to make the dataset use features of both streaming datasets and push dataset. Apart from this, ASA will also set another flag called retentionPolicy to the value basicFIFO. This setting makes sure that the push dataset has the ability to store the first 2 Lakh rows in it. As soon as the upper limit is exceeded, rows will be dropped in the order of First In First Out. It means the first row will get dropped from the dataset if there is a new record after 2 Lakh rows. The same process is repeated for any future rows coming in the dataset.
Note: If the query results from ASA are reaching to Power Bi very fast like one or two results every second, ASA will start the batching process to combine those outputs in one single request. Due to this, the streaming tile limit might exceed, and the tiles will no longer render in Power BI. So to avoid these situations, you need to slow down the output rate reaching Power BI. You can do it by setting the maximum value to 10 seconds instead of doing it every second.
You just learn the three types of datasets and how you can use them for pushing data in the dataset. Now let us see how you can set up the real-time streaming datasets. In order to start the real-time streaming in Power BI, there are two options:
No matter which option you choose, you are required to configure the Streaming data correctly in the Power BI system. In order to configure this setup, open a new or existing dashboard. Then click on “Add a tile,” followed by choosing the option “Custom data streaming.” You can also use the option “Manage data” if the real-time streaming data is not configured yet.
Now a new page will open, and you need to provide the endpoint for the streaming dataset in the text box shown on the screen. If there is no streaming dataset created by you, click on the (+) icon present at the top right corner of the screen. It will provide you all the options for creating a streaming dataset.
After clicking on the (+) icon, you will get two options to choose from. You need to choose the one which suits you best. The options are as follows:
Let us see understand both the options one by one.
As you already know that PubNub has been integrated within the Power BI Desktop, you will be using the low-dormancy data streams. If it is not present, you can create it as well. After choosing the option PubNub, click on Next. A new window will appear.
The PubNub data streams have a huge volume. Thus, you need to see if it will be worthy in its original form to store data and use it for analytics. In order to use the Power BI Desktop for historical data analysis using PubNub data, you need to aggregate the raw stream first. Then send the same to Power BI for processing. You can do all this using the ASA tool.
The latest updates and patches in Power BI REST API makes it a very good choice for real-time streaming datasets. First, you have to choose the API using the “New streaming dataset” window. A new window will open that has all the entries required for connecting and using the endpoint for the data stream.
You can enable the “Historical Data Analysis” option for storing the data transmitted via the data stream into Power BI. Once this data is collected, you can easily use it for reporting.
Once the data stream is created successfully, you will get a URL endpoint for REST API. This URL can be used by your application for calling the POST requests in order to push the data into streaming datasets built by you. The request body needs to match with the JSON while calling the POST requests in the Power BI service. Let us say that you can wrap up the JSON objects in the form of an array object.
Let us see the functioning of real-time streaming with the help of an example. A public stream available for PubNub has been used here. You need to follow the below steps: