Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics

Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics

Event Hub+Azure Streaming Analytics+Power BI Real-Time Streaming is the recommended approach from Microsoft for IoT hot-path analytics. The combination of those techniques provides a simple and efficient way to implement streaming analytics. However, I did meet some issues with this approach when designing hot-path analytics solutions for IoT projects.

  1. Azure Streaming Analytics does not support dynamic reference data join

Azure Streaming Analytics can only join static reference data stored in Azure Blob storage. The reference data file is load when an Azure Streaming Analytics job started. The update of the reference data is through an ETL process that periodically transformed and copied reference data to Azure Blob storage. That could cause big problems with some IoT projects that require frequently update of reference data. For example, for equipment hiring or industrial vehicle rent business that can charge customers based on the equipment or vehicle usage monitored by IoT devices, the equipment or vehicle can be transferred from customer to customer frequently. If the streaming analytics solution cannot pick up the change of customer reference data timely, the business cannot get accurate usage measure for each customer.

2. Limitations with Power BI real-time streaming

Azure Stream Analytics outputs data stream to Power BI stream dataset through Power BI REST APIs and allows report authors to build real-time dashboards. However, the limitations with Power BI stream dataset could prevent it to be adopted for many use cases. First, if Azure Stream Analytics produce rapid output to Power BI (Microsoft define the “rapid” as once or twice per second), the output will be batched into a single request that may cause the request size to exceed the streaming tile limit. Second, the default retentionPolicy set for Power BI is basicFIFO that supports 200,000 rows data size. When the 200,000 rows limit is reached, rows are dropped in the FIFO fashion (I found there is a Power BI Rest API endpoint for changing the setting to none, and I have given it a try but not work for me)

POST https://api.powerbi.com/v1.0/myorg/datasets?defaultRetentionPolicy={None| basicFIFO}

The limitations with data output rate and stream dataset size may not be a big problem for a PoC deployment with small number of IoT devices. However, in real-world IoT projects, thousands equipment/machines can be managed, and each of them may be equipped with tens of sensors. In this case, there will be only a dozen of rows for each sensor stored in the dataset that are far less enough for building most of types charts in Power BI.

3. Power BI dashboard does not support filtering

Dynamic streaming flow (animation) is the selling point of the Azure Streaming Analytics+Power BI real time streaming approach, and for many projects, it is the must-to-have. However, dynamic streaming flow can only be visualised on Power BI dashboards, and this feature is not supported for Power BI reports. However, the problem is that filtering is not support by Power BI dashboards. That means either you have to create one dashboards for each machine or you have to add all machines into one dashboard if you want to have the dynamic streaming flow visualisation. Obviously, it is not practical for most real-world IoT projects that need manage over one hundred machines.

4. Only a limited set of Power BI visuals supports streaming dataset

At this moment, only a very limited set of Power BI visuals (Card, Line chart, Clustered bar chart, Clustered column chart and Gauge) supports streaming dataset.

12.PNG

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s