This is the second part of the blog series to demonstrate how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and loading to a star-schema data warehouse database with considerations on SCD (slow changing dimensions) and incremental loading. Introduction & Preparation Build ADF pipeline for dimensional tables … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 2)
Category: Data Platform & Lakehouse
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)
This blog series demonstrates how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and load to a star-schema data warehouse database with considerations of SCD (slow changing dimensions) and incremental loading. The final pipeline will look as: The machine cycle records will be load from the csv … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)
Trigger Azure Analysis Service Processing in Azure Data Factory
There is one important feature missing from Azure Data Factory. In SSIS, at the end of the ETL process when the new data has been transformed and load into data warehouse, the SSAS processing task can be run to process the cube immediately after the new data has flow into the data warehouse. However, Azure … Continue reading Trigger Azure Analysis Service Processing in Azure Data Factory
Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics
Event Hub+Azure Streaming Analytics+Power BI Real-Time Streaming is the recommended approach from Microsoft for IoT hot-path analytics. The combination of those techniques provides a simple and efficient way to implement streaming analytics. However, I did meet some issues with this approach when designing hot-path analytics solutions for IoT projects. Azure Streaming Analytics does not support … Continue reading Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics
Handling Across-Day Cycle Issue in Daily Usage Analysis using U-SQL
When analysing daily usage of a machine that can run across days, we need to split the time of a single machine running cycle into right days. As the example below shows, the second machine (D00002) run across two days and the third machine (D00003) run across three days. To analyse the daily usage of … Continue reading Handling Across-Day Cycle Issue in Daily Usage Analysis using U-SQL
Generate Device Cycle Records from Raw Telemetry Message using Azure Data Lake Analytics
The raw telemetry data collected from IoT sensor is normally event-based, e.g., a "Device On" message when the device starts to run, and a "Device Off" message when the device stops. One common data preprocessing task is to transform the raw "On/Off" telemetry data into device cycle records with the start time, end time and … Continue reading Generate Device Cycle Records from Raw Telemetry Message using Azure Data Lake Analytics
Workaround for Building Azure Data Warehouse using Visual Studio
When creating Azure Data Warehouse, I have found the limitation that the Visual Studio SSDT SQL projects does not support Azure Data Warehouse. It causes much pain to the data warehouse development without support on source control and nice code organisation as those offered by SSDT SQL Projects. Fortunately, I have found a trick to … Continue reading Workaround for Building Azure Data Warehouse using Visual Studio
Setup an Azure Dev VM for Testing Power BI + SQL Server 2016 Integration
This blog post walks through the steps to setup an Azure dev vm for testing Power BI + SQL Server 2016 integration: Provision the Azure VM using SQL Server 2016 developer image Configure the SSAS/SSRS services Install development software Install sample databases Integrate SSRS and Power BI Service Install and configure Personal Power BI Gateway (Optional) … Continue reading Setup an Azure Dev VM for Testing Power BI + SQL Server 2016 Integration


You must be logged in to post a comment.