Category: Azure Data Platform

SSIS in Azure #2 – Deploy SSIS Packages to Azure-SSIS Integration Runtime in ADF V2

In the first blog post of the SSIS in Azure series, I gave a demonstration on how to create SSIS packages to move data in cloud, using a common use case that periodically ingests data from Azure SQL database to Azure Data Lake Store.  In the pre-ADF V2 era, we can only deploy SSIS packages … Continue reading SSIS in Azure #2 – Deploy SSIS Packages to Azure-SSIS Integration Runtime in ADF V2

Anomaly Detection with Azure Stream Analytics

Anomaly detection is a very common use case in IoT related deployments. A new ANOMALYDETECTION operator has been recently added into Azure Stream Analytics and is currently at public preview. ANOMALYDETECTION operator detects anomalies based on Exchangeability Martingales (EM) that supports online test of the exchangeability of a sequence of event values. When the distribution of the sequence … Continue reading Anomaly Detection with Azure Stream Analytics

SSIS in Azure #1 – Periodically Ingesting Data from SQL Database into Azure Data Lake using SSIS

*The source code created for this blog post is located here. The low cost, schema-less and large column attributes of Azure Data Lake Store along with the large number of supported analytic engines (e.g., Azure Data Lake Analytics, Hive and Spark) makes it a prefect store-everything repository for enterprise data. We can offline the copies of business … Continue reading SSIS in Azure #1 – Periodically Ingesting Data from SQL Database into Azure Data Lake using SSIS

Azure Stream Analytics Patterns & Implementations

Thanks to the increased popularity of IoT and social networks, steaming analytics has become a hot topic and attracted more and more attentions in the data analytics community. Many people (e.g., this and this) believe streaming analytics is the future that will take over the use cases that are traditionally targeted by batch-oriented analytics. Azure … Continue reading Azure Stream Analytics Patterns & Implementations

End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)

This is the last part of the blog series demonstrating how to build an end-to-end ADF pipeline for data warehouse ELT. Introduction & Preparation Build ADF pipeline for dimension tables ELT Build ADLA U-SQL job for incremental extraction of machine cycle data Build ADF pipeline for fact table ELT In the previous part we created … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)

End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)

This is the third part of the blog series to demonstrate how to build an end-to-end ADF pipeline for data warehouse ELT. The part will describe how to build an ADLA U-SQL job for incremental extraction of machine cycle data from Azure Data Lake store and go through the steps for scheduling and triggering the … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)

End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 2)

This is the second part of the blog series to demonstrate how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and loading to a star-schema data warehouse database with considerations on  SCD (slow changing dimensions) and incremental loading. Introduction & Preparation Build ADF pipeline for dimensional tables … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 2)

End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)

This blog series demonstrates how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and load to a star-schema data warehouse database with considerations of  SCD (slow changing dimensions) and incremental loading. The final pipeline will look as: The machine cycle records will be load from the csv … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)

Trigger Azure Analysis Service Processing in Azure Data Factory

There is one important feature missing from Azure Data Factory. In SSIS, at the end of the ETL process when the new data has been transformed and load into data warehouse, the SSAS processing task can be run to process the cube immediately after the new data has flow into the data warehouse. However, Azure … Continue reading Trigger Azure Analysis Service Processing in Azure Data Factory

Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics

Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics

Event Hub+Azure Streaming Analytics+Power BI Real-Time Streaming is the recommended approach from Microsoft for IoT hot-path analytics. The combination of those techniques provides a simple and efficient way to implement streaming analytics. However, I did meet some issues with this approach when designing hot-path analytics solutions for IoT projects. Azure Streaming Analytics does not support … Continue reading Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics