Firstly, I need to clarify that what I am discussing in this blog post is only with ADF Mapping Data Flow instead of the whole ADF service. I am not going to challenge ADF’s role as the superb orchestration service in the Azure data ecosystem. In fact, I love ADF. At the control flow level, … Continue reading Why I Prefer Hand-Coded Transformations over ADF Mapping Data Flow
Tag: Azure Data Factory
Configuration-Driven Azure Data Factory Pipelines
In this blog post, I will introduce two configuration-driven Azure Data Factory pipeline patterns I have used in my previous projects, including the Source-Sink pattern and the Key-Value pattern. The Source-Sink pattern is primarily used for parameterising and configuring the data movement activities, with the source location and sink location of the data movement configured in a … Continue reading Configuration-Driven Azure Data Factory Pipelines
Execute R Scripts from Azure Data Factory (V2) through Azure Batch Service
Introduction One requirement I have been recently working with is to run R scripts for some complex calculations in an ADF (V2) data processing pipeline. My first attempt is to run the R scripts using Azure Data Lake Analytics (ADLA) with R extension. However, two limitations of ADLA R extension stopped me from adopting this … Continue reading Execute R Scripts from Azure Data Factory (V2) through Azure Batch Service
SSIS in Azure #3 – Schedule and Monitor SSIS Package Execution using ADF V2
*The source code created for this blog post can be found here. In the previous blog posts in the SSIS in Azure series, we created a SSIS package to periodically ingests data from Azure SQL database to Azure Data Lake Store and deployed the package in the Azure-SSIS Integrated Runtime. Up to this point, we have … Continue reading SSIS in Azure #3 – Schedule and Monitor SSIS Package Execution using ADF V2
SSIS in Azure #2 – Deploy SSIS Packages to Azure-SSIS Integration Runtime in ADF V2
In the first blog post of the SSIS in Azure series, I gave a demonstration on how to create SSIS packages to move data in cloud, using a common use case that periodically ingests data from Azure SQL database to Azure Data Lake Store. In the pre-ADF V2 era, we can only deploy SSIS packages … Continue reading SSIS in Azure #2 – Deploy SSIS Packages to Azure-SSIS Integration Runtime in ADF V2
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)
This is the last part of the blog series demonstrating how to build an end-to-end ADF pipeline for data warehouse ELT. Introduction & Preparation Build ADF pipeline for dimension tables ELT Build ADLA U-SQL job for incremental extraction of machine cycle data Build ADF pipeline for fact table ELT In the previous part we created … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)
This is the third part of the blog series to demonstrate how to build an end-to-end ADF pipeline for data warehouse ELT. The part will describe how to build an ADLA U-SQL job for incremental extraction of machine cycle data from Azure Data Lake store and go through the steps for scheduling and triggering the … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 2)
This is the second part of the blog series to demonstrate how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and loading to a star-schema data warehouse database with considerations on SCD (slow changing dimensions) and incremental loading. Introduction & Preparation Build ADF pipeline for dimensional tables … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 2)
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)
This blog series demonstrates how to build an end-to-end ADF pipeline for extracting data from Azure SQL DB/Azure Data Lake Store and load to a star-schema data warehouse database with considerations of SCD (slow changing dimensions) and incremental loading. The final pipeline will look as: The machine cycle records will be load from the csv … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 1)
Trigger Azure Analysis Service Processing in Azure Data Factory
There is one important feature missing from Azure Data Factory. In SSIS, at the end of the ETL process when the new data has been transformed and load into data warehouse, the SSAS processing task can be run to process the cube immediately after the new data has flow into the data warehouse. However, Azure … Continue reading Trigger Azure Analysis Service Processing in Azure Data Factory
You must be logged in to post a comment.