Anomaly detection is a very common use case in IoT related deployments. A new ANOMALYDETECTION operator has been recently added into Azure Stream Analytics and is currently at public preview. ANOMALYDETECTION operator detects anomalies based on Exchangeability Martingales (EM) that supports online test of the exchangeability of a sequence of event values. When the distribution of the sequence … Continue reading Anomaly Detection with Azure Stream Analytics
Author: Linxiao Ma
Power Query – Extract Multiple Tags Stored in a Single Text Field
Problem It is not rare to see that multiple attributes are stored in a single text field especially for tagging enabled applications where an unfixed number of tags may associated with an article or post. Those tags are often stored in a single text filed with a delimiter to separate them. When reporting, we often … Continue reading Power Query – Extract Multiple Tags Stored in a Single Text Field
Power Query – Parameterised Files Loading from Azure Data Lake Store within a Given Date Range
Power BI now supports data load from Azure Data Lake Store. We can connect to a folder in the Azure Data Lake Store and load all files from that folder. However, we often don't want to or aren't able to load all the files in the Azure Data Lake Store folder into Power BI due to … Continue reading Power Query – Parameterised Files Loading from Azure Data Lake Store within a Given Date Range
SSIS in Azure #1 – Periodically Ingesting Data from SQL Database into Azure Data Lake using SSIS
*The source code created for this blog post is located here. The low cost, schema-less and large column attributes of Azure Data Lake Store along with the large number of supported analytic engines (e.g., Azure Data Lake Analytics, Hive and Spark) makes it a prefect store-everything repository for enterprise data. We can offline the copies of business … Continue reading SSIS in Azure #1 – Periodically Ingesting Data from SQL Database into Azure Data Lake using SSIS
Scaffolding Azure Machine Learning Experiments
*please download the source code here Microsoft has released the public preview of their newest data science service, Azure Machine Learning, that contains a collection of components to support the end-to-end machine learning solution. The Azure Machine Learning Workbench and the Azure Machine Learning Experimentation service are the two main components offered to machine learning practitioners … Continue reading Scaffolding Azure Machine Learning Experiments
Exploratory Data Analysis in Python
I have written a Jupyter notebook describing the Exploratory Data Analysis using Python as shown below:
Azure Stream Analytics Patterns & Implementations
Thanks to the increased popularity of IoT and social networks, steaming analytics has become a hot topic and attracted more and more attentions in the data analytics community. Many people (e.g., this and this) believe streaming analytics is the future that will take over the use cases that are traditionally targeted by batch-oriented analytics. Azure … Continue reading Azure Stream Analytics Patterns & Implementations
Building Power BI Memory Usage Dashboard using DMV
As the VertiPad engine used in Power BI is an in-memory data analytical engine, the key to optimise your Power BI report performance is to reduce the memory usage of your data model. A smaller data model not only increase the data scan speed but also allow you to processing a larger dataset with the … Continue reading Building Power BI Memory Usage Dashboard using DMV
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)
This is the last part of the blog series demonstrating how to build an end-to-end ADF pipeline for data warehouse ELT. Introduction & Preparation Build ADF pipeline for dimension tables ELT Build ADLA U-SQL job for incremental extraction of machine cycle data Build ADF pipeline for fact table ELT In the previous part we created … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 4)
End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)
This is the third part of the blog series to demonstrate how to build an end-to-end ADF pipeline for data warehouse ELT. The part will describe how to build an ADLA U-SQL job for incremental extraction of machine cycle data from Azure Data Lake store and go through the steps for scheduling and triggering the … Continue reading End-to-End Azure Data Factory Pipeline for Star Schema ETL (Part 3)
You must be logged in to post a comment.