This blog post deep dive into the Azure Event Hubs Connector for Apache Spark, the open-source streaming data source connector for integrating Azure Event Hubs with Spark Structured Streaming. The Azure Event Hubs Connector implements the Source and Sink traits with the EventHubSource and the EventHubSink for receiving streaming data from or writing streaming data … Continue reading Spark Structured Streaming Deep Dive (4) – Azure Event Hub Integration
Tag: Real-Time Analytics
Spark Structured Streaming Deep Dive (3) – Sink
This blog post discusses another main component in the Spark Structured Streaming framework, Sink. As the KafkaSink will be covered when discussing the Spark-Kafka integration, this blog post will focus on ForeachBatchSink, ForeachWriteTable, FileStreamSink and DeltaSink. Spark Structured Streaming defines the Sink trait representing the interface for external storage systems which can collect the results … Continue reading Spark Structured Streaming Deep Dive (3) – Sink
Spark Structured Streaming Deep Dive (2) – Source
As mentioned in the last blog discussing the execution flow of Spark Structured Streaming queries, the Spark Structured Streaming framework consists of three main components, Source, StreamExecution, and Sink. The source interfaces defined by the Spark Structured Streaming framework abstract the input data stream from the external streaming data sources and standarise the interaction patterns … Continue reading Spark Structured Streaming Deep Dive (2) – Source
Spark Structured Streaming Deep Dive (1) – Execution Flow
From this blog post, I am starting to write about streaming processing, focusing on Spark Structured Streaming, Kafka, Flink and Kappa architecture. This is the first blog post of the Spark Structured Streaming deep dive series. This blog post digs into the underlying, end-to-end execution flow of Spark streaming queries. Firstly, let's have a look … Continue reading Spark Structured Streaming Deep Dive (1) – Execution Flow
Anomaly Detection with Azure Stream Analytics
Anomaly detection is a very common use case in IoT related deployments. A new ANOMALYDETECTION operator has been recently added into Azure Stream Analytics and is currently at public preview. ANOMALYDETECTION operator detects anomalies based on Exchangeability Martingales (EM) that supports online test of the exchangeability of a sequence of event values. When the distribution of the sequence … Continue reading Anomaly Detection with Azure Stream Analytics
Azure Stream Analytics Patterns & Implementations
Thanks to the increased popularity of IoT and social networks, steaming analytics has become a hot topic and attracted more and more attentions in the data analytics community. Many people (e.g., this and this) believe streaming analytics is the future that will take over the use cases that are traditionally targeted by batch-oriented analytics. Azure … Continue reading Azure Stream Analytics Patterns & Implementations
Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics
Event Hub+Azure Streaming Analytics+Power BI Real-Time Streaming is the recommended approach from Microsoft for IoT hot-path analytics. The combination of those techniques provides a simple and efficient way to implement streaming analytics. However, I did meet some issues with this approach when designing hot-path analytics solutions for IoT projects. Azure Streaming Analytics does not support … Continue reading Issues with Azure Streaming Analytics + Power BI Real-Time Streaming for IoT Hot-Path Analytics





You must be logged in to post a comment.