Tag: Data Quality / dqops

dqops Data Quality Rules (Part 2) – CFD, Machine Learning

dqops Data Quality Rules (Part 2) – CFD, Machine Learning

The previous blog post introduces a list of basic data quality rules that have been developed for my R&D data quality improvement initiative. Those rules are fundamental and essential for detecting data quality problems. However, those rules have existed since a long, long time ago and they are neither innovative nor exciting. More importantly, those … Continue reading dqops Data Quality Rules (Part 2) – CFD, Machine Learning

Data Quality Improvement – Rule-Based Data Quality Assessment

Data Quality Improvement – Rule-Based Data Quality Assessment

As discussed in the previous blog posts in my Data Quality Improvement series, the key for successful data quality management is the continuous awareness and insights of how fit your data is being used for your business. Data quality assessment is the core and possibly the most challenging activity in the data quality management process. … Continue reading Data Quality Improvement – Rule-Based Data Quality Assessment

Data Quality Improvement – Conditional Functional Dependency (CFD)

Data Quality Improvement – Conditional Functional Dependency (CFD)

To fulfil the promise I made before, I dedicate this blog post to cover the topic of Conditional Functional Dependency (CFD). The reason that I dedicate a whole blog post to this topic is that CFD is one of the most promising constraints to detect and repair inconsistencies in a dataset. The use of CFD … Continue reading Data Quality Improvement – Conditional Functional Dependency (CFD)

Data Quality Improvement – Data Profiling

This is the second post of my Data Quality Improvement blog series. This blog post discusses the data profiling tasks that I think are relevant to data quality improvement use cases. For anyone who has ever worked with data, she or he must has already done some sort of data profiling, either using a commercial … Continue reading Data Quality Improvement – Data Profiling

Data Quality – 80:20 Rule and 1:10:100 Rule

I came across two data quality rules from Martin Doyle's blog today. Martin Doyle is a data quality improvement evangelist and an industry expert on CRM. I found, to a certain extent, those data quality rules provide some kind of theoretical supports to some of my ideas with data quality improvements. 80:20 Rule The 80:20 … Continue reading Data Quality – 80:20 Rule and 1:10:100 Rule

Data Quality Improvement – Set the Scene Up

In this blog series I plan to write about data quality improvement from a data engineer's perspective. I plan this blog series to cover not only data quality concepts, methodologies, procedures but also to case study the architectural designs of some data quality management platforms and deep dive into technical details for implementing a data … Continue reading Data Quality Improvement – Set the Scene Up

Our Data Quality is Good, Nothing Breakdown

Boss: Our data quality is good, nothing breakdown IT: Our data quality is good, there are some unimportant known issues, but all under control BI Developer: All data is from source systems, the quality should be good. Hay, look, how cool is the dashboard I built Business Users: Once again, those reports don't make sense … Continue reading Our Data Quality is Good, Nothing Breakdown

What Makes Me Become a Data Quality Enthusiast

Data Quality is Important Most of the time I don't think I am an absolutist, however, I found I became more and more certain that data quality is the root of all evil. Not only bigger portion of project time should be allocated to data quality management, but also a type of lean, agile and … Continue reading What Makes Me Become a Data Quality Enthusiast