“What is data management?”. I guess many people will (at least I think I will) answer “em… data management is managing data, right?” at the same time swearing in their heads that “what a stupid question!”.
However, if I was asked this question in a job interview, I guess I’d better to provide a bit longer answer, such as the one given by DAMA cited below if I could ever memorise it.
Data Management is the development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles.DAMA-DMBOK
If the interviewers asked me to elaborate in further detail, it could be a challenge as there are so many facets and aspects of Data Management. Many interdependent functions with their own goals, activities, and responsibilities are required in data management. For a data management professional, it is difficult to keep track of all those components and activities involved in data management.
Fortunately, DAMA developed the DMBOK framework, organising data knowledge areas in a structured form, that enables data professionals to understand data management comprehensively.
I have recently been diving into the heavy reading DAMA DMBOK book (“heavy” is in its literal manner, the book weighs 1.65kg!). I actually recommend all data professionals to give a read of this book. It is not only able to connect the dots in your knowledge network to have a comprehensive understanding of data management but also to provide a common language enabling you to communicate in the data world (instead of just nodding and smiling in a meeting when hearing some data jargon.
DAMA DMBOK framework defines 11 functional areas of data management.
As you can see from the DAMA Wheel above, Data Governance is at the heart of all the data management functional areas. Data governance provides direction and oversight for data management to ensure the secure, effective, and efficient use of data within the organisation. The other functional areas include:
- Data Architecture – defines the structure of an organisation’s logical and physical data assets and data management processes through the data lifecycle
- Data Modelling and Design – the process of discovering, analysing, representing and communicating data requirements in the form of data model
- Data Storage and Operation – the process of designing, implementing, and supporting data storage
- Date Security – ensuring data is accessed and used properly with data privacy and confidentiality are maintained
- Data Integration and Interoperability – the process of designing and implementing data movement and consolidation within and between data sources
- Document and Content Management – the process of managing data stored in unstructured medias
- Reference and Master Data – the process of maintaining the core critical shared data within the organisation
- Data Warehousing and Business Intelligence – the process of planning, implementing and controlling the processes for managing the decision supporting data
- Metadata – managing the information of the data in the organisation
- Data Quality – the process of ensuring data to be fit to use.
Based on the functional areas defined by the DAMA Wheel, Peter Aiken developed the DMBOK pyramid that defines the relation between those functional areas.
From the DMBOK pyramid, we can see the top of the pyramid is the golden function that is the most value-added for the business. However, the DMBOK pyramid reveals that the data analytics is just a very small part in an organisation’s data system. To make the data analytics workable, a lot of other functions need to work and collaborate seamlessly to build the foundation.
As the DMBOK pyramid shows, data governance is at the bottom that makes the ground foundation for the whole data system. Data architecture, data quality and metadata make up another layer of logical foundation on top of the data governance. The next level of upper layer includes data security, data modelling & design, and data storage & operations. Based on that layer, data integration & interoperability function can work on moving and consolidating data for enabling the functions at the upper layer, data warehousing/business intelligence, reference & master data, and documents & contents. From this layer, the functions start to be business faced. That also means the business cannot see the functions that need to run underneath.
The practical contribution from DMBOK pyramid is to reveal the logical progression of steps for constructing a data system. The DMBOK pyramid defines four phases for an organisation’s data management journey:
Phase 1 (Blue layer) – An organisation’s data management journey starts from purchasing applications that include database capabilities. That means the data modelling & design, data storage, and data security are the functions to be in place at first.
Phase 2 (Orange layer) – The organisation starts to feel the pains from bad data quality. To improve data quality, reliable metadata and consistent data architecture are essential.
Phase 3 (Green layer) – Along with the developments of the data quality, metadata and data architecture, the needs of structural supports for data management activities starts to appear that reveals the importance of a proper data governance practice. At the same time, data governance enables the execution of strategic initiatives, such as data warehousing, document management, and master data management.
Phase 4 (Red layer) – well-managed data enables advanced analytics.
The four phases of DMBOK seem to make sense, however, from my own experiences, organisations often go on different routes, mostly starting to rush on high-profiling initiatives such as data warehousing, BI, machine learning when no data governance, data quality, metadata etc. is in place.