Mastering Master Data Management
What is it really, and is it important?
Circumstances: The problem with the concepts that shape the notion of Master Data Management (MDM) is they just had to make an acronym for it, and that led to it becoming a buzz word (buzzronym?) and then the software providers got involved to productize solutions around it. Naturally the term “MDM” then began to ambiguate and mean something a little different each time it is presented. Then confusion creeps in. That’s where we are.
Don’t get me wrong, there are a lot of important and good software solutions in the data management space. It’s a large and important marketplace and software category. But it’s annoying when the buzzword overuse community…the buzzerati, can’t stick to a simple and relevant definition.
Definition: Let’s have the simplest possible definition of MDM possible. Here goes: “Within an integrated marketplace and any enterprise in which many applications (referring to their databases) track similar kinds of data, MDM becomes an important means to uniquely identify records of common data objects (data with similar table sets) representing identical real world “things” (instances of people, accounts, entities, objects and processes), along with an attempt to standardize and map key reference data that describe the essential nature of the items being tracked. MDM is a necessary effort, best suited to a framework to support application integration, data comprehension and trust in business information.”
Key MDM Components: Okay, that wasn’t simple. But the concept is. The thing to keep in mind is the idea of master data is self-evident and uninteresting in a world where one mega/monolithic system controls all data within a single database. It’s because we live is a best-of-breed world and have to deal with many digital data stores that makes MDM a relevant topic. At root, the MDM practice is 4 things, and it applies to structured data.
1. Identify core data objects that have representation across several business applications (like customers, prospective customers—prospects), addresses, contacts, vendors, partners, contracts, sold assets, products, production lots, production units, fixed assets, campaigns, quotes, orders, invoices, payments, work orders, cases, incidents, employees, job positions, job roles, applied work units, interaction channels—social media, etc.)
2. Determine common unique and natural keys to allow deterministic matching of records, and maintain those keys across processes (like serial numbers, URI’s and unique names for records)
3. Identify the most salient data taxonomies and agree to a common way to describe the data across the different applications (like pick lists)
4. Determine a method of data governance and maintenance, so all development, customizations, integrations, and implementations are using the right data definitions and structures in the right way across all systems; and so that the data is reflective of reality.
In the old days of heavier waterfall type application deployments, and especially in pre-cloud and even pre-thin client systems, the DBA was the person who had more natural control over master data. But today we value loose coupling and innovation and speed to market; we’re agile. So both data and computing are proliferated and distributed.
In truth, most companies do not have enough discipline around MDM; and when there is a need to deal with it, the data is rationalized for “fit” at the time of a system deployment. Then the need becomes acute. But the solution approaches become a little more pragmatic with an attitude of “just-enough” and “just-in-time”. This isn’t really a strategy or a framework.
Advice: Here are the topics I like to focus on when we consider how MDM is useful to our clients.
1. When new systems are being architected, can the senior architects get together and discuss if and how data is to be conformed to common structures with common attributes. And since, that topic is on the table, which system is the “system of record” (SOR) for a kind of data at any point in time along the business processes? If a certain system is the master (the SOR), then can it serve data to other systems that need the data—can it support data distribution in either API style or ETL style?
2. In an integrated enterprise architecture, where SOR’s are established (or at least are valued and in play), how can an edge system request data from the SOR using natural or unique keys to find-and-bind, and “get” the data across to the consuming system? This is important whenever one system holds reliable data, and the other system is used in a way that a new record must be created…but the new record partially replicates data that is already held elsewhere. Is there a shared context that allows for matching?
Consider a CRM that has the notion of a customer (and has bound to instantiated customer ID’s from the ERP/billing system). But another application is being used by a customer service team, and that system receives a call or email from a customer needing service. The customer care system must ask the customer for a context point, perhaps a customer ID, their name and address, or email account. Then the customer care system must ask the CRM master DB to find the record and produce some information that perpetuates the process. The CRM delivers information as needed. This information may be “sacred” and therefore read-only to the edge system. But the new issue or case record may need to be “filed” back to the CRM. The customer service case ID and other case type data will be stored in CRM as read-only, while the case record in the customer service system is being updated over the lifecycle of the case. CRM then talks to ERP and issues sends a service transaction to ERP, such as a work order or credit memo.
In this example, the edge systems may have master data control over the process records, and CRM has control over the customer ID and stable identity information. But they all need to share master data and utilize commonalities for account, contact, address, case, orders, shipments, invoices, etc.
3. Do you have an internal authority to apply taxonomical “types” or categories to master data? For CRM, this is often relationship types, relationship status, business marketplace categories, record sources, territory assignments. The right information workers need to assign the right classifications and manage this over time. Establishing taxonomies can get a little ideological among and between the participants…I caution you not to get too hung up on the whole process.
4. Do you know when a record should be “retired”? when will you inactivate the record to keep it out of the active data set, but not remove BI by deleting historical data? Again, this needs some administrative authority. My mind goes to the movie where John Wick is classified as “incommunicado”. Who does that?
5. When do you replicate data across applications vs. when do you use API calls to transfer data in specific data sets on-demand for limited use in the consuming system?
6. How do you perpetually maintain data quality and integrity? Do you even have a limited definition of a complete record for key data sets? How do you eliminate duplicates? How do you track changes to reference data? How do you spot defects and remediate them?
7. And importantly, do you have a change framework, where you can keep an eye on MDM while moving your applications ever forward? How do you preserve system responsiveness, agility, and prevent your systems approaches from being byzantine and rutted in procedural malaise for the sake of MDM? Are you working to prevent over-reacting to the need for a right sized MDM?
Do You Need Advice? This paper on Master Data Management brings up more questions than answers, and you are aware that data alignment problems are undermining your BI initiatives or everyday business process success, it might be worthwhile to seek some assistance. Contact us at 920-428-5669; firstname.lastname@example.org.