How to measure master data quality: the six numbers that tell you the truth

A master data quality scorecard

Completeness

94%

Validity

88%

Uniqueness

97%

Consistency

76%

Timeliness

61%

Accuracy

82%

The point is not the average. It is seeing which number is red before someone in finance does.

Almost nobody measures master data quality until a number in a report is wrong and someone goes hunting for the cause. By then the question has changed. It is no longer “what is our data quality.” It is “how long has this been broken, and what did it cost.”

Most teams I have worked with could not answer either. Not because they were careless, but because “data quality” had never been a number in their world. It was a feeling. The supplier file felt messy. The customer list felt full of duplicates. Feelings do not survive a meeting with a controller who wants to know why spend is off by 4%. A number does.

“Bad data quality” is too vague to act on

Say the supplier data is bad. Bad how? Missing tax IDs is one problem. The same supplier entered three times is a different problem. Records nobody has touched since 2019 is a third. Each has a different fix, a different owner, and a different cost. Lumping them under one word means you can argue about it forever and never fix any of them.

The whole point of measuring is to turn the argument into a query. Once “the data is messy” becomes “18% of active suppliers are missing a valid VAT number,” the conversation stops being about opinions and starts being about a list of 340 records and who is going to fix them.

The six dimensions worth measuring

You do not need a framework with forty metrics. Six dimensions cover almost everything that goes wrong with master data. Five of them you can compute against data you already hold. The sixth you cannot, and pretending otherwise is how scorecards start lying.

Dimension	The question it answers	How you compute it
Completeness	Are the mandatory fields filled in?	Records with all required attributes non-null, over the total.
Validity	Does each value match its allowed format or set?	Records whose values pass a format check or lookup, over the total.
Uniqueness	How many records describe the same real thing?	Records that fall into a duplicate cluster, over the total.
Consistency	Do the master and the source systems agree?	Records where a shared field disagrees across systems, over the matched total.
Timeliness	How stale is the record?	Records not verified or changed within your freshness window, over the total.
Accuracy	Does the value match the real world?	Needs an external source. Sample and verify; you cannot compute it internally.

Notice the gap between the first five and the last. Accuracy is the one everyone asks for and almost nobody has. You can know a record is complete, valid, unique, consistent, and fresh, and still have it be wrong, because the company moved last quarter and nobody told you. Accuracy needs the outside world. The other five just need a query.

Computing them in SQL Server

Here is the part that surprises people: the queries are not hard. They are the kind of thing any DBA writes before lunch. Most teams lack these numbers for one reason, and it has nothing to do with difficulty. Nobody was ever asked to run them. The examples below are illustrative, against a plain Supplier table. Swap in your own columns.

Completeness

Count the records where every mandatory field is present, over the total. Decide what counts as mandatory first; the query is the easy part.

-- Completeness: % of active suppliers with all required fields
SELECT
    CAST(100.0 * SUM(CASE
        WHEN VatNumber IS NOT NULL
         AND CountryCode IS NOT NULL
         AND PaymentTerms IS NOT NULL
        THEN 1 ELSE 0 END) / COUNT(*) AS DECIMAL(5,1)) AS CompletenessPct
FROM dbo.Supplier
WHERE IsActive = 1;

Validity

A value can be present and still be nonsense. Validity checks that each value matches an allowed set or format. The cheapest version is a lookup against a reference table; the next cheapest is a pattern check.

-- Validity: country codes that are not in the ISO reference list
SELECT
    CAST(100.0 * SUM(CASE
        WHEN iso.Code IS NOT NULL THEN 1 ELSE 0 END) / COUNT(*) AS DECIMAL(5,1)) AS ValidCountryPct
FROM dbo.Supplier s
LEFT JOIN dbo.IsoCountry iso ON iso.Code = s.CountryCode
WHERE s.IsActive = 1;

Uniqueness

Duplicates are the dimension people feel the most and measure the least. You will never catch every duplicate with a query, but a normalized key on name plus postcode finds the obvious ones and gives you a number to track.

-- Uniqueness: suppliers sharing a normalized name + postcode
WITH Normalized AS (
    SELECT
        SupplierId,
        LOWER(REPLACE(REPLACE(Name, ' ', ''), '.', '')) AS NameKey,
        PostalCode
    FROM dbo.Supplier
    WHERE IsActive = 1
)
SELECT NameKey, PostalCode, COUNT(*) AS Copies
FROM Normalized
GROUP BY NameKey, PostalCode
HAVING COUNT(*) > 1
ORDER BY Copies DESC;

Consistency and timeliness follow the same shape. Consistency joins the master to a source extract and counts where a shared field disagrees. Timeliness counts records whose LastVerifiedAt is older than your freshness window. None of this is clever. It is just work nobody had assigned.

What “good” actually looks like

The instinct, once you have numbers, is to chase 100% everywhere. Do not. A blended quality score across all fields is one of the least useful numbers you can produce, because it averages a field that has to be perfect with a field nobody cares about and lands somewhere in the middle that tells you nothing.

Set a threshold per field per dimension instead. The real skill in measuring quality is deciding what you are allowed to ignore.

Field	Dimension	Target	Why
VAT / tax ID	Completeness + Validity	100%, it is legally required	A wrong or missing tax ID fails an invoice and can fail an audit.
Country code	Validity	100% against the ISO list	Cheap to enforce, breaks reporting and shipping when wrong.
Supplier name + postcode	Uniqueness	Under 1% duplicate clusters	Duplicates split spend and hide negotiating leverage.
Payment terms	Consistency	Master and ERP agree 100%	A mismatch pays the wrong terms on real money.
Secondary contact	Completeness	70% is fine	Nice to have. Not worth a single steward hour to chase.

A good program is not one with a high average. It is one where every number that is allowed to be low is low on purpose, and someone signed off on that.

The scorecard nobody opens

I have seen more dead data quality dashboards than live ones. Someone builds a beautiful page of gauges, presents it once, and then it sits untouched while the numbers rot. A metric that nobody is accountable for is not a metric. It is decoration.

The fix is the same boring discipline that works for every operational number. Each metric gets an owner, a threshold, and a cadence. When the number crosses the threshold, something happens: a record lands in a steward’s queue, an email goes out, a line turns red in the Monday review. The measurement only matters if a falling number triggers an action. Otherwise you are just keeping score while the game gets away from you.

Why this is the part MDS never helped with

MDS could hold the data. It gave you almost nothing for measuring its quality. If you wanted a completeness number, you wrote the T-SQL yourself, ran it by hand, and pasted the result into a spreadsheet. The measurement existed only as long as the person who wrote the query stuck around. When they moved teams, the number went with them, and quality went back to being a feeling.

That is the gap a modern MDM should close. Validity should not be a query you remember to run; it should be a rule on the attribute that blocks the bad value at entry and reports the violation rate on its own. The audit trail should make timeliness free, because every record already knows when it last changed and who changed it. Measurement should be a property of the platform, not a side project that depends on one person’s memory.

For more on the prevention side, see the earlier post on data quality rules that actually work, and on what poor quality costs once it slips through, how to measure what bad master data costs you.

What to do this week

Do not start a quality program. Start with one number. Pick your most painful domain, pick completeness because it is the easiest to compute, write the query, and run it. You now have a baseline, which is more than most teams ever get.

Next week, add validity on one field that matters, and put both numbers somewhere a human will see them. Once two numbers exist and someone is watching them, the rest is just repetition. The hard part was never the SQL. It was deciding that quality is a number, and then being willing to look at it.

Common questions

What are the dimensions of master data quality?

Six cover almost everything: completeness, validity, uniqueness, consistency, timeliness, and accuracy. The first five you can compute with SQL against your own data. Accuracy needs an external reference, which is why most teams quietly skip it.

How do you measure data completeness?

It is the percentage of records where every mandatory field is filled. The hard part is deciding which fields are actually mandatory for that domain. The count itself is one GROUP BY.

What is a good master data quality score?

There is no single good number. Set a threshold per field per dimension. A VAT number should be 100% complete and valid. An optional field can sit at 70%. A good program is one where every number that is low is low on purpose.

Can you measure data accuracy automatically?

Not from inside your own database. Accuracy means the value matches the real world, which only an external source can confirm: VAT validation, an address service, a credit bureau, or a person on the phone. Measure the five you can compute, sample the sixth.

Put a number on your own data

Primentra enforces validity as rules on the attribute, tracks who changed every record and when, and runs on your SQL Server so the queries above work against the same database. The 60-day trial runs on your real data, which is long enough to get a baseline on a domain that has been bothering you for years.

Start free trial →Try the demo →