What are examples of reference data?

Common examples of reference data include: country and region codes (ISO 3166), currency codes (ISO 4217), product categories and hierarchies, business unit and cost center lists, status codes (Active/Inactive/Deprecated), industry classification codes (SIC, NAICS), units of measurement, and payment terms. These values are used as dropdowns and foreign keys across multiple systems.

Why shouldn't reference data be managed in spreadsheets?

Spreadsheets lack audit trails, approval workflows, access control, and system integration capabilities needed for reference data governance. When reference data lives in spreadsheets, there is no single source of truth — multiple copies drift apart, nobody knows who changed what or when, and syncing updates to downstream systems requires manual effort or fragile scripts.

How do you manage reference data in SQL Server?

Reference data in SQL Server can be managed using a dedicated MDM tool like Primentra that stores reference data in your existing SQL Server instance. This provides a web-based UI for data stewards to manage values, approval workflows to prevent unauthorized changes, full audit trail, and automatic integration with downstream systems through SQL views. The alternative — storing reference data in loose tables without governance — leads to the same problems as spreadsheets.

What is reference data? (And why you shouldn't manage it in a spreadsheet)

country_codes.xlsxLast saved: ???

	A: Code	B: Name	C: Region
1	US	United States	North America
2	USA	United States of America	N. America
3	DE	Germany	Europe
4	DEU	germany	EU
5	NL	Netherlands	Europe
6	NL	The Netherlands	EMEA
7	GB	United Kingdom	Europe
8	UK	Great Britain	EMEA / Europe

4 conflicts found — 3 naming conventions — no change history — who owns this file?

app.primentra.com/countries

Countries8 recordsLast modified: 28 Feb 2026 by admin

Code	Name	Region	Status
DE	Germany	Europe	Active
GB	United Kingdom	Europe	Active
NL	Netherlands	Europe	Active
US	United States	North America	Active

ISO 3166 standard · One source of truth · Full audit trail · Approval workflow enabled

Reference data is the collection of lookup values your systems depend on but nobody owns. Country codes, currency codes, product categories, status flags, cost centers, units of measurement.

Every organization has reference data. Most manage it in spreadsheets, or hardcoded in application logic. That setup works until a JOIN fails because someone typed "USA" instead of "US" and your ETL pipeline silently drops 14,000 rows on a Tuesday night.

This post covers what reference data is, how it differs from master data and transaction data, the warning signs that yours is falling apart, and what to do about it.

Reference data, master data, transaction data: what's the difference?

These three terms get confused constantly. The distinction matters because each type needs different governance.

Reference data is a set of permissible values used to classify other data. The dropdowns in your system: country codes, currency codes, product categories, payment terms. These values change rarely, maybe a few times per year, and are shared across multiple systems.

Master data describes your core business entities: customers, products, suppliers, employees. It changes more often. A customer moves, a product gets discontinued, a supplier comes onboard. Master data is what your transactions reference.

Transaction data records what happens. Orders, invoices, shipments, payments. High volume, constantly generated. An invoice references a customer (master data), a currency (reference data), and payment terms (reference data).

	Reference data	Master data	Transaction data
What is it?	Lookup values, classification codes	Core business entities	Business events
Examples	Country codes, currencies, categories	Customers, products, suppliers	Orders, invoices, payments
Change frequency	Rarely (quarterly or less)	Regularly (daily/weekly)	Constantly (every transaction)
Volume	Low (tens to thousands)	Medium (thousands to millions)	High (millions+)
Who changes it?	IT or data admin (ideally)	Data stewards, business users	Systems, automatically
Risk if wrong	Broken joins, bad reports, compliance gaps	Wrong shipments, duplicate records	Incorrect financials

Reference data gets the least attention because it's small and boring. Nobody gets promoted for maintaining country codes. But when it's wrong, everything downstream breaks. A misspelled country code cascades through every system that references it. If you want to understand where reference data fits in the broader MDM landscape, what is master data management covers how these pieces relate.

Five signs your reference data is out of control

If you manage databases or ETL pipelines, you've probably seen at least three of these. They tend to show up together.

1. Duplicate codes are breaking your JOINs

You have "US" and "USA" and "United States" in three different systems. Your ETL job joins on country code and silently drops the rows that don't match. The quarterly report shows revenue 8% lower than expected. Two days later you trace it back to a code mismatch.

This is the most common symptom. Duplicate reference values multiply across every downstream system. Cleaning up after the fact doesn't help if the duplicates keep entering.

2. No audit trail: who changed "Active" to "Inactive" last Tuesday?

A payment term got changed. A product category got renamed. A business unit code got deleted. When was it changed? Who authorized it? What was the previous value?

If your reference data lives in a spreadsheet on a shared drive, or in a SQL table someone updates with ad-hoc INSERT statements, you have no answers. Compliance asks for change history during an audit. You offer a shrug and a backup from six months ago.

3. Manual sync scripts between systems

Someone wrote a PowerShell script three years ago that copies country codes from the ERP to the CRM every night. It runs as a scheduled task on a server under someone's desk. Nobody knows exactly what it does. The person who wrote it left the company.

If you have custom sync scripts for reference data, it means there's no single source of truth. Multiple copies drift apart over time. Every manual sync point is a place where data gets corrupted, delayed, or lost.

4. No ownership: everyone edits, nobody maintains

Finance adds cost center codes. Sales adds region codes. Product adds category codes. IT adds country codes. Nobody coordinates.

Without ownership, reference data degrades through accumulation. Values get added but never retired. Naming conventions drift. You end up with "EMEA", "Europe", "EU", and "Europe (incl. UK)" all meaning roughly the same thing in the same column. People care. They just don't know it's their job.

5. Stale values that should've been retired years ago

Your country code list still has "Yugoslavia". Your product categories include a line that was discontinued in 2019. Your status codes have both "Cancelled" and "Canceled" and "CANCEL".

Stale reference data creates confusion and bad reports. Nobody removes old values because they're afraid of breaking something. Without a way to soft-delete or deprecate values (keeping history intact while hiding them from new data entry), old values just accumulate.

What proper reference data management looks like

You don't need a twelve-month project or a consulting firm. Four things, applied incrementally.

Start with one entity

Pick your worst offender. For most organizations, it's country codes or product categories, something shared across systems where inconsistencies cause the most pain. Import the current values into a governed store. Clean up the duplicates. Establish the canonical set. One entity, one afternoon.

Assign ownership

Every reference data entity needs an owner. A specific person, not a team, who is responsible for accuracy and completeness. Country codes might sit with the data architect. Product categories with the product manager. Cost centers with the finance controller. Ownership doesn't mean they do all the work. It means they approve changes and keep the list clean.

Add an approval workflow

Once you have ownership, add a gate. Changes to reference data shouldn't go live immediately. The owner reviews first. This single step eliminates the most common problems: duplicate values, typos, unauthorized changes, values added without context.

An approval workflow also gives you a built-in audit trail for free. Every change is recorded: who proposed it, who approved it, when it went live.

Build out gradually

Once the first entity is running cleanly, add the next. Most organizations have between 10 and 30 reference data entities that matter. Migrate them one at a time. Each one you bring under governance reduces risk incrementally.

After a few months, you have a single source of truth for all your reference data. Data stewards use a web UI. Downstream systems query SQL views. The spreadsheets on shared drives can finally be deleted.

Why spreadsheets don't work for reference data

"Don't use spreadsheets" is easy advice to give and hard to act on. Spreadsheets are fine for lots of things. Reference data governance is not one of them.

What you need	Spreadsheet	Governed MDM tool
Single source of truth	Multiple copies on shared drives	One database, one UI, one truth
Change history	Who saved last? Check file properties.	Full audit log: who, what, when
Approval before changes go live	Anyone with file access can edit	Built-in approval workflow
Role-based access	File-level permissions at best	Per-entity, per-field access control
Integration with downstream systems	Export CSV, email it, hope for the best	SQL views, direct DB access
Prevent duplicate values	Conditional formatting (maybe)	Enforced unique constraints
Deprecate old values safely	Delete the row and pray	Soft-delete: hide from new entries, keep history

A spreadsheet is a tool for analysis, not a system of record. When you use it as a system of record for reference data, you accept every governance gap in the table above. At 10 records, that's fine. At 500 records across 15 entities feeding three downstream systems, it's a liability.

Reference data management for SQL Server teams

If your organization runs on SQL Server, your reference data probably already lives there, scattered across lookup tables in various databases. Maybe dbo.Countries, dbo.ProductCategories, ref.CurrencyCodes. The data is there, but the governance isn't.

What's missing is governance: a UI for data stewards to manage values without writing SQL, approval workflows, an audit trail. The data stays in your SQL Server. You add a management layer on top. The models, entities, and attributes post explains how Primentra structures this data model.

That's what we built Primentra to do. It runs on your existing SQL Server instance, no cloud dependency, no separate database infrastructure. You define reference data entities in a web UI, import existing values, assign owners, enable approval workflows. Downstream systems access the governed data through SQL views. Every change has an audit trail and nothing goes live without review.

Frequently asked questions

What is the difference between reference data and master data?

Reference data is a set of permissible values used to classify other data — country codes, currencies, status flags. It changes rarely and is shared across systems. Master data describes core business entities like customers, products, and suppliers. It changes more frequently and is tied to specific business processes. Both need governance, but reference data is simpler to manage and a good starting point for organizations new to MDM.

What are common examples of reference data?

Country and region codes (ISO 3166), currency codes (ISO 4217), product categories and hierarchies, business units, cost centers, payment terms, units of measurement, industry classification codes (SIC, NAICS), and status codes (Active, Inactive, Deprecated). Any list of values used as a dropdown or foreign key across multiple systems qualifies as reference data.

How do I start managing reference data properly?

Start with one entity — usually the one causing the most pain (country codes and product categories are common first picks). Import the current values into a governed system, clean up duplicates, assign an owner, and enable an approval workflow. Once the first entity is stable, add more. Most organizations can bring their critical reference data under governance within a few months without a major project.

Can I manage reference data in SQL Server without a separate tool?

You can store reference data in SQL Server tables, but without a governance layer you lack audit trails, approval workflows, role-based access control, and a UI for non-technical users. Tools like Primentra add governance on top of your existing SQL Server — the data stays in your database, you just gain the controls needed to manage it properly.

Managing reference data in spreadsheets or bare SQL tables? The live demo has a few pre-configured reference data entities you can poke around in. No signup required.

Questions about migrating your reference data? Reach out, happy to walk through your situation.