Back to blog
PrimentraPrimentra
·May 27, 2026·9 min read

Master data identifiers: the choice you cannot reverse

Home/Blog/Master data identifiers: the choice you cannot reverse

The same master record, three ways to identify it

Business key
SKU-AC-204
Looks correct, breaks the first time the business reorganizes.
Surrogate INT
4827
Small, fast, readable. Fine when one MDM creates every record.
UUID
7c3a8e2f...
Bigger index. Survives multi-environment and multi-writer creation.

Of all the decisions you make in the first month of an MDM project, none is harder to undo than the format of the master identifier. The data model can be refactored after launch. Validation rules and approval workflow are things teams cheerfully rebuild a year later. The ID is the one thing every downstream system embeds in a join, a foreign key, or an integration log. Once it is out the door, it is out forever.

This is one of the reasons MDS migrations hurt. Teams who picked the wrong identifier in 2014 are living inside the consequence right now, and the migration is the moment when the consequence becomes visible.

The three choices, briefly

There are three practical options for the primary identifier on a master record. Each is a defensible choice in some situation and a quiet disaster in some other.

A business key is something that already means something to a human. An ISO country code. A SAP supplier number. A SKU. Easy to type into a query and easy to recognize on a printout.

A surrogate integer is something the database made up. An auto-increment INT or BIGINT with no meaning outside its own table. Small, fast on lookups, and small enough that the index fits in memory on almost any server you would buy.

A UUID is something the application made up. 128 bits of randomness, globally unique, generated by anyone without coordination. Bigger, uglier, and immune to a lot of problems the other two are not.

PropertyBusiness keySurrogate INTUUID
Size on diskVariable4 or 8 bytes16 bytes
Readable in a queryYesYesAwkward
Survives source-system renumberNoYesYes
Survives merger of two MDM instancesNoNo without renumberingYes
Generated outside the databaseYesNoYes
Survives test-to-prod promotionSometimesNoYes

Why the business key is the mistake almost everyone makes first

The pull toward business keys is strong, and the reason it is strong is that for the first six months they look like the obvious right answer. You can read the value. You can find a record by typing the number. Stewards do not have to translate between two columns. Everyone agrees on what the field means.

Then a few things start happening. None of them are exotic. They happen at every organization of a certain size eventually.

The eventWhat happens to your master ID
AcquisitionThe acquired company has its own SAP with its own supplier numbers. Supplier 4001 now means two suppliers.
RenumberingProduct team aligns SKUs to a new category structure. Twelve months of master records have to be re-keyed.
ReuseSource system retires supplier 4001 in March. Finance reuses 4001 for a new supplier in September. Six months of postings point at the wrong master.
SplitA country splits, two-letter ISO code retires. Values you had on file in 2018 no longer exist as valid codes.
RebrandA product line gets renamed, and the customer-facing SKU prefix changes. Now your "primary key" disagrees with every label printed in the warehouse.

A business key as a primary identifier turns every one of these into a migration. Every integration that joined on the old value has to be updated. Every report. Every external file format. Every contract that quoted the supplier number on a printed page.

I am not saying business keys should disappear. They are useful. People need them. They just should not be the join key.

Surrogate INT or UUID, when does each one win

With business keys ruled out as the primary identifier, the real choice is between a surrogate INT and a UUID. Both work. They win in different situations.

Surrogate INTs are smaller, faster on lookups, and easy on the eyes. A junior analyst can paste 4827 into a query and find the supplier. The index is half the size of the equivalent UUID index, which matters more than people think on a table that joins to every transaction your warehouse touches.

UUIDs cost you those things, and buy you others. Two systems can generate identifiers at the same time without colliding. A record created in a test environment keeps its identifier when promoted to production, which sounds small until you have spent a Friday afternoon translating IDs in a deployment script. If you ever merge two MDM instances after an acquisition, UUIDs make the merge a copy and INTs make it a re-keying project.

For most mid-market MDMs that run on a single SQL Server and create records in one place, the INT is fine. If your roadmap includes federated authoring, multiple regional instances, or external systems that need to mint master IDs and push them in, start with the UUID. Half a kilobyte of extra index per thousand records is cheap compared to the alternative.

The hybrid almost every long-lived MDM ends up with

Every MDM that survives ten years uses the same pattern. The master identifier is a surrogate, either an INT or a UUID. The business keys live as attributes on the record, or in a cross-reference table next to it. Stewards can search on them. Reports can filter on them. The application can render them on screen. They simply are not the join key.

This is the design that lets business keys exist without ruining you. When SAP renumbers, the cross-reference row moves. The master ID stays where it was. Every downstream consumer keeps working because nothing they joined on actually changed.

If you have not seen the post on cross-reference tables, that is the structure that makes the hybrid work. It is the difference between an MDM that holds up through a decade of source-system changes and one that has to be partially rebuilt the first time SAP merges two instances.

In practice the data model looks something like this. Master record on the left, business keys on the right, joined through a cross-reference table that owns the mapping:

-- The master table: one row per real-world entity, surrogate key only
CREATE TABLE master_supplier (
    supplier_id   INT IDENTITY(1,1) PRIMARY KEY,
    legal_name    NVARCHAR(255) NOT NULL,
    tax_id        NVARCHAR(64),
    created_utc   DATETIME2 DEFAULT SYSUTCDATETIME()
);

-- The cross-reference: source system identifiers live here, not on the master
CREATE TABLE supplier_xref (
    xref_id          INT IDENTITY(1,1) PRIMARY KEY,
    supplier_id      INT NOT NULL REFERENCES master_supplier(supplier_id),
    source_system    NVARCHAR(64) NOT NULL,   -- e.g. SAP_UK, SAP_US, NETSUITE
    source_id        NVARCHAR(128) NOT NULL,  -- the business key in that system
    active           BIT NOT NULL DEFAULT 1,
    deactivated_utc  DATETIME2 NULL,
    UNIQUE (source_system, source_id, active)
);

What MDS got right, and what most MDS migrations get wrong

MDS actually had a sensible split. Every entity carried both an integer Id and a Code field. The Id was the primary key. The Code was the business identifier, intended for humans to recognize.

The mistake a lot of teams made was treating Code as the real identifier anyway. Subscriber views were keyed on Code. Downstream integrations joined on Code. Audit reports referenced Code. When the business changed and Code had to change with it, every one of those consumers broke at the same time.

Migrating off MDS is the moment to undo this. The new MDM is a fresh chance to put the surrogate back where it belongs as the primary key, and to demote Code to a queryable attribute that downstream consumers can read but not depend on. If you carry the same join pattern into the next platform, you are buying yourself the same problem in a different color.

A small number of mistakes that show up over and over

A few specific patterns to avoid, from projects that ran for years and then had to walk things back:

Storing the master ID as a string when it is really an integer, to keep your options open. The flexibility is theoretical. The cost of string comparison on every join is real, every day, forever.

Reusing integer IDs after a hard delete. The next consumer that loads the old ID for a historical lookup gets the wrong record back, silently. Soft-delete or skip the number forever.

Letting the application format the ID for display in ways that make it look meaningful. SUP-00004827 looks tidier than 4827 until someone in finance assumes the leading zeros are significant and writes a report that filters on the wrong string.

Exposing the surrogate publicly in URLs or APIs without thinking about what it leaks. An auto-increment ID tells the world how many records you have and the order they were created in. For some domains that is fine. For others it is information you did not want to give away.

What this looks like in Primentra

Primentra uses an integer master identifier by default and keeps business keys as attributes on the record, where downstream consumers can read them without joining on them. Source system IDs go into the cross-reference table rather than onto the master itself, so renumbering a source does not touch the master ID.

If you are migrating from MDS, the import wizard preserves your old Code values as attributes while the new master IDs are assigned fresh. That gives you a clean break from the pattern of joining on Code, without losing the values stewards have spent years recognizing.

The data lives in your own SQL Server. The IDs are real integers in a real table. You can paste them into a query, embed them in a foreign key, or hand them to a consumer system, and they will mean the same thing tomorrow.

Related reading

The identifier choice is half the story. The other half is the table that maps every source system ID to one master, which is what the cross-reference post covers. If you are still working out how to load the first version of the master data, the post on the initial data load walks through what to do with the legacy keys you inherit. And if you have not picked your first domain yet, how to pick your first MDM domain is the place to start before any of this matters.

Common questions

Should I use a business key as the primary identifier?

No. Business keys look correct for the first six months, then break the first acquisition, the first renumber, the first reuse, or the first country code retirement. Use a surrogate INT or a UUID as the master ID and keep business keys as queryable attributes.

Surrogate INT or UUID?

INT if records are only ever created in one MDM instance, on one database. UUID if records can be created in multiple environments or systems, or if you need test-to-production promotion to preserve identifiers. For most on-premise mid-market MDMs, an INT is fine.

Why are identifiers so hard to change later?

Because every downstream system embeds the ID. Once it is out the door to twelve consumers, changing the format means coordinating a rewrite across all twelve. The cost of getting it right on day one is hours. The cost of changing it later is a year of cleanup.

Can I migrate the master ID after go-live?

You can run dual IDs in parallel and translate at every integration boundary, but most projects that start this never finish it. One downstream consumer always slips through and the dual system becomes permanent.

Get the identifier decision right on day one

Primentra ships with a surrogate master ID and the cross-reference table already wired in, so business keys live where they belong and the join key holds steady through the things that change. Runs on your own SQL Server, where the IDs are integers in a table you can query directly. The 60-day trial runs on your real source systems, which is long enough to find out whether the model holds up against the keys you actually have.

Start free trial →Try the demo →

More from the blog

Your MDM is only as good as its cross-reference table9 min readBuild vs buy for master data management: rolling your own is how you become the next abandoned MDS10 min readHow to measure master data quality: the six numbers that tell you the truth10 min read

Ready to migrate from Microsoft MDS?

Join the waitlist and be the first to try Primentra. All features included.

Download Free TrialTry DemoCompare MDM tools