Back to blog
PrimentraPrimentra
·March 2, 2026·14 min read

Inside the Staging Scheduler: How Primentra Automates Batch Processing

Home/Blog/Inside the Staging Scheduler: How Primentra Automates Batch Processing

Primentra's staging tables let external systems — SSIS packages, ETL pipelines, raw SQL inserts — deposit data for review before it touches production. But somebody has to hit the "Process" button. For a handful of entities, that's fine. When you have thirty entities with data arriving from six different sources around the clock, manual processing stops being practical.

That's why we built the staging scheduler. It runs inside SQL Server itself — no external services, no message queues, no additional infrastructure. One SQL Agent job, one dispatcher stored procedure, and a per-entity configuration table. This post walks through every layer of the system: the tables, the dispatch loop, the concurrency guards, the trigger evaluation logic, and the logging. If you want to understand how staging rows actually flow into Primentra, the SQL Server staging table load strategies post covers the ingestion side.

Everything below is from our actual codebase, not pseudocode.

System architecture

SQL Agent Job
every 60 seconds
usp_Scheduler_Dispatch
APPLOCK guard
Overlap check
Zombie recovery
Trigger eval
ProcessBatch
SchedulerLog
StagingBatches
SchedulerConfigurations

The data model

Two tables hold the entire scheduler state. SchedulerConfigurations stores one row per entity with the processing mode, schedule parameters, trigger thresholds, overlap flags, and run tracking. SchedulerLog is the audit trail — every time the dispatcher fires, skips, completes, errors, or releases a zombie, it writes a row here.

CREATE TABLE SchedulerConfigurations (
    Id                      INT IDENTITY(1,1) PRIMARY KEY,
    EntityId                INT NULL REFERENCES Entities(Id),

    -- Mode: 'manual' | 'scheduled' | 'triggered'
    ProcessingMode          NVARCHAR(20) NOT NULL DEFAULT 'manual',

    -- Scheduled mode settings
    ScheduleType            NVARCHAR(20) NULL,      -- 'interval'|'daily'|'weekly'|'monthly'
    IntervalMinutes         INT NULL,
    RunTimes                NVARCHAR(200) NULL,     -- Comma-separated HH:mm
    RunDays                 NVARCHAR(100) NULL,     -- Mon,Tue,...
    RunDaysOfMonth          NVARCHAR(100) NULL,     -- 1,15,...

    -- Trigger mode settings
    TriggerOnNewRows        BIT NOT NULL DEFAULT 0,
    TriggerRowThreshold     INT NULL,
    TriggerIdleMinutes      INT NULL,
    TriggerDebounceSeconds  INT NOT NULL DEFAULT 60,

    -- Overlap guard
    IsRunning               BIT NOT NULL DEFAULT 0,
    RunStartedAt            DATETIME2 NULL,
    RunStartedBy            NVARCHAR(50) NULL,

    -- Schedule tracking
    NextScheduledRun        DATETIME2 NULL,
    LastCompletedAt         DATETIME2 NULL,
    LastTriggerCheck        DATETIME2 NULL,

    IsEnabled               BIT NOT NULL DEFAULT 1,
    CONSTRAINT UQ_SchedulerConfig_EntityId UNIQUE (EntityId)
);

A few things to notice. ProcessingMode is the top-level switch: manual means the dispatcher ignores this entity entirely. RunStartedBy tags who started the batch — "Schedule", "Manual", or a trigger description like "Trigger: 500 rows (threshold 500)". The UNIQUE constraint on EntityId guarantees one config row per entity.

The log table is intentionally flat:

CREATE TABLE SchedulerLog (
    Id              INT IDENTITY(1,1) PRIMARY KEY,
    EntityId        INT NOT NULL,
    EventTime       DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
    EventType       NVARCHAR(30) NOT NULL,
        -- 'fired' | 'skipped' | 'error' | 'completed' | 'zombie'
    TriggerSource   NVARCHAR(100) NULL,
    Message         NVARCHAR(500) NULL,
    RowsQueued      INT NULL,
    DurationMs      INT NULL,
    BatchId         INT NULL
);

Five event types cover every state transition. fired means the dispatcher decided to start a batch. skipped means the overlap guard stopped it. zombie means a batch was stuck for over 30 minutes and got forcibly released. DurationMs tracks how long each batch took, so you can spot performance regressions before they become outages.

The heartbeat: SQL Agent

The scheduler has no daemon, no Windows service, no cron. It is a single SQL Agent job with two steps:

SQL Agent job: Primentra_Scheduler

1
EXEC usp_Scheduler_Dispatch
Evaluate all entity schedules, fire batches
on_fail: next step
2
EXEC usp_Maintenance_Cleanup
Purge old audit log, scheduler log, staging imports
on_fail: quit
Schedule: freq_subday_type=4, freq_subday_interval=1 (every 60s, 24/7)

Step 1 always proceeds to step 2 regardless of outcome (on_fail_action = 3). A dispatcher failure should not prevent cleanup from running. Step 2 reads retention settings from the AppSettings table and purges old data accordingly — scheduler log, audit log, error log, staging imports.

Why SQL Agent instead of a standalone service? Because it is already running on every SQL Server instance. No additional binaries to deploy, no service accounts to configure, no containers to manage. If SQL Server is up, the scheduler is running.

The dispatch loop

usp_Scheduler_Dispatch is the core of the system. Every 60 seconds, SQL Agent calls it. Here is what happens, in order:

Dispatch flow per cycle

1
APPLOCK_TEST — another dispatcher running?
EXIT immediately
2
Snapshot all enabled entities into #DispatchQueue
3
FOR EACH entity in queue:
4
IsRunning = 1 and started > 30 min ago?
Release zombie, log, continue
5
IsRunning = 1 and started < 30 min ago?
Log skip, GOTO NextEntity
6
Count pending rows in stg.[TableName]
7
Evaluate mode (scheduled or triggered)
8
Conditions met? → EXEC usp_Staging_ProcessBatch
9
Log result (fired/completed/error)

Step 1: The APPLOCK guard

SQL Agent runs on a fixed schedule. If a dispatch cycle takes 90 seconds and the interval is 60 seconds, the next cycle starts while the first one is still processing. Without protection, you get two dispatchers evaluating the same entities at the same time — double-firing batches, corrupted overlap flags, race conditions on IsRunning.

The fix is sp_getapplock — SQL Server's application-level advisory lock:

-- Fast-fail: if another dispatcher is already running, exit immediately.
-- APPLOCK_TEST returns 0 when the lock is NOT available.
IF APPLOCK_TEST('public', 'Primentra_Scheduler_Dispatch',
                'Exclusive', 'Session') = 0
BEGIN
    PRINT 'Dispatcher already running — skipping this cycle.';
    RETURN;
END;

-- Acquire the lock with zero timeout (fail immediately, don't wait)
EXEC sp_getapplock
    @Resource    = 'Primentra_Scheduler_Dispatch',
    @LockMode    = 'Exclusive',
    @LockOwner   = 'Session',
    @LockTimeout = 0;

Two important details. First, APPLOCK_TEST is a non-blocking check that runs before we attempt to acquire. This avoids a pointless wait on sp_getapplock even with a zero timeout. Second, the lock is session-scoped: it releases automatically when the stored procedure returns or the session disconnects. No manual cleanup required on crashes.

Step 2: Building the dispatch queue

The dispatcher snapshots all enabled, non-manual entities into a temp table before processing. This prevents issues with config changes mid-cycle — if someone switches an entity from "scheduled" to "manual" while the dispatcher is running, the cursor has already captured the snapshot and will finish processing the current state cleanly.

INSERT INTO #DispatchQueue (...)
SELECT sc.EntityId, e.Name, sc.ProcessingMode, sc.ScheduleType, ...
FROM SchedulerConfigurations sc
INNER JOIN Entities e ON e.Id = sc.EntityId
WHERE sc.IsEnabled = 1
  AND sc.ProcessingMode IN ('scheduled', 'triggered')
  AND sc.EntityId IS NOT NULL;

Step 3: Per-entity processing with error isolation

Each entity is processed inside its own TRY/CATCH. If entity A throws an error during batch processing, entity B still runs. The error handler clears the IsRunning flag (so the entity is not permanently locked) and recalculates NextScheduledRun for scheduled entities — without this, a failed scheduled entity would re-fire every 60 seconds instead of waiting for its next scheduled time.

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        -- ... overlap guard, trigger evaluation, ProcessBatch ...
    END TRY
    BEGIN CATCH
        DECLARE @ErrMsg NVARCHAR(500) = ERROR_MESSAGE();
        BEGIN TRY
            -- Clear IsRunning so entity isn't stuck forever
            UPDATE SchedulerConfigurations
            SET IsRunning = 0, RunStartedAt = NULL
            WHERE EntityId = @EntityId;

            -- For scheduled entities: recalculate next run
            -- (prevents re-firing every 60s on persistent errors)
            IF @ProcessingMode = 'scheduled'
            BEGIN
                EXEC usp_Scheduler_CalculateNextRun ...
                    @NextRun = @ErrNextRun OUTPUT;
                UPDATE SchedulerConfigurations
                SET NextScheduledRun = @ErrNextRun
                WHERE EntityId = @EntityId;
            END;

            INSERT INTO SchedulerLog (EntityId, EventType, Message)
            VALUES (@EntityId, 'error', @ErrMsg);
        END TRY
        BEGIN CATCH
            -- Swallow secondary errors — main loop must continue
        END CATCH;
    END CATCH;
    NextEntity:
    FETCH NEXT FROM EntityCursor INTO ...
END;

The nested TRY/CATCH inside the error handler is deliberate. If the SchedulerLog INSERT itself fails (disk full, constraint violation), the dispatcher still moves to the next entity. You do not want error logging to crash the error handler.

Overlap guard and zombie recovery

APPLOCK prevents two dispatchers from running at the same time. But it does not prevent the dispatcher from firing a second batch for the same entity while the first batch is still running. That is the job of the IsRunning flag.

Overlap guard decision tree

IsRunning = 1?
YES
RunStartedAt > 30 min ago?
YES
Release zombie
Log 'zombie' event
Continue evaluation
NO
Log 'skipped'
GOTO NextEntity
NO
Proceed to
trigger evaluation

The zombie timeout is 30 minutes. If IsRunning = 1 and RunStartedAt is more than 30 minutes old, the dispatcher assumes the batch crashed without cleaning up. It resets the flag and logs a zombie event:

IF @IsRunning = 1
BEGIN
    IF @RunStartedAt < DATEADD(MINUTE, -30, SYSUTCDATETIME())
    BEGIN
        -- Zombie: batch stuck for over 30 minutes — release and log
        UPDATE SchedulerConfigurations
        SET IsRunning = 0, RunStartedAt = NULL
        WHERE EntityId = @EntityId;

        INSERT INTO SchedulerLog (EntityId, EventType, Message)
        VALUES (@EntityId, 'zombie',
            '[' + @EntityName + '] Zombie batch released after '
            + '30-minute timeout.');
        SET @IsRunning = 0;
        -- Fall through: zombie released, now evaluate whether to fire
    END
    ELSE
    BEGIN
        -- Live running batch — skip to avoid overlap
        INSERT INTO SchedulerLog (EntityId, EventType, Message)
        VALUES (@EntityId, 'skipped',
            '[' + @EntityName + '] Batch already running (overlap guard).');
        GOTO NextEntity;
    END;
END;

After releasing a zombie, the dispatcher falls through to the trigger evaluation. If conditions are still met (rows are still pending), it fires a new batch. If the zombie was caused by a transient issue — a deadlock, a tempdb autogrow — the system self-heals without human intervention.

Processing modes

Each entity has one of three modes. The dispatcher only evaluates entities in scheduled or triggered mode.

Manual

User clicks "Process" in the UI. The dispatcher ignores this entity.

Scheduled

Fire at fixed times. Four schedule types: interval (every N minutes), daily (at HH:mm), weekly (specific days), monthly (specific dates).

Triggered

Fire when data conditions are met. Three trigger types: row threshold, idle timeout, new row detection with debounce.

Scheduled mode

When mode is scheduled, the dispatcher checks if NextScheduledRun has passed. If it has: set IsRunning = 1, log a "fired" event, call usp_Staging_ProcessBatch, then recalculate the next run time using usp_Scheduler_CalculateNextRun.

The CalculateNextRun procedure handles all four schedule types. For interval: add N minutes to the last run time (or now, if never run). For daily: walk each comma-separated RunTimes value and find the next future HH:mm slot today or tomorrow. For weekly: scan the next 7 days, match against RunDays, and apply each RunTimes slot. For monthly: scan up to 32 days, match against RunDaysOfMonth.

Schedule typeConfig fieldsExample
intervalIntervalMinutesEvery 15 minutes
dailyRunTimes06:00, 12:00, 18:00
weeklyRunDays + RunTimesMon, Wed, Fri at 02:00
monthlyRunDaysOfMonth + RunTimes1st and 15th at 03:00

Triggered mode

Triggered mode is data-driven. Before evaluating triggers, the dispatcher counts pending rows in the entity's staging table and measures the age of the oldest row:

DECLARE @QueueSQL NVARCHAR(500) = N'
    SELECT @PendingRows  = COUNT(*),
           @OldestRowAge = ISNULL(
               DATEDIFF(MINUTE, MIN(CreatedAt), SYSUTCDATETIME()), 0)
    FROM stg.' + QUOTENAME(@SafeName) + N'
    WHERE ImportStatus = 0;';
EXEC sp_executesql @QueueSQL,
    N'@PendingRows INT OUTPUT, @OldestRowAge INT OUTPUT',
    @PendingRows OUTPUT, @OldestRowAge OUTPUT;

Dynamic SQL is necessary here because each entity has a different staging table name. QUOTENAME prevents SQL injection. ImportStatus = 0 filters for unprocessed rows only.

Then three conditions are evaluated in priority order:

1
Row threshold
@PendingRows >= @TriggerRowThreshold

Enough rows have accumulated to justify a batch. Good for high-volume feeds that deposit thousands of rows per hour.

2
Idle timeout
@PendingRows > 0 AND @OldestRowAge >= @TriggerIdleMinutes

Rows have been sitting unprocessed for too long. Catches the case where a slow trickle of rows never hits the threshold.

3
New rows + debounce
@TriggerOnNewRows = 1 AND @PendingRows > 0 AND debounce elapsed

Any new rows trigger processing, but with a configurable debounce window (default 60 seconds) to avoid firing on every single INSERT.

Priority matters: if both the threshold and idle timeout conditions are true, the threshold wins. The TriggerSource value written to the log and to the batch record reflects which condition actually fired, making it easy to trace why a batch ran.

The debounce mechanism deserves a closer look. When TriggerOnNewRows = 1, the dispatcher checks if TriggerDebounceSeconds have elapsed since LastTriggerCheck. It updates LastTriggerCheck on every cycle regardless of whether it fires. The default debounce is 60 seconds — this prevents a rapid INSERT loop from spawning a batch on every dispatcher cycle.

Logging

The scheduler logs five event types. Every state transition writes a row to SchedulerLog:

EventTypeWhenExtra data
firedConditions met, batch startingTriggerSource, RowsQueued
completedBatch finished successfullyDurationMs, TriggerSource
skippedOverlap guard blocked a new batchRunStartedAt of the blocking batch
errorPer-entity or top-level errorERROR_MESSAGE()
zombieBatch stuck >30 min, forcibly releasedOriginal RunStartedAt

Log cleanup is handled by usp_SchedulerLog_Cleanup, called as part of the maintenance step. It reads the retention period from AppSettings (default: 30 days) and deletes older rows. The index on (EntityId, EventTime DESC) keeps both the paginated UI queries and the cleanup DELETE fast.

The UI presents this log in a paginated table with color-coded event types. DBAs can also query SchedulerLog directly — it is a regular table, not hidden behind an abstraction.

Server-side config validation

The save procedure validates that mode-specific fields are actually supplied. If someone sends a direct API call with ProcessingMode = 'scheduled' but no ScheduleType, the SP rejects it:

IF @ProcessingMode = 'scheduled' AND @ScheduleType IS NULL
    THROW 50001,
        'ScheduleType is required when ProcessingMode is scheduled', 1;

IF @ProcessingMode = 'triggered'
   AND @TriggerOnNewRows = 0
   AND @TriggerRowThreshold IS NULL
   AND @TriggerIdleMinutes IS NULL
    THROW 50001,
        'At least one trigger must be configured', 1;

When switching modes, stale fields are cleared. If an entity switches from scheduled to manual, the old ScheduleType, RunTimes, and trigger fields are nulled out. This prevents a user from toggling back to "scheduled" months later and accidentally inheriting a stale configuration.

The test button

Before enabling a schedule, you probably want to know what would happen. usp_Scheduler_TestRun simulates one dispatcher cycle for a single entity. It evaluates all the same conditions — overlap guard, zombie check, queue stats, schedule evaluation, trigger conditions — but never fires a batch. Instead, it returns a result set:

SELECT
    @WouldFire      AS WouldFire,       -- BIT
    @Reason         AS Reason,          -- Human-readable explanation
    @PendingRows    AS PendingRows,     -- Current queue depth
    @OldestRowAge   AS OldestRowAge,    -- Minutes since oldest row
    @NextScheduledRun AS NextScheduledRun;

The Reason field explains exactly why a batch would or would not fire. "Next run not due yet. NextScheduledRun: 2026-03-02 06:00:00" or "Row threshold met: 1,200 rows >= threshold 1,000". When configuring a schedule for the first time, this test-before-you-commit workflow prevents accidental mass-processing.

Putting it all together

Complete invocation chain

SQL Agent fires every 60s
Step 1: EXEC usp_Scheduler_Dispatch
→ APPLOCK_TEST (fast-fail if locked)
→ sp_getapplock (acquire exclusive)
→ Snapshot entities into #DispatchQueue
→ FOR EACH entity:
→ Overlap guard + zombie recovery
→ Count pending rows (dynamic SQL)
→ Evaluate scheduled or triggered mode
→ EXEC usp_Staging_ProcessBatch (if conditions met)
→ Log result to SchedulerLog
→ Recalculate NextScheduledRun
→ sp_releaseapplock
Step 2: EXEC usp_Maintenance_Cleanup

Seven stored procedures, two tables, one SQL Agent job, zero external dependencies. The entire scheduler lives inside SQL Server. The application server exposes the configuration and log via REST endpoints for the UI, but it has no role in the dispatch logic itself. If the app server goes down, the scheduler keeps running. To see how the data model that the scheduler processes is structured, models, entities, and attributes covers the foundational concepts.

The safety layers are worth summarizing:

APPLOCK
Dispatcher-levelPrevents two dispatcher cycles from running at the same time
IsRunning
Entity-levelPrevents a second batch for the same entity while one is in progress
Zombie timeout
Entity-levelReleases batches stuck for over 30 minutes
TRY/CATCH isolation
Entity-levelOne entity's error does not stop other entities from processing

Run it on your SQL Server

Everything described in this post ships with Primentra out of the box. The SQL Agent job is created during installation, the scheduler configuration is built into the entity management UI, and all seven stored procedures deploy with the setup script. The 60-day trial includes full access to all scheduling modes, trigger configuration, and the scheduler log.

Start free 60-day trial →Scheduler documentation →Full changelog →

More from the blog

The Primentra REST API: read and write master data from any system7 min readSQL Server Staging Tables: Import Actions, Merge Modes, and Per-Field Rules14 min readAll 156 job roles in one dropdown. How MDS made hierarchy filtering painful — and how Primentra fixed it.7 min read

Ready to migrate from Microsoft MDS?

Join the waitlist and be the first to try Primentra. All features included.

Download Free TrialTry DemoCompare MDM tools