Primentra's staging tables let external systems — SSIS packages, ETL pipelines, raw SQL inserts — deposit data for review before it touches production. But somebody has to hit the "Process" button. For a handful of entities, that's fine. When you have thirty entities with data arriving from six different sources around the clock, manual processing stops being practical.
That's why we built the staging scheduler. It runs inside SQL Server itself — no external services, no message queues, no additional infrastructure. One SQL Agent job, one dispatcher stored procedure, and a per-entity configuration table. This post walks through every layer of the system: the tables, the dispatch loop, the concurrency guards, the trigger evaluation logic, and the logging. If you want to understand how staging rows actually flow into Primentra, the SQL Server staging table load strategies post covers the ingestion side.
Everything below is from our actual codebase, not pseudocode.
The data model
Two tables hold the entire scheduler state. SchedulerConfigurations stores one row per entity with the processing mode, schedule parameters, trigger thresholds, overlap flags, and run tracking. SchedulerLog is the audit trail — every time the dispatcher fires, skips, completes, errors, or releases a zombie, it writes a row here.
CREATE TABLE SchedulerConfigurations (
Id INT IDENTITY(1,1) PRIMARY KEY,
EntityId INT NULL REFERENCES Entities(Id),
-- Mode: 'manual' | 'scheduled' | 'triggered'
ProcessingMode NVARCHAR(20) NOT NULL DEFAULT 'manual',
-- Scheduled mode settings
ScheduleType NVARCHAR(20) NULL, -- 'interval'|'daily'|'weekly'|'monthly'
IntervalMinutes INT NULL,
RunTimes NVARCHAR(200) NULL, -- Comma-separated HH:mm
RunDays NVARCHAR(100) NULL, -- Mon,Tue,...
RunDaysOfMonth NVARCHAR(100) NULL, -- 1,15,...
-- Trigger mode settings
TriggerOnNewRows BIT NOT NULL DEFAULT 0,
TriggerRowThreshold INT NULL,
TriggerIdleMinutes INT NULL,
TriggerDebounceSeconds INT NOT NULL DEFAULT 60,
-- Overlap guard
IsRunning BIT NOT NULL DEFAULT 0,
RunStartedAt DATETIME2 NULL,
RunStartedBy NVARCHAR(50) NULL,
-- Schedule tracking
NextScheduledRun DATETIME2 NULL,
LastCompletedAt DATETIME2 NULL,
LastTriggerCheck DATETIME2 NULL,
IsEnabled BIT NOT NULL DEFAULT 1,
CONSTRAINT UQ_SchedulerConfig_EntityId UNIQUE (EntityId)
);A few things to notice. ProcessingMode is the top-level switch: manual means the dispatcher ignores this entity entirely. RunStartedBy tags who started the batch — "Schedule", "Manual", or a trigger description like "Trigger: 500 rows (threshold 500)". The UNIQUE constraint on EntityId guarantees one config row per entity.
The log table is intentionally flat:
CREATE TABLE SchedulerLog (
Id INT IDENTITY(1,1) PRIMARY KEY,
EntityId INT NOT NULL,
EventTime DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
EventType NVARCHAR(30) NOT NULL,
-- 'fired' | 'skipped' | 'error' | 'completed' | 'zombie'
TriggerSource NVARCHAR(100) NULL,
Message NVARCHAR(500) NULL,
RowsQueued INT NULL,
DurationMs INT NULL,
BatchId INT NULL
);Five event types cover every state transition. fired means the dispatcher decided to start a batch. skipped means the overlap guard stopped it. zombie means a batch was stuck for over 30 minutes and got forcibly released. DurationMs tracks how long each batch took, so you can spot performance regressions before they become outages.
The heartbeat: SQL Agent
The scheduler has no daemon, no Windows service, no cron. It is a single SQL Agent job with two steps:
Step 1 always proceeds to step 2 regardless of outcome (on_fail_action = 3). A dispatcher failure should not prevent cleanup from running. Step 2 reads retention settings from the AppSettings table and purges old data accordingly — scheduler log, audit log, error log, staging imports.
Why SQL Agent instead of a standalone service? Because it is already running on every SQL Server instance. No additional binaries to deploy, no service accounts to configure, no containers to manage. If SQL Server is up, the scheduler is running.
The dispatch loop
usp_Scheduler_Dispatch is the core of the system. Every 60 seconds, SQL Agent calls it. Here is what happens, in order:
Step 1: The APPLOCK guard
SQL Agent runs on a fixed schedule. If a dispatch cycle takes 90 seconds and the interval is 60 seconds, the next cycle starts while the first one is still processing. Without protection, you get two dispatchers evaluating the same entities at the same time — double-firing batches, corrupted overlap flags, race conditions on IsRunning.
The fix is sp_getapplock — SQL Server's application-level advisory lock:
-- Fast-fail: if another dispatcher is already running, exit immediately.
-- APPLOCK_TEST returns 0 when the lock is NOT available.
IF APPLOCK_TEST('public', 'Primentra_Scheduler_Dispatch',
'Exclusive', 'Session') = 0
BEGIN
PRINT 'Dispatcher already running — skipping this cycle.';
RETURN;
END;
-- Acquire the lock with zero timeout (fail immediately, don't wait)
EXEC sp_getapplock
@Resource = 'Primentra_Scheduler_Dispatch',
@LockMode = 'Exclusive',
@LockOwner = 'Session',
@LockTimeout = 0;Two important details. First, APPLOCK_TEST is a non-blocking check that runs before we attempt to acquire. This avoids a pointless wait on sp_getapplock even with a zero timeout. Second, the lock is session-scoped: it releases automatically when the stored procedure returns or the session disconnects. No manual cleanup required on crashes.
Step 2: Building the dispatch queue
The dispatcher snapshots all enabled, non-manual entities into a temp table before processing. This prevents issues with config changes mid-cycle — if someone switches an entity from "scheduled" to "manual" while the dispatcher is running, the cursor has already captured the snapshot and will finish processing the current state cleanly.
INSERT INTO #DispatchQueue (...)
SELECT sc.EntityId, e.Name, sc.ProcessingMode, sc.ScheduleType, ...
FROM SchedulerConfigurations sc
INNER JOIN Entities e ON e.Id = sc.EntityId
WHERE sc.IsEnabled = 1
AND sc.ProcessingMode IN ('scheduled', 'triggered')
AND sc.EntityId IS NOT NULL;Step 3: Per-entity processing with error isolation
Each entity is processed inside its own TRY/CATCH. If entity A throws an error during batch processing, entity B still runs. The error handler clears the IsRunning flag (so the entity is not permanently locked) and recalculates NextScheduledRun for scheduled entities — without this, a failed scheduled entity would re-fire every 60 seconds instead of waiting for its next scheduled time.
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
-- ... overlap guard, trigger evaluation, ProcessBatch ...
END TRY
BEGIN CATCH
DECLARE @ErrMsg NVARCHAR(500) = ERROR_MESSAGE();
BEGIN TRY
-- Clear IsRunning so entity isn't stuck forever
UPDATE SchedulerConfigurations
SET IsRunning = 0, RunStartedAt = NULL
WHERE EntityId = @EntityId;
-- For scheduled entities: recalculate next run
-- (prevents re-firing every 60s on persistent errors)
IF @ProcessingMode = 'scheduled'
BEGIN
EXEC usp_Scheduler_CalculateNextRun ...
@NextRun = @ErrNextRun OUTPUT;
UPDATE SchedulerConfigurations
SET NextScheduledRun = @ErrNextRun
WHERE EntityId = @EntityId;
END;
INSERT INTO SchedulerLog (EntityId, EventType, Message)
VALUES (@EntityId, 'error', @ErrMsg);
END TRY
BEGIN CATCH
-- Swallow secondary errors — main loop must continue
END CATCH;
END CATCH;
NextEntity:
FETCH NEXT FROM EntityCursor INTO ...
END;The nested TRY/CATCH inside the error handler is deliberate. If the SchedulerLog INSERT itself fails (disk full, constraint violation), the dispatcher still moves to the next entity. You do not want error logging to crash the error handler.
Overlap guard and zombie recovery
APPLOCK prevents two dispatchers from running at the same time. But it does not prevent the dispatcher from firing a second batch for the same entity while the first batch is still running. That is the job of the IsRunning flag.
The zombie timeout is 30 minutes. If IsRunning = 1 and RunStartedAt is more than 30 minutes old, the dispatcher assumes the batch crashed without cleaning up. It resets the flag and logs a zombie event:
IF @IsRunning = 1
BEGIN
IF @RunStartedAt < DATEADD(MINUTE, -30, SYSUTCDATETIME())
BEGIN
-- Zombie: batch stuck for over 30 minutes — release and log
UPDATE SchedulerConfigurations
SET IsRunning = 0, RunStartedAt = NULL
WHERE EntityId = @EntityId;
INSERT INTO SchedulerLog (EntityId, EventType, Message)
VALUES (@EntityId, 'zombie',
'[' + @EntityName + '] Zombie batch released after '
+ '30-minute timeout.');
SET @IsRunning = 0;
-- Fall through: zombie released, now evaluate whether to fire
END
ELSE
BEGIN
-- Live running batch — skip to avoid overlap
INSERT INTO SchedulerLog (EntityId, EventType, Message)
VALUES (@EntityId, 'skipped',
'[' + @EntityName + '] Batch already running (overlap guard).');
GOTO NextEntity;
END;
END;After releasing a zombie, the dispatcher falls through to the trigger evaluation. If conditions are still met (rows are still pending), it fires a new batch. If the zombie was caused by a transient issue — a deadlock, a tempdb autogrow — the system self-heals without human intervention.
Processing modes
Each entity has one of three modes. The dispatcher only evaluates entities in scheduled or triggered mode.
User clicks "Process" in the UI. The dispatcher ignores this entity.
Fire at fixed times. Four schedule types: interval (every N minutes), daily (at HH:mm), weekly (specific days), monthly (specific dates).
Fire when data conditions are met. Three trigger types: row threshold, idle timeout, new row detection with debounce.
Scheduled mode
When mode is scheduled, the dispatcher checks if NextScheduledRun has passed. If it has: set IsRunning = 1, log a "fired" event, call usp_Staging_ProcessBatch, then recalculate the next run time using usp_Scheduler_CalculateNextRun.
The CalculateNextRun procedure handles all four schedule types. For interval: add N minutes to the last run time (or now, if never run). For daily: walk each comma-separated RunTimes value and find the next future HH:mm slot today or tomorrow. For weekly: scan the next 7 days, match against RunDays, and apply each RunTimes slot. For monthly: scan up to 32 days, match against RunDaysOfMonth.
Triggered mode
Triggered mode is data-driven. Before evaluating triggers, the dispatcher counts pending rows in the entity's staging table and measures the age of the oldest row:
DECLARE @QueueSQL NVARCHAR(500) = N'
SELECT @PendingRows = COUNT(*),
@OldestRowAge = ISNULL(
DATEDIFF(MINUTE, MIN(CreatedAt), SYSUTCDATETIME()), 0)
FROM stg.' + QUOTENAME(@SafeName) + N'
WHERE ImportStatus = 0;';
EXEC sp_executesql @QueueSQL,
N'@PendingRows INT OUTPUT, @OldestRowAge INT OUTPUT',
@PendingRows OUTPUT, @OldestRowAge OUTPUT;Dynamic SQL is necessary here because each entity has a different staging table name. QUOTENAME prevents SQL injection. ImportStatus = 0 filters for unprocessed rows only.
Then three conditions are evaluated in priority order:
@PendingRows >= @TriggerRowThresholdEnough rows have accumulated to justify a batch. Good for high-volume feeds that deposit thousands of rows per hour.
@PendingRows > 0 AND @OldestRowAge >= @TriggerIdleMinutesRows have been sitting unprocessed for too long. Catches the case where a slow trickle of rows never hits the threshold.
@TriggerOnNewRows = 1 AND @PendingRows > 0 AND debounce elapsedAny new rows trigger processing, but with a configurable debounce window (default 60 seconds) to avoid firing on every single INSERT.
Priority matters: if both the threshold and idle timeout conditions are true, the threshold wins. The TriggerSource value written to the log and to the batch record reflects which condition actually fired, making it easy to trace why a batch ran.
The debounce mechanism deserves a closer look. When TriggerOnNewRows = 1, the dispatcher checks if TriggerDebounceSeconds have elapsed since LastTriggerCheck. It updates LastTriggerCheck on every cycle regardless of whether it fires. The default debounce is 60 seconds — this prevents a rapid INSERT loop from spawning a batch on every dispatcher cycle.
Logging
The scheduler logs five event types. Every state transition writes a row to SchedulerLog:
Log cleanup is handled by usp_SchedulerLog_Cleanup, called as part of the maintenance step. It reads the retention period from AppSettings (default: 30 days) and deletes older rows. The index on (EntityId, EventTime DESC) keeps both the paginated UI queries and the cleanup DELETE fast.
The UI presents this log in a paginated table with color-coded event types. DBAs can also query SchedulerLog directly — it is a regular table, not hidden behind an abstraction.
Server-side config validation
The save procedure validates that mode-specific fields are actually supplied. If someone sends a direct API call with ProcessingMode = 'scheduled' but no ScheduleType, the SP rejects it:
IF @ProcessingMode = 'scheduled' AND @ScheduleType IS NULL
THROW 50001,
'ScheduleType is required when ProcessingMode is scheduled', 1;
IF @ProcessingMode = 'triggered'
AND @TriggerOnNewRows = 0
AND @TriggerRowThreshold IS NULL
AND @TriggerIdleMinutes IS NULL
THROW 50001,
'At least one trigger must be configured', 1;When switching modes, stale fields are cleared. If an entity switches from scheduled to manual, the old ScheduleType, RunTimes, and trigger fields are nulled out. This prevents a user from toggling back to "scheduled" months later and accidentally inheriting a stale configuration.
The test button
Before enabling a schedule, you probably want to know what would happen. usp_Scheduler_TestRun simulates one dispatcher cycle for a single entity. It evaluates all the same conditions — overlap guard, zombie check, queue stats, schedule evaluation, trigger conditions — but never fires a batch. Instead, it returns a result set:
SELECT
@WouldFire AS WouldFire, -- BIT
@Reason AS Reason, -- Human-readable explanation
@PendingRows AS PendingRows, -- Current queue depth
@OldestRowAge AS OldestRowAge, -- Minutes since oldest row
@NextScheduledRun AS NextScheduledRun;The Reason field explains exactly why a batch would or would not fire. "Next run not due yet. NextScheduledRun: 2026-03-02 06:00:00" or "Row threshold met: 1,200 rows >= threshold 1,000". When configuring a schedule for the first time, this test-before-you-commit workflow prevents accidental mass-processing.
Putting it all together
Seven stored procedures, two tables, one SQL Agent job, zero external dependencies. The entire scheduler lives inside SQL Server. The application server exposes the configuration and log via REST endpoints for the UI, but it has no role in the dispatch logic itself. If the app server goes down, the scheduler keeps running. To see how the data model that the scheduler processes is structured, models, entities, and attributes covers the foundational concepts.
The safety layers are worth summarizing:
APPLOCKIsRunningZombie timeoutTRY/CATCH isolationRun it on your SQL Server
Everything described in this post ships with Primentra out of the box. The SQL Agent job is created during installation, the scheduler configuration is built into the entity management UI, and all seven stored procedures deploy with the setup script. The 60-day trial includes full access to all scheduling modes, trigger configuration, and the scheduler log.