Data Freshness: SLAs, Monitoring, and Stale Data

Q: How do you detect stale data automatically?

Two reliable checks: compare the most recent timestamp in the table against the current time and alert when the gap exceeds the SLA; and count rows inserted in a rolling window and alert when zero rows arrive in a period that normally sees activity. The second check catches the silent-success failure where the pipeline runs without error but writes nothing.

June 25, 2026·Francisco Ferreira·9 min read

Data freshness is how recently a dataset was updated relative to its expected update interval. A fresh table received new data within the expected window; a stale one has not, even if the pipeline reported success and the dashboard loaded without error. The most dangerous property of stale data is that it is invisible: queries run, dashboards render, and the numbers look exactly as they looked last week, because they are last week's numbers.

Data freshness is the time elapsed since a table was last updated, compared to its expected update interval. A table with a 1-hour update rhythm that has not changed in 4 hours is stale: not because anything looks wrong, but because the expected rhythm was broken.

Why stale data stays hidden until it drives a wrong decision

Pipeline success does not mean data freshness. A DAG that completes with a green checkmark can write zero rows if the upstream API returned empty. A batch job can finish without error and ingest records from a cache that stopped updating six hours ago. The orchestrator logs say "success." The destination table says nothing, because it has not changed.

The canonical stale-data failure pattern:

A source system changes: API format shift, credentials rotation, schema update
The pipeline runs and completes with zero rows written, or wrong rows written
The downstream metric holds its prior value or flatlines
No alert fires, because no table-level threshold was breached
A stakeholder asks about the plateau in a meeting, hours or days later

Detection speed is the only variable that separates a 10-minute incident from a 72-hour one.

Freshness vs. latency vs. timeliness

Three terms that describe related but distinct properties of data pipelines:

Term	What it measures	Failure example
Freshness	Gap between now and the last table update, vs. the expected interval	Hourly table has not updated in 4 hours
Latency	Time from source event to when it is queryable at the destination	Purchase takes 45 minutes to appear in the warehouse
Timeliness	Whether data was available by a required business deadline	Revenue table did not update before the 9 AM finance review

A table can be fast (low latency) but stale (the pipeline stopped three hours ago). A table can be timely (it updated before 9 AM) while showing events from six hours ago. These are different failures, and each requires its own check.

The data freshness SLA framework

A freshness SLA is a written expectation: this table must update at least every X minutes and be available with current data by Y. Setting one requires knowing the table's normal rhythm and when its staleness causes a business problem.

This table gives starting SLAs for the five most common table types. Adjust for your specific pipeline behavior.

Table type	Normal update rhythm	Freshness SLA	Check frequency	Alert threshold
Events / clickstream	Continuous	30 minutes	Every 15 min	Last row > 45 min ago
Payments / transactions	Hourly	2 hours	Every 30 min	Last row > 2.5 hours ago
Daily aggregates (revenue, DAU)	Nightly batch	25 hours	Every 2 hours	Not updated by 8 AM
CRM sync (customers, accounts)	Every 4–6 hours	8 hours	Every 2 hours	Last row > 10 hours ago
Reference / dimension tables	Weekly or on-change	8 days	Daily	Not updated in 10 days

One consistent rule: set the SLA at 1.5–2× the expected update interval, not 1.0×. A table that updates every hour should alert at 90–120 minutes, not 61 minutes. False positives from normal variation train teams to ignore real alerts.

Four monitoring checks that cover freshness failures

Freshness monitoring is not a single check. It is a layered approach where each layer catches a different failure mode.

Last-updated timestamp check. Compare the most recent updated_at or created_at value against the current time. If the gap exceeds the SLA, alert. This is the baseline check: fast to implement, but blind to tables that lack a reliable timestamp column.

Row count window check. Count rows inserted in the last N minutes. If zero rows arrived in a window that normally sees 500, something stopped. This is the silent-success check: it catches pipelines that complete without error but write no data, which the timestamp check cannot detect.

Pipeline heartbeat check. Monitor the pipeline itself, not just its destination table. If a scheduled job did not start in its expected window, or started but is running 3× longer than normal, you have an upstream signal before the staleness appears downstream.

Metric-level freshness check. Upstream tables can look fresh while a specific column stops receiving valid values, and the computed metric becomes wrong. Watch the metric output directly. A business metric monitor catches what table-level freshness checks miss.

Before and after: what a freshness alert actually prevents

A fintech company runs its payments pipeline nightly, completing around 3 AM. The CFO reviews revenue at 9 AM.

Without freshness monitoring: On Tuesday night, a dependency service is temporarily unavailable from 2:30 to 3:15 AM. The pipeline job starts and completes successfully. Zero payment records write. The log shows "0 rows processed." Not an error, just an empty result. Wednesday morning, the revenue dashboard shows Tuesday's value flat at Monday's. The CFO asks the data team to investigate. By 10:30 AM they trace it to Tuesday's empty load: seven hours after the failure, during which reports and decisions referenced wrong data.

With freshness monitoring: At 3:05 AM, an alert fires: "payments table received 0 rows in the last 60 minutes. Expected 1,200 based on Tuesday-night baseline. Pipeline returned success. Likely: dependency service failure. Diagnostic query attached." The on-call engineer investigates at 3:10 AM, triggers a backfill, and resolves the issue before 4 AM. The 9 AM review shows correct revenue.

The engineering effort is identical in both scenarios. The freshness check added minutes of setup and removed hours of incident response.

What freshness monitoring does not catch

A table can be completely fresh (updated 4 minutes ago, within SLA) while containing wrong data. A freshness check confirms the pipeline ran. It does not confirm the data is correct.

The most common gap: an upstream schema change causes a key column to populate as null for 70% of rows. The table's updated_at is current. The row count is normal. The freshness check passes. The null rate is 70% and rising, but that check was never configured.

This is why data observability covers five signals, not one. Freshness tells you the pipeline ran. Null rate tells you whether what it wrote is complete. Row count anomaly tells you whether the volume is normal. Schema drift tells you whether the structure changed. Each check catches a different class of failure.

For tables that feed critical business metrics, monitor all four. For reference tables that change rarely, freshness alone is often sufficient.

How to start freshness monitoring without a data engineering team

The minimum viable freshness setup requires a read-only database connection and five decisions:

Pick five critical tables. The tables your core business metrics depend on. Not every table: just the ones where staleness causes a business problem within hours.
Confirm the actual update rhythm. Query the last 30 days of updated_at timestamps to see the real pattern. A "daily" table that actually runs at 2 AM, 6 AM, and noon needs a different SLA than one that runs once at midnight.
Define "stale enough to matter." This is a business decision. If the revenue table is an hour late but no one reviews it until 9 AM, the SLA should be "not updated by 8 AM," not "not updated in 61 minutes."
Route alerts where the team already works. Slack for async teams, PagerDuty for on-call. An alert requiring login to a separate tool at 3 AM will not be acted on.
Use a learned baseline, not a fixed threshold, for variable tables. Fixed thresholds on cyclical data fire false positives on weekends. A baseline that accounts for day-of-week and hour patterns eliminates the noise that trains teams to ignore real alerts.

See where your data quality stands today: the free 2-minute health check grades it A–F across freshness and four other dimensions, no account required. Or compare the data observability tools available in 2026 to find the right fit for your stack. Tabkeel connects read-only in under two minutes and starts monitoring freshness immediately. The Free plan monitors 10 tables, no card required.

Frequently asked questions

What is data freshness?

Data freshness is how recently a dataset was updated relative to its expected update interval. A table is fresh when it received new data within the expected window; it is stale when it has not, even if the pipeline reported success and no errors were logged.

What is a data freshness SLA?

A data freshness SLA is a written expectation of how often a table must be updated and by when it must be available. For example: "the payments table must update at least hourly and contain data from the last 2 hours." SLAs should be set at 1.5–2× the expected interval to avoid false positives from normal variation.

What is stale data?

Stale data is data that has not been updated within its expected freshness window. Queries still run, dashboards still load, but the numbers reflect a state of the world that no longer exists. The most dangerous stale data passes all standard quality checks while being hours or days out of date.

How do you detect stale data automatically?

Two reliable checks: (1) compare the most recent timestamp in the table against the current time and alert when the gap exceeds the SLA; (2) count rows inserted in a rolling window and alert when zero rows arrive in a period that normally sees activity. The second check catches the silent-success failure (the pipeline runs without error but writes nothing) that timestamp checks miss.

What is the difference between data freshness and data latency?

Data latency is the time between an event occurring and that event appearing in the destination. Data freshness is whether the data updated within its expected interval, regardless of pipeline speed. A fast pipeline still produces stale data if it stopped running. A slow pipeline produces fresh data if it ran on schedule.