Data Freshness: SLAs, Monitoring, and Stale Data
Data freshness is how recently a dataset was updated relative to its expected update interval. A fresh table received new data within the expected window; a stale one has not, even if the pipeline reported success and the dashboard loaded without error. The most dangerous property of stale data is that it is invisible: queries run, dashboards render, and the numbers look exactly as they looked last week, because they are last week's numbers.
Why stale data stays hidden until it drives a wrong decision
Pipeline success does not mean data freshness. A DAG that completes with a green checkmark can write zero rows if the upstream API returned empty. A batch job can finish without error and ingest records from a cache that stopped updating six hours ago. The orchestrator logs say "success." The destination table says nothing, because it has not changed.
The canonical stale-data failure pattern:
- A source system changes: API format shift, credentials rotation, schema update
- The pipeline runs and completes with zero rows written, or wrong rows written
- The downstream metric holds its prior value or flatlines
- No alert fires, because no table-level threshold was breached
- A stakeholder asks about the plateau in a meeting, hours or days later
Detection speed is the only variable that separates a 10-minute incident from a 72-hour one.
Freshness vs. latency vs. timeliness
Three terms that describe related but distinct properties of data pipelines:
| Term | What it measures | Failure example |
|---|---|---|
| Freshness | Gap between now and the last table update, vs. the expected interval | Hourly table has not updated in 4 hours |
| Latency | Time from source event to when it is queryable at the destination | Purchase takes 45 minutes to appear in the warehouse |
| Timeliness | Whether data was available by a required business deadline | Revenue table did not update before the 9 AM finance review |
A table can be fast (low latency) but stale (the pipeline stopped three hours ago). A table can be timely (it updated before 9 AM) while showing events from six hours ago. These are different failures, and each requires its own check.
The data freshness SLA framework
A freshness SLA is a written expectation: this table must update at least every X minutes and be available with current data by Y. Setting one requires knowing the table's normal rhythm and when its staleness causes a business problem.
This table gives starting SLAs for the five most common table types. Adjust for your specific pipeline behavior.
| Table type | Normal update rhythm | Freshness SLA | Check frequency | Alert threshold |
|---|---|---|---|---|
| Events / clickstream | Continuous | 30 minutes | Every 15 min | Last row > 45 min ago |
| Payments / transactions | Hourly | 2 hours | Every 30 min | Last row > 2.5 hours ago |
| Daily aggregates (revenue, DAU) | Nightly batch | 25 hours | Every 2 hours | Not updated by 8 AM |
| CRM sync (customers, accounts) | Every 4–6 hours | 8 hours | Every 2 hours | Last row > 10 hours ago |
| Reference / dimension tables | Weekly or on-change | 8 days | Daily | Not updated in 10 days |
One consistent rule: set the SLA at 1.5–2× the expected update interval, not 1.0×. A table that updates every hour should alert at 90–120 minutes, not 61 minutes. False positives from normal variation train teams to ignore real alerts.
Four monitoring checks that cover freshness failures
Freshness monitoring is not a single check. It is a layered approach where each layer catches a different failure mode.
updated_at or created_at value against the current time. If the gap exceeds the SLA, alert. This is the baseline check: fast to implement, but blind to tables that lack a reliable timestamp column.Before and after: what a freshness alert actually prevents
A fintech company runs its payments pipeline nightly, completing around 3 AM. The CFO reviews revenue at 9 AM.
Without freshness monitoring: On Tuesday night, a dependency service is temporarily unavailable from 2:30 to 3:15 AM. The pipeline job starts and completes successfully. Zero payment records write. The log shows "0 rows processed." Not an error, just an empty result. Wednesday morning, the revenue dashboard shows Tuesday's value flat at Monday's. The CFO asks the data team to investigate. By 10:30 AM they trace it to Tuesday's empty load: seven hours after the failure, during which reports and decisions referenced wrong data.
With freshness monitoring: At 3:05 AM, an alert fires: "payments table received 0 rows in the last 60 minutes. Expected 1,200 based on Tuesday-night baseline. Pipeline returned success. Likely: dependency service failure. Diagnostic query attached." The on-call engineer investigates at 3:10 AM, triggers a backfill, and resolves the issue before 4 AM. The 9 AM review shows correct revenue.
The engineering effort is identical in both scenarios. The freshness check added minutes of setup and removed hours of incident response.
What freshness monitoring does not catch
A table can be completely fresh (updated 4 minutes ago, within SLA) while containing wrong data. A freshness check confirms the pipeline ran. It does not confirm the data is correct.
The most common gap: an upstream schema change causes a key column to populate as null for 70% of rows. The table's updated_at is current. The row count is normal. The freshness check passes. The null rate is 70% and rising, but that check was never configured.
This is why data observability covers five signals, not one. Freshness tells you the pipeline ran. Null rate tells you whether what it wrote is complete. Row count anomaly tells you whether the volume is normal. Schema drift tells you whether the structure changed. Each check catches a different class of failure.
For tables that feed critical business metrics, monitor all four. For reference tables that change rarely, freshness alone is often sufficient.
How to start freshness monitoring without a data engineering team
The minimum viable freshness setup requires a read-only database connection and five decisions:
- Pick five critical tables. The tables your core business metrics depend on. Not every table: just the ones where staleness causes a business problem within hours.
- Confirm the actual update rhythm. Query the last 30 days of
updated_attimestamps to see the real pattern. A "daily" table that actually runs at 2 AM, 6 AM, and noon needs a different SLA than one that runs once at midnight. - Define "stale enough to matter." This is a business decision. If the revenue table is an hour late but no one reviews it until 9 AM, the SLA should be "not updated by 8 AM," not "not updated in 61 minutes."
- Route alerts where the team already works. Slack for async teams, PagerDuty for on-call. An alert requiring login to a separate tool at 3 AM will not be acted on.
- Use a learned baseline, not a fixed threshold, for variable tables. Fixed thresholds on cyclical data fire false positives on weekends. A baseline that accounts for day-of-week and hour patterns eliminates the noise that trains teams to ignore real alerts.
See where your data quality stands today: the free 2-minute health check grades it A–F across freshness and four other dimensions, no account required. Or compare the data observability tools available in 2026 to find the right fit for your stack. Tabkeel connects read-only in under two minutes and starts monitoring freshness immediately. The Free plan monitors 10 tables, no card required.
Frequently asked questions
What is data freshness?
Data freshness is how recently a dataset was updated relative to its expected update interval. A table is fresh when it received new data within the expected window; it is stale when it has not, even if the pipeline reported success and no errors were logged.
What is a data freshness SLA?
A data freshness SLA is a written expectation of how often a table must be updated and by when it must be available. For example: "the payments table must update at least hourly and contain data from the last 2 hours." SLAs should be set at 1.5–2× the expected interval to avoid false positives from normal variation.
What is stale data?
Stale data is data that has not been updated within its expected freshness window. Queries still run, dashboards still load, but the numbers reflect a state of the world that no longer exists. The most dangerous stale data passes all standard quality checks while being hours or days out of date.
How do you detect stale data automatically?
Two reliable checks: (1) compare the most recent timestamp in the table against the current time and alert when the gap exceeds the SLA; (2) count rows inserted in a rolling window and alert when zero rows arrive in a period that normally sees activity. The second check catches the silent-success failure (the pipeline runs without error but writes nothing) that timestamp checks miss.
What is the difference between data freshness and data latency?
Data latency is the time between an event occurring and that event appearing in the destination. Data freshness is whether the data updated within its expected interval, regardless of pipeline speed. A fast pipeline still produces stale data if it stopped running. A slow pipeline produces fresh data if it ran on schedule.
Related posts
Data Quality for Startups: A Practical Monitoring Guide
Data quality for startups means catching wrong numbers before decisions get made on them. Learn the six failure modes, the null cascade, and how to start monitoring with three tables and one metric.
Data Observability: Buy vs. Build (And the Free Option Nobody Talks About)
Data observability: buy vs. build has four real options, not two. What each costs, the hidden cost of open-source, and a 5-question framework to decide.
Data Governance for Small Teams: PII, Catalog, and Audit Trail
Data governance for small teams means knowing where your sensitive data lives, who changed it, and whether it can be trusted. A practical four-signal framework covering PII, catalog, schema drift, and audit.