Refresh Policy Design & Scheduling

A continuous aggregate is only as useful as the freshness of the data it serves. The core engineering problem this guide solves is a single, focused one: how to design and schedule the background policy that decides which time buckets get re-materialized, how often, and with what safety margin — so that dashboards stay current without the scheduler daemon fighting your ingestion path for CPU and I/O. Raw telemetry must ingest at low latency, yet analytical queries expect pre-aggregated rollups that lag reality by seconds, not hours. The refresh policy is the control surface between those two demands. Getting its start_offset, end_offset, and schedule_interval right is the difference between a self-healing pipeline and one that silently serves stale numbers. This page sits inside the broader continuous aggregate refresh lifecycle and assumes you already have a materialized view in place.

Prerequisites

Before registering any policy, confirm the environment can actually run background jobs and that the target view is a real continuous aggregate rather than a plain materialized view.

TimescaleDB 2.10 or newer on PostgreSQL 14+ (the policy_refresh_continuous_aggregate proc and end_offset semantics assume this baseline).
timescaledb.max_background_workers is set high enough to cover every refresh, compression, and retention job that runs concurrently — a starved worker pool is the most common cause of a policy that “exists but never runs”.
max_worker_processes is large enough to hold the background workers plus your application’s parallel query workers.
The target object was created with CREATE MATERIALIZED VIEW ... WITH (timescaledb.continuous) and buckets on time_bucket() — see materialized view architecture for the DDL contract a policy depends on.
The bucket column is TIMESTAMPTZ. Naive TIMESTAMP columns make offset arithmetic ambiguous across DST boundaries.
A connection role with USAGE on the schema and ownership of the aggregate, since add_continuous_aggregate_policy writes to the job catalog.

Verify the extension and worker headroom in one pass:

sql

-- Confirm version and that background workers are actually available.
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';
SHOW timescaledb.max_background_workers;
SELECT count(*) AS running_workers
FROM pg_stat_activity
WHERE backend_type = 'TimescaleDB Background Worker Scheduler';

Step-by-Step Implementation

The steps below map directly to the refresh-window diagram above: you first decide where the window’s trailing edge (start_offset) and leading edge (end_offset) sit, then register the job that sweeps that window on every schedule_interval.

1. Choose the refresh cadence

Cadence is a workload decision, not a default. Minute-grain rollups feeding an alerting pipeline want a tight schedule_interval (1–5 minutes); daily or monthly summaries can refresh hourly or nightly without any dashboard noticing. Reserve full recomputes for backfills and corrections — for the mechanics of that trade-off, work through incremental vs full refresh strategies before you settle on a number. As a rule, incremental policies should be your default and the schedule_interval should never be shorter than the p95 runtime of a single refresh, or jobs queue behind each other.

2. Size the window offsets

start_offset sets how far back the window reaches on each run; end_offset sets how close to now the window stops. The trailing edge must cover the widest expected out-of-order arrival (late-uplinking IoT gateways, batched edge writes) so that late rows still land in a bucket the policy will revisit. The leading edge must sit at least one full bucket behind wall-clock so the policy never materializes a bucket that is still receiving data.

3. Register the policy idempotently

Treat policy creation as declarative infrastructure. Drop any prior policy first, then add the intended one, so a redeploy always converges to the same offsets rather than erroring or stacking duplicates. This exact pattern underpins the focused walkthrough on automatic refresh policies for 5-minute intervals.

sql

-- Idempotent policy deployment for a TimescaleDB continuous aggregate.
-- Dropping first (if_not_exists => true never errors when absent) guarantees a
-- clean reset of the offsets before re-registering under CI/CD.
SELECT remove_continuous_aggregate_policy('sensor_telemetry_5m_agg', if_not_exists => true);

SELECT add_continuous_aggregate_policy(
    'sensor_telemetry_5m_agg',
    start_offset      => INTERVAL '10 minutes',  -- trailing edge: covers late arrivals
    end_offset        => INTERVAL '2 minutes',   -- leading edge: skip the in-flight bucket
    schedule_interval => INTERVAL '5 minutes',   -- how often the worker sweeps the window
    if_not_exists     => true
);

The start_offset guarantees the policy never touches data still being written by upstream producers, while the end_offset prevents the scheduler from racing ahead of the current wall clock. Together they give deterministic watermark progression under TimescaleDB’s native job scheduler.

4. Set the first run explicitly (optional)

By default the scheduler picks the initial start time. When you need refreshes to land on clean wall-clock boundaries (top of the minute, top of the hour) — which makes dashboards and alerts far easier to reason about — pin initial_start and, for multi-region fleets, timezone:

sql

SELECT add_continuous_aggregate_policy(
    'sensor_telemetry_5m_agg',
    start_offset      => INTERVAL '10 minutes',
    end_offset        => INTERVAL '2 minutes',
    schedule_interval => INTERVAL '5 minutes',
    initial_start     => date_trunc('hour', now()) + INTERVAL '1 hour',
    timezone          => 'UTC',
    if_not_exists     => true
);

5. Guard the deploy with a pre-flight check

For fleets that span regions, DST transitions can silently create a 23-hour or 25-hour day, distorting hourly buckets around the change. A short Python pre-flight run before policy registration catches an imminent offset change so you can widen the window or skip the transition window. This slots naturally into an infrastructure-as-code pipeline:

python

import zoneinfo
from datetime import datetime, timedelta
import psycopg
from psycopg.rows import dict_row


def validate_refresh_window(db_uri: str, agg_view: str, target_tz: str = "UTC") -> bool:
    """Flag DST transitions in the next 24h and confirm the policy is registered."""
    tz = zoneinfo.ZoneInfo(target_tz)
    now = datetime.now(tz)

    # A DST transition shows up as a change in UTC offset across the window.
    # Sample hourly points over the next day and flag any offset change.
    offsets = {(now + timedelta(hours=i)).utcoffset() for i in range(24)}
    if len(offsets) > 1:
        print(f"Warning: DST transition within 24h for {target_tz}; "
              "widen start_offset or defer the deploy.")
        return False

    with psycopg.connect(db_uri, row_factory=dict_row) as conn, conn.cursor() as cur:
        cur.execute(
            """
            SELECT 1
            FROM timescaledb_information.continuous_aggregates ca
            JOIN timescaledb_information.jobs j
              ON j.hypertable_name = ca.materialization_hypertable_name
             AND j.proc_name = 'policy_refresh_continuous_aggregate'
            WHERE ca.view_name = %s
            """,
            (agg_view,),
        )
        if not cur.fetchone():
            raise RuntimeError(f"Policy for {agg_view} not found.")

    return True

Configuration Parameters Reference

Every argument to add_continuous_aggregate_policy maps to one edge or timing of the refresh window. The recommendations below assume a high-frequency IoT rollup feeding operational dashboards.

Parameter	Type	Recommended value	Effect
`start_offset`	`INTERVAL` (or `BIGINT` for integer time)	2–3× the worst-case late-arrival window	Trailing edge: how far back each run re-materializes. Too small drops late rows; too large wastes CPU re-scanning settled buckets.
`end_offset`	`INTERVAL` / `BIGINT`	≥ 1 bucket width (e.g. `2 minutes` for 1-min buckets)	Leading edge: keeps the in-flight bucket out of the window so it is never materialized half-full.
`schedule_interval`	`INTERVAL`	≥ p95 refresh runtime; 1–5 min for real-time	How often the job runs. Shorter than a single refresh’s duration causes jobs to queue.
`initial_start`	`TIMESTAMPTZ`	Next clean wall-clock boundary	First execution time; pin it to align runs with minute/hour boundaries.
`timezone`	`TEXT`	`'UTC'` unless buckets are local	Anchors `initial_start` and schedule math across DST for region-local aggregates.
`if_not_exists`	`BOOLEAN`	`true` in automation	Makes registration idempotent so redeploys converge instead of erroring.

A useful sizing intuition for the trailing edge:

\text{start\_offset} \geq \text{bucket\_width} + \max(\text{late\_arrival\_delay})

If your buckets are 5 minutes wide and gateways can uplink up to 4 minutes late, a start_offset of 10 minutes leaves comfortable headroom while keeping each run cheap.

Integration With Adjacent Features

A refresh policy never runs in isolation — it shares the background worker pool and the chunk lifecycle with several neighbours:

Asynchronous execution. Refresh jobs are dispatched through the same scheduler that runs compression and retention. When many policies fire at once they contend for workers; how that queue drains is covered in asynchronous execution and queue management, and the specific tuning for large historical sweeps in incremental refresh performance tuning for large datasets.
Retention alignment. Retention must run after the incremental pass completes, or a chunk can be dropped while a partial still references it. Keep the two decoupled and staggered, mapping business SLAs to drop_after windows as described in TTL policy mapping and enforcement under the wider data retention and lifecycle automation strategy.
Compression. Once a bucket is settled and unlikely to be refreshed, its underlying chunk becomes a compression candidate; the trade-offs live in columnar compression models for high-frequency telemetry. Never let start_offset reach back into already-compressed chunks unless you have enabled compressed-chunk refresh, or the job will error.
Chunk boundaries. Windows that align to time-based chunk partitioning touch fewer chunks per run and lock less. Misaligned offsets fan a single refresh across extra chunks and inflate I/O.
Query-time gap filling. Because the policy stores only real buckets, sparse-data presentation stays a query concern — see creating continuous aggregates with time_bucket_gapfill.

Performance Validation

After a policy is live, prove it is running on schedule and materializing the ranges you expect. TimescaleDB exposes the job and its statistics through system views.

Confirm the policy exists and inspect its configured offsets:

sql

SELECT j.job_id,
       j.schedule_interval,
       j.config ->> 'start_offset' AS start_offset,
       j.config ->> 'end_offset'   AS end_offset,
       ca.view_name
FROM timescaledb_information.jobs j
JOIN timescaledb_information.continuous_aggregates ca
  ON ca.materialization_hypertable_name = j.hypertable_name
WHERE j.proc_name = 'policy_refresh_continuous_aggregate';

Check execution health — success counts, failures, and the last run’s duration — through job_stats:

sql

SELECT job_id,
       last_run_status,
       last_successful_finish,
       total_runs,
       total_failures,
       last_run_duration
FROM timescaledb_information.job_stats
WHERE job_id IN (
    SELECT job_id FROM timescaledb_information.jobs
    WHERE proc_name = 'policy_refresh_continuous_aggregate'
);

Measure freshness directly by comparing the newest materialized bucket against wall-clock — this is your true refresh lag:

sql

SELECT now() - max(bucket) AS refresh_lag
FROM sensor_telemetry_5m_agg;

If refresh_lag drifts well beyond end_offset + schedule_interval, the policy is falling behind and warrants investigation before dashboards go stale.

Troubleshooting

Policy is registered but last_run_status is never Success. The worker pool is exhausted. Raise timescaledb.max_background_workers (and max_worker_processes above it), then restart. Confirm with the running_workers query from the prerequisites.

ERROR: cannot refresh compressed chunk (or refresh silently skips old ranges). The start_offset reaches into a chunk that compression already sealed. Either shorten start_offset so it stops before the compression boundary, or enable refresh of compressed regions. Keep the refresh window and the compression orderby/age policy from overlapping.

Dashboards show stale numbers even though the job reports success. The window is too tight for late data: rows arrived after their bucket left the window. Widen start_offset to cover the true worst-case arrival delay, and confirm the diagnosis against troubleshooting stale continuous aggregates in production.

total_failures climbs but the job keeps retrying. A transient lock or deadlock against ingestion. The scheduler backs off and retries automatically; for deterministic control, wire in explicit recovery via error handling and retry mechanisms or a bespoke handler as in handling refresh failures with custom PL/pgSQL triggers.

Runs overlap and queue up. schedule_interval is shorter than a single refresh’s runtime. Compare last_run_duration against the interval and lengthen the interval, or narrow start_offset so each run does less work.

Frequently Asked Questions

Can two continuous aggregates safely share the same refresh schedule?

Yes, but they share the background worker pool. Two policies firing on the same schedule_interval will contend for workers and can serialize behind each other. Stagger their initial_start by a fraction of the interval, or ensure max_background_workers covers both plus any compression and retention jobs that overlap.

What happens to data that arrives inside the end_offset gap?

Rows landing in the still-forming bucket (newer than now - end_offset) are simply not materialized yet. On the next run, once that bucket falls inside the window, it is materialized in full. Nothing is lost — the end_offset only delays materialization of the newest bucket, it never discards data.

Should start_offset ever be NULL?

A NULL start_offset means “refresh from the beginning of time” on every run — a full recompute disguised as a policy. It is appropriate only for small aggregates or deliberate periodic full refreshes. For high-frequency telemetry it will saturate the worker pool; keep it bounded to the late-arrival window instead.

How do I force an immediate refresh outside the schedule?

Call CALL refresh_continuous_aggregate('sensor_telemetry_5m_agg', now() - INTERVAL '1 hour', now()); for an ad-hoc manual pass over a chosen range. This runs in your session, not the background worker, and is the right tool for backfills and one-off corrections — not for steady-state freshness.

Does changing the offsets re-materialize historical buckets?

No. Altering start_offset only changes the window that future runs sweep. Buckets already materialized under the old offsets stay as they were until a run’s window covers them again. To correct history after widening the window, trigger a manual refresh_continuous_aggregate over the affected range.

← Back to Continuous Aggregate Creation & Refresh Management

Materialized View Architecture & Syntax — the DDL contract every refresh policy depends on.
Incremental vs Full Refresh Strategies — how to decide what a policy recomputes.
Asynchronous Execution & Queue Management — how refresh jobs share the scheduler.
Error Handling & Retry Mechanisms — recovering from failed refresh runs.
Setting up Automatic Refresh Policies for 5-Minute Intervals — a focused walkthrough of the pattern on this page.

Refresh Policy Design & Scheduling

# Prerequisites

# Step-by-Step Implementation

# 1. Choose the refresh cadence

# 2. Size the window offsets

# 3. Register the policy idempotently

# 4. Set the first run explicitly (optional)

# 5. Guard the deploy with a pre-flight check

# Configuration Parameters Reference

# Integration With Adjacent Features

# Performance Validation

# Troubleshooting