Configuring Space Partitions for Multi-Tenant Time-Series

Choosing number_partitions for a multi-tenant hypertable is a sizing decision, not a guess: the count must match your parallel write and I/O capacity, never your tenant count, because space partitioning hashes tenant_id into a small fixed set of chunks so writes fan out, tenant-scoped reads prune, and compression and retention operate on tenant-aligned units. This page gives you a deterministic way to pick that number, a worked IoT example, the edge cases that break the rule, and the exact catalog queries that confirm the setting took effect. It sits under Space Partitioning for Multi-Tenant IoT and assumes you have already decided that a second partitioning dimension is warranted.

Input Profiling: What to Measure First

The partition count is derived from hardware and workload facts, not from how many customers you have. Gather these before touching DDL:

Parallel I/O channels — the number of independent devices the table’s tablespace spans (NVMe namespaces, RAID members, or EBS volumes). This is the ceiling on useful write fan-out.
Concurrent write workers — how many ingestion connections insert into distinct tenants at the same peak second (gateway consumers, Kafka sink workers, batch loaders).
Peak tenant concurrency on read — how many tenants issue dashboards simultaneously; this sets how much pruning benefit you actually harvest.
Tenant cardinality and skew — total tenants and the ratio of the busiest tenant’s row rate to the median. Hash partitioning cannot rescue a single whale tenant.
Chunk footprint — your current chunk_time_interval sizing, because each space partition multiplies the chunk count per interval by number_partitions.

Record these into a short profile. The two that drive the formula are I/O channels and write workers; the rest govern the edge-case checks later.

The Calculation

Space partitioning divides every time interval into number_partitions hash buckets on the space column. Each additional partition multiplies catalog rows, autovacuum targets, and per-query planning work, so the goal is the smallest count that saturates parallel writes without inflating metadata. The deterministic rule:

N_{\text{partitions}} = \operatorname{clamp}\bigl(\operatorname{round\_even}(\min(C_{io},\, W_{write})),\; 4,\; 32\bigr)

where $C_{io}$ is parallel I/O channels and $W_{write}$ is concurrent write workers. Round up to an even value so hash distribution stays balanced, floor at 4 (below that the fan-out rarely pays for its overhead), and cap at 32 (beyond that, per-interval chunk explosion costs more than the parallelism returns). Critically, number_partitions is fixed at hypertable creation and cannot be altered later, so size it once, deliberately.

Profile input	Symbol	Recommended value	Effect if too high
Parallel I/O channels	$C_{io}$	one per independent device	Idle partitions, wasted planning
Concurrent write workers	$W_{write}$	measured at peak second	Lock contention shifts to catalog
number_partitions	$N$	clamp(round_even(min), 4, 32)	Chunk explosion, vacuum pressure
chunk_time_interval	—	sized separately by time	Multiplies total chunk count

Apply it idempotently so a redeploy always converges to the same layout:

sql

-- Idempotent base table for multi-tenant telemetry.
CREATE TABLE IF NOT EXISTS telemetry.raw_metrics (
    time        TIMESTAMPTZ NOT NULL,
    tenant_id   UUID        NOT NULL,
    device_id   VARCHAR(64) NOT NULL,
    metric_name VARCHAR(32) NOT NULL,
    value       DOUBLE PRECISION NOT NULL,
    metadata    JSONB
);

-- Convert to a hypertable with a second (space) dimension only if not already one.
-- number_partitions is the computed N; it CANNOT be changed after this call.
DO $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM timescaledb_information.hypertables
        WHERE hypertable_name = 'raw_metrics' AND hypertable_schema = 'telemetry'
    ) THEN
        PERFORM create_hypertable(
            'telemetry.raw_metrics',
            'time',
            partitioning_column => 'tenant_id',
            number_partitions   => 8,               -- computed N, see worked example
            chunk_time_interval => INTERVAL '7 days',
            create_default_indexes => false
        );
    END IF;
END $$;

-- Tenant-scoped composite index so the planner prunes by partition then seeks by time.
CREATE INDEX IF NOT EXISTS idx_raw_metrics_tenant_time
    ON telemetry.raw_metrics (tenant_id, time DESC);

If you define a primary key or unique constraint on this hypertable it must include both the time column and tenant_id, since every unique index has to contain all partitioning columns.

Worked Example: A 12-Node Gateway Fleet

Take a realistic industrial IoT platform: 40 tenants, 8,000 devices, ingesting roughly 120,000 rows/second at peak. The write path is 6 Kafka sink workers writing concurrently, each pinned to a tenant shard. Storage is a striped volume spanning 8 NVMe namespaces.

Plug the two drivers in: $C_{io} = 8$ , $W_{write} = 6$ , so $\min(8, 6) = 6$ . Six is already even and inside the 4–32 band, so $N_{\text{partitions}} = 6$ . Note what did not enter the calculation: the 40 tenants and 8,000 devices are irrelevant to the count. If you had naively set number_partitions => 40 to “match tenants”, every 7-day interval would produce 40 chunks instead of 6 — a 6.7× increase in catalog rows, autovacuum targets, and planning overhead, with no extra write parallelism beyond the 6 workers actually inserting.

With $N = 6$ , each time interval fans across six hash buckets. The six sink workers land on distinct chunks most of the time, so the shared-chunk insert lock that a time-only table would serialise on effectively disappears. A dashboard filtering WHERE tenant_id = $1 hashes to exactly one partition, so the planner discards five-sixths of every interval before it reads a page. Those tenant-aligned chunks then flow cleanly into downstream lifecycle jobs: per-partition columnar compression models and tenant-scoped TTL policy enforcement both operate on chunks that already correspond to a coherent slice of one tenant’s data.

Edge Cases and When to Deviate

The formula assumes even hash distribution and independent tenants. These conditions break it:

Whale tenants. If one tenant emits an order of magnitude more than the median, hashing still routes it to a single partition — that partition becomes the hot chunk you were trying to eliminate. Isolate the whale into its own hypertable or a dedicated tablespace instead of raising number_partitions.
Low tenant count. With fewer tenants than partitions, some hash buckets stay empty every interval, wasting the metadata they cost. If tenants are permanently below your computed $N$ , drop $N$ to the tenant count or skip space partitioning entirely.
Cross-tenant analytical scans. Queries that aggregate across all tenants gain nothing from pruning and pay the multi-chunk planning tax. If those dominate, keep $N$ at the 4 floor.
Tiny chunks. If your time-based chunk partitioning interval is already short, multiplying it by $N$ can push chunks below the ~25 MB range where per-chunk overhead dominates. Widen chunk_time_interval before adding a space dimension.
UUID vs. text keys. A high-cardinality space key spreads well; a low-cardinality one (e.g. region codes) can collide badly under hashing — see chunk indexing on high-cardinality tags for how key choice interacts with index placement.

Verification

Confirm the space dimension exists with the count you intended. timescaledb_information.dimensions exposes one row per dimension; the space dimension reports num_partitions:

sql

-- Confirm the configured space-partition count.
SELECT dimension_number, column_name, num_partitions
FROM timescaledb_information.dimensions
WHERE hypertable_name = 'raw_metrics'
  AND hypertable_schema = 'telemetry'
ORDER BY dimension_number;

Then confirm the fan-out is real by checking how many distinct chunks a single interval produced — it should equal number_partitions, not 1:

sql

-- Chunk count per time interval should equal number_partitions.
SELECT range_start, count(*) AS chunks_in_interval
FROM timescaledb_information.chunks
WHERE hypertable_name = 'raw_metrics'
  AND hypertable_schema = 'telemetry'
GROUP BY range_start
ORDER BY range_start DESC
LIMIT 3;

Finally, validate pruning at query time. EXPLAIN a tenant-scoped predicate and confirm that excluded partitions are simply absent from the plan — only the chunks for the hashed partition appear as children of the Append node:

sql

EXPLAIN (ANALYZE, BUFFERS)
SELECT time, value
FROM telemetry.raw_metrics
WHERE tenant_id = 'a1b2c3d4-e5f6-7890-abcd-ef1234567890'
  AND time > NOW() - INTERVAL '3 days';

If the plan still scans every chunk, the predicate is not selective on the space column — check that the query filters tenant_id directly rather than through a join or a function wrapper that the planner cannot fold into constraint exclusion.

Space Partitioning for Multi-Tenant IoT — the parent guide covering when and why to add a space dimension.
Best Practices for Chunk Indexing on High-Cardinality Tags — index placement for tenant and device keys.
How to Calculate Optimal chunk_interval for IoT Sensor Data — size the time dimension that this count multiplies.
Compression Models for High-Frequency Telemetry — how tenant-aligned chunks compress.
TTL Policy Mapping & Enforcement — retention on space-partitioned chunks.

← Up: Space Partitioning for Multi-Tenant IoT · Core Hypertable Architecture & Partitioning Strategy

Configuring Space Partitions for Multi-Tenant Time-Series

# Input Profiling: What to Measure First

# The Calculation

# Worked Example: A 12-Node Gateway Fleet

# Edge Cases and When to Deviate

# Verification

# Related

Input Profiling: What to Measure First

The Calculation

Worked Example: A 12-Node Gateway Fleet

Edge Cases and When to Deviate

Verification

Related