total{debug} total{debug}
>_ Python · Database

A self-tuning write throttle for batch Postgres jobs

A self-tuning write throttle for batch Postgres jobs

I have a background job that writes a lot of rows. It chunks a big import into batches and upserts each batch into Postgres. On its own that’s fine. The problem is that it doesn’t run on its own: the same database is serving live traffic and a handful of other workers at the same time. When my import goes full tilt, everyone else feels it.

The naive fix is to drop a sleep() between batches and call it a day. I did exactly that, and it bugged me for weeks, because a fixed sleep is wrong in both directions at once. So I replaced it with a small feedback controller that tunes itself, and it’s one of those changes that’s far simpler than it sounds. This is how it works and why I’d reach for it again.

Why a fixed sleep is the wrong tool

Say you settle on time.sleep(0.5) between chunks. You’ve now baked in a single number that has to be right for every situation, and it never is.

When the database is idle overnight, that half-second does nothing useful. It just makes a job that could finish in five minutes take fifteen, for no benefit to anyone.

When the database is genuinely under load, half a second might not be anywhere near enough, and your import keeps piling pressure on at exactly the wrong moment.

And the killer: the “right” number changes the moment anything else does. Scale the database up a tier and your sleep is now too conservative. Scale it down, or let two of these jobs run at once, and it’s too aggressive. You end up tuning a magic constant by hand every time the environment shifts, which is precisely the kind of toil I try to design out.

What I actually want is simple to state: go full speed when the database is bored, back off when it’s busy, and never make me re-tune it when the hardware changes.

Use latency as the control signal

Here’s the insight that makes the whole thing work. I don’t need to measure CPU, connection counts, or replication lag to know whether the database is struggling. I can read it straight off the thing I’m already doing: how long each chunk’s write takes.

Per-chunk latency is a brilliant signal because it folds in everything I care about, for free:

  • A bigger or faster database does my write in less time, so latency drops.
  • A loaded database does it more slowly, so latency rises.
  • Pressure from other workers shows up here too. If they’re hammering the database, my queries get slower as well, and that’s exactly the signal I want to react to.

So instead of guessing at a sleep duration, I pick a latency budget: a target time per chunk that represents “this is a healthy pace”. If a chunk comes back under budget, the database has room to spare and I don’t sleep at all. If it comes back over budget, I sleep in proportion to how far over I am. The budget is a stable number tied to acceptable latency, not to the size of the box, so it keeps working when the box changes.

The controller

The heart of this is a single pure function. No database handle, no clock, no hidden state: you hand it the latest measurement and the previous smoothed value, and it hands you back a sleep duration and the new smoothed value. That’s deliberate, because a pure function is trivial to test, and the gnarly part of any throttle is the decision logic, not the plumbing.

1
2
3
4
5
6
7
8
def next_chunk_backoff(observed_ms, ema_ms, *, target_ms, max_sleep_ms, factor, alpha=0.5):
    """Return (sleep_seconds, new_ema_ms). Proportional backoff on EMA-smoothed
    per-chunk latency; zero when under the target budget."""
    new_ema = observed_ms if ema_ms is None else (alpha * observed_ms + (1 - alpha) * ema_ms)
    overage = new_ema - target_ms
    if overage <= 0:
        return 0.0, new_ema
    return min(max_sleep_ms, factor * overage) / 1000.0, new_ema

There are only three ideas in there:

  • Smooth the measurement. Raw per-chunk timings are jittery. One chunk hits a cold cache, the next is instant, and if I reacted to every spike I’d oscillate between sprinting and crawling. An exponential moving average (the ema_ms term) damps that down, so the controller responds to a sustained trend rather than a single unlucky chunk. alpha sets how reactive it is: higher means it trusts the latest reading more.
  • Back off proportionally. If the smoothed latency is over budget, I sleep by factor times the overage. Twice as far over budget means twice as long backing off. It’s the proportional term of a classic controller, and nothing more clever is needed here.
  • Cap it. max_sleep_ms stops a pathological reading from parking the job for a minute. Backoff is bounded no matter how ugly the latency gets.

The nicest property is what happens at normal load: the overage is zero or negative, so the function returns a 0.0 sleep and gets out of the way. The throttle is completely inert until the database is actually under pressure, which is exactly when you want it to wake up.

Wiring it into the loop

The controller is pure, so something stateful has to call it. That part is short. Time each chunk, feed the measurement in, sleep for whatever comes back:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import time

def run_import(chunks, *, target_ms=50.0, max_sleep_ms=500.0, factor=4.0):
    ema_ms = None
    total_throttle_s = 0.0
    throttled = 0

    for chunk in chunks:
        start = time.perf_counter()
        write_chunk(chunk)          # the bulk upsert, committed here
        observed_ms = (time.perf_counter() - start) * 1000.0

        sleep_s, ema_ms = next_chunk_backoff(
            observed_ms, ema_ms,
            target_ms=target_ms, max_sleep_ms=max_sleep_ms, factor=factor,
        )
        if sleep_s > 0:
            throttled += 1
            total_throttle_s += sleep_s
            time.sleep(sleep_s)

    return {"throttle_seconds": total_throttle_s, "throttled_chunks": throttled, "final_ema_ms": ema_ms}

Two details in there matter more than they look.

Sleep outside the transaction. By the time I sleep, write_chunk has already committed. The connection is idle and holding no locks, so napping on it blocks nobody. If you sleep with a transaction still open you’re holding locks while doing nothing, which is the opposite of being a good citizen.

Don’t close the connection to “save resources” during the sleep. An idle connection adds essentially no load, whereas reconnecting on every chunk is genuinely expensive and would dwarf any saving. Keep the connection, just don’t use it for a moment.

Make it observable

A throttle you can’t see is a throttle you can’t trust. The first question anyone asks is “is this thing even doing anything?”, and you want an answer in the logs, not a shrug. That’s why run_import returns the total time spent throttling, how many chunks were throttled, and the final smoothed latency.

Emit those as structured fields at the end of a run and you can actually reason about it: a job that throttled for 90 seconds across 400 chunks was clearly running alongside something heavy, while one that reports zero throttling sailed through an idle window at full speed. When you come to tune target_ms or factor, those numbers are the difference between informed adjustment and superstition.

Test the logic, not the sleeping

Because the decision lives in a pure function, the tests are boring in the best way. No database, no waiting on real time, just inputs and expected outputs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def test_under_budget_does_not_throttle():
    sleep_s, ema = next_chunk_backoff(
        20.0, None, target_ms=50.0, max_sleep_ms=500.0, factor=2.0,
    )
    assert sleep_s == 0.0
    assert ema == 20.0


def test_over_budget_backs_off_proportionally():
    sleep_s, _ = next_chunk_backoff(
        150.0, 150.0, target_ms=50.0, max_sleep_ms=500.0, factor=2.0,
    )
    # 100ms over budget, factor 2 -> 200ms -> 0.2s
    assert sleep_s == 0.2


def test_backoff_is_capped():
    sleep_s, _ = next_chunk_backoff(
        1000.0, 1000.0, target_ms=50.0, max_sleep_ms=300.0, factor=2.0,
    )
    assert sleep_s == 0.3


def test_ema_smooths_a_single_spike():
    # one slow chunk after a steady-low run shouldn't trigger a full reaction
    sleep_s, ema = next_chunk_backoff(
        200.0, 20.0, target_ms=50.0, max_sleep_ms=500.0, factor=2.0,
    )
    assert ema == 110.0
    assert sleep_s == 0.12

The control logic is covered without ever calling time.sleep or touching Postgres. If I want to test the wiring in run_import, I patch time.sleep and write_chunk and assert on the returned telemetry. The slow, flaky parts stay out of the test suite entirely.

When this is and isn’t worth it

I reach for this whenever a batch job shares a database with anything latency-sensitive and I can’t predict the load ahead of time. It’s a handful of lines, it has no dependencies, and it adapts to environments I haven’t thought of yet. That’s a good trade.

I wouldn’t bother if the job owns its database outright, or runs in a maintenance window where nothing else cares. A throttle that never needs to fire is just noise. And if you need hard guarantees rather than polite backoff, this isn’t that: it’s cooperative, not enforced. For real isolation you’d be reaching for connection pool limits, statement timeouts, or Postgres-side resource controls.

But for the common case, a shared database and a job that should know its place, a few lines of proportional feedback beat a hand-tuned sleep() every time. The job runs flat out when it can, eases off when it should, and I never have to touch that magic number again.