June 14, 2026 - David Oyinbo

On the Topic of Making Simple Useful Things (A Distributed Counter)

Counting things across multiple servers sounds trivial. It is not.

RustDistributed SystemsRedisEngineeringCounters

The Basic Problem

On a single server, counting is barely a problem. Something happens, you bump a number in memory, and you move on. The trouble starts the moment there is a second server.

The thing people forget is that an increment isn't one operation. It's three: read the current value, add one, write it back. On one machine those three steps run back to back and nobody notices. On two machines they interleave. Both servers read 41, both add one, both write 42. You did two increments and the counter only moved by one. The update didn't go anywhere. It was quietly overwritten.

The usual fix is to hand the problem to something that can serialise the writes for you, and that something is almost always Redis. Redis is single threaded. It runs one command at a time, full stop, and it gives you operations like HINCRBY that fold the read, the add, and the write into a single atomic step and hand you the new value back. There is no gap for a second writer to slip into.

For most things, that is the entire story. Wrap the increment in a small Lua script so the read and the write travel together, point every server at the same Redis, and the count stays correct no matter how many servers you are running. You can stop here and be fine.

Accuracy vs Throughput

The catch is the network. Every increment is now a round trip to Redis, and on a lightly loaded system you will never feel it. Start counting page impressions or API calls, things that happen thousands of times a second, and a Redis call per event becomes the most expensive thing your code does.

The way out is to stop writing on every increment. Keep a small buffer in memory, a map from key to the delta you owe Redis, and add to it locally. Then on a fixed interval, flush the whole thing in one pipelined batch and start over. Redis traffic drops by orders of magnitude, and inc stops being a network call. It becomes an in memory atomic add, which finishes in well under a microsecond.

You do not get that for free, of course. You have traded immediate consistency for eventual consistency. Increment a counter on server A, immediately read it from server B, and B can hand you a stale total because it has not flushed yet. Reads stay consistent within a single process, since you just add the local buffer to whatever Redis already has, but across servers there is a window, up to one flush interval wide, where the totals disagree. For analytics that is a perfectly fine trade. For billing or inventory it absolutely is not.

When a Global Counter Is Not Enough

The two patterns above cover a surprising amount of ground. There is one class of problem they do not touch, though, and it took me a while to see why.

Say you want to know how many WebSocket connections are open across the whole cluster right now. A global counter seems perfect: increment when a connection opens, decrement when it closes, read the total whenever you like. And it works, right up until a server crashes. Server A dies with 500 connections open. Those connections are gone, but nobody got to run the 500 decrements. The global count is now 500 too high, and it stays that way forever, because the only process that knew about those connections is no longer running.

You can try to paper over this. Reset the count on startup, and you race every other server that is also starting up. Run a periodic reconciliation job, and now you have an extra moving part plus a window where the number is just wrong. I tried both in my head and neither one sat right.

Owning a Slice

The cleaner idea is to stop sharing one number. Give every server its own slice of the counter instead.

Each process gets a UUID when it starts. When it increments, it writes to a key that belongs to it and nobody else, and it bumps the shared cumulative in the same atomic step. Because no two processes ever write to the same slice, there is nothing to coordinate. Each one is the sole authority on its own contribution, and the cumulative is just kept in lockstep as the sum of every live slice. No locks, no fighting over a single field.

It also changes what a read can tell you. Every read comes back with two numbers: the cluster wide cumulative and this instance's own slice. More often than not that is exactly the pair you want, the global total on a dashboard while still knowing how much of it this particular box is responsible for.

The Dead Instance Problem

Slicing fixes the coordination problem. It does not fix the dead instance problem, not yet. If server A dies, its slice is still sitting in Redis, and the cumulative still counts it.

What I settled on is heartbeats, but not the version you might expect. There is no TTL key per instance. Instead there is a single sorted set per counter, where each member is an instance's UUID and its score is the last time that instance was seen. Every operation an instance runs, every inc, every get, anything that touches Redis, also stamps its own score with the current time. So liveness rides along on work the instance was already doing. As long as a process keeps doing anything at all, it keeps itself alive in that set.

The cleanup is the part I am happiest with: it is not a background job at all. It is a side effect of ordinary operations. Whenever any instance touches a counter, the script first range scans the sorted set for members whose last seen timestamp is older than the threshold. For each dead instance it finds, it subtracts that instance's contribution back out of the cumulative, deletes its slice, and drops it from the set. The survivors clean up after the dead, collectively, with no coordinator and no scheduled task. A crashed server's contribution just evaporates the next time anyone looks.

How long older than the threshold means is up to you, dead_instance_threshold_ms, thirty seconds by default. Turn it down and you react to crashes faster but start punishing instances that merely went quiet for a moment over a flaky network. Thirty seconds has been a sane middle for everything I have thrown at it.

Coordinating a Reset

One problem left. Suppose you want to reset a counter to a specific value across the whole cluster, set the global total to 100. Each instance still has its own slice lying around, so the next time anyone reads and the cumulative gets reconciled against those slices, your clean 100 turns back into garbage.

I solved this with an epoch, a version number per counter, living in Redis, that marks the current era. The coordinating operations, set and del, bump the epoch when they run. Every instance remembers the epoch it last saw locally, and before it does anything it checks whether Redis has moved on. If the epoch changed, the instance zeroes its own slice before touching the counter. After a reset, every instance independently realises it is in a new era and starts from a clean baseline. No broadcast, no instance ever has to talk to another directly.

Recovery from a network blip falls out of the same mechanism for free. If a server drops off and comes back holding increments it never managed to flush, it should only replay them if the epoch has not changed. If it has, the buffered data belongs to a previous era and gets thrown away. One check, and recovery is safe without a separate reconciliation protocol bolted on top.

The Lax Layer, Again

The buffering trick from the basic counter applies here too. If you are tracking something that fires thousands of times a second, you still do not want a Redis call per event just because the counter happens to be instance aware. So the lax instance aware counter buffers inc and dec locally the same way the plain lax counter does. The only difference is that when it flushes, the writes land in this instance's slice instead of a shared key.

The warm path never leaves memory. inc, dec, get, and set_on_instance all read and write the local buffer and return immediately. The operations that have to be globally consistent, set, del, and clear, flush whatever is pending first and then hand off to the strict implementation. So the latency promise has a footnote: warm path calls finish in well under a microsecond, but anything that must be true across the whole cluster pays for a Redis round trip. That is the honest version, and I would rather state it than hide it.

The flushing itself runs on a Tokio task that holds a weak reference back to the counter. When you drop the counter, the weak reference cannot upgrade, the task notices on its next tick, and it shuts itself down. There is nothing to stop manually and nothing to leak.

I Built This in distkit

I needed every one of these behaviours for a realtime platform I have been building. Rather than let the implementation rot inside that one project, I pulled it out into a standalone Rust crate: distkit.

It ships four counter types, StrictCounter, LaxCounter, StrictInstanceAwareCounter, and LaxInstanceAwareCounter, and each one maps cleanly onto a decision from this post. The crate is #![forbid(unsafe_code)], with 194 tests and 77 runnable doc examples keeping me honest. It also pulls in trypema behind a feature flag, since the same platform that needed counting also needed rate limiting, and the two tend to show up together.

The API stays out of your way. You build the options once, then call inc, get, and friends. Here is the setup every example shares:

use distkit::{DistkitRedisKey, counter::{StrictCounter, LaxCounter, CounterOptions, CounterTrait}};

let client = redis::Client::open("redis://127.0.0.1/")?;
let conn = client.get_connection_manager().await?;

let prefix = DistkitRedisKey::try_from("my_app".to_string())?;
let options = CounterOptions::new(prefix, conn);
let key = DistkitRedisKey::try_from("page_views".to_string())?;

A strict counter, where every read reflects the latest write:

let counter = StrictCounter::new(options.clone());

counter.inc(&key, 1).await?;
let total = counter.get(&key).await?; // hits Redis, always exact

A lax counter, where increments stay in memory and flush on an interval:

let counter = LaxCounter::new(options);

counter.inc(&key, 1).await?;          // local, returns immediately
let total = counter.get(&key).await?; // local view, no Redis hit

And an instance aware counter, where every call hands back both the cluster total and this instance's own slice:

use distkit::icounter::{
    InstanceAwareCounterTrait,
    StrictInstanceAwareCounter, StrictInstanceAwareCounterOptions,
};

let counter = StrictInstanceAwareCounter::new(
    StrictInstanceAwareCounterOptions::new(prefix, conn),
);

let (total, mine) = counter.inc(&key, 1).await?;
let (total, mine) = counter.dec(&key, 1).await?;

If you have a similar problem and you would rather not reimplement any of the above, the crate is on crates.io.

Related Logs

February 2, 2024

Error Handling in Rust Where Bugs Go to Take a Vacation

Safeguarding Software Reliability with Rust's Error Handling Mechanisms - exploring Result and Option enums, the ? operator, custom error types, and panic! for robust error management

rusterror-handlingprogramming

View log

February 2, 2024

Overview of The Proxy Design Pattern

A simplified overview of the proxy design pattern with examples in Java - exploring structural design patterns that provide surrogates or placeholders for other objects

design-patternssoftware-developmentcoding

View log

April 21, 2024

Understanding the Client-Server Model in Distributed Computing

Exploring the fundamental architecture in distributed computing that facilitates efficient allocation of tasks and workloads between clients and servers through network communication

distributed-computingclient-servernetworking

View log

On the Topic of Making Simple Useful Things (A Distributed Counter)

The Basic Problem

Accuracy vs Throughput

When a Global Counter Is Not Enough

Owning a Slice

The Dead Instance Problem

Coordinating a Reset

The Lax Layer, Again

I Built This in distkit

Related Logs

Error Handling in Rust Where Bugs Go to Take a Vacation

Overview of The Proxy Design Pattern

Understanding the Client-Server Model in Distributed Computing

Let's build something together