Mesh math: one packet, twenty thousand devices . Hoppr Blog

A packet drops into a Bluetooth mesh of N devices. How far does it go. How fast. What keeps it from flooding forever. This post walks through the arithmetic Hoppr uses, with numbers, on a whiteboard, in the order they actually matter.

If you have heard the phrase "gossip protocol" and nodded politely, the goal here is to give you enough grounding that the defaults stop looking arbitrary. TTL equals seven, dedup cache holds a thousand IDs, sync envelope is four hundred bytes. Those numbers are not tuned by vibe. They fall out of a few lines of algebra that a developer can work in their head.

The reach formula

In a mesh of N devices where each node has average degree d, a packet propagates in O(log_d N) hops. Under an asynchronous flood at uniform density, the expected number of devices reached after h hops is bounded by a simple branching expression.

E[R(h)]  =  min( N ,  d · (d−1)^(h−1) )

The intuition is that the origin radio reaches d peers. Each of those peers has its own d neighbours, but one of them is the node that just sent to it, so only d minus one are new. Recursion up to h hops gives the product. The min with N is there because you cannot reach more devices than exist.

A field example worked out

Assume five peers per radio on average at BLE range, d equals 5, and the default TTL of h equals 7. Plug it in. The bound evaluates to 5 times 4 to the power of 6, which is 20,480 unique devices in the reachable set. A festival crowd of eight thousand people saturates in about six hops. A protest with forty thousand people packed close enough that each phone sees around eight neighbours saturates in five.

Field: d=5, h=7 → E[R] = 20,480 devices
Field: d=8, h=5 → E[R] = 32,768 devices

These are upper bounds, not guarantees. Real radios miss packets, nodes move, some peers are asleep. The arithmetic is for sizing, not for promises. What it tells you is that seven hops is enough reach for any realistic crowd, and that pushing the TTL higher buys almost nothing in the cases you care about.

Why TTL equals 7 and not 15

Each hop adds roughly 15 to 20 ms of latency on BLE once you account for scan windows, connection intervals, and the responder's own relay decision. Seven hops caps worst-case delivery latency at around 120 ms. That is fast enough that a message feels immediate. TTL equals 15 is supported by the packet format and reserved for very sparse topologies, for example a long thin corridor of campers where each device sees only one or two neighbours and reach is bottlenecked by chain length, not branching.

The adaptive relay module watches local node degree. When it exceeds the high-density threshold, every relay stops forwarding every unseen packet and falls back to probabilistic forwarding with p equals 0.3. That is the point where radio contention starts to hurt more than additional reach helps. The trade is intentional. Once the crowd is dense enough to saturate in five hops, the eighth hop is not extending the mesh, it is burning battery on redundant transmissions.

Deduplication: LRU, not bloom

There is a common misreading of academic mesh papers that leads implementers to reach for a bloom filter. Hoppr does not. We use a bounded LRU cache of the last 1000 message IDs with a five minute expiry per device. Lookups are O(1). No false positives, no hash-function tuning, no bloom-filter math to explain to auditors when they ask why a message got silently dropped.

The size budget is trivial. A thousand IDs times sixteen bytes each is sixteen kilobytes of RAM per device. On a phone with gigabytes of memory this is noise. The five minute expiry is important: without it, a device that sits in a dense mesh for an hour would accumulate enough stale IDs to start evicting fresh ones before they aged out of the network. Five minutes covers the time a packet can plausibly still be in flight somewhere in the mesh, and is short enough that the cache never needs to grow.

The sync envelope

When two peers reconnect after being out of radio range, they cannot replay every message they might have missed. The bandwidth and battery cost is too high and most of it would be redundant. Instead each peer exchanges a 400-byte Golomb-Coded Set that summarises the IDs it has seen. The other side diffs its own set against the summary and transmits only the missing messages.

A GCS packs sorted hashes at a target false-positive rate. For n elements at false-positive rate p, the bits per element works out to approximately log2(1/p) plus 1/ln(2). At p equals 0.01, that is about 8.5 bits per element. Four hundred bytes of payload carries roughly 380 IDs of state, which is enough to reconcile a day of disconnection in one bundle smaller than a single TCP segment.

Good enough is the right target here. The 1 percent false-positive rate means that about one missed message in a hundred is not detected during the gossip reconcile. Those get picked up on the next reconcile or delivered redundantly through another peer on the mesh. The layering is forgiving.

Adaptive forwarding

At low density, every relay forwards every unseen packet. The mesh behaves like pure flooding and reach is maximised. At high density, relays forward with probability p, where p decreases as local degree increases. This is the classic BLE mesh managed-flood-with-relay-pruning approach. It trades a small amount of reach for very large battery savings at crowded events, and it is what keeps a phone in your pocket during a ten-thousand-person protest from draining in an hour.

The probability curve is not magic. It is tuned so that expected coverage stays above 95 percent at the densities where the default would otherwise light every radio continuously. The tuning was done empirically against the reach formula above, not against any particular field test, because the formula is tight enough to trust at design time.

Where it breaks

The honest answer: the single point of failure is geographic. A mesh of a hundred devices clustered in one building has no tether to the world outside that building. No amount of TTL helps you reach a phone that is not in the radio cell.

This is why Hoppr has a second layer. The Nostr fallback over Tor handles the "I need to reach someone not in this radio cell" case. It is slower, it depends on an internet connection somewhere, it has its own privacy properties. But the two layers together give you the thing you actually want, which is local-first messaging with a global escape hatch that does not need you to decide in advance which one to use.

Closing

You can read the whole protocol in one afternoon. There are no secret constants, no Byzantine agreement, no consensus algorithm. It is gossip with a bounded dedup cache and a tiny reconciliation frame. The intuition is the same as water finding every crevice in a rock. Given enough radios and enough hops, a message reaches every device it can reach. The math is there to tell you how long that takes.

Published by the Hoppr team. Mirrored as a NIP-23 long-form event (kind 30023) on the Hoppr publication key. Questions: hello@hoppr.chat.