Sending data over sound is older than most of the internet. The first acoustic modems shipped in 1958. What changed in the last decade is that every consumer device now carries a microphone and a speaker that can comfortably reach the upper edge of the audible range, and the processing power to encode and decode payloads in real time.

This post unpacks the moving parts of an ultrasonic exchange the way Beeping does it.

The frequency choice

Human hearing tops out around 20 kHz for children and drops to roughly 14-16 kHz for most adults. Standard consumer hardware — phones, laptops, smart speakers — can both emit and pick up audio reliably up to about 22 kHz, the limit of the 44.1 kHz sample rate that came out of the CD era.

Beeping's working band sits between 18 kHz and 22 kHz. That gives us four kilohertz of bandwidth that almost no one can hear, and that almost every device manufactured in the last fifteen years can faithfully reproduce.

Pick lower and you bother humans. Pick higher and you fall off the hardware response curve.

Encoding: bits riding the carrier

Once you have a band, you need a way to map ones and zeros onto sound. The mechanism Beeping uses is a flavour of frequency-shift keying (FSK):

Pick N discrete tones inside the working band.
Each tone represents a chunk of bits (a "symbol"). Two tones gives you 1 bit per symbol; four tones gives you 2; eight gives you 3.
The transmitter plays a quick succession of those tones; the receiver listens, runs an FFT on each window, and reads off which tone was loudest.

More tones per symbol means faster transmission but tighter spacing between frequencies, which means less margin for noise and harder decoding. We tune that knob conservatively — Beeping prioritises "works in a noisy room with cheap speakers" over "highest possible bitrate".

Why error correction matters

The world is loud. A coffee shop is loud. A laptop fan is loud. A speaker emitting a tone you can't hear into a microphone listening through a phone case is, at best, a noisy channel. Without correction, even a 5% bit error rate destroys a payload.

Beeping wraps the symbol stream in a forward error correction (FEC) layer — every payload includes redundancy bytes that let the receiver detect and repair a few flipped bits without asking the sender to retransmit. That's the difference between "a clear lab" and "a conference floor" working reliably.

The end-to-end shape of an exchange

Here's what a Commlink-style send actually does, in order:

Frame the payload: prepend a short preamble the receiver can lock onto, append FEC bytes.
Encode the frame as a sequence of FSK symbols.
Synthesize the audio: each symbol becomes a brief sine burst at the chosen frequency, smoothly windowed to avoid clicks.
Emit through the device speaker.
Receive through the listener's microphone.
Window the incoming audio into chunks matching the symbol length, run an FFT, identify which tone was loudest in each window.
Decode the symbol stream back into bits.
Unwrap the FEC layer, fix small errors, validate the payload.
Surface the result to the application.

Steps 1-4 are the encoder. Steps 5-9 are the decoder. The transport layer in between is the air in the room.

Why this matters

Bluetooth, WiFi and NFC all need pairing, configuration or proximity in a very specific sense. Sound just propagates. Anything within acoustic range — a TV, a phone in a pocket, a smart speaker on the counter — can participate without prior setup.

That's a different kind of primitive. It's not always the right tool, but when "no pairing, any device, broadcast-shaped" is the requirement, ultrasonic data exchange wins easily.

If you want to see it actually work end-to-end, the Commlink quest walks you through it in five minutes. Same algorithm, real devices, your own payload.