BPM Detection & Beat-Synced Visuals: A Music Video Maker's Guide
BPM detection is the difference between a music video that looks reactive and one that feels reactive. Frequency-band reactivity (bass → glow, mid → rotation) responds to whatever's playing right now; beat synchronization snaps visual events to the song's underlying grid. The two together — band reactivity for moment-to-moment energy, beat sync for structural punch — are what every great music visualizer does.
This guide explains how automatic BPM detection works, how Shimga uses it, and what to do when the algorithm gets it wrong.
What "BPM Detection" Actually Means
Beats per minute. 120 BPM = one beat every 0.5 seconds. The detector's job is to look at a raw audio file and return:
- A BPM number (usually a half-integer like 124.5).
- A list of beat timestamps — the actual times where beats fall.
Returning just the BPM isn't enough. If a song is 120 BPM, you still need to know whether beat one is at 0.04 seconds or 0.32 seconds. Different starting offsets produce visibly different beat-snapped visuals.
How the Algorithm Works (the Short Version)
Modern in-browser BPM detection algorithms follow the same recipe:
- Downsample audio from 44.1 kHz to ~22 kHz. Halves the work.
- Low-pass filter — keep only frequencies below ~200 Hz. Beats live in the bass (kicks, low toms).
- Onset detection — break the filtered audio into short frames, measure energy in each, and look for sudden jumps. Each jump is a potential beat.
- Autocorrelation — slide the onset-strength signal against itself at various lags (corresponding to 60-200 BPM). The lag that produces the highest correlation is the tempo.
- Parabolic interpolation — refine the integer-frame lag to a sub-frame value, giving you a more accurate BPM than the frame rate alone supports.
- Find the first beat — within the first ~2 beats of audio, pick the strongest onset. That's beat 1; the rest of the grid follows from BPM.
Total runtime: ~2-5 seconds for a 3-minute song. More on how the FFT side works.
Why It Sometimes Gets It Wrong
BPM detection is solved for "regular" music — drum-driven hip hop, EDM, rock, pop. It struggles on:
- Half-time / double-time confusion. A 75 BPM trap beat with a kick on every other downbeat will often be detected as 150 BPM, or vice versa. Both answers are "correct" — they're harmonically related — but only one matches the producer's intent.
- Free time / rubato. Acoustic ballads, classical music, ambient — anything without a steady kick — defeats the autocorrelation step. The detector returns a confident-looking number that's completely wrong.
- Heavy syncopation. Drum-and-bass with snares-on-the-and beats can fool onset detection.
- Very fast intros. If the first 5 seconds is unusual (a vocal drop before any drums), the first-beat finder can land on the wrong onset.
How to Fix a Wrong BPM in Shimga
If the studio detects a wrong BPM:
- Open the timeline. Look at the beat-grid lines overlaid on the waveform.
- If the grid is exactly twice or half what it should be, manually edit the BPM in the timeline header (click the BPM badge). Halving 150 → 75 fixes the most common case.
- If the grid is correct in tempo but offset (every beat is 0.2 seconds too early), use the "First Beat Offset" field. Shimga shifts the entire grid by that amount.
- If the song genuinely has no steady tempo, turn off Beat Snap in the timeline and place reactive events manually using keyframes.
Using the Beat Grid Creatively
Once Shimga has a correct BPM, you can:
- Snap layer-add times to beats. Hold Shift while dragging a new layer onto the timeline — its start time snaps to the nearest beat.
- Step through beats with J / L keys. Skip backward/forward exactly one beat at a time. Way faster than scrubbing.
- Set keyframes on beats. Drop a keyframe, hit L to advance one beat, drop another. The interpolation between them now happens over exactly one beat — most musical timing.
- Use the BPM for camera moves. Even on still-image backgrounds, a tiny scale pulse on every fourth beat (downbeat) feels like a documentary cut.
Why Run BPM Detection in a Web Worker
Three minutes of audio is ~7.9 million samples. The autocorrelation loop alone runs ~3 million iterations per BPM lag tested. Doing that on the main thread blocks the UI for several seconds — the user clicks "Upload Audio" and the studio freezes. Shimga runs it in a Web Worker, so the main thread keeps painting at 60fps while detection happens in parallel. The result arrives via postMessage and updates the timeline live.
What "Beat Sync" Looks Like in a Music Video
The cleanest beat sync is usually invisible. Examples:
- A particle burst that fires on every fourth beat (the downbeat). One every 2 seconds at 120 BPM. Not noticeable individually, but the eye learns to expect them.
- A scale pulse on every beat at 4% — too small to draw attention but enough to feel rhythmic.
- Lyric line transitions snapped to the start of the next measure. Lines don't pop in mid-bar; they appear on beat 1.
- Camera shake on a sub-bass drop (an explicit big-energy moment) only.
Avoid: a different visual event on every beat. That's strobing, not sync.
Try beat-synced visuals now
Upload audio, the BPM appears in the timeline, beat-snap is on by default.
Open Shimga Studio →