Why SETI candidates appear as slanted lines in spectrograms, and how integration along those lines pulls weak signals out of noise.
Prerequisite: Foundations (spectrograms, noise, basic signal concepts)
This chapter explains why narrowband SETI candidates appear as slanted lines in spectrograms, and how a de-Doppler search integrates power along those lines to pull weak signals out of noise. You already know what a spectrogram is: power as a function of time (rows or columns) and frequency (the other axis). Here we connect motion, frequency shift, and the brute-force algorithm that MitraSETI-style pipelines refine later with faster structures.
When an ambulance approaches you, the siren sounds higher in pitch; after it passes, it sounds lower. The siren's mechanical vibration frequency at the vehicle is unchanged. What changes is how often the compressed air peaks reach your ear: motion along the line of sight stretches or squeezes the spacing between wave crests that actually arrive.
That is the Doppler effect: observed frequency depends on relative radial velocity (motion toward or away from you), not only on what the source emits.
For speeds much less than the wave speed c (sound in air, or the speed of light for radio), a good linear approximation is:
fobserved = fsource × (1 + v / c)
Here v is positive if the source is receding and negative if approaching (sign conventions vary; what matters is the idea). Approaching motion increases observed frequency (blueshift for light); receding motion decreases it (redshift).
Electromagnetic waves obey the same idea. A transmitter on a spacecraft, a radar echo, or a hypothetical beacon on an exoplanet all have a rest-frame frequency. The antenna on Earth measures a different frequency when source and observer move relative to each other along the line of sight.
SETI often imagines a narrow carrier or comb of lines at some fsource. Everything that changes the radial velocity between that source and Earth changes fobserved over time.
For software engineers, a useful mental model is resampling: radial motion is like a continuous time-varying mix between the source clock and the receiver clock in the line-of-sight direction. You do not need general relativity for night-scale SETI intuition—classical Doppler plus known ephemerides gets you most of the way—but remember that precision work (comparing candidates across days) eventually pulls in barycentric corrections: you express frequencies in a frame tied to the solar system's center of mass so Earth's orbital reflex motion does not masquerade as an intrinsic source drift.
A fixed frequency in the source's frame is not fixed at the telescope unless every relative motion is constant—and it never is for long.
A point on the equator moves at roughly 465 m/s due to Earth's spin. That velocity vector projects onto the line of sight to a star or galaxy and changes as Earth turns. So the radial velocity toward a distant source drifts throughout a night.
Earth orbits the Sun at about 30 km/s. Over weeks and months this dominates many drift signatures compared to rotation alone for a given pointing.
If the "beacon" sits on a planet, that planet orbits its star with an unknown orbital speed and phase. The star may have its own motion. All of these add vectorially to what the telescope sees.
Over a single recording (seconds to minutes), the cleanest first-order model is often:
Typical magnitudes discussed for habitable-zone contexts are often in the ballpark of ±0.01 Hz/s to ±4 Hz/s, depending on band, duration, geometry, and whether you include only Earth rotation or full barycentric corrections. The exact number matters less for intuition than the fact that drift is normal for a celestial narrowband line.
Over short snippets (a few seconds), drift_rate × duration may be smaller than one channel width, so the line looks almost vertical until you zoom out or use finer resolution. Over tens of seconds to minutes, the same Hz/s accumulates into many channels of walk—exactly the regime where de-Doppler integration pays off.
If the transmitter and receiver share the same rotating, orbiting frame—like a terrestrial interferer fixed to Earth's surface—there is no differential Doppler between source and dish. The line stays at one frequency (aside from equipment drift, which is usually slow).
A candidate that shows zero drift (a vertical ridge in the spectrogram) is therefore more consistent with RFI than with a geometrically distant beacon, though it is not a proof by itself.
Think of the spectrogram as a time–frequency plane: time runs horizontally, frequency runs vertically (or the axes may be swapped in software—the geometry is the same).
A constant drift rate draws a straight line through that plane.
Higher frequency is "up"; time advances to the right.
Positive drift (frequency increases with time): line slopes up and right.
Negative drift (frequency decreases with time): line slopes down and right.
Zero drift: vertical line (same channel over time)—often a flag for terrestrial RFI in SETI-style reasoning.
In any single time step, a weak carrier spreads over a few bins and sits near the noise floor. You might not see a convincing peak. The eye catches structure only when many time steps are viewed together—and even then, the diagonal smear can be faint.
If you sum (or average coherently in more advanced setups) power along the correct trajectory, energy from the signal adds constructively along that path, while noise tends to average down. That is the core idea of de-Doppler search: integrate along candidate lines in the time–frequency plane.
Real pipelines work on bins, not continuous lines. Here is a toy spectrogram: rows are frequency channels (0 at bottom), columns are time; · is noise-dominated and * marks where a weak signal passes through.
The trajectory is the set of (time, channel) pairs the integrator visits for a trial d starting at channel 1 at t = 0. Interpolation (linear, sinc, or nearest-neighbor) decides how to read values between exact bin centers when d does not land on integers. Nearest-neighbor is fast but can bias scores; better pipelines use sub-bin interpolation so drift is not artificially quantized worse than the instrument already is.
Goal: try a grid of trial drift rates d and starting frequency channels f, integrate power along each corresponding line, and mark high signal-to-noise trajectories as detections.
Raw dynamic range is harsh. A common robust preprocessing step:
MAD is the median absolute deviation; the factor 1.4826 makes it comparable to standard deviation for normal noise. Exact choices vary by implementation, but the intent is the same: stabilize comparisons before summing along paths.
Choose a list of drift rates d spanning the physically plausible range (symmetric positive and negative).
For a discrete spectrogram:
Concretely, if k(t) is the (possibly fractional) channel index at time step t:
(Adjust signs if your convention defines positive drift as decreasing frequency.) At each t, sample the spectrogram at k(t)—e.g. linear interpolation between floor(k) and ceil(k)—to get a value x(t). Accumulate:
then form a detection statistic such as S / √Nt or a variant that also accounts for per-step weights.
If the integrated score exceeds a threshold, record (f, d, t_span, SNR, …) as a candidate.
The inner loop is embarrassingly parallel over f and d, but memory bandwidth and cache locality dominate at scale: you sweep the spectrogram many times unless you restructure access (again: Taylor tree).
Nearby detections in (frequency, drift) space are clustered or non-max suppressed so one physical line does not yield hundreds of duplicate hits.
A single bright RFI burst can also create aliases at wrong drifts if the model is imperfect; conservative pipelines therefore combine de-Doppler scores with RFI masks, kurtosis gates, or multiple observations of the same sky location. This chapter stays focused on the geometry of drift; later chapters cover how MitraSETI decides which high-SNR blobs survive scrutiny.
Rough operation count:
For Breakthrough Listen–style resolutions, orders of magnitude can look like Nf = 1,048,576, Nt = 16, Nd ≈ 300, which is on the order of 5 × 109 inner-loop contributions per file for the naive triply nested structure. That is slow at scale, which motivates algorithmic acceleration (the next tutorial).
Voyager 1 transmits near ~8.4 GHz (band-dependent; exact channel matters for plotting). From Earth, the dominant smooth drift from Earth's rotation is often quoted in the ~0.287 Hz/s class for such geometry (illustrative; always compute for your epoch and pointing).
On a spectrogram, Voyager does not look like a bright vertical stripe. It is a faint diagonal whose slope encodes that drift.
You can reproduce the lesson on any stable narrowband transmitter whose line-of-sight velocity changes smoothly—satellites, planetary spacecraft, or calibrated lab sources—provided your spectrogram cadence and resolution resolve the drift across Nt.
The pedagogical point: the signal is a line in 2D, not a point in 1D. Searching only per-channel FFTs without drift matching leaves most astrophysical or deep-space narrowband energy under-integrated.
Let:
A natural drift step in Hz/s that moves the signal by one channel over the full duration is:
That sets the grid spacing in drift rate: finer steps cost more Nd and more compute.
The maximum drift you care about in Hz/s, call it max_drift_rate, maps to a span in channels of roughly:
(Up to factors of order unity depending on whether you use half-steps or full channel quantization.) This is how you connect physical Hz/s limits to how many drift trials you need and how wide a frequency walk you must allow per trajectory so you do not "lose" the line off the edge of the band.
Assume per-time-step noise contributions are roughly independent with zero mean and similar variance. If you sum Nt of them:
So define something proportional to:
Then noise-only paths have scores of order 1, while a real aligned narrowband ridge produces a larger positive outlier. That is why brute-force de-Doppler works: it separates coherent accumulation along the correct line from incoherent wandering of noise.
Equivalently, you may see implementations that use a matched filter or normalized cross-correlation viewpoint; the √(Nt) factor is the same statistical normalization in disguise.
If steps are not perfectly independent (spectra overlap in time, or preprocessing introduces correlation), the effective scaling deviates slightly from √(Nt). Pipeline designers calibrate thresholds on real data or simulations so false-alarm rates stay under control. The concept—coherent vs incoherent growth—remains the reason line integration is a principled detector for drifting narrowband energy.
The nested loops recompute trajectories that overlap heavily. Two adjacent trial drift rates d and d + ε visit almost the same set of pixels, shifted slightly. A naive implementation re-sums from scratch every time, throwing away shared work.
The Taylor tree (next chapter) exploits that redundancy by organizing partial sums so that families of drifts share intermediate results, driving complexity down toward O(N log N)-style scaling in the number of drift hypotheses instead of a flat O(Nd × Nt × Nf) blow-up.
Picture two nearby drifts d and d′ = d + δ. For early time steps, the two paths through the grid share the same integer cells or differ by at most one bin. Brute force re-reads those cells for every d; a tree-based method stores partial sums along time and reuses them when only the tail of the trajectory diverges. That is the same algorithmic story as dynamic programming or prefix structures: pay once, query many related hypotheses.
You should leave this chapter with three anchors:
When you are ready, open 03 – Taylor Tree Algorithm for the efficient engine.
The de-Doppler search runs automatically in the cloud — upload a .fil file and see drift rate results in seconds.