sadda.dsp¶

Pure-function DSP toolkit. Every function takes NumPy float32 audio + a sample rate and returns NumPy or dataclass results. No corpus dependency. STABLE tier.

sadda.dsp — foundational DSP toolkit.

Pure-function API over NumPy float32 arrays. Window functions, STFT, spectrogram, intensity, and the relocated f0 from Phase 0 all live here. Stability tier: STABLE (per the 2026-05-18 Python API surface DEVLOG entry).

The top-level sadda.f0 stays as a Phase-0 back-compat alias for the same function.

Source: python/sadda/dsp/__init__.py:1

FormantFrame ¶

One frame of formant output. Variable-length frequencies / bandwidths per frame — frames where the LPC root-finder didn't return enough valid roots in the formant range are honestly empty rather than NaN-padded.

Source: crates/python/src/lib.rs:3764

doc `class-attribute` ¶

__doc__ = "One frame of formant output. Variable-length `frequencies` /\n`bandwidths` per frame — frames where the LPC root-finder didn't return\nenough valid roots in the formant range are honestly empty rather than\nNaN-padded."

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

module `class-attribute` ¶

__module__ = 'sadda._native'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

__sadda_stability__ `class-attribute` ¶

__sadda_stability__ = 'stable'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

bandwidths `property` ¶

bandwidths

Bandwidths in Hz, co-indexed with frequencies.

frequencies `property` ¶

frequencies

Formant frequencies in Hz, ascending.

time_seconds `property` ¶

time_seconds

Time at the centre of the analysis frame, in seconds.

repr `method descriptor` ¶

__repr__()

Return repr(self).

hann `builtin` ¶

hann(n)

Hann window: 0.5 * (1 - cos(2π n / (N-1))).

Source: crates/python/src/lib.rs:3587

hamming `builtin` ¶

hamming(n)

Hamming window: 0.54 - 0.46 * cos(2π n / (N-1)).

Source: crates/python/src/lib.rs:3594

blackman `builtin` ¶

blackman(n)

Blackman window: 0.42 - 0.5*cos(2π n / (N-1)) + 0.08*cos(4π n / (N-1)).

Source: crates/python/src/lib.rs:3602

gaussian `builtin` ¶

gaussian(n, sigma)

Gaussian window of length n with standard deviation sigma (in samples).

Source: crates/python/src/lib.rs:3609

kaiser `builtin` ¶

kaiser(n, beta)

Kaiser window of length n with shape parameter beta.

Source: crates/python/src/lib.rs:3616

stft `builtin` ¶

stft(samples, frame_size, hop_size, *, window=None)

Short-time Fourier transform of a real-valued 1-D float32 signal.

Returns the complex spectrogram with shape (n_frames, n_freq_bins) where n_freq_bins = frame_size / 2 + 1 (the unique half of the spectrum for real input). If window is omitted, a Hann window of length frame_size is used (matches scipy.signal.stft's default).

Source: crates/python/src/lib.rs:3629

spectrogram `builtin` ¶

spectrogram(samples, frame_size, hop_size, *, window=None)

Power spectrogram of a real-valued signal: |X|² of the STFT, in shape (n_freq_bins, n_frames). If window is omitted, a Hann window of length frame_size is used.

Source: crates/python/src/lib.rs:3679

intensity `builtin` ¶

intensity(audio, *, frame_size_seconds=0.03, hop_seconds=0.01)

Per-frame intensity over an [Audio]: returns (times, rms, db_fs) as three NumPy arrays. times is float64 seconds at frame centres; rms is float32 linear amplitude; db_fs is float32 dB relative to digital full-scale (clamped to -200 dB on silence). dB-SPL (Praat convention) arrives in a later slice once microphone calibration is wired through.

Source: crates/python/src/lib.rs:3731

f0 `builtin` ¶

f0(audio, *, frame_size_seconds=0.03, hop_size_seconds=0.01, min_freq_hz=75.0, max_freq_hz=500.0)

Estimates f0 over an Audio via time-domain autocorrelation.

Returns (times, frequencies) as a 2-tuple of NumPy arrays: times is float64 in seconds, frequencies is float32 in Hz.

Source: crates/python/src/lib.rs:3563 · impl: crates/engine/src/pitch.rs:278

voiced_pitch ¶

voiced_pitch(audio, *, params: Optional[PitchParams] = None, frame_size_seconds: float = 0.03, hop_size_seconds: float = 0.01, min_freq_hz: float = 75.0, max_freq_hz: float = 500.0, method: str = 'boersma', voicing_threshold: float = 0.45)

Estimate f0 with a voicing decision; returns (times, frequencies, voicing) as three NumPy arrays.

Either pass method (one of autocorrelation | windowed_autocorrelation | boersma (default) | yin | pyin | swipe) with the common analysis keywords, or pass params= a :class:PitchParams (from a preset, optionally edited with .replace(...)). When params is given it fully determines the computation and the other keywords are ignored.

Source: python/sadda/dsp/__init__.py:164

formants ¶

formants(audio, *, params: Optional[FormantsParams] = None, frame_size_seconds: float = 0.025, hop_seconds: float = 0.01, n_formants: int = 5, pre_emphasis: float = 0.97, lpc_order: Optional[int] = None, method: str = 'burg', max_bandwidth_hz: float = 1000.0, min_frequency_hz: float = 50.0)

Per-frame formants via LPC + root-finding; returns a list of FormantFrame.

Either pass method (burg (default) or autocorrelation) with the analysis keywords, or pass params= a :class:FormantsParams (from a preset, optionally edited with .replace(...)). When params is given it fully determines the computation and the other keywords are ignored.

Source: python/sadda/dsp/__init__.py:78

mfcc ¶

mfcc(audio, *, params: Optional[MfccParams] = None, frame_size_seconds: float = 0.025, hop_seconds: float = 0.01, n_mels: int = 40, n_mfcc: int = 13, f_min: float = 0.0, f_max: Optional[float] = None, method: str = 'librosa')

Mel-frequency cepstral coefficients, shape (n_frames, n_mfcc).

Two ways to specify the computation:

By named method (default): method is one of "librosa" (default), "kaldi", or "praat" — each a faithful reproduction of that reference (see MfccMethod). The other keyword args set the common analysis parameters.
By full parameter set: pass params= an :class:MfccParams (from a preset, optionally edited with .replace(...)). When params is given it fully determines the computation and the method / n_mels / frame_size_seconds / … keywords are ignored.

f_max defaults to the Nyquist frequency (sample_rate / 2).

Source: python/sadda/dsp/__init__.py:248

log_mel_whisper `builtin` ¶

log_mel_whisper(audio, *, n_fft=400, hop_length=160, n_mels=80, target_frames=None)

Whisper-exact log-mel spectrogram, shape (n_frames, n_mels).

Byte-faithful to OpenAI Whisper's encoder front end (Slaney mel, power STFT with a periodic Hann window, log10 + clamp, global dynamic-range floor, (+4)/4 normalisation). Expects 16 kHz mono for Whisper fidelity. target_frames pads/trims the audio so the result has exactly that many frames (Whisper uses 3000 for 30 s); None keeps the natural length.

Source: crates/python/src/lib.rs:4035

sadda.dsp¶

FormantFrame ¶

__doc__ class-attribute ¶

__module__ class-attribute ¶

__sadda_stability__ class-attribute ¶

bandwidths property ¶

frequencies property ¶

time_seconds property ¶

__repr__ method descriptor ¶

hann builtin ¶

hamming builtin ¶

blackman builtin ¶

gaussian builtin ¶

kaiser builtin ¶

stft builtin ¶

spectrogram builtin ¶

intensity builtin ¶

f0 builtin ¶

voiced_pitch ¶

formants ¶

mfcc ¶

log_mel_whisper builtin ¶

doc `class-attribute` ¶

module `class-attribute` ¶

__sadda_stability__ `class-attribute` ¶

bandwidths `property` ¶

frequencies `property` ¶

time_seconds `property` ¶

repr `method descriptor` ¶

hann `builtin` ¶

hamming `builtin` ¶

blackman `builtin` ¶

gaussian `builtin` ¶

kaiser `builtin` ¶

stft `builtin` ¶

spectrogram `builtin` ¶

intensity `builtin` ¶

f0 `builtin` ¶

log_mel_whisper `builtin` ¶