sadda.ml¶

ONNX-model inference over audio — bundled Silero VAD plus a generic embedding-extraction harness for wav2vec2-style (waveform) and Whisper-style (log-mel) encoders. PROVISIONAL tier.

ONNX Runtime is loaded at runtime, not linked into the wheel. With pip install "sadda[ml]" the wheel auto-discovers the installed onnxruntime package at import time and sets ORT_DYLIB_PATH; the desktop-app bundles ship the runtime as a sidecar so it just works. Without ORT available, these calls raise a clear "ONNX Runtime not available" error rather than crashing — see the 2026-05-28 ORT-sidecar packaging DEVLOG entry.

Downloading models (`hf://`)¶

load_model("hf://<org>/<name>/<file>[@<rev>]") fetches a model from HuggingFace into the local cache and runs it (unverified passthrough — prefer a curated sadda/… id when one exists). pip install "sadda[download]" is the convenient install (it pulls ONNX Runtime so a downloaded model is immediately runnable).

sadda never touches the network unless you opt in. The fetch is compiled into the wheel but stays dormant until you set the environment variable SADDA_ALLOW_NETWORK=1; without it, an hf:// cache miss raises a clear "network access is disabled" error. Cached models and local:// / sadda/… ids always work offline. Authenticate to private or gated repos with HF_TOKEN. The desktop app does not compile this in — the GUI is network-free by construction.

import os
os.environ["SADDA_ALLOW_NETWORK"] = "1"      # explicit opt-in
m = sadda.ml.load_model("hf://onnx-community/silero-vad/onnx/model.onnx")

Voice activity detection (bundled)¶

vad ¶

vad(audio, *, model_path: Optional[str] = None)

Run Silero VAD over audio.

Returns (times, speech_probs) as NumPy arrays — one entry per ~32 ms window (the audio is mono-mixed and resampled to 16 kHz). Uses the bundled model unless model_path points at another ONNX VAD model. Raises if ONNX Runtime isn't available.

Source: python/sadda/ml/__init__.py:155

speech_segments ¶

speech_segments(audio, *, threshold: float = 0.5, model_path: Optional[str] = None)

Speech regions in audio as (start_seconds, end_seconds).

Runs :func:vad, then merges consecutive windows whose probability is >= threshold. Uses the bundled model unless model_path is given.

Source: python/sadda/ml/__init__.py:167

Model resolution + embeddings¶

load_model ¶

load_model(id)

Resolve a model by id, returning a :class:Model.

id is one of: "sadda/<name>[@version]" (curated registry, falling back to the bundled set), "local://<path>" (a model directory with a model.toml, or a bare model file), or "hf://<repo>" (HuggingFace passthrough — arrives in a later release). The returned model exposes .vad(audio) plus .id / .version / .kind / .weights_checksum metadata.

Source: python/sadda/ml/__init__.py:126

install_model ¶

install_model(src_dir, *, root=None)

Install a model directory (a model.toml + its files) into the store by copying it in — how the bundled set seeds the cache and where a fetched model lands. Returns the installed :class:Model.

Source: python/sadda/ml/__init__.py:140

get_model ¶

get_model(id, version, *, root=None)

The model with this id + version in the store (the per-user cache by default, or an explicit root), or None.

Source: python/sadda/ml/__init__.py:148

Model ¶

A model resolved from the registry by [load_model].

Source: crates/python/src/ml.rs:59

doc `class-attribute` ¶

__doc__ = 'A model resolved from the registry by [`load_model`].'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

module `class-attribute` ¶

__module__ = 'sadda._native.ml'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

id `property` ¶

id

Resolvable id (e.g. "sadda/silero-vad").

Source: crates/python/src/ml.rs:67

kind `property` ¶

kind

Model kind (vad, embedding, …).

Source: crates/python/src/ml.rs:77

license `property` ¶

license

SPDX license id, if declared.

Source: crates/python/src/ml.rs:87

title `property` ¶

title

Human-readable title.

Source: crates/python/src/ml.rs:82

version `property` ¶

version

Version.

Source: crates/python/src/ml.rs:72

weights_checksum `property` ¶

weights_checksum

Weights checksum (sha256:…), if declared.

Source: crates/python/src/ml.rs:92

repr `method descriptor` ¶

__repr__()

Return repr(self).

embeddings `method descriptor` ¶

embeddings(audio)

Runs this model as an embedding extractor over audio, returning a (frames, dims) float64 NumPy array. The input is shaped per the model's declared representation (waveform / log_mel). Errors unless ONNX Runtime is available.

Source: crates/python/src/ml.rs:114

vad `method descriptor` ¶

vad(audio)

Runs this model as a VAD over audio → (times, speech_probs). Errors unless it's a vad model and ONNX Runtime is available.

Source: crates/python/src/ml.rs:99

sadda.ml¶

Downloading models (hf://)¶

Voice activity detection (bundled)¶

vad ¶

speech_segments ¶

Model resolution + embeddings¶

load_model ¶

install_model ¶

get_model ¶

Model ¶

__doc__ class-attribute ¶

__module__ class-attribute ¶

id property ¶

kind property ¶

license property ¶

title property ¶

version property ¶

weights_checksum property ¶

__repr__ method descriptor ¶

embeddings method descriptor ¶

vad method descriptor ¶