Machine Learning Researcher - Audio

Protege

Completely RemoteFull TimeEngineering & Architecture

Posted 1 months ago

This role is no longer accepting applications.

Browse live jobs

Job description

Responsibilities

Research audio data quality for machine learning by investigating how signal properties and dataset composition affect downstream model training.
Develop new metrics, benchmarks, and evaluation frameworks to measure audio quality in ways that predict ML model performance.
Characterize speech datasets by analyzing acoustic properties such as effective bandwidth, spectral energy, noise, and codec artifacts.
Build workflows for segment-level quality evaluation to detect localized degradation in diarized or segmented speech regions.
Design and run targeted evaluations connecting audio quality issues to downstream behaviors in ASR, TTS, and speaker modeling.
Translate research findings into reproducible filtering rules, quality gates, and scalable evaluation infrastructure.
Collaborate with ML researchers, data engineers, and operations teams to communicate the value of audio data assets.

Requirements

PhD or equivalent Master’s degree plus 4+ years of industry experience in machine learning, audio signal processing, or speech technology.
Proven experience designing and running data evaluations, audio analyses, or benchmarks.
Strong understanding of speech/audio signal properties, including sampling rates, codecs, spectrograms, and perceptual quality.
Experience developing or evaluating metrics and measurement frameworks for ML systems or audio signal analysis.
Ability to connect low-level signal properties to downstream machine learning behavior and model robustness.
Proficiency in moving between research exploration and production implementation of scalable tools.
Excellent technical communication skills and a high degree of ownership.

Preferred Qualifications

Experience with ASR, TTS, speaker modeling, self-supervised speech models, or multimodal audio models.
Experience developing evaluation frameworks specifically for training data.
Publications or open-source contributions in speech, audio ML, or data-centric AI.
Experience studying the relationship between dataset quality and downstream model performance.

About the Company

Protege is building a platform to solve the biggest unmet need in AI: access to high-quality training data. We facilitate the secure, efficient, and privacy-centric exchange of AI training data, helping ambitious teams power their models with the best possible signals. We are a lean, fast-moving, high-trust team of builders obsessed with velocity and impact.

Skills & tools

Machine LearningAudio Signal ProcessingSpeech Technology

What the team is looking for

Use this list as a quick fit check before you apply.

01PhD or Master's with 4+ years experience
02Experience in ML, audio signal processing, or speech technology
03Strong understanding of speech/audio signal properties
04Experience designing data evaluations or benchmarks
05Ability to implement research into scalable tools

Wake up to a shortlist, not a search results page.

ScoutJobs scores every new listing against your CV, salary floor and visa. A handful of real matches by morning.

Get your daily matches

Protege

Applications closed

Job details

Work model: Completely Remote
Commitment: Full Time
Category: Engineering & Architecture
Posted: 1 months ago

Wake up to a shortlist, not a search results page.

ScoutJobs scores every new listing against your CV, salary floor and visa. A handful of real matches by morning.

Get your daily matches

Applications closed