Agent Evals Specialist

Coconut

Completely RemoteFull TimeCustomer Service

Posted 1 months ago

This role is no longer accepting applications.

Browse live jobs

Job description

Responsibilities

Review AI agent outputs side-by-side with source material to verify accuracy
Evaluate what the AI agent created, changed, or omitted
Score tasks using a rubric covering accuracy, coverage, organization, and rule adherence
Write detailed, specific feedback regarding mistakes to drive AI performance improvements

Requirements

Strong written English with the ability to read dense technical content for extended periods
High level of consistency in scoring and evaluation
Ability to provide clear, specific, and actionable feedback
Proficiency with Slack and internal platforms
Familiarity with Markdown

Preferred Qualifications

Prior experience as an AI trainer, tutor, or evaluator (e.g., Outlier, DataAnnotation, xAI, Surge, Mercor, Invisible, Toloka)
Background in technical writing, editing, QA, translation, paralegal work, or research assistance

Benefits

Competitive salary and 13th-month pay
12 days of Paid Time Off (PTO) and 12 paid US holidays
Maternity and paternity leave
Comprehensive healthcare and life insurance
Mental health support and wellness resources
Milestone gifts, birthday treats, and team experiences like virtual town halls and regional meetups

About the Company

Coconut connects founders and businesses with top-tier remote talent while prioritizing meaningful, reliable partnerships. We believe success happens when everyone wins: clients achieve their goals, and our virtual professionals find purpose and growth in their work.

Skills & tools

AIQuality Assurance

What the team is looking for

Use this list as a quick fit check before you apply.

01Strong written English
02Ability to read dense technical content
03Markdown familiarity
04Slack proficiency
05Attention to detail

Coconut

Applications closed

Job details

Work model: Completely Remote
Commitment: Full Time
Category: Customer Service
Posted: 1 months ago

Applications closed