Agent Evals Specialist

Coconut

Completely RemoteFull TimeCustomer Service
Posted 2 days ago

Job description

Responsibilities

  • Review AI agent outputs side-by-side with source material to verify accuracy
  • Evaluate what the AI agent created, changed, or omitted
  • Score tasks using a rubric covering accuracy, coverage, organization, and rule adherence
  • Write detailed, specific feedback regarding mistakes to drive AI performance improvements

Requirements

  • Strong written English with the ability to read dense technical content for extended periods
  • High level of consistency in scoring and evaluation
  • Ability to provide clear, specific, and actionable feedback
  • Proficiency with Slack and internal platforms
  • Familiarity with Markdown

Preferred Qualifications

  • Prior experience as an AI trainer, tutor, or evaluator (e.g., Outlier, DataAnnotation, xAI, Surge, Mercor, Invisible, Toloka)
  • Background in technical writing, editing, QA, translation, paralegal work, or research assistance

Benefits

  • Competitive salary and 13th-month pay
  • 12 days of Paid Time Off (PTO) and 12 paid US holidays
  • Maternity and paternity leave
  • Comprehensive healthcare and life insurance
  • Mental health support and wellness resources
  • Milestone gifts, birthday treats, and team experiences like virtual town halls and regional meetups

About the Company

Coconut connects founders and businesses with top-tier remote talent while prioritizing meaningful, reliable partnerships. We believe success happens when everyone wins: clients achieve their goals, and our virtual professionals find purpose and growth in their work.

Skills & tools

AIQuality Assurance

What the team is looking for

Use this list as a quick fit check before you apply.

  1. 01Strong written English
  2. 02Ability to read dense technical content
  3. 03Markdown familiarity
  4. 04Slack proficiency
  5. 05Attention to detail