
Agent Evals Specialist
Coconut
Completely RemoteFull TimeCustomer Service
Posted 2 days ago
Job description
Responsibilities
- Review AI agent outputs side-by-side with source material to verify accuracy
- Evaluate what the AI agent created, changed, or omitted
- Score tasks using a rubric covering accuracy, coverage, organization, and rule adherence
- Write detailed, specific feedback regarding mistakes to drive AI performance improvements
Requirements
- Strong written English with the ability to read dense technical content for extended periods
- High level of consistency in scoring and evaluation
- Ability to provide clear, specific, and actionable feedback
- Proficiency with Slack and internal platforms
- Familiarity with Markdown
Preferred Qualifications
- Prior experience as an AI trainer, tutor, or evaluator (e.g., Outlier, DataAnnotation, xAI, Surge, Mercor, Invisible, Toloka)
- Background in technical writing, editing, QA, translation, paralegal work, or research assistance
Benefits
- Competitive salary and 13th-month pay
- 12 days of Paid Time Off (PTO) and 12 paid US holidays
- Maternity and paternity leave
- Comprehensive healthcare and life insurance
- Mental health support and wellness resources
- Milestone gifts, birthday treats, and team experiences like virtual town halls and regional meetups
About the Company
Coconut connects founders and businesses with top-tier remote talent while prioritizing meaningful, reliable partnerships. We believe success happens when everyone wins: clients achieve their goals, and our virtual professionals find purpose and growth in their work.
Skills & tools
AIQuality Assurance
What the team is looking for
Use this list as a quick fit check before you apply.
- 01Strong written English
- 02Ability to read dense technical content
- 03Markdown familiarity
- 04Slack proficiency
- 05Attention to detail

Coconut
Job details
- Work model
- Completely Remote
- Commitment
- Full Time
- Category
- Customer Service
- Posted
- 2 days ago