
Senior Site Reliability Engineer
PlayOn
Completely RemoteFull TimeEngineering & Architecture
Posted Today
Job description
Responsibilities
- Contribute to system observability by implementing and improving metrics, alerting, and dashboards.
- Develop automation, tooling, and monitoring solutions to support high service availability.
- Partner with application and quality engineering teams to implement best practices in reliability and release automation.
- Drive operational excellence through proactive incident prevention, blameless postmortems, and capacity planning.
- Participate in on-call rotations to support critical services and ensure rapid response to incidents.
- Define SLIs and SLOs for core user flows to align the team on performance and availability standards.
Requirements
- Solid experience in Python for automation, tooling, and data-driven operational tasks.
- Proficiency in at least one of the following: Java, C++, or Go.
- Strong understanding of Linux systems and cloud infrastructure (AWS, GCP, or Azure).
- Experience with modern deployment practices including Docker, Kubernetes, and Terraform.
- Proficiency with CI/CD pipelines, version control, and automated testing frameworks.
- Experience with observability tools such as Prometheus, Grafana, ELK, or Datadog.
- Proven ability to translate Critical User Journeys into actionable SLA/SLO metrics.
Preferred Qualifications
- Experience writing or maintaining end-to-end or integration tests for distributed systems.
- Background in performance testing, capacity planning, or chaos engineering.
- Contributions to internal developer tooling or reliability-focused frameworks.
- Familiarity with AI-augmented development tools like Claude or Codex.
Benefits
- Multiple medical insurance plans, dental, vision, life, and disability insurance.
- Company equity (stock options).
- Open PTO policy.
- 401K plan with company match.
- Employee Emergency Fund.
About the Company
PlayOn is where high school sports come to life. Through GoFan, NFHS Network, and MaxPreps, we provide fans, parents, and communities with the technology to stay connected to the moments that matter most in high school athletics. Backed by KKR, we build the infrastructure that powers ticketing, streaming, and fundraising for schools nationwide.
Skills & tools
PythonAWSKubernetesTerraformDockerPrometheusGrafana
What the team is looking for
Use this list as a quick fit check before you apply.
- 01Python proficiency
- 02Java, C++, or Go proficiency
- 03Linux systems expertise
- 04Cloud infrastructure (AWS, GCP, or Azure)
- 05Docker and Kubernetes
- 06Terraform
- 07CI/CD pipelines
- 08Observability tools (Prometheus, Grafana, ELK, or Datadog)
- 09SLA/SLO definition

PlayOn
Job details
- Work model
- Completely Remote
- Commitment
- Full Time
- Category
- Engineering & Architecture
- Posted
- Today