
Staff Site Reliability Engineer
Blink Health
Job description
Responsibilities
- Establish and evolve SRE best practices across the organization, including reliability principles, error budgets, incident response, postmortems, and operational readiness standards.
- Define and drive observability strategy for system health, performance, and reliability, including SLIs/SLOs, alerting quality, dashboards, and service health indicators.
- Design and implement software-driven solutions within the infrastructure domain, automating manual processes and eliminating operational complexity and toil.
- Act as a technical leader and force multiplier, helping set priorities and influencing decision-making across core cloud infrastructure, reliability tooling, and platform architecture.
- Take ownership of large, ambiguous initiatives, driving them from concept to delivery while aligning stakeholders across engineering, security, and product.
- Combine deep knowledge of software development, infrastructure, and security to improve platform resilience, scalability, performance, and compliance.
- Proactively identify systemic risks and reliability gaps, recommending and leading platform upgrades and architectural improvements before they become incidents.
- Partner with engineering teams to improve developer workflows, tooling, and operational maturity, increasing productivity while reducing cognitive load.
- Provide technical mentorship, architecture guidance, and high-quality design and code reviews for engineers across infrastructure and product teams.
- Lead by example in documentation and knowledge sharing, ensuring systems and processes are well-understood and not dependent on individual ownership.
- Participate in and help mature incident response, escalation practices, and post-incident learning across the organization.
Requirements
- Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.
- 7+ years of experience in site reliability engineering, infrastructure engineering, or platform engineering roles, with demonstrated impact at scale.
Preferred Qualifications
- Expert-level, methodical troubleshooting across the entire stack, from application to kernel to network.
- Strong command-line proficiency and deep expertise in Linux systems and operating system fundamentals.
- Advanced understanding of networking concepts including load balancing, proxies, DNS, TCP/IP, NAT, and service-to-service communication.
- Experience working across multiple languages (e.g., Python, Go, Bash) and troubleshooting application stacks such as React or similar.
- Strong track record of automating repetitive and complex operational work to reduce toil and increase reliability.
- Ability to design and build internal tools (Python or Go) that standardize and scale engineering practices.
- Comfortable operating in an agile environment, with disciplined testing and quality practices.
- Deep experience with cloud platforms (AWS preferred, GCP/Azure acceptable), particularly managed services and production-grade architectures.
- Strong expertise in Kubernetes and container orchestration (EKS, Helm), including lifecycle management and operational best practices.
- Proven experience designing and implementing observability systems, including metrics, logging, tracing, dashboards, and alerting.
- Deep understanding of container technologies, security scanning, secrets management, dynamic configuration, and microservices architectures.
- Familiarity with service meshes and advanced traffic management concepts.
- Experience designing and maintaining company-wide IaC codebases using tools such as Terraform, Pulumi, CloudFormation, or Ansible.
- Ability to think holistically about infrastructure design, cost, reliability, security, and long-term maintainability.
Benefits
About the Company
Blink Health is the fastest growing healthcare technology company that builds products to make prescriptions accessible and affordable to everybody. Our two primary products – BlinkRx and Quick Save – remove traditional roadblocks within the current prescription supply chain, resulting in better access to critical medications and improved health outcomes for patients.
BlinkRx is the world’s first pharma-to-patient cloud that offers a digital concierge service for patients who are prescribed branded medications. Patients benefit from transparent low prices, free home delivery, and world-class support on this first-of-its-kind centralized platform. With BlinkRx, never again will a patient show up at the pharmacy only to discover that they can’t afford their medication, their doctor needs to fill out a form for them, or the pharmacy doesn’t have the medication in stock.
We are a highly collaborative team of builders and operators who invent new ways of working in an industry that historically has resisted innovation. Join us!
Skills & tools
What the team is looking for
Use this list as a quick fit check before you apply.
- 01Bachelor's Degree
- 027+ Years Experience

Blink Health
Job details
- Work model
- Completely Remote
- Commitment
- Full Time
- Category
- Information Technology
- Posted
- 1 months ago