Stryker
Apply Now
Staff Site Reliability Engineer
Description
**What You Will Do**
+ Own and maintain highly available production systems, lead incident response (P1/P2), conduct RCA/PIRs, and drive improvements to reliability, performance, and operational excellence.
+ Design, build, and manage scalable cloud infrastructure on AWS using Terraform, with strong ownership of Kubernetes (EKS), networking, security, and platform resilience.
+ Develop and optimize CI/CD and GitOps pipelines using GitLab CI and ArgoCD, while automating operational processes to improve efficiency and consistency.
+ Manage observability and on-call operations through tools such as PagerDuty/Zenduty, Prometheus, Grafana, ELK, and Datadog, ensuring actionable monitoring and effective alert management.
+ Collaborate with global engineering, security, and product teams, contribute to cloud architecture and compliance initiatives (SOC2, ISO27001), create operational documentation, and mentor team members
**What You Will Need**
**Required...
+ Own and maintain highly available production systems, lead incident response (P1/P2), conduct RCA/PIRs, and drive improvements to reliability, performance, and operational excellence.
+ Design, build, and manage scalable cloud infrastructure on AWS using Terraform, with strong ownership of Kubernetes (EKS), networking, security, and platform resilience.
+ Develop and optimize CI/CD and GitOps pipelines using GitLab CI and ArgoCD, while automating operational processes to improve efficiency and consistency.
+ Manage observability and on-call operations through tools such as PagerDuty/Zenduty, Prometheus, Grafana, ELK, and Datadog, ensuring actionable monitoring and effective alert management.
+ Collaborate with global engineering, security, and product teams, contribute to cloud architecture and compliance initiatives (SOC2, ISO27001), create operational documentation, and mentor team members
**What You Will Need**
**Required...