Staff/Principal Site Reliability Engineer, ASA (Remote Eligible)

See more jobs from Okta Inc

about 3 years old

This job is no longer active

We are looking for an experienced engineer to help design and build the next generation of infrastructure to support Okta's <Advanced Server Access> product line.  This is an exciting opportunity to help build something special from the ground up and to create infrastructure that will scale our business over the coming years.

Job Duties and Responsibilities

  • Build an application hosting platform that makes deploying and running our software painless and reliable.
  • Create tooling and automation that maximizes the effectiveness of our Engineering teams and enables us to do the most with the least amount of manual effort.
  • Driving initiatives to evolve our current platform to increase efficiency and keep it in line with current standards and best practices.
  • Responding to production incidents and determining how we can prevent them in the future.
  • Support a 24x7 online environment as part of an on-call rotation.
  • Be a subject matter expert and partner with our team at Amazon Web Services (AWS).
  • Develop and maintain technical documentation, runbooks, and procedures.

Qualifications for the role:

  • You must possess a deep Linux systems engineering background with strong TCP/IP networking knowledge.
  • You must have experience building infrastructure with code that is deployed and updated as part of a CI/CD pipeline.
  • The ideal candidate has at least six years experience supporting large-scale production workloads, including 3+ years experience working with infrastructure built in AWS (or other comparable providers).
  • You should have experience building infrastructure on Kubernetes.  The ideal candidate has at least 3 years of experience building and scaling production Kubernetes clusters on Amazon EKS.
  • Real-world experience running a modern web stack in production, including HTTP tiers such as haproxy, or Envoy, application tiers that run as Docker workloads, and data storage tiers utilizing PostgreSQL, Elasticache, DynamoDB, etc.
  • Have exposure to PCI, FedRAMP, SOC2, or other compliance programs.
  • 3+ years of experience with automating systems and infrastructure via Terraform, Ansible, or Chef.  
  • Can code to a good standard with at least one modern programming language using git source control.  We use Go primarily.
  • Champion excellent written and oral communication skills, with the ability to influence others.

Okta is rethinking the traditional work environment, providing our employees with the flexibility to be their most creative and successful versions of themselves, no matter where they are located.  We enable a flexible approach to work, meaning for roles where it makes sense, you can work from the office, or from home, regardless of where you live.  Okta invests in the best technologies and provides flexible benefits and collaborative work environments/experiences, empowering employees to work productively in a setting that best and uniquely suits their needs.  Find your place at Okta https://www.okta.com/company/careers/.

 

Okta is an Equal Opportunity Employer