Site Reliability Engineer

See more jobs from Chewy.com

about 4 years old

This job is no longer active

Our Opportunity:

Chewy is looking to hire a Site Reliability Engineer at our Boston, MA location. Site Reliability Engineers are a cross between system and software engineers who are responsible for all operational aspects of Chewy’s e-commerce platform. The team is responsible for designing, building, monitoring, and maintaining the infrastructure of our internet-facing and internal services. We're looking for engineers who want to be a part of developing infrastructure software, maintaining it, and scaling Chewy’s technology stack. Come help us build a bigger and better Chewy as a Site Reliability Engineer. You will be part of a small family within Chewy that has a huge impact on our incredible growth. Ideal candidates will possess the ability to discuss complex technical concepts with a diverse audience across all areas of the organization. They will remain calm under pressure and always strive to add structure to high-pressure, fast paced tasks or projects. 

What You'll Do:

  • Focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies
  • Define application monitoring and alerting strategy
  • Perform capacity planning and production readiness assessment
  • Embed with product teams during the design and requirements phase of new product development through to initial production launch
  • Identify requirements for other operational teams (release engineering, automation, etc.) during application development phase
  • Be a technology and Devops evangelist for the rest of the company
  • Participate in on-call rotation for level 3 support escalations 

 What You'll Need:

  • At least 5 years of experience working in an SRE role or similar
  • Hands on experience with orchestration and system configuration tools such as Ansible, Puppet, Chef, Terraform, etc.
  • Expert in building and maintaining highly available applications including redundancy, fail over, scalability, monitoring and performance.
  • Strong experience with virtualization, monitoring and automation
  • Software development experience (both scripting and “programming” languages)
  • Experience working with open source community (troubleshooting, patch submission, etc.)
  • Demonstrated 5+ years of Linux System Administration
  • Experience with CI tools such as Bamboo, Jenkins, Hudson
  • Ability to organize, troubleshoot and continuously learn
  • Previous experience working within controls such as SOX, PCI, etc.
  • This position may require travel

If you have a disability under the Americans with Disabilities Act or similar law, or you require a religious accommodation, and you wish to discuss potential accommodations related to applying for employment at our company, please contact [email protected].

To access Chewy’s Privacy Policy, which contains information regarding information collected from job applicants and how we use it, please click here: Chewy Privacy Policy (https://www.chewy.com/app/content/privacy).