Senior Site Reliability Engineer

See more jobs from LogicMonitor, Inc.

about 2 years old

This job is no longer active

About us

LogicMonitor is the leading fully automated, cloud-based infrastructure monitoring and observability platform for enterprise IT and managed service providers.

We love going to work and think you should too. We are customer-obsessed, work as one agile team, and strive to be better every day while building trust.  These are our core values. So it's no surprise that we work hard and genuinely have fun working with each other as we expand our global presence and achieve record-breaking success.

This position can be remote, offering you the flexibility to work out of your home full-time. You'll have easy access to and support from your manager and frequent video meetings to keep you plugged into your team. If you are traveling to the area, we invite you to take advantage of our space if you would like to work in an office environment.

LogicMonitor is an equal opportunity employer. We’re committed to creating an inclusive environment for all our employees, where different backgrounds and perspectives are valued and encouraged - regardless of race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation. We encourage all people to come as they are.

We operate with integrity, esteem diversity, and treat each other fairly and with respect. We strive to find our own versions of personal and professional harmony through community building and holistic growth. We hear time and time again that our awesome people are a huge part of why LMers chose LogicMonitor, love their teams, and choose to stay.

To learn more about life at LogicMonitor, check out our Careers Page.

What you'll do

This is a once in a lifetime opportunity to be a part of an organization with an outstanding product, operation and culture. We are seeking an experienced Senior Site Reliability Engineer that is ready to advance to the next level. Take a leading role in the reliability and continued expansion of the LogicMonitor platform. Manage a global network of hybrid cloud computing services. Provide guidance in organizing, securing and automating our systems. Work with developers to drive operational improvements within our platform and increase our reliability. This role provides plenty of opportunity to make your mark at LogicMonitor.

Here's a closer look at the role

  • Maintain uptime of LogicMonitor's SaaS based service and drive technical/process enhancements to improve uptime
  • Design and deploy new infrastructures
  • Write code to automate various aspects of infrastructure maintenance and and deployments
  • Support Development and work closely with developers to drive operational and architecture/design changes
  • Own, manage, and execute large and technically complex projects across teams
  • Lead by example in providing good documentation and thorough Runbooks

What you'll need

  • 4+ years experience working as an SRE or systems administration role 
  • Configuration management tools such as Puppet, Chef ,or Ansible
  • Virtualization and container technologies (Docker, Kubernetes, etc.)
  • Programming and scripting (python/shell/go)
  • Source code management tools (git)
  • Service Oriented Architecture and High Availability systems

#LI-PR1

#LI-REMOTE

Residents of California, click Here to view our California Applicant Privacy Notice.