Senior DevOps Engineer(Promtheus/Grafana)

See more jobs from Binance

about 1 month old

Apply Now

Responsibilities:

  •  Design, implement, and manage comprehensive monitoring solutions to ensure high availability, performance of our microservices infrastructure and applications.
  • Utilize advanced monitoring tools and scripting to automate the monitoring of our cloud environments, focusing on AWS.
  • Develop and maintain robust logging and alerting mechanisms to identify and mitigate potential issues proactively.
  • Collaborate with infra team to integrate monitoring solutions into the CI/CD pipeline, ensuring seamless deployments and operations.
  • Conduct performance analysis, capacity planning, and scalability testing to ensure our systems meet current and future demands.
  • Lead incident response and troubleshooting efforts, utilizing monitoring data to quickly resolve operational issues.
  • Requirements:

  • Minimum of 5 years of hands-on experience with Kubernetes, Elasticsearch, Promtheus, Grafana and AWS, with a strong emphasis on monitoring and observability in cloud-native environments.
  • Proficiency in promgraming languages (such as Python, Go or Rust) for automation of monitoring tasks.
  • Experience with infrastructure as code (IaC) tools, and strong understanding of CI/CD principles, including experience with Docker and Kubernetes for container orchestration.
  • Deep knowledge monitoring tools (such as Prometheus, Grafana or ELK stack) and strategies for large-scale environments.
  • Proven track record in managing and troubleshooting large-scale distributed systems, with an emphasis on performance tuning and optimization.
  • Excellent problem-solving skills, with a focus on delivering high-quality, reliable, and scalable infrastructure solutions.
  • Strong communication and teamwork skills, with the ability to work effectively in a fast-paced, collaborative environment.