Who We Are
Generate Biomedicines, Inc. is a Flagship backed, privately-held biotechnology company on a mission to reimagine the drug discovery process to one of dynamic, data-driven generation. We pursue this audacious vision because we believe in the unique and revolutionary power of generative biology to radically transform the lives of billions, with an outsized opportunity for patients in need. Generate will be successful by constantly turning innovative ideas into methods, technologies, and products that solve some of the most difficult challenges with developing medicines. We are seeking collaborative, relentless problem solvers that share our passion for impact to join us!
Generate was founded by Flagship Pioneering. Flagship Pioneering conceives, creates, resources, and develops first-in-category life sciences companies to transform human health and sustainability. Since its launch in 2000, the firm has applied a unique hypothesis-driven innovation process to originate and foster more than 100 scientific ventures, resulting in over $30 billion in aggregate value. The current Flagship ecosystem comprises 37 transformative companies, including: Moderna Therapeutics (NASDAQ: MRNA), Rubius Therapeutics (NASDAQ: RUBY), Indigo Agriculture, and Sana Biotechnology.
Position Summary
We are seeking a creative and motivated MLOps Data Engineer to build the machine learning platform required to achieve our ambitious goals. As part of the MLOps group, she/he will work across the stack to develop, test, deploy, and maintain ML based software solutions. The successful candidate will work closely with ML scientists, Computational Biologists, and Informatics/IT engineers to implement a scalable platform that rapidly advances our scientific programs.
Key responsibilities:
- Design, develop and refine infrastructure for Generate Biomedicine’s ML platform, enabling rapid model development, training, evaluation at scale
- Deploy and monitor cutting-edge ML models and algorithms developed by ML team
- Implement large scale MLOps and data ETL pipelines in a distributed computing environment
- Develop solutions for efficiently performing ML model experimentation and tuning
- Establish data engineering processes and best-practices for data scientists utilizing our ML platform
- Maintain awareness of commercial tools & innovations and champion new solutions internally
Qualifications:
- 2+ years experience working in a DevOps or data engineer role using cloud-based infrastructure such as AWS, GCP, or Microsoft Azure
- Well versed in machine learning fundamentals. Experience with deep learning frameworks is desired.
- Proficiency in Python and strong object-oriented design skills coupled with a solid understanding of data structures and algorithms
- Familiarity with workflow orchestration tools such as Airflow, Luigi, or Prefect is desired
- Familiarity with Kubernetes or other container orchestration tools in a production setting
- Demonstrated self-motivation and willingness to dive into complicated data engineering challenges
- Ability to work in a fast-paced environment and strong technical communication skills