Senior Site Reliability Engineer, Developer Productivity
CoreWeave is a specialized cloud provider focused on GPU accelerated use cases including VFX, AI/ML, Batch Processing and Real Time Experiences. We support countless AI/ML services in the text to image, NLP and broader AI/ML space, reducing client’s infrastructure management requirements with our Kubernetes based serverless GPU cloud offerings.
Job Description
About the Role:
The Developer Productivity Team functions as the lubricant that keeps CoreWeave’s gears of innovation turning fast and friction-free. This team is responsible for the development, integration, and operation of platforms central to the engineering experience, with the ultimate objective of enabling engineers across CoreWeave to do more and better. Central to the Developer Productivity team’s mission is to standardize the SDLC and software development processes across the company, ensuring a seamless and efficient experience for all CoreWeave’s engineers. This role offers a unique opportunity to work on a large scale and make a tangible impact on developer efficiency and productivity.
Engineers on this team will endeavor to discover and remove engineer friction across CoreWeave’s engineering teams through the development of boilerplate, integrations, automation, and the operation of shared platforms.
We are seeking a Site Reliability Engineer who can help us execute on the mission of making developers’ lives easier. This individual will work with a team of 6-8 mixed-specialization engineers and have the opportunity to work on the full gamut of rewarding challenges that come with the business of building a cloud in a communicative, supportive, and high-performing environment. As a member of the Developer Productivity Team, you would have the opportunity to:
- Design and implement services and tools to reduce friction and toil in the lives of our engineering and operations.
- Streamline repetitive tasks and eliminate bottlenecks to improve development velocity with automated workflows and processes.
- Partner with developers to understand their pain points and develop tailored solutions that enhance their productivity.
- Champion best practices and advocate for new tools and technologies to drive ongoing productivity gains.
- Tackle complex issues related to build systems, testing frameworks, code analysis, and other developer tooling.
- Enable and evangelize the practice of reliability engineering across CoreWeave’s engineering teams.
- Grow, change, invest in your teammates, be invested-in, share your ideas, listen to others, be curious, have fun, and most importantly, be yourself.
Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are some qualities we’ve found compatible with our team. If a portion of this resonates with you, we’d love to talk.
Minimum Qualification
- You have 5+ years of experience in the software or infrastructure engineering industry
- You enjoy helping your colleagues achieve more with less effort.
- Experience with Python, Go or another scripting language
- Experience with how to containerize applications and/or have experience using Kubernetes to manage deployments.
- Experience with Git
- Experience with Linux shell scripting and/or can navigate a *nix-based operating system
- Experience creating and maintaining GitHub Actions to automate workflows.
- You have experience deploying services in production and are interested in learning reliability-at-scale engineering concepts such as the different types of testing, progressive deployments, error budgets, role observability, and fault-tolerant design.
- You have experience refining SDLC, doing code reviews, and providing technical support
- You’re excited about being part of a team of diverse perspectives and backgrounds that believe in tackling challenges, growing hand in hand, and winning together.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $175,000-$210,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.
About the Role:
The Developer Productivity Team functions as the lubricant that keeps CoreWeave’s gears of innovation turning fast and friction-free. This team is responsible for the development, integration, and operation of platforms central to the engineering experience, with the ultimate objective of enabling engineers across CoreWeave to do more and better. Central to the Developer Productivity team’s mission is to standardize the SDLC and software development processes across the company, ensuring a seamless and efficient experience for all CoreWeave’s engineers. This role offers a unique opportunity to work on a large scale and make a tangible impact on developer efficiency and productivity.
Engineers on this team will endeavor to discover and remove engineer friction across CoreWeave’s engineering teams through the development of boilerplate, integrations, automation, and the operation of shared platforms.
We are seeking a Site Reliability Engineer who can help us execute on the mission of making developers’ lives easier. This individual will work with a team of 6-8 mixed-specialization engineers and have the opportunity to work on the full gamut of rewarding challenges that come with the business of building a cloud in a communicative, supportive, and high-performing environment. As a member of the Developer Productivity Team, you would have the opportunity to:
- Design and implement services and tools to reduce friction and toil in the lives of our engineering and operations.
- Streamline repetitive tasks and eliminate bottlenecks to improve development velocity with automated workflows and processes.
- Partner with developers to understand their pain points and develop tailored solutions that enhance their productivity.
- Champion best practices and advocate for new tools and technologies to drive ongoing productivity gains.
- Tackle complex issues related to build systems, testing frameworks, code analysis, and other developer tooling.
- Enable and evangelize the practice of reliability engineering across CoreWeave’s engineering teams.
- Grow, change, invest in your teammates, be invested-in, share your ideas, listen to others, be curious, have fun, and most importantly, be yourself.
Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are some qualities we’ve found compatible with our team. If a portion of this resonates with you, we’d love to talk.
Minimum Qualification
- You have 5+ years of experience in the software or infrastructure engineering industry
- You enjoy helping your colleagues achieve more with less effort.
- Experience with Python, Go or another scripting language
- Experience with how to containerize applications and/or have experience using Kubernetes to manage deployments.
- Experience with Git
- Experience with Linux shell scripting and/or can navigate a *nix-based operating system
- Experience creating and maintaining GitHub Actions to automate workflows.
- You have experience deploying services in production and are interested in learning reliability-at-scale engineering concepts such as the different types of testing, progressive deployments, error budgets, role observability, and fault-tolerant design.
- You have experience refining SDLC, doing code reviews, and providing technical support
- You’re excited about being part of a team of diverse perspectives and backgrounds that believe in tackling challenges, growing hand in hand, and winning together.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $175,000-$210,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.