Posted on 
Dec 12, 2024

Senior Infrastructure Engineer, Metal Dev (NYC)

New York City
Mid-Senior ICs
Engineering, IT
CoreWeave
CoreWeave
CoreWeave
Private
101-250
Software, Security & Developer Tools

CoreWeave is a specialized cloud provider focused on GPU accelerated use cases including VFX, AI/ML, Batch Processing and Real Time Experiences. We support countless AI/ML services in the text to image, NLP and broader AI/ML space, reducing client’s infrastructure management requirements with our Kubernetes based serverless GPU cloud offerings.

Job Description

About this Role:

CoreWeave is seeking a Senior Software Engineer with experience in infrastructure development to join our dynamic team based in NYC. This role involves working closely with GPU servers, building infrastructure services using Go and Python, and developing gRPC and REST APIs consumed by our Kubernetes orchestration layer. We are looking for an engineer who thrives in a highly dynamic and collaborative environment. 

Responsibilities:

  • Design, develop and maintain Server Management services using Go and Python
  • Develop and maintain gRPC and REST APIs for interaction with K8S orchestration layers and other infrastructure consumers.
  • Collaborate with upstream open source communities, including Go and Redfish based projects.
  • Document hardware automation workflows and processes
  • Create CI/CD pipelines for server hardware compliance tests
  • Develop and maintain hardware/firmware management, data collection and reporting services
  • Automate all aspects of the server hardware lifecycle
  • Address production service escalations.
  • Collaborate with cross-functional teams to define service requirements, specifications, and system architecture

Requirements:

  • B.S. in Computer Science, or related field or equivalent work experience.
  • 5+ years of experience in software engineering with a focus on infrastructure development.
  • Proficiency in Go and Python.
  • Strong experience with gRPC and REST API development.
  • Familiarity with Kubernetes (K8s) and container orchestration.
  • Experience with GPU servers is highly desirable.
  • Strong analytical and problem solving skills with attention to detail.
  • Strong communication skills and the ability to work well in a team environment.
  • Prior experience with Prometheus / Grafana
  • Experience deploying containerized applications using Kubernetes

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $175,000 - $210,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

About this Role:

CoreWeave is seeking a Senior Software Engineer with experience in infrastructure development to join our dynamic team based in NYC. This role involves working closely with GPU servers, building infrastructure services using Go and Python, and developing gRPC and REST APIs consumed by our Kubernetes orchestration layer. We are looking for an engineer who thrives in a highly dynamic and collaborative environment. 

Responsibilities:

  • Design, develop and maintain Server Management services using Go and Python 
  • Develop and maintain gRPC and REST APIs for interaction with K8S orchestration layers and other infrastructure consumers.
  • Collaborate with upstream open source communities, including Go and Redfish based projects.
  • Document hardware automation workflows and processes
  • Create CI/CD pipelines for server hardware compliance tests
  • Develop and maintain hardware/firmware management, data collection and reporting services
  • Automate all aspects of the server hardware lifecycle
  • Address production service escalations.
  • Collaborate with cross-functional teams to define service requirements, specifications, and system architecture

Requirements:

  • B.S. in Computer Science, or related field or equivalent work experience.
  • 5+ years of experience in software engineering with a focus on infrastructure development.
  • Proficiency in Go and Python.
  • Strong experience with gRPC and REST API development.
  • Familiarity with Kubernetes (K8s) and container orchestration. 
  • Experience with GPU servers is highly desirable.
  • Strong analytical and problem solving skills with attention to detail.
  • Strong communication skills and the ability to work well in a team environment.
  • Prior experience with Prometheus / Grafana 
  • Experience deploying containerized applications using Kubernetes

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $175,000 - $210,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

Receive Tech Ladies'
newest jobs in your inbox,
every week.

Join Tech Ladies for full-access to the job board, member-only events, and more!

If you're already a member, we haven't forgotten you. We promise. It's a new system. If you fill out the form once, it'll remember you going forward. Apologies for the inconvenience.

New York City
New York City
No items found.
Engineering
Engineering
IT
IT
In-Person
In-Person