Senior IT Site Reliability Engineer
At Hudson River Trading (HRT) we are mathematicians, computer scientists, statisticians, physicists and engineers. We research and develop automated trading algorithms using advanced mathematical techniques. We have built one of the world's most sophisticated computing environments, and our researchers are at the forefront of innovation in the world of algorithmic trading.
Job Description
Hudson River Trading (HRT) is looking for a Senior IT Site Reliability Engineer to join our growing IT Solutions Delivery team. This team is responsible for developing and maintaining the corporate productivity stack for the entire firm, both on-prem and in the cloud. As a Senior IT SRE, you will ensure the availability and reliability of systems within this stack and grow our engineering practice in alignment with the firm’s larger engineering organization.
This role requires a deep Linux operating system and application administration skill set, proficiency in Python, and solid experience with configuration management/IaC. Successful candidates should also have exceptional organizational, communication, and project management skills, as well as the ability to troubleshoot complex technical issues.
Responsibilities
- Manage on-premise containerized web services
- Automate and troubleshoot a broad range of technical infrastructure
- Design and operate secure, reliable systems
- Develop and implement monitoring solutions to ensure high system uptime and reliability; utilize tools to detect and resolve issues proactively
- Document system architecture, processes, and best practices
- Break down complexity, iterate, and communicate progress to a wide variety of leads and stakeholders
- Assist with the administration of DHCP and DNS for both on-premise and external systems and applications
Qualifications
- 5+ years of experience in site reliability engineering or related disciplines
- Proficiency with Python
- Experience managing and monitoring containerized infrastructure
- Experience working with CI/CD tools such as Jenkins, GitHub Actions, or ArgoCD
- Expert experience with IaC and configuration management tools such as Terraform, SaltStack, Chef, Puppet, or Ansible
- Nice-to-haves:
+ Experience building and operating systems on cloud platforms (e.g. AWS, Azure, GCP)
+ OpenLDAP or other directory services management expertise
+ Atlassian Data Center administration experience (on-prem)
+ Web development experience
Annual base salary range of $150,000 to $250,000. Pay (base and bonus) may vary depending on job-related skills and experience. A sign-on and discretionary performance bonus may be provided as part of the total compensation package, in addition to company-paid medical and/or other benefits.
Hudson River Trading (HRT) is looking for a Senior IT Site Reliability Engineer to join our growing IT Solutions Delivery team. This team is responsible for developing and maintaining the corporate productivity stack for the entire firm, both on-prem and in the cloud. As a Senior IT SRE, you will ensure the availability and reliability of systems within this stack and grow our engineering practice in alignment with the firm’s larger engineering organization.
This role requires a deep Linux operating system and application administration skill set, proficiency in Python, and solid experience with configuration management/IaC. Successful candidates should also have exceptional organizational, communication, and project management skills, as well as the ability to troubleshoot complex technical issues.
Responsibilities
- Manage on-premise containerized web services
- Automate and troubleshoot a broad range of technical infrastructure
- Design and operate secure, reliable systems
- Develop and implement monitoring solutions to ensure high system uptime and reliability; utilize tools to detect and resolve issues proactively
- Document system architecture, processes, and best practices
- Break down complexity, iterate, and communicate progress to a wide variety of leads and stakeholders
- Assist with the administration of DHCP and DNS for both on-premise and external systems and applications
Qualifications
- 5+ years of experience in site reliability engineering or related disciplines
- Proficiency with Python
- Experience managing and monitoring containerized infrastructure
- Experience working with CI/CD tools such as Jenkins, GitHub Actions, or ArgoCD
- Expert experience with IaC and configuration management tools such as Terraform, SaltStack, Chef, Puppet, or Ansible
- Nice-to-haves:
- Experience building and operating systems on cloud platforms (e.g. AWS, Azure, GCP)
- OpenLDAP or other directory services management expertise
- Atlassian Data Center administration experience (on-prem)
- Web development experience
Annual base salary range of $150,000 to $250,000. Pay (base and bonus) may vary depending on job-related skills and experience. A sign-on and discretionary performance bonus may be provided as part of the total compensation package, in addition to company-paid medical and/or other benefits.