Posted on 
Apr 9, 2025

Site Reliability Engineering (SRE) Specialist

Seattle
Entry level
Engineering
Alibaba Cloud
Alibaba Cloud
Alibaba Cloud
5001+
Computer Software

Established in September 2009, Alibaba Cloud develops highly scalable cloud computing and data management services providing large and small businesses, financial institutions, governments and other organizations with flexible, cost-effective solutions to meet their networking and information needs. A business of Alibaba Group, one of the world’s largest e-commerce companies, Alibaba Cloud operates the network that powers Alibaba Group’s extensive online and mobile commerce ecosystem and sells a comprehensive suite of cloud computing services to support sellers and other third-party entities participating in this ecosystem. 

Follow us:

Twitter: www.twitter.com/alibaba_cloud

Facebook: https://www.facebook.com/alibabacloud/

Job Description

Elastic Compute Service (ECS) is a core product of Alibaba Cloud. The Elastic Compute team is dedicated to building world-leading cloud computing infrastructure. As a key component of Alibaba Cloud's self-developed Apsara operating system , Elastic Compute Service (ECS) provides full-stack computing resources covering virtual machine instances, container services and Heterogeneous computing clusters.

Through technological innovation and product optimization, the Alibaba Cloud Elastic Compute team continuously drives advancements in cloud computing technologies, delivering high-quality computing services to users worldwid

e. Our goal is not only to support enterprises in achieving elastic scalability but also to deeply empower infrastructure innovation in the New era . Our mission is to build an intelligent foundation of "Computing as a Service," enabling developers to focus on businesses to concentrate on breakthroughs, without worrying about the complex engineering implementations from chips to clusters 

.

SRE Te

am:The Alibaba Cloud Elastic Compute Service (ECS) SRE (Site Reliability Engineering) team is a critical force in ensuring system stability and reliability. The SRE team focuses on guaranteeing the high availability, high performance, and robust stability of ECS products through technical expertise and innovati

on.

The Alibaba Cloud ECS SRE team is not only a core technical safeguard but also a driver of technological innovation and continuous optimization . By leveraging technical capabilities and collaborative teamwork, we ensure the stability and reliability of ECS products, safeguarding global customers' businesses. Additionally, we are committed to advancing cloud computing technologies through knowledge sharing and industry collaborati

on .

Joining the Alibaba Cloud ECS SRE team offers the opportunity to engage in the development and optimization of world-leading cloud computing technologies, while growing alongside a passionate and creative 

  • team.

Responsible for the delivery and operation/maintenance of various clusters, and participate in the architecture design and construction of the infrastructure operation pla

  • tform.Establish and optimize operation/maintenance service systems to achieve product stability and SLA 
  • goals.Develop delivery standards, document maintenance specifications, and enhance daily work efficiency through tool plat
  • forms.This position involves on-call responsibilities, requiring timely customer response within Service Level Agreement (SLA) timeframes, driving issue resolution and improving customer exper

ience.

Qualif

  • ication:5+ years of operation and maintenance (O&M) experience in IT, internet, or cloud computing ind
  • ustries;Proficient in Linux operating systems and mainstream protocols (e.g., TCP/IP), with solid hands-on experience in troubleshooting OS and network
  •  issues.Familiar with containerization and orchestration technologies such as Kubernetes, Slurm, 
  • and LSF.Ability to analyze and document technical issues systematically, develop tools/systems to optimize workflows, and improve operational efficiency through automation and platform-based so
  • lutions.Strong self-driven learning capabilities, excellent communication skills, and experience leading cross-team projects. Results-driven and action-oriented, with a commitment to exc

ellence.

The pay range for this position at commencement of employment is expected to be between $133,200/year and $219,600/year. However, base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and e

xperience.

If hired, employee will be in an “at-will position” and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and mark

et factors.

Alibaba U.S. based full time regular employees have access to medical, dental, and vision insurance, a 401(k) plan and basic life insurance, and wellbeing benefits like FSA, subject to the terms and conditions of the applicable plans then in effect. U.S. based employees are also eligible to receive up to 12 paid holidays, accrue up to 15 paid vacation days for this position, and receive up to 72 hours paid sick time (front-loaded) per ca

lendar year.

Receive Tech Ladies'
newest jobs in your inbox,
every week.

Join Tech Ladies for full-access to the job board, member-only events, and more!

If you're already a member, we haven't forgotten you. We promise. It's a new system. If you fill out the form once, it'll remember you going forward. Apologies for the inconvenience.

Seattle
Seattle
No items found.
Engineering
Engineering
On-site
On-site