Senior Site Reliability Engineer
Movable Ink is a software company that provides marketers with technology and expert services to create unique customer experiences.
Job Description
As one of our Site Reliability Engineers, you will be 100% hands on with both infrastructure and software development. We operate a multi-region, active-active content serving platform that serves upwards of 8 Billion requests daily with a mixture of ingenuity, attention to detail, off the shelf components, and custom software. Come and help us scale to 16 Billion requests per day and beyond.
Responsibilities:
- Improve the tooling and automation of our infrastructure to minimize manual work, increase performance, and decrease the frequency and severity of incidents
- Build, maintain, and support core applications
- Build and operate our core internal observability platform
- Monitor our systems for capacity, performance, and troubleshooting issues
- Partner with the rest of the SRE team and our service engineering teams to ensure smooth, continued delivery of our service to clients
Qualifications:
- Experience in Site Reliability or Software Engineering, building and maintaining scalable, resilient services.
- Building the tooling and automation to manage those services, as well as investigating system and application metrics to diagnose and resolve performance issues.
- 4+ years experience as an SRE or Software Engineer, with a focus on Cloud platforms. We use AWS.
- Experience building and operating large scale observability platforms. We use Prometheus, Thanos, Loki and Tempo.
- Experience and willingness to operate in an on-call environment, evaluating and improving monitoring and alerting systems, and developing run books to investigate and debug issues. Every member of the SRE team does a week long on-call rotation every 5 to 6 weeks.
- Strong experience with infrastructure as code tools. We use Terraform and Chef.
- Strong experience with operating Kubernetes and running workloads on it. We use EKS.
- Familiarity with one or more high level programming languages and a willingness to learn. We use NodeJS, Golang, Ruby, Python, Bash and Shell scripting.
- Linux experience (Ubuntu/Debian)
The base pay range for this position is $165,000 - $195,000 yearly. The base pay offered may vary depending on job-related knowledge, skills, and experience. Stock options and other incentive pay may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, depending on the position ultimately offered.
As one of our Site Reliability Engineers, you will be 100% hands on with both infrastructure and software development. We operate a multi-region, active-active content serving platform that serves upwards of 8 Billion requests daily with a mixture of ingenuity, attention to detail, off the shelf components, and custom software. Come and help us scale to 16 Billion requests per day and beyond.
Responsibilities:
- Improve the tooling and automation of our infrastructure to minimize manual work, increase performance, and decrease the frequency and severity of incidents
- Build, maintain, and support core applications
- Build and operate our core internal observability platform
- Monitor our systems for capacity, performance, and troubleshooting issues
- Partner with the rest of the SRE team and our service engineering teams to ensure smooth, continued delivery of our service to clients
Qualifications:
- Experience in Site Reliability or Software Engineering, building and maintaining scalable, resilient services.
- Building the tooling and automation to manage those services, as well as investigating system and application metrics to diagnose and resolve performance issues.
- 4+ years experience as an SRE or Software Engineer, with a focus on Cloud platforms. We use AWS.
- Experience building and operating large scale observability platforms. We use Prometheus, Thanos, Loki and Tempo.
- Experience and willingness to operate in an on-call environment, evaluating and improving monitoring and alerting systems, and developing run books to investigate and debug issues. Every member of the SRE team does a week long on-call rotation every 5 to 6 weeks.
- Strong experience with infrastructure as code tools. We use Terraform and Chef.
- Strong experience with operating Kubernetes and running workloads on it. We use EKS.
- Familiarity with one or more high level programming languages and a willingness to learn. We use NodeJS, Golang, Ruby, Python, Bash and Shell scripting.
- Linux experience (Ubuntu/Debian)
The base pay range for this position is $165,000 - $195,000 yearly. The base pay offered may vary depending on job-related knowledge, skills, and experience. Stock options and other incentive pay may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, depending on the position ultimately offered.