Posted on 
Jan 9, 2025

Senior Data Engineer

USA
Mid-Senior ICs
Data Science + Analytics
Arity
Arity
Arity
Public
251-1000
Software, Security & Developer Tools

Founded by The Allstate Corporation in 2016, Arity is a mobility data and analytics company focused on improving transportation. We collect and analyze enormous amounts of data, using predictive analytics to build solutions with single goal in mind: to make transportation smarter, safer, and more useful for everyone.

Job Description

Founded by The Allstate Corporation in 2016, Arity is a data and analytics company focused on improving transportation. We collect and analyze enormous amounts of data, using predictive analytics to build solutions with a single goal in mind: to make transportation smarter, safer, and more useful for everyone. At the heart of that mission are the people that work here—the dreamers, doers and difference-makers that call this place home. As part of that team, your work will showcase both your intelligence and your creativity as you tackle real-world problems and put your talents towards transforming transportation. That’s because at Arity, we believe work and life shouldn’t be at odds with one another. After all, we know that your unique qualities give you a unique perspective. We don’t just want you to see yourself here. We want you to be yourself here. Arity is committed to supporting an inclusive and diverse environment where you can thrive and learn from others.

We are seeking a highly skilled and experienced Senior Data Engineer with extensive professional experience in full-stack data engineering, including hands-on expertise in the development of large-scale data platforms and machine learning pipelines. In this role, you’ll design, develop, and optimize scalable data and ML workflows to support our growing telematics business needs. As a key member of the Data Analytics Engineering Team, you’ll enable data-driven decision-making by building robust pipelines, efficient architectures, and impactful ML solutions.

Key Responsibilities:

  • Design, build, and maintain end-to-end data and machine learning pipelines to support analytics, reporting, and AI-driven applications.
  • Develop and optimize scalable ETL/ELT processes to extract, transform, and load data from diverse sources into a cloud-based platform.
  • Architect and manage data storage solutions within the data platform (e.g., data lakes, warehouses, and marts) to enable advanced analytics and machine learning.
  • Implement and manage ML pipelines, building feature pipelines and deploying models.
  • Collaborate with data scientists, analysts, and software engineering to integrate data products into business workflows.
  • Ensure data quality, consistency, and governance by implementing robust monitoring, validation, and alerting mechanisms.
  • Lead the adoption of new cloud-native technologies to streamline and enhance data and ML operations.
  • Mentor junior data engineers, fostering a culture of innovation and knowledge-sharing within the team.

Qualifications:

  • Bachelor’s degree in Computer Science, Data Science, Software Engineering, Mathematics, Statistics, or a related field. A Master’s degree is preferred.

  • 5+ years of professional experience in data engineering, including end-to-end pipeline development and cloud integration.
  • Proven experience with machine learning workflows, including data preparation, feature engineering, and model deployment.
  • Proficiency in programming languages such as Python, Scala, or Java, with an emphasis on ML libraries like TensorFlow, PyTorch, or Scikit-learn.
  • Strong knowledge of data processing frameworks (e.g., Apache Spark, Flink, Beam) and real-time data streaming (e.g., Kafka, Kinesis).
  • Hands-on expertise with AWS or GCP ecosystems, including tools like:
    • AWS: SageMaker, Redshift, Glue, Athena, EMR
    • GCP: BigQuery, Vertex AI, Dataflow, Dataproc
  • Solid understanding of relational and non-relational database systems (SQL, NoSQL).
  • Experience with data orchestration tools (e.g., Airflow, Prefect, dbt) and CI/CD practices.
  • Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a plus.

Preferred Skills:

  • Experience with geospatial data like trajectories.
  • Experience deploying machine learning models into production environments with monitoring and optimization strategies.
  • Familiarity with cloud security and compliance best practices for data and ML workflows.
  • Proficiency in BI tools (e.g., Tableau, Looker, Power BI) and data visualization.
  • Strong understanding of MLOps practices and tools for automated ML lifecycle management.

Relevant certifications in cloud platforms (e.g., AWS Data Engineer, Machine Learning Engineer; Google Cloud Professional ML Engineer, Professional Data Engineer) are a plus.

Skills

Business Data Analytics, Computer Science, Data Analytics, Data Science, Machine Learning, Predictive Modeling, Python (Programming Language)

Compensation

Compensation offered for this role is $85,600.00 – 152,650.00 annually and is based on experience and qualifications.

The candidate(s) offered this position will be required to submit to a background investigation.

Joining our team isn’t just a job — it’s an opportunity. One that takes your skills and pushes them to the next level. One that encourages you to challenge the status quo. And one where you can impact the future for the greater good.  

You’ll do all this in a flexible environment that embraces connection and belonging. And with the recognition of several inclusivity and diversity awards, we’ve proven that Allstate empowers everyone to lead, drive change and give back where they work and live. 

Receive Tech Ladies'
newest jobs in your inbox,
every week.

Join Tech Ladies for full-access to the job board, member-only events, and more!

If you're already a member, we haven't forgotten you. We promise. It's a new system. If you fill out the form once, it'll remember you going forward. Apologies for the inconvenience.

USA
USA
Cassandra
Cassandra
Hadoop
Hadoop
Java
Java
JavaScript
JavaScript
Kotlin
Kotlin
MariaDB
MariaDB
MongoDB
MongoDB
Python
Python
R
R
React
React
Redux
Redux
Scala
Scala
Spark
Spark
Spring
Spring
SQL
SQL
Hive
Hive
Kafka
Kafka
Scikit
Scikit
GitHub
GitHub
Salesforce
Salesforce
Jira
Jira
Data Science + Analytics
Data Science + Analytics
Remote
Remote