This job has expired.
Apex Systems
Expired

DevOps Engineer (Remote)

Remote

Location restricted
This job is restricted to tax residents of , but we detected your IP as outside of the country. Please only apply if you are a tax resident.

Apex Systems the 2nd largest IT Staffing firm in the nation is seeking an experienced DevOps Engineer to join our client’s team. This is a  W2 contract position  is slated for 6 months with possibility for extension/conversion and is FULLY REMOTE (PST hours).

**Must be comfortable sitting on Apex System's W2**

Job Description:

We are on a mission to connect every member of the global workforce with economic opportunity, and that starts right here. Talent is our number one priority, and we make sure to apply that philosophy both to our customers and to our own employees as well. Explore cutting-edge technology and flex your creativity. Work and learn from the best. Push your skills higher. Tackle big problems. Innovate. Create. Write code that makes a difference in professionals’ lives.

Gobblin is a distributed data integration framework that was born at client and was later released as an open-source project under the Apache foundation. Gobblin is a critical component in client's data ecosystem, and is the main bridge between the different data platforms, allowing efficient data movement between our AI, analytics, and member-facing services. Gobblin utilizes and integrates with the latest open source big data technologies, including Hadoop, Spark, Presto, Iceberg, Pinot, ORC, Avro, and Kubernetes. Gobblin is a key piece in client's data lake, operating at a massive scale of hundreds of petabytes.

Our latest work involves integrations with cutting edge technologies such as Apache Iceberg to allow near-real-time ingestion of data from various sources onto our persistent datasets that allow complex and highly scalable query processing for various business logic applications, serving machine-learning and data-science engineers. Furthermore, we play an instrumental role in client's transformation from on-prem oriented deployment to Azure cloud-based environments. This transformation prompted a massive modernization and rebuilding efforts of Gobblin, transforming it from a managed set of Hadoop batch jobs to an agile, auto-scalable, real-time streaming oriented PaaS, with user-friendly self-management capabilities that will boost productivity across our customers. This is an exciting opportunity to take part in shaping the next generation of the platform.

What is the Job

You will be working closely with development and site reliability teams to better understand their challenges in aspects like:

Increasing development velocity of data management pipelines by automating testing and deployment processes,

Improving the quality of data management software without compromising agility.

You will create and maintain fully-automated CI/CD processes across multiple environments and make them reproducible, measurable, and controllable for data pipelines that deal with PBs every day. With your abundant skills as a DevOps engineer, you will also be able to influence the broad teams and cultivate DevOps culture across the organization.

Why it matters

CI/CD for big data management pipelines have been a traditional challenge for the industry. This is becoming more critical as we evolve our tech stack into the cloud age (Azure). With infrastructure shifts and data lake features being developed/deployed at an ever fast pace, our integration and deployment processes must evolve to ensure the highest-quality and fulfill customer commitments. The reliability of our software greatly influences the analytical workload and decision-making processes across many company-wide business units, the velocity of our delivery plays a critical role to transform the process of mining insights from massive-scale Data Lake into an easier and more efficient developer productivity paradigm.

What You’ll Be Doing

  • Work collaboratively in an agile, CI/CD environment
  • Analyze, document, and implement and maintain CI/CD pipelines/workflows in cooperation with the data lake development and SRE teams
  • Build, improve, and maintain CI/CD tooling for data management pipelines
  • Identify areas for improvement for the development processes in data management teams
  • Evangelize CI/CD best practices and principles
  • Technical Skills

  • Experienced in building and maintaining successful CI/CD pipelines
  • Self-driven and independent
  • Has experience with Java, Scala, Python or other programming language
  • Great communication skills
  • Master of automation
  • Years of Experience

  • 5+
  • Preferred Skills

  • Proficient in Java/Scala
  • Proficient in Python
  • Experienced in working with:
  • Big Data environments: Hadoop, Kafka, Hive, Yarn, HDFS, K8S
  • ETL pipelines and distributed systems
  • Other DevOps contracts

    Remote
    0
    USD
    /hr

    0 outside IR35 DevOps contracts