Location – Carlsbad, California (candidate would start working remotely and if and when the quarantine is lifted you would have to relocate and work on site)

Must have Skills : ETL , SQL , Pyspark

· 7 to 9 years working experience in data integration and pipeline development with data warehousing.

· 5 + years of experience with data integration with Apache Spark, EMR, Glue, Kafka, etc ecosystems

· 5+ years of strong real-life experience in python development especially in Pyspark .

· Design, develop test, deploy, maintain, and improve data integration pipeline.

· Experience in Python and common python libraries.

· Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.

· Strong experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.

· Experience with continuous deployment (CI/CD)

· Databricks, Airflow and Apache Spark Experience is a plus.

· Experience with databases (PostgreSQL, Redshift, MySQL, or similar)

· Exposure to ETL tools including Informatica and any other

· BS/MS degree in CS, CE or EE.

Job Type: Contract

Benefits:

401(k)

Schedule:

Monday to Friday

Work Remotely:

Temporarily due to COVID-19

Other Apache contracts

What

Where

Remote

Data engineer // Pyspark (Remote)

Other Apache contracts

0 outside IR35 Apache contracts