Title – Data Engineer // Pyspark Developer
Type of engagement – long term contract opportunity (at least 1 year)
Location – Carlsbad, California (candidate would start working remotely and if and when the quarantine is lifted you would have to relocate and work on site)
Must have Skills : ETL , SQL , Pyspark
· 7 to 9 years working experience in data integration and pipeline development with data warehousing.
· 5 + years of experience with data integration with Apache Spark, EMR, Glue, Kafka, etc ecosystems
· 5+ years of strong real-life experience in python development especially in Pyspark .
· Design, develop test, deploy, maintain, and improve data integration pipeline.
· Experience in Python and common python libraries.
· Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
· Strong experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.
· Experience with continuous deployment (CI/CD)
· Databricks, Airflow and Apache Spark Experience is a plus.
· Experience with databases (PostgreSQL, Redshift, MySQL, or similar)
· Exposure to ETL tools including Informatica and any other
· BS/MS degree in CS, CE or EE.
Job Type: Contract
Benefits:
Schedule:
Work Remotely: