Software Engineers required to work on building data pipelines on Cloudera using Hadoop, Spark, Python, Pyspark, Parquet and Avro, Hive, Hue with an understanding and focus on sourcing data, transforming, establishing governance and building flexible frameworks to transform, aggregate data for aggregation
Familiarity with Cloudera toolset desirable (HUE, Hive, Impala, Cloudera Data Science Workbench, HDFS, Avro, Parquet).
Python, Spark, PySpark, Gitlab - developer languages and code repo
Agile - to align to team current way of working and promote iterative development
Confluence, JIRA, Sharepoint - agile project tooling