Banner Default Image
Back to job search

Data Engineer

Company:

Real estate and construction company at the forefront of the UK's Real Estate market since 1992, having built a strong track record of achieving growth including the award winning development project at Wembley Park.

Job Purpose:
To architect and build the data platform supporting the data and analytical needs of customers, business colleagues and shareholders underpinning data-led decision making. This role is expected to understand and follow best practices in data warehouse design, data pipeline engineering and have experience in delivering business intelligence and analytical solutions that demonstrably meet business needs.

Key Accountabilities:
* Build data pipelines: Designing, creating, and maintaining data pipelines will be the primary responsibility of the data engineer.
* Evolve data architectures: Working closely with the wider technology team, the data engineer will identify and implement the most appropriate architectures (at both a technology and canonical level).
* Educate and train: The data engineer will be responsible for proposing appropriate (and innovative) data ingestion, preparation, integration and operationalization techniques in optimally addressing these data requirements
* Deliver Business Value: Working as part of a small team to identify and deliver data & insight solutions which add value to the business.

Qualifications/Experience/Skills:
o Strong experience with various Data Management architectures like Data Warehouse, Data Lake, Data Hub and the supporting processes like Data Integration, Governance, Metadata Management.
o Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management.
o Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies. These should include ETL/ELT, data replication/CDC, message-oriented data movement, API design and access and upcoming data ingestion and integration technologies such as stream data integration, CEP and data virtualization.
o Basic experience in working with data governance/data quality and data security teams and specifically information stewards and privacy and security officers in moving data pipelines into production with appropriate data quality, governance and security standards and certification. Ability to build quick prototypes and to translate prototypes into data products and services in a diverse ecosystem.
o Demonstrated success in working with large, heterogeneous datasets to extract business value using popular data preparation tools.
o Demonstrated success in designing, implementing and managing end to end data solutions using Microsoft's Azure Synapse Analytics platform and associated technologies such as Azure Data Factory, streaming analytics and PowerBI
o Strong experience with SQL.
o Knowledge of advanced analytics tools for Object-oriented/object function scripting using languages such as R, Python, Java, C++, Scala, and others.
o Knowledge of open-source and commercial message queuing technologies such as Azure Service Bus, stream data integration technologies such as Apache Nifi and stream analytics technologies such as Apache Kafka.
o Strong experience in working with DevOps capabilities like version control, automated builds, testing and release management capabilities using tools like Git, Jenkins, Puppet, Ansible.
o Basic experience in working with data science teams in refining and optimizing data science and machine learning models and algorithms
o Demonstrated success in working with both IT and business while integrating analytics and data science output into business processes and workflows.
o Basic experience working with popular data discovery, analytics and BI software tools like Tableau, Qlik, PowerBI and others for semantic-layer-based data discovery.
o Basic understanding of popular open-source and commercial data science platforms such as Python, R, KNIME, Alteryx, and others is a strong plus but not required/compulsory.