Position: Platform Lead
Location: London/Hybrid (2 days a week onsite)
Salary: £70,000 - £85,000 + Benefits
The role:
The Technology Operations team design, develop and operate all infrastructure used for the businesses Technology across our data centres, Public Cloud, and offices. This encompasses multiple domains; compute, storage, connectivity, end-user computing, and monitoring tools. This includes both 3rd party and internally developed applications and the infrastructure that support the wider group.
Responsibilities:
- Contributes to the planning and implementation of infrastructure maintenance and updates. Implements agreed infrastructure changes and maintenance routines.
- Designs, develops and operates the Infrastructure platforms that allow development teams and business colleagues to consume infrastructure
- Ensures that incidents are handled according to agreed procedures.
- Facilitates recovery, following resolution of incidents. Documents and closes resolved incidents.
- Analyses causes of incidents, and informs service owners to minimise probability of recurrence, and contributes to service improvement.
- Facilitates recovery, following resolution of incidents. Ensures that resolved incidents are properly documented and closed.
- Provides technical expertise to enable the correct application of operational procedures.
- Assists with the specification, development, research and evaluation of service standards.
- Applies these standards to resolve or escalate issues and gives technical briefings to staff members.
- Carries out agreed system software maintenance tasks. Automates routine system administration tasks to specifications using standard tools and basic scripting.
- Monitors, measures and reports on infrastructure load, performance and security events. Identifies operational issues and contributes to their resolution.
- Uses infrastructure management tools to determine load and performance statistics. Configures tools and/or creates scripts to automate the provisioning, testing and deployment of new and changed infrastructure. Maintains operational procedures and checks that they are executed following agreed standards.
Key Skills
- Linux and Windows server operating systems
- Networking Infrastructure, Routing, Switching and Security
- Cloud Operations (preferably AWS)
- VMWare (or similar Hyper-Visor)
- Veeam or other backup technologies
- Observability delivery through Sumologic or similar
- Monitoring delivery through Prometheus or similar
- Advanced PowerShell or Python scripting
- Cloudflare WAF or similar
- Load balancing and autoscaling through F5 or Cloud
- Automation experience with Ansible
- Private cloud deployment is a plus
- Kubernetes deployment is a plus
