An amazing Global Investment Client of ours located in Central London are looking for a Site Reliability Engineer to join their team on a permanent basis. This is a rare opportunity and the package offered for this role is up to £300k depending on skills and experience.

ABOUT THE COMPANY

The company is a leading provider of alternative investment solutions with approximately $63 billion of assets under management (“AUM”) and over 550 employees worldwide including London, New York, Singapore and Hong Kong. One of their founding beliefs is that technology and data are at the core of the business allowing them to build and maintain cutting edge hardware and software solutions. The technology team is lean and has a culture that encourages interaction across all areas of the business on a global scale. Their aim is to use the best tool for the job therefore there is the opportunity to be constantly learning and use modern technologies. Their teams strive to push boundaries and think innovatively creating an environment that is fast paced, dynamic and successful.

ABOUT THE ROLE

They are looking for an enthusiastic Site Reliability Engineer to join the SRE team in London. Their team is central to the business as they are responsible for the technology that underpins everything they do therefore you will have a direct impact on the success of the company. From scaling for the huge volumes of data that drive their research process, to improving the reliability and speed of a rapidly evolving application estate, there is always a relentless focus on automation and efficiency at scale.

The company's engineers own their varied technology stack, end-to-end, and are in constant search of incremental improvements, new technologies and ways of working to evolve their platform and give them a competitive edge. They are looking for people who want to find unique solutions for optimising efficiency and performance in a context where they are key enablers.

The ideal candidate will be passionate about improving reliability and removing toil by identifying opportunities for automation and building platforms to make the systems more “reliable by default”.

Responsibilities:

Evangelise the SRE mindset and implement best practices across the environment
Understand the business and find ways to measure and enhance resilience across the application estate
Eliminate the toil that emerges with complex, distributed systems by automating where possible
Working as both an individual contributor and collaboratively to find new ways of improving the reliability, availability, security and performance of the infrastructure
Accelerate the migration strategy to more cloud-native, distributed applications
Improve productivity and developer experience through automation and interface improvements in local tool chains, IDEs, CI/CD.

Requirements:

Expert level scripting / coding skills in one or more languages (Python / Golang etc.)
Expert knowledge of observability systems (Prometheus / ELK / Jaeger / Opentelemetry / Service Meshes etc.)
Experience with configuration management tools (Ansible / Puppet / Kapitan / Terraform)
Experience with distributed data platforms (Kafka / Flink / Airflow)
Comfortable using cloud native and containerisation technologies (Kubernetes / Docker)
Good Linux systems knowledge (experience with RHEL desirable)
Broad knowledge across network technologies, server virtualisation and storage
Self-starter, able to quickly pick up concepts, implement new ideas and think outside the box
Focused on improving system reliability, availability, security, and performance through testing, automation, and standardisation
Ability to simply articulate the "why" behind best practices
Ability to build positive and collaborative relationships with colleagues across teams and geographies

PERKS & BENEFITS

Food & Beverage: Complimentary breakfast and lunch for all employees plus on-site coffee bars and a wide variety of healthy snacks.
Annual Discretionary Bonuses: Reflecting firm and individual performance.
Cycle to Work Initiative: Green loan scheme which employees are able to use for the purchase of bicycles.
Employee Referral Programme: Bonus for each successful hire in the month your referral joins the company.
Global Office Design: They aim to create a cohesive environment, regardless of region. They've designed office spaces to ensure everyone feels the connection no matter where you're located.
Pension Scheme: Generous pension and retirement savings plans.
Carbon Offset Programme: The company offsets its CO2 emissions annually and aims to sustainably source all office materials.
Physical and Mental Fitness: Health and wellness benefits include an onsite gym & classes (LDN and NYC), gym subsidies in other regions, access to mental health support, and subscriptions to mindfulness platforms.
Charity Donation Matching: Generous charity matching scheme and ample opportunities to become involved in the community. They offer charity of the year awards in each region and encourage employees to submit causes they're passionate about.
Enhanced Caregiver Leave: Enhanced, flexible primary and secondary caregiver leave.
Sabbatical: Generous sabbatical after you've been with the company for 8 years and every 4 years after that.
Annual Training Allowance: Encourage personal and professional development. This allowance may be used towards conferences, seminars, and training courses which supplement extensive on-site training materials.
Health and Life Insurance: Range of healthcare benefits to help you manage your personal, physical and emotional wellbeing.

Location	London
Discipline:	Software Design and Application Development, IT Infrastructure, IT Infrastructure & Support
Job type:	Permanent
Salary:	£Package is up to 300k depending on skills and experience

Site Reliability Engineer (Applications)

Site Reliability Engineer (Applications)

Current Opportunities

Get new jobs for this search by email