A leading trading firm based in Singapore are seeking a talented Site Reliability Engineer (SRE) to be part of their global SRE team on a permanent basis.
Key Responsibilities
Execute scheduled updates and deployments effectively.
Conduct in-depth analysis and resolution of performance-related issues.
Respond promptly to emergencies and operational incidents.
Ensure the continuous functionality of the firm's Linux-based trading infrastructure while addressing daily operational requirements.
Contribute to creating automated solutions for server provisioning, configuration, and monitoring, aimed at managing thousands of servers at scale.
Oversee critical core services such as DHCP, LDAP, DNS, and NFS across on-premises, hosted data centers, and public cloud environments.
Work closely with the Trading and Core Engineering teams to support operations and service delivery.
Participate in a rotational on-call schedule which may include early morning and weekend shifts to deliver timely support.
Skills and Experience
Basic proficiency in Python and Bash scripting.
Strong proficiency in managing production environments on Linux.
Familiarity with automation and monitoring toolsets.
Familiarity with Intel-based server hardware and its components.
Comprehensive understanding of operating system principles particularly Linux internals.
Understanding of cloud services and architectural solutions.
Experience in server-side networking including knowledge of network protocols and configurations.
Proven ability to design, build, and troubleshoot complex systems.
Strong analytical abilities with a methodical approach to addressing technical challenges.
Benefits
Signing bonus
Paid annual leave
Sick leave / Childcare leave
Healthcare and dental insurance
Housing search assistance
Rental deposit coverage