Site Reliability Engineer

Site Reliability Engineer

EMBL-EBI - European Bioinformatics Institute

Hinxton, United Kingdom

Your role

EMBL-EBI is at the forefront of research and services in the life sciences. Our IT & Technical Services department's Operations team plays a crucial role in supporting this mission by maintaining and developing a range of critical services.

We are seeking a dynamic and experienced Senior Site Reliability Engineer to join our small IT Operations team.

This role is essential for ensuring the availability, reliability and efficiency of EMBL-EBI’s service portfolio, which supports our scientific and research communities. This team has a broad remit, so a background in running, maintaining and enhancing a wide range of services in a medium-sized Enterprise environment would be advantageous.

Your role will include

  • Identity Management: Actively participate in upgrading and standardising our Authentication and Authorisation Infrastructure for managing thousands of internal and virtual accounts. Experience with Active Directory, Entra-ID, Redhat IDP would be advantageous.
  • Email Systems: Initially focus on understanding, upgrading, and maintaining our email systems, including Postfix, Cyrus, Roundcube, and Mailman. Experience of migrating to and running O365 mail would be an advantage.
  • Service Management: Jointly support and develop services such as Transfer Services, software-defined object storage, authentication and authorisation infrastructure, and the Request Tracker ticketing system.
  • Monitoring Systems: Help maintain and evolve the distributed Check_mk monitoring system and support the development and growth of a wider monitoring strategy.
  • Orchestration Infrastructure: Become fluent in the orchestration infrastructure (Gerrit, Foreman, RPM repositories, Puppet) used to deploy, update and maintain over 3,000 servers.
  • Core Modules Maintenance: Maintain and improve core Puppet modules like NTP and SSSD, and manage templates for new RedHat OS versions.
  • Documentation: Provide thorough documentation and Standard Operating Procedures to support our Service Desk team and enhance user service experiences.

You may also have

  • Education: A degree (or equivalent level of qualification) in a relevant technical subject.
  • Experience: Proven experience in systems management and operations, infrastructure engineering, or site reliability engineering.
  • Technical Skills: Strong proficiency with Linux at scale, email systems (Postfix, Cyrus, Roundcube, Mailman, O365), and orchestration tools (Puppet, Foreman).
  • Problem-Solving: Excellent problem-solving skills and a proactive attitude towards continuous improvement.
  • Team Player: Ability to work collaboratively in a multicultural, multi-disciplinary team and a willingness to share knowledge and learn from others.
  • Initiative: A self-starter who can take responsibility for tasks and can bring the team along.

You have

  • At least 5 years of hands-on experience with Linux production systems in on-prem and potentially cloud hosting environments.
  • Experience with postfix mail systems.
  • Experience with orchestration and automation.
  • A strong sense of responsibility and ethics.
  • Experience with 389 directory server or OpenLDAP.
  • Puppet expertise.
  • You are comfortable with tcpdump, strace and log parsing at scale.
  • Experience reviewing Python, Bash and Puppet code created by other team members.
  • You are used to taking on high-level responsibilities and guiding others.

You might also have

  • A desire to contribute to scientific research from an IT perspective.
  • Experience within a high-performing team.
  • ITSM experience (we use the ServiceNow platform).

Apply Now

Don't forget to mention EuroTechJobs when applying.

Share this Job

EuroTechJobs Logo

© EuroJobsites 2025