Site Reliability Engineer/Senior Reliability Engineer
EMBL-EBI - European Bioinformatics Institute
Hinxton, United Kingdom
We’re seeking a skilled individual to join our Applications Group and contribute to the success of our applications portfolio, joining the team as a Site Reliability Engineer. Within the Applications Group, the Web Applications Platform Team is responsible for providing the platforms on which all EBI web services are hosted.
A couple of years ago, the web hosting service started shifting to a container based model, and our goal is to accelerate and consolidate this trend. Our web services are very popular among the scientific community, and the average monthly request count is over 3,000 million.
Working closely with the different IT groups like Infrastructure and Operations, this position will help designing, implementing and administering the future platform, on which all our scientific web services will be running on.
Duties & Responsibilities
In this role you will:
Responsible for building and maintaining the following environments:
- Web Hosting platform based on Kubernetes, where users can deploy web applications along with the following eco-system:
- Infrastructure and application monitoring based on Prometheus;
- Web analytics platform, currently based on ElasticSearch;
- CI/CD tools like Gitlab.
- Drive automation and change to simplify management, operations and increase efficiency;
- Ensure documentation is of standard;
- Drive SRE best practices.
This position will contribute directly to the above mentioned projects and tasks, and will help the team move forward with the production automation. This position will also help and guide other team members with daily prioritisation of tasks.
You have (Requirements)
- Bachelor's degree or higher in computer science or a related discipline, or demonstrate equivalent experience. The role would be suitable for a Unix/Linux systems administrator with good web hosting, Kubernetes, and CI/CD understanding;
- At least 3 years of experience in the design, implementation and operation of large scale web hosting platforms;
- Experience managing public-facing production services;
- Experience working with Agile methodologies;
- 3 years of experience with automated deployment/configuration methods (e.g. Ansible, Puppet, Terraform);
- Solid experience in Kubernetes deployment and administration in public or private cloud;
- Strong Linux administration skills, ideally with RHEL or a RHEL clone;
- Solid skills in automation tools like Jenkins, Rundeck, or similar;
- Hands-on experience using Git in CI/CD and infrastructure-as-code workflows;
- Solid skills in at least one programming language, ideally python;
- Experience with methodologies for infrastructure monitoring;
- Solid interpersonal and written English communication skills;
- Proven ability to work well in a team, building positive relationships and sharing knowledge;
- Ability to plan and prioritise workloads.
You might also have (Desirable)
- Experience with cloud technologies, including Google or AWS certification;
- Experience with Web Security best practices (OWASP).
Behaviours we value in our team
- You will possess strong communication skills, with the ability to multiple priorities and deadlines In a collaborative and effective in multidisciplinary, international teams!
- A technical expert in your area of expertise, willing to share knowledge and keep up with trends!
Don't forget to mention EuroTechJobs when applying.