Deutsche Telekom IT Solutions HU
Linux Site Reliability Engineer (Remote)
Linux Site Reliability Engineer | Deutsche Telekom IT SolutionsHU | Hungary
Linux Site Reliability Engineer | Deutsche Telekom IT Solutions HU | Hungary
Job Description
We are seeking a skilled and motivated Linux Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in Linux system administration, automation, and cloud infrastructure, with a passion for building reliable and scalable systems. You will collaborate with development and operations teams to ensure our services are highly available, performant, and fault-tolerant.
- Onboarding of New Customers: Ensure smooth deployment and operational readiness, document processes and provide initial support during the transition.
- System Administration: Manage, monitor, and optimize Linux servers in production and development environments. Identify and resolve bottlenecks in application and system performance.
- Automation: Develop and maintain infrastructure automation using tools like Ansible, Terraform, or similar. Creation and Maintenance of Hardening and Washing Script (Ansible).
- Performance Optimization: Diagnose and resolve performance bottlenecks at the OS, application, and network levels. Analyze system demands and plan for scaling.
- Incident Management: Lead efforts to quickly resolve production incidents, conduct post-mortems, and implement solutions to prevent future occurrences.
- Scalability: Work on infrastructure scalability and reliability for high-traffic services.
- Collaboration: Partner with development teams to create CI/CD pipelines and integrate reliability practices into the development lifecycle. Coordinate changes with Operation Teams.
- Security: Ensure system security through best practices in access control, patch management, and system hardening.
Qualifications
- Operating Systems: Extensive experience with Linux distributions like RHEL, CentOS, or Ubuntu
- Scripting: Proficiency in scripting languages like Bash, Python, or Ruby for automation
- Cloud Expertise: Familiarity with cloud platforms like AWS, Azure or GCP and containerization technologies like Docker or Kubernetes
- Infrastructure as Code (IaC): Hands-on experience with tools such as Terraform, Ansible, or Chef
- Networking: Solid understanding of networking protocols, DNS, load balancers, and firewalls
- Version Control: Experience with Git or similar version control systems
- Web Servers & Middleware: Good skills in configuring and managing Apache, Tomcat, JBoss and NGINX for production environments
- Problem-Solving: Strong troubleshooting and debugging skills
- Communication: Strong communication and teamwork abilities for cross-functional work. At least intermediate English language knowledge
- Mindset: A mindset for optimizing and enhancing systems iteratively
Nice To Have/preferred Skills And Experience
- Exposure to high-availability architectures and disaster recovery strategies
- Certifications: RHCE, AWS Certified SysOps Administrator, or equivalent
- Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or Datadog
- Experience with Websphere
- German language knowledge
Additional Information
- Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation.
Show more
Show less
Related Jobs
See more All Other Remote Jobs-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave
-
NewSave