Lead Site Reliability Engineer (Remote)

Other
Salary: Competitive Salary
Job Type: Full time
Experience: Senior Level

EPAM Systems

Lead Site Reliability Engineer (Remote)

Lead Site Reliability Engineer | EPAM Systems |Poland

We are seeking a highly skilled Lead Site ReliabilityEngineer to join our team.

The ideal candidate will have a strong background in software...

Lead Site Reliability Engineer | EPAM Systems | Poland

We are seeking a highly skilled Lead Site Reliability Engineer to join our team.

The ideal candidate will have a strong background in software engineering and systems engineering, with a focus on reliability and scalability in cloud environments, specifically Azure.

Responsibilities

  • Design, implement, and maintain highly available and scalable systems across multi-region Azure cloud architectures
  • Ensure disaster recovery plans are in place and tested regularly
  • Configure and enhance monitoring and alerting processes using Prometheus, Grafana, Alertmanager, and OpsGenie
  • Develop dashboards to visualize system performance and reliability metrics
  • Utilize Terraform for infrastructure provisioning and management
  • Implement best practices for continuous deployment and infrastructure changes
  • Work closely with the development team to support ongoing development efforts
  • Communicate with the customer’s DevOps team to elaborate on requirements and collaborate on implementations
  • Enhance release management and CI/CD processes using Jenkins
  • Improve system security based on recommendations from the security team
  • Write and test runbooks to streamline operational tasks and incident response
  • Manage and optimize services running on Kubernetes, Docker/Linux environments
  • Handle data persistence using Cosmos DB (Mongo API & SQL API) and MS SQL Server
  • Work with messaging systems like RabbitMQ, Kafka, and EventHub
  • Utilize Azure Networking for secure and efficient communication

Requirements

  • 5+ years experience as a DevOps or SRE engineer
  • Proven experience with multi-region Azure cloud architectures
  • Proficiency in Kubernetes and containerization technologies
  • Strong knowledge of Cosmos DB (both Mongo API & SQL API) and MS SQL Server
  • Familiarity with monitoring tools like Prometheus, Grafana, Alertmanager, OpsGenie
  • Experience with .NET Core and ASP.NET Core applications
  • Competency in Docker and Linux environments
  • Expertise in Terraform for infrastructure as code
  • Experience with CI/CD tools
  • Solid understanding of Azure Networking concepts
  • Excellent communication skills, both verbal and written
  • Strong self-motivation and ability to self-manage tasks and projects

Nice to have

  • Experience with Azure IoT Hub and EventHub

We offer

  • We gather like-minded people:
  • Engineering community of industry professionals
  • Friendly team and enjoyable working environment
  • Flexible schedule and opportunity to work remotely within Poland
  • Chance to work abroad for up to 60 days annually
  • Relocation within our 50+ offices
  • We provide growth opportunities:
    • Outstanding career roadmap
    • Leadership development, career advising, soft skills, and well-being programs
    • Certification (GCP, Azure, AWS)
    • Unlimited access to LinkedIn Learning, Get Abstract, O’Reilly, Cloud Guru
    • Language classes in English and Polish for foreigners
  • We cover it all:
    • Stable income (Employment Contract or B2B)
    • Participation in the Employee Stock Purchase Plan
    • Benefits package (health insurance, multisport, shopping vouchers)
    • Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
    • Referral bonuses
    • Corporate, social and well-being events
  • Please, note:
    • The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview.
    • We will reach out to selected candidates exclusively.

    EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

    Show more

    Show less

    Tagged as: remote, remote job, virtual, Virtual Job, virtual position, Work at Home, work from home

    Load more listings
    When applying state you found this job on Pangian.com Remote Network.