
RemoteWorker US
Sr Site Reliability Engineer (Remote)
Sr Site Reliability Engineer | RemoteWorker US | United States
Join a team that puts its People First! Since 1889, First American (NYSE: FAF) has held an unwavering belief in its people. They are passionate about what they do, and we are equally passionate about fostering an environment where all feel welcome, supported, and empowered to be innovative and reach their full potential. Our inclusive, people-first culture has earned our company numerous accolades, including being named to the Fortune 100 Best Companies to Work For list for eight consecutive years. We have also earned awards as a best place to work for women, diversity and LGBTQ+ employees, and have been included on more than 50 regional best places to work lists. First American will always strive to be a great place to work, for all. For more information, please visit .
What We Do
We are looking for a Senior Site Reliability Engineer to support the reliability of First American’s mission-critical software systems. This transformative role involves automating IT infrastructure tasks and driving SRE best practices, tools, and processes. The ideal candidate should exhibit a growth mindset and proactively monitor and respond to incidents for optimal user experience.
What You’ll Do
Maintain and improve reliability of core software systems.
Prioritize customer satisfaction in all efforts.
Continuously learn and adapt to new technologies and methodologies.
Collaborate effectively with stakeholders and other Engineers.
Quickly respond to changes and resolve issues.
Take accountability for issue resolution and prevention.
Utilize automation tools to streamline processes and minimize manual intervention.
What You’ll Bring (At Least 5-7 Years’ Experience)
Bachelor’s degree in Computer Science, Information Technology, or equivalent education and experience.
Expertise in application performance monitoring, observability, and proactive alert correlation, including monitoring containers and failure-based alerting.
Skilled in defining service level objectives, measuring service level indicators, and setting up error budgets.
Strong understanding of SRE practices: incident response, change/release management, capacity planning, infrastructure automation, elastic environments, chaos engineering and blameless postmortems.
Successful in improving CI/CD pipelines and build/release processes.
Experienced in creating SRE adoption framework and onboarding procedure.
Technology Stack
Cloud Computing Platform: AWS (Lambda, EC2, ECS, EKS, Fargate, RDS, S3, Dynamo DB, SQS)
Monitoring and Logging Tools(s): AppDynamics, Splunk, ELK Stack, Prometheus, AWS Cloudwatch/X-Ray
Networking Technology: Protocols, Load Balancers, Firewalls
Programming: C# .NET, PowerShell, Python, YAML
Code Repos: Azure Repos, GitHub
Infrastructure as code: Terraform,
Automation Tools: Jenkins, Chef, Puppet
Pay Range: $87,945 – $182,655 Annually
This hiring range is a reasonable estimate of the base pay range for this position at the time of posting. Pay is based on a number of factors which may include job-related knowledge, skills, experience, business requirements and geographic location.
What We Offer
By choice, we don’t simply accept individuality – we embrace it, we support it, and we thrive on it! Our People First Culture celebrates diversity, equity and inclusion not simply because it’s the right thing to do, but also because it’s the key to our success. We are proud to foster an authentic and inclusive workplace For All. You are free and encouraged to bring your entire, unique self to work. First American is an equal opportunity employer in every sense of the term.
Based on eligibility, First American offers a comprehensive benefits package including medical, dental, vision, 401k, PTO/paid sick leave and other great benefits like an employee stock purchase plan.
Show more
Show less