Job Description
We have
a
position
for a
Sr Software Engineer (Site Reliability Engineering)
with one of our clients in
Atlanta, GA
for an initial contract duration of
6 months.
No third party candidates considered for this position.
This role is for an opening for a Senior Site Reliability Engineer (SRE) on the Manheim Logistics SRE team. The SRE team is tasked with designing and maintaining AWS infrastructure and deployment pipelines for Manheim Logistics’ 15+ development teams. The team has currently standardized on a Docker-based infrastructure solution and is adding functionality to support new development team requests and architectural patterns (such as Lambda, Step Functions, Fargate, etc). The SRE team has a strong focus on IaC with Terraform and best practices such as least privilege access, proactive monitoring and alerting, etc. This role will work directly with a release train and help with IaC and SRE activites such as improving monitoring/alerting, defining an error budget, assisting with DevSecOps, etc.
As a Senior Site Reliability Engineer at Cox Automotive you will:
- Strong automation experience- testing, deploying, monitoring, etc.
- Take complex problems and come up with a technically reasonable solution
- Experience working with and defining SLOs, error budgets, etc.
- Have innate curiosity about how things work
- Design and assist in the authoring of software tools that reliably manage application delivery & performance
- Design and assist in the setup and maintenance of application monitoring and alerting
- Engage with engineering teams to ensure best practices are implemented
- Improve predictability and reliability of software releases, workflows, and operating software.
- Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.
-
- Qualifications:
- Bachelor’s degree in Computer Science or related field and at least 3-5 years working experience
- Expertise in software development and architecture/solutioning experience
- Strong background with Terraform
- Experience with Amazon AWS technologies especially: ECS and Lambda
- Experience with monitoring/observability tools such as: New Relic, Splunk, PagerDuty
- Experience with agile development, continuous integration and automated testing
- Solid written communication, problem solving, and process management skills
-
- Preferred Skills:
- Broad AWS platform skills including Cognito, WAF, Elasticache (Redis), Elasticsearch, SNS, SQS, S3, Systems Manager
- Experience automating Terraform at scale
- Experience with Database Server infrastructure (RDS, MySQL, Postgres, etc)
- .NET core development experience
- GitHub Actions
Experience with Github, docker, and Linux adminstration experience
Required Skills
Docker, Terraform, Linux, AWS, GitHub Actions
Preferred Skills
Monitoring, Python, .NET, New Relic, Splunk