- Rumah
 - ...
 - Peranan Terbuka
 - Butiran Peranan
 
Perihalan & Keperluan
Pour visualiser la description de poste en français, veuillez sélectionner le français dans le menu déroulant au haut de la page
Site Reliability Engineer – Service Quality team, Battlefield Online
Foundational Technology
Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world. Here, everyone is part of the story. Part of a community that connects across the globe. A place where creativity thrives, new perspectives are invited, and ideas matter. A team where everyone makes play happen.
Motive is a creative studio with offices in Montréal. We believe in the power of diversity and welcome game creators from all backgrounds to collaborate with us as we unlock the potential for the future of Battlefield!
We’re always pushing to be at the forefront of creative entertainment - blending digital art, design, and technology to push boundaries. Our collaborative culture is fueled by passion, driving innovation and making a positive difference for our players and community.
At Motive, your ideas matter. We offer an inclusive space where you can thrive, be yourself, and grow alongside a team dedicated to making a meaningful impact on the world of gaming.
We’re all-in on the future and our most ambitious Battlefield yet. Want to be part of something special? Read on.
The Role
We are seeking an SRE with outstanding capability for running and scaling massive online service infrastructure. You will be at the heart of our operations, safeguarding the reliability, performance, and scalability of our game services. Your expertise in managing distributed, complex environments will be vital as you focus on service observability, alerting, 24/7 on-call support, operational security and cost optimization, all while ensuring uninterrupted gaming experiences at scale.
You will leverage your deep knowledge of Kubernetes, infrastructure as code with Jsonnet and Terraform, along with automated certificate management to deliver robust, high-availability autoscaling environments. The ability to anticipate, identify, and resolve challenges in large systems will be central to your success in this role.
You will work in a small distributed team that collaborates to create solutions for the Battlefield Franchise, using modern technologies and frameworks deployed to cloud-based infrastructure. You will work with multiple existing systems; some developed here at Battlefield Foundational Tech, some developed externally. This will require collaborating with lots of different teams within EA. You will report into a Development Director.
You will work in hybrid mode 3 days a week from the office located in Montreal.
RESPONSIBILITIES:
Operate and maintain our large-scale online game portfolio, ensuring exceptional uptime, secure environments and seamless player experiences across global infrastructure.
Design, implement, and manage observability solutions—monitoring, logging, and alerting—capable of supporting vast, distributed systems.
Participate in a 24/7 on-call rotation, leading incident response for our high-traffic services and driving continuous improvement based on root cause analysis.
Track, analyze, and optimize infrastructure and service costs across Battlefield Online’s expansive cloud ecosystem.
Automate infrastructure management using Jsonnet, Terraform and custom tools, enabling efficient scaling and rapid deployment of services.
Collaborate closely with engineering, operations, and product teams to enhance the quality, reliability, and scalability of our services.
Analyze feature designs and propose technical solutions for how they can be implemented
SKILLS:
Strong analytical skills to troubleshoot and solve complex technical challenges.
Excellent teamwork and communication skills for working in a cross-functional, globally distributed environment.
Linux administration experience for container orchestration platforms.
Experience with networking configuration and maintenance in public cloud environments.
Experience implementing data and infrastructure security best practices.
REQUIREMENTS:
5+ years of experience in managing distributed, scalable, resilient, high-performing systems
5+ years of relevant experience with public cloud services (preferably AWS), including design, implementation, and operational support of critical systems.
Experience with container workload technologies such as Kubernetes, Helm, and Docker.
Experience with with several DevOps tools and methodologies, including infrastructure as code and GitOps
Experience with Monitoring/observability systems such as Prometheus, Grafana and Datadog.
Experience with continuous integration and delivery, using pipeline automation systems such as Jenkins, GitLab, and GitHub
Experience developing automated solutions with at least one of the following languages: Python, Ruby, Go.
Demonstrated expertise in operating system and network security fundamentals for publicly accessible services hosted on Linux servers.
Experience implementing network resources in public cloud context, including DNS, subnetting, route tables, NAT, and firewalls
NICE TO HAVE:
Direct experience operating massive online game infrastructures or similar high-demand digital platforms.
Experience working in a multi-team, distributed development environment supporting large-scale projects.
Background in automating operational tasks and continuously improving service delivery pipelines for expansive online systems.