설명 및 참여 요건
EA Experiences group (XO) is dedicated to ensuring great experiences for our growing communities centered around our world-renowned brands, including fan-favorites like Apex, Battlefield, EA SPORTS FC, Madden NFL and The Sims, just to name a few. We're a multi-functional group, with world-class expertise building fandoms, driving interactive storytelling, and positioning our franchises at the center of the broader entertainment ecosystem. We inspire, connect, and engage fans through culturally relevant content, intentionally architected journeys across channels, and meaningful fan care. Our goal is to provide valuable, easy experiences that fans love – in our games, around our games, and through innovative adjacent experiences to grow and enrich how fans experience EA as we shape the future of entertainment.
To empower more players and fans in new and amazing ways, we need more innovators to join our world-class team. The future of entertainment is interactive, and you can help lead that future, by growing and enriching how hundreds of millions of people (and counting) find joy and belonging, forge friendships, and celebrate their lived experiences through the work we do every single day, together.
AI Infrastructure Engineer
As the AI Infrastructure Engineer, you will design and operate AI-powered, cloud-native infrastructure that is self-monitoring, self-healing, and secure by default. You will lead the implementation of AIOps practices, build infrastructure as software using frameworks like AWS CDK, and establish standards for CI/CD, observability, traceability, and DevSecOps. Working across systems, you will enable reliable event-driven integrations and ensure continuous validation of system health, performance, and security. This is a hands-on leadership role combining deep technical execution with architectural ownership, driving scalable, autonomous, and resilient platform capabilities across the organization.
You will play a key role in transforming our infrastructure from traditional DevOps to AI-driven, autonomous operations. You will define how systems are built, integrated, secured, and operated—enabling teams to move faster while maintaining high reliability, visibility, and control at scale.
Responsibilities
Lead the design and implementation of AI-powered DevOps (AIOps) capabilities, including anomaly detection, predictive alerting, and automated remediation
Build and operate self-monitoring and self-healing systems using event-driven automation
Architect and implement infrastructure as software using programmable IaC frameworks (AWS CDK preferred)
Develop reusable infrastructure patterns, shared libraries, and platform standards across teams
Establish end-to-end observability and traceability across services, pipelines, and data flows
Design and govern CI/CD pipelines that provision, validate, and deploy infrastructure with embedded security and compliance controls (DevSecOps)
Define and implement security best practices, including policy enforcement, identity management, and continuous validation of system posture
Design and support event-driven integration patterns across internal systems, ensuring reliable communication and signal propagation
Define SLIs/SLOs and lead incident response, postmortems, and continuous reliability improvements
Mentor engineers and influence architecture decisions across teams
Qualifications
7+ years of experience in DevOps, Site Reliability Engineering (SRE), platform engineering, or systems engineering
Strong expertise in AWS across compute, storage, networking, and IAM
Experience designing and operating systems supporting both traditional services and AI/ML workloads
Advanced experience with Infrastructure as Code, with emphasis on AWS CDK and reusable infrastructure patterns
Deep experience with observability tools (Datadog, Prometheus, Grafana, OpenTelemetry) and distributed systems debugging
Strong experience building CI/CD pipelines with integrated infrastructure provisioning, testing, and security controls
Experience with event-driven architectures (Kafka, SNS/SQS) and system integrations
Proficiency in Python, Typescript, or similar programming languages
Strong understanding of security principles, including least privilege, identity controls, and secure system design
Nice to Have
Experience with SageMaker, Bedrock and AgentCore, Kubernetes (EKS) and container orchestration at scale
Experience with Data Lakehouse platforms (e.g., Databricks) and data pipeline integration
Experience building internal developer platforms or shared infrastructure frameworks
Experience creating custom CDK constructs or platform tooling
Familiarity with policy-as-code and continuous compliance frameworks
Exposure to real-time systems or high-scale consumer platforms
Background in gaming, media, or large-scale consumer ecosystems