AI Infrastructure Engineer - EA Experiences - 214202 - Electronic Arts

Informacje ogólne

Lokalizacje: Galway, Ireland

Identyfikator roli

214202

Typ pracownika

Regular Employee

Studio/dział

Marketing

Elastyczna organizacja pracy

Hybrid

Description & Requirements

Electronic Arts tworzy rozrywkę na najwyższym poziomie, inspirując graczy i fanów na całym świecie. Tutaj każdy jest częścią historii. Częścią społeczności łączącej się z całą resztą świata. Miejscem, w którym kreatywność kwitnie, nowe perspektywy są mile widziane, a pomysły zawsze się liczą. Zespołem, w którym każdy ma swój wkład w grę.

EA Experiences group (XO) is dedicated to ensuring great experiences for our growing communities centered around our world-renowned brands, including fan-favorites like Apex, Battlefield, EA SPORTS FC, Madden NFL and The Sims, just to name a few. We're a multi-functional group, with world-class expertise building fandoms, driving interactive storytelling, and positioning our franchises at the center of the broader entertainment ecosystem. We inspire, connect, and engage fans through culturally relevant content, intentionally architected journeys across channels, and meaningful fan care. Our goal is to provide valuable, easy experiences that fans love – in our games, around our games, and through innovative adjacent experiences to grow and enrich how fans experience EA as we shape the future of entertainment.

To empower more players and fans in new and amazing ways, we need more innovators to join our world-class team. The future of entertainment is interactive, and you can help lead that future, by growing and enriching how hundreds of millions of people (and counting) find joy and belonging, forge friendships, and celebrate their lived experiences through the work we do every single day, together.

You will be the hands-on AI Infrastructure Engineer for our AI and machine learning platform, reporting to the Director, Agentic Solutions. You will design, build, and operate the cloud foundation our models and production AI agents run on, going deep in AWS to make the platform reliable, secure, and cost-effective at scale. You'll bring MLOps and AIOps together: the training, serving, and monitoring infrastructure teams build on, with MLflow-based experiment tracking, model registry, and pipelines on one side, and self-monitoring, self-healing systems on the other. You'll architect and ship the CI/CD, observability, and infrastructure-as-code standards that the rest of XO builds on, and you'll still go deep in the code when the work calls for it. You will define requirements, rapidly prototype, iterate with stakeholders, and establish reusable architectures, standards, and patterns using the latest AI engineering methodologies, models, tools, and platforms. You're creative, innovative, self-motivated, and team-first, equally strong at problem-solving and collaborating across product, data, security, IT, and engineering teams. You will build scalable ML and AI pipelines that let teams spend more time on high-value, creative, and strategic work. You will be a hybrid worker, collaborating with teams 3 days a week from the office; international travel to collaborate with global teams is an added bonus.

Responsibilities

Own the MLOps platform: build and operate the platform teams use to train, track, version, and deploy models, with MLflow for experiment tracking, model registry, and lineage.
Run the ML pipelines: design and operate training, validation, and deployment pipelines, including automated retraining when data or model performance drifts.
Serve models at scale: stand up real-time and batch inference infrastructure, including GPU-backed and LLM serving, and make the calls on hosted versus self-managed serving.
Monitor models in production: put drift detection, data quality checks, and performance tracking in place, with alerts that trigger action.
Drive AIOps: build self-monitoring, self-healing systems on event-driven automation, with anomaly detection, predictive alerting, and automated remediation.
Architect infrastructure as software: implement programmable IaC (AWS CDK preferred) plus reusable patterns, shared libraries, and platform standards across teams.
Establish observability and traceability: make services, pipelines, models, and data flows visible end to end.
Govern CI/CD and continuous training: design pipelines with security and compliance controls built in (DevSecOps and MLSecOps).
Secure the platform: enforce least privilege, identity management, and continuous validation across infrastructure, models, and data.
Own reliability: define SLIs/SLOs, run incident response and postmortems, and continuously improve reliability.
Partner and mentor: work with teams across XO, guide engineers, and shape architecture decisions.

Your Qualifications

7+ years designing, building, and operating production-grade infrastructure and platforms, with strong software engineering, security, and reliability best practices.
Hands-on MLOps experience is the core of this role: building and operating ML platforms with experiment tracking, model registry, and automated training and deployment pipelines (MLflow, or equivalents such as Kubeflow or SageMaker).
Deep, hands-on AWS experience across compute and serverless (Lambda, ECS/Fargate, containers), storage, networking (VPC), IAM, observability and telemetry (CloudWatch, tracing, structured logging), and secrets management; experience with SageMaker and Amazon Bedrock is a strong plus.
Experience running AIOps practices: anomaly detection, predictive alerting, automated remediation, and self-healing systems built on event-driven automation.
Strong infrastructure-as-code and CI/CD experience (CDK preferred; Terraform or CloudFormation), with a track record of building for reliability, scale, and cost efficiency.
Experience with ML pipeline orchestration (Airflow, Kubeflow, SageMaker Pipelines, or Step Functions) and model serving and inference (SageMaker, Bedrock, KServe, Seldon, or Triton).
Experience with model and data monitoring, including drift detection and data quality.
Strong Python skills; working knowledge of at least one additional language (TypeScript/Node.js, Go, Java, or C#).
Deep experience with observability tools (Datadog, Prometheus, Grafana, OpenTelemetry) and debugging distributed systems.
Solid grasp of the ML lifecycle, from training and evaluation through deployment, monitoring, and retraining.
Experience navigating the legal, ethical, and security implications of AI, including data privacy, IP, and safety, and translating policy into engineering controls.
Thrive working both collaboratively and independently, with excellent creative, critical thinking, and problem-solving skills, and a demonstrated ability to clearly articulate complex technical concepts.
LLMOps experience (serving and fine-tuning LLMs, vector databases, and RAG infrastructure), feature stores (Feast, Tecton, or SageMaker Feature Store), GPU and accelerator infrastructure, Kubernetes (EKS), or Data Lakehouse platforms (e.g., Databricks) is beneficial.
Experience working in a gaming company or large-scale consumer platform is beneficial.

O firmie Electronic Arts

Jesteśmy dumni z obszernego katalogu gier i doświadczeń, lokalizacji na całym świecie i możliwości, jakie daje EA. Cenimy zdolność adaptacji, wytrwałość, kreatywność i ciekawość. Od przywództwa, które wydobywa potencjał, po tworzenie przestrzeni do nauki i eksperymentowania, dajemy wam możliwość wykonywania świetnej pracy i korzystania z możliwości rozwoju.

Stosujemy kompleksowe podejście do naszych programów świadczeń, kładąc nacisk na dobre samopoczucie fizyczne, emocjonalne, finansowe, zawodowe i społeczne, aby wspierać zrównoważony styl życia. Nasze pakiety są dostosowane do lokalnych potrzeb i mogą obejmować opiekę zdrowotną, wsparcie psychiczne, fundusze emerytalne, płatne oraz rodzinne urlopy, bezpłatne gry i wiele innych udogodnień. Pielęgnujemy środowiska, w których nasze zespoły zawsze mogą dać z siebie wszystko.

Electronic Arts jest pracodawcą realizującym politykę równych szans w zatrudnieniu. Wszystkie decyzje dotyczące pracowników są podejmowane bez względu na rasę, kolor skóry, przynależność etniczną, kraj pochodzenia, płeć, tożsamość lub ekspresję płciową, orientację seksualną, wiek, informację genetyczną, wyznanie, niepełnosprawność, stan zdrowia, stan cywilny lub rodzinny, status kombatanta lub jakąkolwiek inną cechę chronioną prawnie. Rozpatrzymy również aplikacje wykwalifikowanych osób skazanych prawomocnym wyrokiem zgodnie z obowiązującym prawem. EA dostosowuje miejsca pracy dla pracowników lub kandydatów z niepełnosprawnościami zgodnie z obowiązującym prawem.