Sergio García

AI Systems Developer | LLMs, RAG & Distributed Systems
sergiogglez@gmail.comSpain · Remote (EU time zone), ESLinkedInGitHub

AI Systems Developer with a background in Physics, specializing in building production-grade AI systems that combine large language models, distributed infrastructure, and real-world integrations. Experienced across MLOps, DevOps, and applied AI, with a strong focus on system design, modeling, and scalable architectures in complex environments.

Experience

AI Systems Developer (LLMs & RAG)

Capitole
Dec 2024

Design and development of production-grade AI systems leveraging Large Language Models (LLMs) to improve knowledge access and productivity in large-scale enterprise environments, supporting thousands of internal users.

  • Designed and implemented RAG and agent-based systems integrating enterprise tools and data sources (via MCP and APIs)
  • Built semantic retrieval pipelines using vector-based search for large-scale documentation and codebases
  • Developed multi-agent architectures for task orchestration, coordination, and decomposition
  • Addressed LLM context limitations by designing scalable strategies using sub-agents and context compaction
  • Extended agent frameworks with custom support for handoffs and interoperability (A2A protocol)
  • Implemented observability and tracing for LLM and agent-based systems in production environments
  • Integrated multiple LLM providers (OpenAI, Anthropic, Gemini) across different use cases

MLOps Developer (ML Platform & Model Serving)

Capitole
Apr 2023 - Nov 2024

Development of a production-grade MLOps platform enabling reliable deployment, serving, and lifecycle management of Machine Learning models within a large-scale enterprise PaaS environment (e-commerce), supporting high-throughput, low-latency inference workloads across multiple customer-facing channels worldwide.

  • Designed scalable model serving solutions supporting TensorFlow, ONNX, Scikit-learn, and other frameworks
  • Implemented REST and gRPC inference endpoints compliant with V2 Inference Protocol
  • Optimized deployments on OpenShift for latency, concurrency, and resource efficiency (load testing and performance tuning)
  • Built CI/CD workflows for automated model promotion using GitHub Actions and Azure Machine Learning
  • Implemented monitoring and observability using Prometheus and Grafana for model serving workloads

Research & Development Software Developer (MLOps)

Atos
Mar 2022 - Apr 2023

Worked on European-funded research projects (FlexiGroBots, Grapevine), focusing on ML infrastructure, workflow orchestration, and cloud-native systems.

  • Managed Kubernetes clusters and multi-user Kubeflow platform for ML experimentation
  • Developed Kubeflow pipelines supporting model serving and ML lifecycle workflows
  • Contributed to a Cloudify plugin for cross-platform (HPC + Cloud) workload orchestration
  • Coordinated technical work across project partners and contributed to project reviews

Software Developer (DevOps)

Ericsson
Jan 2021 - Mar 2022

DevOps-focused software development for 5G Core network systems, contributing to the deployment and reliability of cloud-native microservices in a telecom environment.

  • Deployed and managed microservices in distributed environments using Docker and Kubernetes
  • Developed and maintained CI/CD pipelines and development infrastructure
  • Automated build, deployment, and release processes
  • Validated integration of third-party components ensuring system stability

Research & Development Software Developer

Alisys
Nov 2019 - Jan 2021

Applied AI development across speech, computer vision, and conversational systems, focusing on real-world integration and early-stage production feasibility.

  • Developed speech and speaker recognition systems using embedding-based models and performance evaluation
  • Designed a custom vector similarity search layer (cosine similarity) on top of MongoDB
  • Built real-time pipelines for processing large-scale telephone audio data
  • Integrated speech-to-text and text-to-speech services with intent-based conversational systems (RASA)
  • Developed conversational AI prototypes for public administration services

Global R&D Junior Researcher

ArcelorMittal
Jul 2017 - Jul 2019

Applied research in additive manufacturing combining computational modelling, simulation, and data infrastructure.

  • Designed and operated distributed data and compute infrastructure (Hadoop and Linux clusters)
  • Performed multi-scale simulations of additive manufacturing processes
  • Conducted molecular dynamics studies on graphene nanocoatings for steel
  • Prototyped an electromagnetic atomization system for steel powder production
  • Contributed to R&D projects (~€180k), including planning and technical execution

Global R&D Intern

ArcelorMittal
Jan 2017 - Jul 2017

Supported research activities in additive manufacturing, contributing to simulation tasks and technical documentation within industrial R&D projects.

Research Intern

IMDEA Materials
Jul 2014 - Oct 2014

Carried out computational (Monte Carlo Dynamics) study of neutron-radiation damage in materials designed for future nuclear fusion reactors.

Skills & Tools

AI Systems
Agent-based systems • Large Language Models (LLMs) • Retrieval-Augmented Generation (RAG) • Vector search • A2A protocol • MCP (Model Context Protocol) • Embedding-based models
MLOps
MLServer • Kubeflow • Azure Machine Learning • Model lifecycle management
Data & Databases
MongoDB • Redis • Weaviate
Cloud & Distributed Systems
Kubernetes • OpenShift • Helm • Docker • Azure
Observability & Monitoring
Prometheus • Grafana • OpenTelemetry • Phoenix
Programming & Tools
Python • Bash • Linux • Git • CI/CD (GitHub Actions) • Agile Methodologies

Education

U.N.E.D (National Distance Education University)

Master's degree in Complex Systems Physics
Jan 2014 - Jan 2016

University of Oviedo

Bachelor's degree in Physics
Jan 2009 - Jan 2014

Languages

Spanish: Native
English: Professional (Cambridge C1)