Scaling Edge Operations at Onefootball with Edge Stack


Introduction: The Challenge of Scale
Why We Needed a Better API Gateway
Why We Chose Edge Stack API Gateway
Migrating to Edge Stack: Our Process
Key Benefits After Migration
Introduction: The Challenge of Scale
OneFootball is a media company serving 10 million monthly active users and delivering over 10TB of daily content. With a workload peaking at 6,000 requests per second (rps), we needed a Kubernetes-based API Gateway and Ingress solution that was not only scalable and reliable but also cost-effective and easy to maintain.
Our previous infrastructure relied on multiple cloud-based load balancers, causing unnecessary complexity, high operational overhead, and inflated costs. By migrating to Edge Stack API Gateway, we drastically reduced infrastructure costs, improved observability, and streamlined API management—all while maintaining a small SRE team.
This post explores our migration journey, the challenges we encountered, and the key benefits Edge Stack brought to OneFootball’s engineering ecosystem.
Why We Needed a Better API Gateway
Before Edge Stack, OneFootball’s architecture relied on:
✔ Over 50 microservices running in production
✔ Applications written in multiple languages (Golang, PHP, Python, and Node.js)
✔ Cloud-based load balancers (ELBs) for each service
✔ A CDN to handle media-heavy traffic
While this setup worked, it introduced several challenges:
- Excessive Load Balancers: Every new service required a new ELB and CDN configuration, leading to operational overhead and high costs.
- Scalability Issues: Traffic spikes—especially during events like the Cristiano Ronaldo transfer to Juventus—exposed limitations in our infrastructure.
- Lack of Observability: With distributed logs across multiple ELBs, debugging issues was difficult and time-consuming.
- Maintainability Concerns: SSL certificate renewals, DNS configurations, and monitoring each ELB individually became a growing engineering burden.
We needed a solution that could reduce complexity, increase visibility, and improve deployment velocity—all while being Kubernetes-native.
Why We Chose Edge Stack API Gateway
After evaluating multiple Kubernetes Ingress solutions, we decided on Edge Stack API Gateway, an open-source project built on Envoy Proxy.
Key Factors That Led to Our Decision:
Native Kubernetes Integration: Configuration is managed via Kubernetes annotations, making it easy to deploy and declaratively manage API traffic.
Cost Reduction: We reduced our load balancers from ~100 to just 4, saving $2000+ per year in cloud costs.
Enhanced Observability: Integrated Prometheus monitoring provided powerful insights for our small SRE team.
Maintainability: Decoupling cluster settings from application delivery enabled faster feature rollout with minimal overhead.
Performance & Reliability: Envoy-powered Edge Stack provided low latency, traffic control, and dynamic routing without performance bottlenecks.
API Gateway Features: Beyond Ingress, Edge Stack offered traffic shadowing, request transformation, authentication, and rate limiting out of the box.
Migrating to Edge Stack: Our Process
Transitioning from load balancer-based routing to Edge Stack API Gateway required careful planning to minimize service disruptions.
Step 1: Creating a Centralized Helm Chart Repository
- We moved all configurations to a dedicated Helm chart repository, separating application logic from deployment configurations.
- This allowed us to version control API Gateway settings independently, making it easy to track changes across 40+ services.
Step 2: Defining Edge Stack Mappings for Each Service
- Edge Stack’s Kubernetes-native mappings allowed us to define traffic routing within service annotations, simplifying API management.
Step 3: Gradual Canary Migration with Weighted Traffic Routing
- To minimize risk, we leveraged weighted traffic routing (via Helm and Edge Stack) to gradually shift traffic from ELBs to Edge Stack.
- Example rollout:
- Deploy API mapping
- Route 1% of traffic through Edge Stack
- Validate performance & monitoring
- Incrementally increase to 100% traffic
- Remove old ELB configuration
Step 4: Standardizing Observability & Monitoring
- Integrated Prometheus monitoring with Edge Stack’s Envoy stats exporter, providing centralized logs, success rates, and failure analysis.
- Benefits:
- Unified monitoring for all API traffic
- Automatic alerting on failure spikes
- Faster debugging via centralized access logs
Key Benefits After Migration
Eliminated Complexity
- Consolidated 100+ ELBs into just 4
- Simplified SSL certificate management
- Streamlined DNS configuration
Reduced Costs
- Eliminated $2000+ per year in cloud expenses
- Optimized API gateway performance with fewer resources
Improved Observability & Debugging
- Integrated Edge Stack + Prometheus provided real-time insights
- Centralized API logs, making debugging much easier
Faster Deployments & Feature Rollout
- Kubernetes-native configuration enabled faster API updates
- Traffic shadowing & canary releases allowed safe production testing
Enhanced Security & Performance
- Fine-grained rate limiting & authentication
- Future-ready for mutual TLS (mTLS) and service mesh integration
Migrating to the Edge Stack API Gateway was a transformative move for OneFootball’s engineering team. By reducing infrastructure complexity, enhancing observability, and significantly cutting costs, we created a scalable and resilient API ecosystem capable of handling massive traffic spikes without compromising performance.
This journey highlighted the importance of choosing a Kubernetes-native solution that aligns with our goals for scalability, maintainability, and ease of deployment. It also demonstrated that strategic infrastructure changes can drive significant business value, even with a lean SRE team.
Looking ahead, we plan to explore advanced Edge Stack features like mutual TLS (mTLS) and deeper service mesh integrations to further enhance our API security and performance. We hope our story inspires other teams navigating similar challenges in their path toward simplified and scalable infrastructure solutions.