Benchmarking HTTP/3 Performance with Ambassador Edge Stack and Envoy Proxy
HTTP/3 Benchmarking Setup
The Server
The Client
The Network
The Results: High vs Low Latency (Europe -> US vs US -> US)
Small site performance:
Medium site performance:
Large site performance:
The Results: High Latency (Europe -> US)
Summary & Conclusion
Explore real-world benchmarking results between HTTP/3 vs. HTTP/2 and HTTP/1
With HTTP/3 being supported by all the major web browsers, including Chrome, Firefox, and Edge, and the official HTTP/3 RFC being finalized this month, now is the time that organizations are beginning a widespread rollout of this protocol. As leaders in the implementation of the HTTP/3 spec, the Envoy Proxy and Google teams have been working on rolling this out for quite some time and have seen considerable performance improvements. We were keen to explore the real-world results and have run a series of benchmarks with Edge Stack, which contains Envoy Proxy at its core.
This post compares the performance between HTTP/3 and its predecessors HTTP/2 and HTTP/1.1. When comparing such performance characteristics there are a lot of different factors that can influence the results. This is especially true when comparing UDP-based protocols (like HTTP/3) vs TCP-based protocols (like HTTP/2 and HTTP/1.1). Networks and load balancers may not prioritize these types of traffic the same way, and so a theoretically superior protocol design may perform worse if not all networks involved are tuned to handle the traffic appropriately.
HTTP/3 Benchmarking Setup
Some factors that we are able to consider and accommodate in this benchmark:
- web site size: how many total bytes are transferred
- web site chattiness: how many round trips are required to transfer the bytes
- latency of the network between client and server
- congestion of the network between client and server
- HTTP protocol negotiated between client and server
- TLS protocol negotiated between client and server
Not all of these factors are easy to control for, but this benchmark does the following:
The Server
Located in a GCP us-central region. Three different websites (“Small”, “Medium ”, and “Large”) are used to provide a representative range of content size and chattiness:
- The Small size/chattiness site is 1.5MB total consisting of 11 @ 1.4K round trips and 11 @ 141K round trips.
- The Medium size/chattiness site is 7MB total consisting of 51 @ 1.4K round trips and 51 @ 141K round trips.
- The Large size/chattiness site is 14MB total consisting of 101 @ 1.4K round trips and 101 @ 141K round trips.
All content is served with a custom backend deployed behind Ambassador Edge-Stack 3.0. The backend sets cache-control headers in order to ensure no content is cached by the client.
Ambassador Edge-Stack is configured to serve the exact same backend via two different hostnames. One of these hosts is configured to serve TLS v1.2 and the other host is configured to serve TLS v.1.3. Both hosts will serve HTTP 1.1, 2, or 3.
The Client
In order to accurately assess real-world performance, the Chrome web browser was used via Puppeteer scripting. Each page load was measured as the point when all round trips for the given site were completed.
The HTTP and TLS protocol versions were captured at the client to ensure the expected outcome of protocol negotiations.
Each tested configuration was sampled 10, 25, and 50 times with pauses between in order to get a representative set of samples of the network conditions at that time.
The Network
For a baseline comparison, the client is tested from two different locations, one in us-central and one in europe-west in order to get a low latency measurement and a high latency measurement.
Tests were performed at two different times during the day (t0 and t1) in order to account for any potential effect of global congestion trends, e.g. browsing during lunch, streaming after dinner, etc.
The Results: High vs Low Latency (Europe -> US vs US -> US)
These first graphs show how each protocol combination fared under different site sizes with small (US) and large (Europe) latencies. As expected, the higher the latency, the more difference there is in performance across different protocol versions.
Small site performance:
Medium site performance:
Large site performance:
The Results: High Latency (Europe -> US)
Zooming in on performance for high latency connections, there is definitely a big improvement going from HTTP/1.1 (H1) to either HTTP/2 (H2) or HTTP/3 (H3). Between H2 and H3, performance is close, but H2 seems to edge out H3 in most of the trials.
The one circumstance where H3 and H2 performance seems to converge is when both are using TLS v1.3 against the largest/chattiest site see (t0-tls13-europe-large and t1-tls13-europe-large). In trial t0-tls13-europe-large, they are roughly equal, and in trial t1-tls13-europe-large, H3 seems to outperform H2. This suggests there may be at least some network conditions that favor the H3 protocol.
Summary & Conclusion
Both HTTP/2 and HTTP/3 quite handily outperform HTTP/1.1, but between HTTP/2 and HTTP/3, it’s a much closer call. Much of the motivating factors for HTTP/3 are for specific use cases, such as lossy connections with mobile/cell phone applications, the Internet of Things (IoT), or emerging markets. If performance is important to your use case, you can get benefits from using HTTP/3, and you can also rest assured that you can use this without much of a penalty (even if worse case the connection falls back to HTTP/2).
It’s worth mentioning that with average packet loss on the Internet clocking in at 2%, investing in HTTP/3 support for your application can provide a forward-looking capability to handle any increased geographic-specific degradation in the network.
Only a limited subset of proxies, ingresses, and API gateways currently support a complete and well-tested implementation of HTTP/3, such as Ambassador Edge Stack, which is powered by Envoy Proxy. With the simple configuration required to enable HTTP/3 support, we recommend that platform teams experiment with and deploy this technology now.