The Problems with Annotation-based configuration

Cindy Mullins

October 10, 2024

•

I’m Cindy Mullins, the Community Manager here at Ambassador. Our Community Corner segments on LinkedIn feature deep dives into common questions we get in our Community about our products: Edge Stack, Telepresence, and Blackbird.

In this segment we’re talking about Edge Stack Kubernetes API Gateway configuration: specifically, annotations vs Kubernetes Custom Resource Definitions (or CRDs).

Annotations vs Kubernetes CRDs

CRDs vs Annotation-based configuration

The word ‘annotations’ in the context of Kubernetes can mean a couple of different things. For example, there are human-readable service annotations that add context to your deployment: things like the service creator or owner, the build number, release, or image information or pointers to logging, monitoring, or audit repositories. These are useful meta tags and generally something we recommend at Ambassador.

But what I really want to address today is what you might call “annotation-based configuration” or functional annotations. Annotation-based-configuration is a common approach to Kubernetes that most users learn. That’s because this was the legacy way to configure things before Custom Resource Definitions, or CRDs, came along.

The way these annotations work is, you would first define a service or an ingress and then you’d add on an annotations spec. Inside of that spec you would include functional declarations about the path traffic will be served on, the routing logic, the api version or other key value pairs configuring how that service or ingress will work.. It’s like adding a bunch of footnotes onto the service.

As configuration has grown more complex, with more options and specs available, these footnotes over time have grown longer and more detailed. This led to the creation of a cleaner approach to configuration using Custom Resource Definitions, or CRDs.

What is a CRD?

A custom resource is an extension of the Kubernetes API that stores a collection of API objects of a certain Kind. Many core Kubernetes functions are now built using custom resources, making Kubernetes more modular.

Edge Stack configuration is based on Custom Resource Definitions such as Mappings, Hosts, Listeners and Modules. For example, you by defining it as the kind Mapping and you specify how you want your routing logic to work. You specify the path and also any mapping rules you want to implement like regex matches, path rewrites, timeouts and retries, and things like associating your Mapping with a particular Host. The Mapping is a modular resource that controls all the aspects of how routing should happen.

Similarly, a Host resource specifies the hostname by which Edge Stack will be reachable, how TLS certificates should be handled, and handling for secure and insecure requests. A Listener CRD declares the ports Edge Stack API Gateway should listen on and which traffic protocol to listen for. There are a lot of customizable variables available for traffic routing and management, and Edge Stack allows you to define and maintain these specifications via CRDs.

CRDs are perhaps new for some Edge Stack users. In fact, a common getting-started error is often traced back to the fact that users haven’t applied Edge Stack’s default CRDs, as part of our installation instructions.

Annotations-as-config are unwieldy

One of the major issues is that routing logic has become more complex with more options for managing your traffic flow. Attaching annotations to your service means as these ‘footnotes’ grow longer and more complicated they are harder to break out conceptually from your other config logic.

Generally annotations are best used when kept simple. In our experience, it’s better not to denote the entire body/structure of more complex objects by using annotations.

Annotations are formed as key-value pairs and there are restrictions on allowable size:

Key: Same as labels, 63 characters.
Value: There is no strict limit on the size of individual annotation values, but the total size of all annotations (keys and values combined) must not exceed 256 KB per object.

This means that if you had an excessive amount of Edge Stack configuration on one resource, like a service, you might hit that limit and be forced to break the service apart. Generally speaking, having a lot of configuration in a single annotation is a bad idea.

Makes Upgrading Harder to Manage

Annotations-as-config are also harder to keep track of because they’re not modular or self-contained. When you need to update an API version, for example, you’ll have to go and track down all that logic embedded in your Services.

We’ve had many cases where users have a difficult time locating these annotations to amend them as part of an upgrade. Having v1 annotations in your configuration, for example, can cause Edge Stack version 3.x not to work. When resources are applied as annotations, it can be much more difficult to find these versioning issues and address them.

Harder to Maintain

If you applied your config as an annotation and later you try to look up a resource with a ‘kubectl get Mapping’, you won’t be able to look up that resource because it wasn’t created as a CRD. Instead, you’d have to go find where this Mapping-like-functionality was created inside the annotation spec of your Service definition. When it really counts - for example in case of an outage - this can add extra time to your search and recovery efforts.

With CRDs you can easily at a glance do `kubectl get mappings -A` to see all of your mappings. And with status columns such as the path for our Mapping resource, it is much easier for developers and cluster operators to tell at a glance what config is applied and if there are any errors on those resources.

Without this info, you might have to check application logs or some third party/external UI application to see if your config works or not.

If you ever need to edit a resource (`kubectl edit`) or output it's yaml (`

kubectl get <resource> -o yaml

`) then it is also much harder to read complex objects that are in annotations because much of the time the output will be with escape characters instead of being neatly organized with easy to read indentations and new lines.

Harder to Integrate

Annotation-based-config is also harder to integrate with tools. For example, with GitOps like ArgoCD, you can set up alerts when config like custom resources fail to apply. With annotations, they will just apply regardless of whether or not the config was valid and you won't know this until you actually test the behavior.

Harder to Validate

The validation processes are very different. With CRDs you can get validation before the resources are applied and at apply-time that will tell you when there is any invalid config.

With a CRD, when the resource is first applied, Edge Stack’s validation system won’t let you save invalid config. So any time you do an ‘apply’ command or an ‘edit’ command with kubectl, bad data types are caught by the Kubernetes validator that checks against the definitions provided by Ambassador.

Some specifications do need to be added into the CRDs by the developer, but setting things up this way will make your config much more transparent. This prevents some invalid config like inputting a string for a field that only accepts integers, or omitting required fields, etc.

Edge Stack will catch some invalid annotation-based-config during its internal checks. Most of the safety is provided by the apply-time validation I just described. But if Edge Stack doesn’t catch an invalid or deprecated annotation, it may only be observable at runtime. In that case Edge Stack will just log an error and ignore the resource, which could cause some part of the deployment not to work.

Catch Config Errors Before Runtime

Catching errors at apply-time helps avoid runtime errors. Otherwise you’d have to test your config by applying it and either observing the results or relying on a third party/external UI application to try to check the config. If not adequately checked, this could have unintended side-effects.

For example, imagine a scenario where you’re using the `Filter`/`FilterPolicy` CRDs to configure authentication to protect your services. If you use CRDs, you can at least guarantee that because the CRDs were successfully applied, the config itself is valid.

But let’s say you used annotations to configure the `Filter`/`FilterPolicy` and applied it. There are no apply-time checks. You’d assume your services are now secure but if that `FilterPolicy` had invalid config, it might not even be in effect so your services would be exposed and unprotected. There would of course be logs showing errors, but you would need to manually check the logs after every annotation config change to see whether or not it’s working properly.

Potential Resource Drain

Edge Stack is built on the Envoy Proxy and uses Envoy for all traffic routing and proxying. Envoy is a modern L7 proxy that is used in production at companies including Lyft, Apple, Google, and Stripe.

With Edge Stack, anything that causes Envoy reconfigurations can result in higher CPU usage. This is uncommon but annotations can sometimes apply across namespaces, which are continuously monitored - along with secrets and config maps - and as such they can be culprits of higher resource demand.

Allowed but Not Supported

Since we can’t test all Kubernetes annotations as they might conceivably be used, Edge Stack allows them as a legacy configuration option but does not recommend them. In general, CRDs are a more robust, scalable, and methodical way to manage your routing config and the logic around managing traffic to your Services. If there is a support issue with annotations our suggestion will usually be to convert them to CRDs which we can then help with.

When are annotations a good idea?

There are a few cases where annotations are a good idea, and are even required. You’ll use an annotation when using Edge Stack as an Ingress resource (see example below) and also with some configuration with a Cloud provider. For example when specifying an AWS load balancer type or when putting an L7 GKE load balancer in front of Edge Stack using the Ingress-GCE resource. But these are limited situations with specific annotations that you would use for each case.

In general, in our view, the correct way to use annotations is for things like providing operational metadata, storing reference or cataloging information like repo locations and external tools, and notes on your services for future reference and for others on your team.

I hope this provides a better understanding of why at Ambassador we’ve taken the CRD approach and how CRDs help you create a cleaner, more reliable and better maintainable API Gateway configuration. This has been Community Corner with Cindy. Until next time!

‍

Edge Stack API Gateway

Achieve Better Performance and Control with CRDs in Edge Stack

Contact Sales

Contents

Example H2

Example H3