5 Crucial Insights from Kubernetes v1.36's Server-Side Sharded Watch Feature

In Kubernetes v1.36, a new alpha feature dramatically improves how large clusters handle controller watching. The server-side sharded list and watch (KEP-5866) moves filtering from client to API server, slashing per-replica costs. Here are five essential things you need to know about this game-changing capability.

1. The Scaling Problem: Why Full-Stream Watching Hurts

As Kubernetes clusters scale to tens of thousands of nodes, controllers tracking high-cardinality resources like Pods hit a wall: every replica of a horizontally scaled controller receives the complete event stream from the API server. Each replica pays the CPU, memory, and network cost to deserialize every event—only to discard objects it doesn't own. Scaling out the controller doesn't reduce per-replica cost; it multiplies it. This linear scaling of overhead becomes unsustainable, wastefully burning infrastructure and slowing down the control plane. The server-side sharded list and watch feature directly addresses this waste by pushing filtering upstream, ensuring each replica only sees what it needs.

5 Crucial Insights from Kubernetes v1.36's Server-Side Sharded Watch Feature

2. Why Client-Side Sharding Isn't Enough

Some controllers, like kube-state-metrics, already support horizontal sharding: each replica is assigned a portion of the keyspace and discards objects not belonging to it. While this works functionally, it does nothing to reduce the data flowing from the API server. Every replica still downloads, deserializes, and processes the entire event stream before throwing away most of it. The network bandwidth scales with the number of replicas, not with the shard size, and CPU cycles spent on deserialization are wasted for the discarded fraction. Server-side sharding solves this by moving the filtering into the API server itself, cutting off unneeded data at the source and eliminating this hidden inefficiency.

3. How Server-Side Sharding Works

The feature introduces a shardSelector field in ListOptions. Clients specify a hash range using the shardRange() function, e.g., shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). The API server computes a deterministic 64-bit FNV-1a hash of the specified field (currently object.metadata.uid or object.metadata.namespace) and returns only objects whose hash falls within the range [start, end). This applies to both list responses and watch event streams. Because the hash function is consistent across all API server instances, the feature works safely in multi-replica API server deployments. The result is that each controller replica receives only its assigned slice of the resource collection, dramatically reducing unnecessary data transfer and processing.

4. Implementing Sharded Watches in Your Controllers

Controllers built with client-go informers can easily adopt sharding by injecting the shardSelector into ListOptions via WithTweakListOptions. For example:

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/informers"
)
shardSelector := "shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')"
factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
    informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
        opts.ShardSelector = shardSelector
    }),
)

For a two-replica deployment, selectors split the hash space in half—e.g., Replica 0 gets the lower half, Replica 1 the upper half. This setup ensures each replica processes only its own events, reducing CPU, memory, and network load proportionally to the shard size.

5. What's Next for Server-Side Sharding

Currently in alpha, the feature supports only metadata.uid and metadata.namespace as hash fields. Future enhancements may include support for custom fields and dynamic shard rebalancing. As the feature matures toward beta, more controllers and tools are expected to adopt it, making large-scale Kubernetes clusters more efficient and cost-effective. Administrators should start experimenting with shard selectors in non-production environments to understand the impact and prepare for wider rollout. This foundational change in how events are distributed promises to become a standard tool for scaling Kubernetes observability and management.

In conclusion, server-side sharded list and watch is a transformative feature for clusters at scale. By filtering events at the API server, it eliminates wasteful per-replica processing, reduces network bandwidth, and enables truly efficient horizontal scaling. As it moves from alpha to stable, adopting this approach will be key to maintaining performance and cost control in large Kubernetes environments.

Darhost