Kubernetes v1.36 GA: Pressure Stall Information (PSI) Metrics Now Stable for Production Workloads
Breaking: PSI Metrics Graduate to General Availability
Kubernetes v1.36, released today, marks a major milestone for node-level observability: Pressure Stall Information (PSI) metrics have graduated to General Availability (GA). This means operators can now rely on a stable, production-grade interface to detect resource bottlenecks—CPU, memory, and I/O—before they escalate into outages.

“PSI gives us the earliest possible warning of resource tension,” said Jane Chen, a contributor to the Kubernetes SIG Node. “Unlike traditional utilization numbers, PSI tells you how long tasks are actually waiting—and that’s the signal that matters in a live cluster.”
Background: Beyond Utilization
First introduced in the Linux kernel in 2018, PSI tracks the time tasks spend stalled due to resource shortages. Traditional metrics like CPU or memory utilization can be misleading: a node at 80% CPU may still cause severe latency for some workloads due to scheduling delays. PSI fills that gap by providing cumulative totals and moving averages over 10s, 60s, and 300s windows.
These moving averages help operators distinguish between transient spikes and sustained pressure, enabling more accurate capacity planning and faster incident response. Until now, Kubernetes lacked a standardized, stable way to expose PSI metrics at the pod and container levels.
What This Means for Operators
With the GA graduation in v1.36, PSI metrics are available through the Kubelet at the node, pod, and container granularity. Operators no longer need to rely on external agents or custom scripts to scrape kernel-level counters. This directly translates into:
- Earlier detection of resource contention before it impacts SLAs.
- Lower overhead—the collection logic is negligible, as proven by extensive performance testing.
- Improved automation for cluster autoscaling and workload rebalancing based on actual stall signals.
“This is a game-changer for cluster resource management,” added Chen. “We now have a first-class, stable metric that aligns with how Linux actually schedules work.”
Proving Stability: Performance Testing at Scale
A common concern with new telemetry features is the resource overhead of collection and serving. To address this, SIG Node conducted rigorous performance validation on high-density workloads (80+ pods) across different machine types. The tests isolated two scenarios:
- Kubelet overhead: Compare Kubelet CPU usage with PSI feature enabled versus disabled, while kernel tracking was already active.
- Kernel overhead: Compare system-level CPU impact when kernel PSI is turned on versus off, with the Kubelet feature active.
Scenario 1: Kubelet Overhead Is Negligible
On 4-core machines, both clusters had kernel PSI enabled by default. The Kubelet’s CPU usage showed practically identical bursts whether the feature was on or off. The extra cost stayed within 0.1 cores—just 2.5% of node capacity—well within safe production margins.
Scenario 2: Kernel PSI Adds Minimal System Load
When measuring system CPU usage, the PSI-enabled clusters tracked the same pattern as those without, with only a marginal increase from the baseline of 2.5 cores. The act of Kubernetes reading cgroup metrics proved to be a fraction of the overall system cost.
“These numbers confirm that PSI is production-ready,” said Chen. “The overhead is so small it’s lost in the noise of normal Kubelet housekeeping.”
Immediate Availability
Kubernetes v1.36 is now available for download. Operators can enable PSI metrics by ensuring the kernel has psi=1 (default on most modern distributions) and upgrading their clusters to v1.36. No additional feature gate is required.
For detailed migration guides and configuration examples, refer to the official Kubernetes PSI documentation.
Related Articles
- Critical Bug in Linux CUBIC Congestion Controller Permanently Stalls QUIC Connections – One-Line Fix Deployed
- Security Patch Roundup: Critical Updates Across Major Linux Distributions
- Everything You Need to Know About Firefox’s Free VPN with Server Choice
- BleachBit's New Text-Based Interface: A Game Changer for Headless Server Maintenance
- How to Join and Make the Most of the Fedora Linux 44 Virtual Release Party
- Critical Security Patch Blitz: Multiple Linux Distributions Release Urgent Fixes
- Fedora Workstation 44: 8 Exciting Features You Should Know About
- Meta's KernelEvolve AI Agent Revolutionizes Chip-Level Optimization – 60% Performance Boost