Visualização normal

Antes de ontemStream principal
  • ✇Security | CIO
  • Cloud modernization is advancing. Utilization isn’t
    At Datadog, an observability and security platform for cloud applications, I work on research studies that analyze anonymized infrastructure telemetry from thousands of production environments across Kubernetes, managed container platforms and serverless services across cloud providers. The datasets span multiple cloud providers and billions of workload hours. Much of that work goes into our annual reports on container and serverless adoption, where we examine how organiza
     

Cloud modernization is advancing. Utilization isn’t

5 de Maio de 2026, 08:00

At Datadog, an observability and security platform for cloud applications, I work on research studies that analyze anonymized infrastructure telemetry from thousands of production environments across Kubernetes, managed container platforms and serverless services across cloud providers. The datasets span multiple cloud providers and billions of workload hours. Much of that work goes into our annual reports on container and serverless adoption, where we examine how organizations run workloads in modern cloud environments.

Over the past few years, one question kept coming up as we updated these reports: As cloud platforms become more granular and autoscaling adoption increases, does resource efficiency improve?

Going into this work, I didn’t have a formal hypothesis about utilization improving over time. But there was an implicit assumption—one that felt reasonable. As platforms became more granular and autoscaling adoption increased, resource efficiency should improve at least incrementally.

It didn’t.

When we compared successive editions of the research, including the 2023 Container Report and the 2025 State of Containers and Serverless, the answer was less straightforward than expected. The share of Kubernetes workloads running well below their requested CPU and memory levels remained broadly consistent between reports.

That persistence raises an uncomfortable question: If modernization alone doesn’t improve utilization, what does?

Rapid evolution in cloud infrastructure

Cloud environments today look markedly different from even three years ago.

In the 2023 Container Report, we found that over 65% of Kubernetes workloads were using less than 50% of their requested CPU and memory. That report examined container telemetry across thousands of production environments to understand how teams run Kubernetes workloads.

Two years later, the 2025 State of Containers and Serverless expanded the scope of the research to look at broader compute patterns, including the growing mix of containers and serverless, while continuing to analyze Kubernetes workloads.

Using the same <50% threshold for comparison, the overall utilization pattern remained similar. In October 2025, 72% of Kubernetes workloads were still using less than 50% of their requested CPU, and 62% were using less than 50% of their requested memory.

In other words, even as organizations adopted newer compute models and expanded autoscaling, most workloads continued to run well below their requested capacity.

At a surface level, the modernization between those report cycles is obvious: More granular compute models, broader instance diversification, increased use of managed services and deeper abstraction.

Looking only at platform capabilities and adoption trends, this appears to be steady operational maturity, the kind often discussed in CIO.com’s own coverage of cloud strategy.

If modernization alone were enough, we would expect to see measurable improvement in utilization patterns. The data suggests otherwise.

The utilization baseline barely moved

Using the same <50% threshold for comparison, the 2025 data shows a familiar pattern. In October 2025, 72% of Kubernetes workloads were using less than 50% of their requested CPU, and 62% (vs. 65% in 2023) were using less than 50% of their requested memory.

In other words, most workloads still operate well below their provisioned capacity.

Looking even closer, the distribution becomes more pronounced. In October 2025, 57% of workloads were using less than 25% of requested CPU, and 37% were using less than 25% of requested memory.

This is not marginal inefficiency at the edges. It reflects a large share of workloads running far below their requested baseline.

When I saw those updated numbers in the 2025 report, I was a little surprised. Not because I expected perfection, since cloud systems are inherently uneven, but because I expected at least some measurable drift toward tighter provisioning as platform sophistication increased.

Instead, the overall distribution remained remarkably persistent.

To be clear, this does not imply that teams are careless or that modernization efforts failed. It suggests something more structural. Utilization behaves less like a short-term tuning issue and more like a stable characteristic of how systems are configured and operated over time.

A longitudinal comparison between the 2023 and 2025 data shows that individual workloads churn, clusters scale and instance types diversify, yet the aggregate distribution remains comparatively steady. That persistence stood out more than any single annual trend.

Importantly, the longitudinal data does not explain why that persistence exists. It only shows that modernization at the platform layer does not automatically reshape the utilization distribution.

At scale, persistent underutilization also has cost implications. Even if individual workloads appear inexpensive, conservative provisioning raises the baseline against which budgets are set.

Over time, that baseline becomes normalized, shaping cloud forecasts, contract negotiations and infrastructure investment priorities.

Averages hide persistence

Infrastructure data is rarely evenly distributed; it is long-tailed.

A relatively small number of workloads drive sustained utilization. A much larger number are bursty, intermittently active or lightly used. When averaged together, the system appears stable even when individual components are dynamic.

Averaging utilization metrics can therefore be misleading. An average implies symmetry. In practice, resource usage is asymmetric. Extreme values often drive cost and capacity exposure, while the median workload remains comparatively quiet. When those extremes are averaged away, the signals that matter most are softened.

Partial instrumentation adds another layer. Not every workload produces the same performance and utilization data at the same level of detail. As organizations mix legacy systems with newer managed services, visibility gaps are common. Those gaps can skew aggregate metrics and create a false sense of stability or efficiency.

CIOs encounter similar issues when interpreting other aggregate metrics such as average latency, mean time to recovery or blended cloud spend. As CIO.com has noted in discussions of meaningful metrics, aggregation can obscure operational reality.

In infrastructure, that obscurity can persist for years.

What “utilization” measures

Before interpreting the trend, it is important to clarify what these utilization metrics measure.

In Kubernetes environments, utilization is typically measured relative to requested resources rather than raw machine capacity. Requests influence scheduling and reserve capacity on anode, shaping the baseline against which utilization is measured. But they also encode human judgment. Sometimes that judgment is based on load testing. Sometimes it reflects historical spikes. Sometimes it is simply conservative.

Two teams can run similar services and choose very different request baselines. The utilization metric will faithfully reflect that configuration choice.

That is one reason I am cautious about treating utilization as a moral signal. It is a technical metric, but it is also a reflection of configuration decisions embedded over time.

Looking at it over time shows what changes and what stubbornly does not, even as platforms evolve.

Autoscaling isn’t the same as precision

One obvious question is whether autoscaling adoption should materially change these patterns.

Horizontal Pod Autoscaling (HPA) is common across Kubernetes environments and widely supported across platforms. This reflects broader ecosystem trends described in the CNCF Annual Survey.

But elasticity is not the same as precision.

Many autoscaling configurations still center on CPU and memory signals. More context-aware scaling, based on queue depth or application-level indicators, remains less prevalent. Vertical scaling is comparatively rare and often used in advisory modes rather than actively reshaping requests.

Workloads can scale up and down without necessarily altering their baseline request posture or the broader utilization distribution we observe.

Enabling elasticity is straightforward. Sustaining precision over time is much harder.

Technical debt doesn’t disappear with new platforms

Another pattern surfaced in the Container Reports is version lag. In both the 2022 and 2023 editions, a significant share of Kubernetes clusters were running versions approaching end-of-life even as newer releases were widely available.

Production systems rarely upgrade at the same pace as new platform capabilities are released. End-of-life versions persist. Premium support tiers extend. Older runtimes remain embedded even when more efficient versions are available.

Upgrades compete with feature delivery. Stability is prioritized. Risk is managed conservatively.

Version adoption does not directly determine utilization levels. But it reflects a broader dynamic: Configuration and upgrade decisions change more slowly than platform capabilities. When analyzed at scale, that inertia becomes visible. New tools layer onto existing systems, but earlier configuration assumptions often remain intact.

In practice, modern platforms often inherit older provisioning choices.

Capability is not outcome

Seeing the same utilization pattern persist across report cycles shifted my thinking. It was not about Kubernetes, serverless or autoscaling in isolation. It was about separating capability from outcome.

Cloud platforms today offer far more granularity than they did a few years ago. We can allocate resources in smaller increments. We can autoscale pods and nodes. We can mix execution models. We can diversify architectures.

None of that automatically changes the empirical shape of infrastructure usage.

Modernization creates new possibilities, but it does not automatically change how resources are used.

Across report cycles, it became clear that architecture was evolving faster than the underlying usage patterns.

That distinction has significant implications for how infrastructure performance—and investment decisions—are interpreted.

When platform evolution isn’t enough

If multiple years of visible platform evolution do not materially shift the utilization baseline, the constraint likely extends beyond feature availability.

What makes this pattern interesting is not that utilization is low in any single snapshot. It is that it remains low even as surrounding variables change. Platform capabilities evolve. Adoption curves shift. Workload composition becomes more heterogeneous. Yet the aggregate distribution remains comparatively stable.

That stability suggests something important: Modernization changes what is possible, but it does not automatically change how systems are configured or revisited over time.

For CIOs and senior technology leaders, the implication is not to pursue the next abstraction layer. It is to examine the decision frameworks that shape provisioning, headroom and risk tolerance year after year.

Cloud platforms will continue to evolve quickly. Whether utilization patterns change will depend less on new capabilities and more on how deliberately organizations revisit the assumptions embedded in their configurations.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

  • ✇Security | CIO
  • 잘못된 클라우드 전략, AI 혁신 가로막는다
    대부분의 조직이 AI의 효과를 충분히 활용하기 위해서는 고도화된 클라우드 전략이 필수적이다. 그러나 많은 CIO가 속한 조직은 여전히 고급 수준의 AI 배포를 추진할 만큼의 클라우드 역량과 투자 수준을 갖추지 못한 상태다. 클라우드 운영을 희생하면서 AI에 대규모 투자를 단행하는 결정은 오히려 상황을 악화시킬 수 있다. NTT 데이터의 보고서에 따르면, AI 성공에 최적화된 ‘클라우드 진화 단계(cloud evolved)’에 도달한 조직은 전체의 14%에 불과하다. 이 단계에서는 클라우드 주도의 혁신이 비즈니스 전환을 가속화하고, 클라우드 네이티브 서비스가 핵심 전략에 내재화돼 있다. 또 다른 34%의 고위 IT 의사결정자는 자사의 클라우드 전략을 ‘성숙 단계(mature)’로 평가했다. 이는 진화 단계 바로 아래 수준으로, 조직 전반에 걸쳐 폭넓고 전략적인 클라우드 활용이 이뤄지고 있으며, 강력한 거버넌스와 모범 사례, 확장 가능
     

잘못된 클라우드 전략, AI 혁신 가로막는다

21 de Abril de 2026, 04:24

대부분의 조직이 AI의 효과를 충분히 활용하기 위해서는 고도화된 클라우드 전략이 필수적이다. 그러나 많은 CIO가 속한 조직은 여전히 고급 수준의 AI 배포를 추진할 만큼의 클라우드 역량과 투자 수준을 갖추지 못한 상태다. 클라우드 운영을 희생하면서 AI에 대규모 투자를 단행하는 결정은 오히려 상황을 악화시킬 수 있다.

NTT 데이터의 보고서에 따르면, AI 성공에 최적화된 ‘클라우드 진화 단계(cloud evolved)’에 도달한 조직은 전체의 14%에 불과하다. 이 단계에서는 클라우드 주도의 혁신이 비즈니스 전환을 가속화하고, 클라우드 네이티브 서비스가 핵심 전략에 내재화돼 있다.

또 다른 34%의 고위 IT 의사결정자는 자사의 클라우드 전략을 ‘성숙 단계(mature)’로 평가했다. 이는 진화 단계 바로 아래 수준으로, 조직 전반에 걸쳐 폭넓고 전략적인 클라우드 활용이 이뤄지고 있으며, 강력한 거버넌스와 모범 사례, 확장 가능한 워크로드를 갖춘 상태를 의미한다.

결과적으로 절반 이상의 조직이 AI 효과를 충분히 발휘하기 위한 클라우드 역량에서 뒤처져 있는 것으로 나타났다. 이 가운데 4분의 1 이상은 단순히 ‘클라우드 활용 단계(cloud enabled)’에 머물러 있으며, 약 4분의 1은 여전히 초기 수준에 머물러 있다.

CIO가 클라우드 성숙도 곡선의 어느 위치에 있든, AI 프로젝트 자금을 마련하기 위해 클라우드 투자를 포기하는 전략은 위험성이 크다.

IT 리더의 약 88%는 조직 내 클라우드 투자 부족이 AI, 클라우드 네이티브, 현대화 이니셔티브 전반에 위험 요소가 될 수 있다고 우려하고 있다. AI 확산으로 클라우드 사용이 증가하고 있음에도 불구하고, 설문 응답자의 84%는 지난 1년간 클라우드 지출이 정체 상태에 머물렀다고 밝혔다.

AI 위해 클라우드 희생? ‘돌려막기 투자’의 위험

기업들이 AI 파일럿 프로젝트에 예산을 집중하는 과정에서, 정작 핵심 인프라인 클라우드를 소홀히 하는 경향이 나타나고 있다. NTT 데이터(NTT DATA)의 클라우드 및 보안 글로벌 총괄 찰리 리는 이러한 상황을 두고 “클라우드 영역에는 충분한 예산이 배정되지 않고 있다”고 설명했다.

찰리 리는 “AI를 추진하기 위해 특정 영역에는 비용을 투입해야 하지만, 정작 클라우드에는 사용할 자원이 부족한 상황”이라며 “이미 AI에 많은 투자를 하고 있음에도 불구하고, 결국 다수의 파일럿 프로젝트를 반복하면서 비용을 낭비하는 결과로 이어지고 있다”고 분석했다.

일부 기업은 수십 개의 AI 파일럿을 실행할 수 있는 예산을 확보했지만, 클라우드 서비스에 대한 추가 투자 여력은 확보하지 못한 것으로 나타났다. 이에 따라 CIO는 AI를 구현하기 위해 필수적인 클라우드 기반 작업을 수행해야 함에도 불구하고, 이를 위한 자금이 부족한 상황에 직면해 있다.

NTT 데이터는 AI 개발에 필요한 막대한 연산 능력 때문에 클라우드가 필수적이라고 보고 있다. 찰리 리는 “AI에는 대규모 데이터와 강력한 처리 능력이 요구된다”며 “이 두 요소가 오늘날 생성형 AI 확산을 이끈 핵심 요인”이라고 설명했다. 이어 “자체 데이터센터의 제한된 서버 환경만으로는 이러한 요구를 충족하기 어렵다”고 밝혔다.

또한 성공적인 AI 프로젝트를 위해서는 데이터 성숙도를 확보할 수 있는 클라우드 전략이 필요하다. 찰리 리는 “클라우드 전략이나 구현 수준이 충분히 성숙하지 않으면 데이터가 분산된 상태로 남게 된다”며 “데이터 품질이 낮거나 거버넌스가 미흡할 경우, 학습된 모델의 정확도 역시 떨어질 수밖에 없다”고 지적했다.

정리되지 않은 클라우드에서는 AI도 없다

한편 업계 전문가들도 클라우드가 AI 성공에 핵심적인 역할을 한다는 데 의견을 같이한다. 디지털 전환 솔루션 기업 UST의 최고 AI 아키텍트 아드난 마수드는 “정리되지 않은 클라우드 환경 위에서 AI가 안정적으로 확장된 사례는 보기 어렵다”고 진단했다.

마수드는 “데모 수준에서는 구현이 가능하지만, 실제 운영 단계에서는 데이터 거버넌스 부족, 취약한 시스템 통합, 낮은 가시성, 급격히 증가하는 연산 비용 등의 문제가 본격적으로 드러난다”고 설명했다. 또한 이번 NTT 데이터 조사 결과는 현재 시장 상황과도 일치한다고 평가했다.

온프레미스 기반 AI는 일부 규제가 엄격한 산업에서 제한적으로 활용될 수 있지만, 대부분의 기업은 클라우드 기반 접근 방식에서 더 큰 효과를 얻는 것으로 나타났다. 마수드는 “강력한 클라우드 전략 없이도 제한적인 AI 구현은 가능하지만, 이를 안정적으로 확장할 가능성은 낮다”고 밝혔다. 이어 “온프레미스 환경이 효과적으로 작동하려면 데이터, 오케스트레이션, 모델 서빙, 가시성, 보안, 사이버 복구, 거버넌스 등 관리 계층이 충분히 성숙해야 하지만, 대부분의 기업은 아직 이 수준에 도달하지 못했다”고 분석했다.

AI 및 데이터 플랫폼 기업 엔터프라이즈DB의 CTO 콰이스 타라키는 클라우드 성숙도가 AI를 단순 파일럿에서 실제 운영 시스템으로 전환하는 데 중요한 요소라고 설명했다. 그는 “클라우드 성숙도만으로 충분하다고 볼 수는 없지만, 성숙도가 높은 기업일수록 최신 데이터 아키텍처와 강력한 거버넌스, 환경 간 높은 상호운용성을 갖추고 있다”며 “또한 실제 동시 처리와 대규모 데이터 요구를 안정적으로 감당할 수 있는 인프라를 확보하고 있다”고 밝혔다.

다만 클라우드 투자 확대가 곧바로 성숙도로 이어지는 것은 아니다. 타라키는 “클라우드로 워크로드를 이전한다고 해서 기존에 분산돼 있던 아키텍처가 자동으로 단순화되지는 않는다”며 “데이터와 AI가 사일로 없이 함께 작동할 수 있는 통합되고 유연한 기반을 구축할 때 클라우드 투자의 효과가 극대화된다”고 설명했다.

클라우드 지출이 반드시 클라우드 성숙도로 이어지는 것은 아니다. 엔터프라이즈DB의 CTO 콰이스 타라키는 클라우드에 대규모 투자를 단행한 기업들조차도 AI 파일럿을 실제 운영 환경으로 전환하는 데 어려움을 겪고 있다고 설명했다. 이는 실시간 및 다중 환경에서 요구되는 AI 워크로드를 지원할 수 있는 데이터 아키텍처를 갖추지 못했기 때문이다.

타라키는 “클라우드로 워크로드를 이전한다고 해서, 기존에 분산돼 있던 아키텍처가 자동으로 단순화되지는 않는다”고 밝혔다. 이어 “클라우드 투자가 효과를 발휘하려면 데이터와 AI가 사일로 없이 함께 작동할 수 있는, 보다 통합되고 유연하며 거버넌스가 갖춰진 기반을 구축해야 한다”고 분석했다.

잘못된 클라우드 투자

잘못된 클라우드 투자는 오히려 AI 도입을 저해할 수 있다. 운영 복잡성을 증가시키거나, 고급 분석과 AI 확장 비용을 예측하기 어려운 가격 구조를 포함하고, 특정 벤더 의존도가 높아 새로운 요구에 유연하게 대응하지 못하는 경우가 대표적이다.

타라키는 “클라우드 환경의 데이터 및 시스템 거버넌스 아키텍처가 분산돼 있을 경우, 조직은 시스템 간 데이터 이동에 과도한 시간을 소모하게 된다”며 “이 과정에서 예측하기 어려운 비용을 감수해야 하고, 운영 과정에서 발생하는 마찰도 지속적으로 증가한다”고 설명했다.

이어 “AI가 단순히 데이터를 조회하는 수준을 넘어, 데이터를 기반으로 직접 행동하는 단계로 발전하는 순간 이러한 문제는 더욱 심화된다”고 덧붙였다.
dl-ciokorea@foundryco.com

  • ✇Security | CIO
  • Neglecting the cloud? Good luck with AI
    An advanced cloud strategy is essential for most organizations to fully benefit from AI. Unfortunately for the majority of CIOs, their organizations still lack the cloud proficiency and investment to drive advanced deployments. Decisions to invest significantly in AI at the expense of cloud operations may only be making things worse. According to a report from NTT DATA, just 14% of organizations are “cloud evolved,” the optimal level for AI success, with cloud-led innov
     

Neglecting the cloud? Good luck with AI

16 de Abril de 2026, 07:01

An advanced cloud strategy is essential for most organizations to fully benefit from AI. Unfortunately for the majority of CIOs, their organizations still lack the cloud proficiency and investment to drive advanced deployments. Decisions to invest significantly in AI at the expense of cloud operations may only be making things worse.

According to a report from NTT DATA, just 14% of organizations are “cloud evolved,” the optimal level for AI success, with cloud-led innovation accelerating business transformation and cloud-native services embedded in core strategies.

Another 34% of senior IT decision-makers surveyed for NTT DATA consider their cloud approach “mature,” the next level down from evolved and defined as having broad and strategic cloud use across business units, with strong governance, best practices, and scalable workloads.

That leaves more than half of organizations behind the cloud curve for AI effectiveness, with more than a quarter simply “cloud enabled” and nearly a quarter being cloud novices.

Regardless of where CIOs find themselves on the cloud maturity curve, forgoing cloud investments to fund AI projects can be dicey.

Nearly nine in 10 IT leaders (88%) are worried that a lack of cloud investment at their organizations will put their AI, cloud-native, and modernization initiatives at risk. Despite AI driving more cloud use, 84% of survey respondents say their cloud spending has been flat over the past year.

Robbing Peter to pay Paul

The survey suggests that as organizations reallocate money for AI pilots, they’re neglecting the cloud, an essential piece of the AI puzzle, says Charlie Li, president and global head of cloud and security at NTT DATA.

“The cloud side is not getting the money,” he says. “The frustration is, ‘In order to do AI, I’ve got to spend money here, and I have no money here, but I’ve been throwing in a lot of money for AI. So I end up wasting a lot of money doing a bunch of pilots.’”

Some customers of NTT DATA have the budget to run dozens of AI pilots, but CIOs have not gotten any new money to spend on cloud services, Li adds.

“The CIO is sitting there going, ‘Wait a minute, I’ve got to do these things in the cloud side in order to be able to do AI, but I have no money to do it,’” he says.

NTT DATA sees cloud services as essential for AI development because of the computing power it requires, Li says.

“You need humongous scales of data and processing power,” he explains. “Those two things are what’s really led to the advent of the gen AI trend that we actually see today. You cannot do this with 100 servers sitting in your own data center.”

An evolved cloud strategy is also needed to run successful AI projects because of the data maturity that it enables, Li adds. “If you don’t have a mature cloud strategy or implementation, your data is still all over the place,” he says. “If you’ve got junk data, if you have poor data governance, none of your trained models are going to be accurate.”

No AI on messy clouds

Other cloud and AI experts agree that the cloud often plays a huge role in successful AI deployments. The link is quite direct, says Adnan Masood, chief AI architect at digital transformation solutions provider UST.

“I’ve yet to see an AI program scale cleanly on top of a messy cloud estate,” he says. “Teams can get a demo running that way, sure. Production is where weak data governance, brittle integrations, poor observability, and runaway compute costs show up.”

The NTT DATA survey, with respondents concerned about a lack of cloud spending, is what Masood sees in the market.

While on-premises AI projects can work in limited circumstances, particularly for highly regulated companies, most organizations benefit from a cloud approach, he adds.

“Enterprises can get some AI footing without a strong cloud strategy — usually a contained assistant, internal search layer, or a narrow automation flow — but the odds of scaling it cleanly are low,” Masood says. “In practice, on-prem only works when the management layer is mature — AI-ready data, orchestration, model serving, observability, security, cyber-recovery, and governance — and that is where most enterprises are still behind.”

Full cloud maturity can be the difference between deploying AI as a pilot and AI as an operational system, adds Quais Taraki, CTO at AI and data platform provider EnterpriseDB.

“Cloud maturity alone is not fully sufficient, but companies that are further along in the process of cloud maturity tend to have more modern data architecture, better governance, stronger interoperability across environments, and infrastructure that can actually support production-scale workloads without falling apart under real concurrency and data volume demands,” he says.

Cloud spending doesn’t necessarily lead to cloud maturity, however. Some companies that have invested heavily in the cloud also struggle with moving AI pilots into production, because they don’t have the data architecture to support the real-time, multi-environment workloads that production AI requires, Taraki says.

“Moving workloads to the cloud does not simplify an architecture that was fragmented before you moved it,” he adds. “What we see consistently is that cloud investment helps when it creates a more unified, flexible, and governed foundation where data and AI can operate together without silos.”

The wrong cloud investments

Cloud investment can hinder AI, however, when it creates increased operational complexity, when it includes pricing models that make advanced analytics and AI unpredictable to scale, and when vendor dependencies constrain how they respond to new requirements, Taraki adds.

“When the data and system governance architecture of the cloud environments are fragmented, teams spend too much time moving data across systems, absorbing unpredictable cost, and managing operational friction that compounds the moment AI becomes agentic and starts acting on data rather than just querying it,” he says.

❌
❌