Visualização de leitura

AI, Cyberwarfare, and Autonomous Weapons: Inside America’s New Military Strategy

The Pentagon is integrating AI into military operations, transforming cybersecurity, targeting, and command systems into a unified warfare architecture.

May 2026 marks a turning point in the evolution of modern warfare: the convergence of artificial intelligence, cybersecurity, and conventional military power is no longer theoretical. It is becoming an operational reality.

The Pentagon has signed agreements with major technology companies, including OpenAI, Google, Microsoft, Amazon, and SpaceX to integrate advanced AI models into classified military networks. The stated goal is clear: transform the United States into an “AI-first” military force capable of maintaining decision superiority across every battlefield domain.

Under this strategy, AI is no longer treated as a laboratory tool or analytical assistant. It is moving directly into the military chain of command, intelligence analysis, logistics, targeting, and operational planning. More than 1.3 million Department of Defense employees are already using the GenAI.mil platform, dramatically reducing processes that once took months to just days.

The Pentagon’s doctrine reflects a major cultural shift: code and combat are no longer separate domains. Cybersecurity itself is now considered a combat capability. The ability to deploy, secure, update, and operate AI models inside classified environments has become part of national defense infrastructure.

The contracts signed with technology providers include “lawful operational use” clauses, requiring vendors to accept any use considered legitimate by the Pentagon, including autonomous weapons systems and intelligence operations. This raises profound ethical and geopolitical questions.

At the same time, the U.S. military is pushing for deep integration across defense systems. Through the Army’s new “Right to Integrate” initiative, manufacturers of missiles, drones, radars, and sensors are being asked to open their software interfaces so AI agents can connect systems in real time. The inspiration comes largely from Ukraine, where open APIs allowed rapid battlefield integration between drones, sensors, and fire-control systems.

However, this transformation creates a dangerous paradox: the same openness that enables speed and flexibility also expands the attack surface. Every API, cloud platform, and AI integration point can potentially become an entry point for sophisticated adversaries such as China, Russia, or state-sponsored APT groups.

A compromised AI-enabled military ecosystem could allow attackers to inject false sensor data, manipulate targeting systems, degrade drone communications, study operational decision patterns, or even hijack autonomous weapons platforms. In this context, software vulnerabilities and supply-chain weaknesses are no longer merely IT problems, they become military objectives.

Washington is also increasingly concerned about the cyber risks posed by advanced AI models themselves. According to reports, the White House is considering new oversight mechanisms for frontier AI systems capable of autonomously discovering software vulnerabilities or automating cyberattacks at scale. Officials fear that uncontrolled deployment of such models could lead to mass exploitation of critical infrastructure, financial systems, or global supply chains.

The strategic implications extend beyond military technology. Major cloud providers such as Amazon, Microsoft, and Google are gradually becoming part of the American defense architecture. Civilian digital infrastructure is evolving into a structural extension of military power.

This raises difficult questions for Europe and Italy. In a world where most cloud, AI, and cybersecurity infrastructures are controlled by American companies, what does technological sovereignty really mean? Sovereignty is no longer just about producing chips or funding startups. It is about controlling the digital infrastructure that supports national defense, determining who can update AI systems operating on classified networks, and deciding who sets the operational rules of software during crises.

The United States, Israel, and China are already integrating AI into military doctrine at high speed. Europe risks remaining trapped between regulation and technological dependence unless it develops its own industrial capabilities, operational autonomy, and independent evaluation frameworks.

The message coming from Washington is unmistakable: the future of strategic power will depend on who controls AI models, data, interfaces, and software-driven operational systems. In modern warfare, software has become a battlefield domain, and the speed of code deployment increasingly matters as much as firepower itself.

A more detailed analysis is available in Italian here.

Follow me on Twitter: @securityaffairs and Facebook and Mastodon

Pierluigi Paganini

(SecurityAffairs – hacking, AI)

Your CEO just got AI FOMO. Here are 6 tips on what to do next.

Every CIO I know has had some version of this conversation: their CEO comes back from a golf trip with their buddy, or a conference with peers, and is told AI is about to automate everything at their company, from HR to marketing and finance. No humans in the loop, just AI. The CEO then calls an all-hands Monday morning, and the CIO is suddenly on the hook to make it all happen.

The instinct for CEOs to chase unsubstantiated claims is understandable since they’re responding to competitive pressure. But that leaves CIOs responsible to close the gap between ambition and reality. Making AI work in an organization with decades of accumulated process, permission frameworks, and cultural inertia is very different from deploying it in a demo.

The best response isn’t to push back on the ambition, but redirect it. Translate the CEOs vision into an honest map of what has to happen for the organization to get there, including the infrastructure, governance, and training. That helps to convert the kneejerk compulsion to move faster into a concrete plan that leadership can get behind.

Here’s what CIOs should actually be focused on to get where their CEOs want them to go, regardless of what’s discussed on the links.

1. Start where AI can build its own credibility

The hype machine wants you to climb Everest on day one. Instead, identify the repetitive tasks where AI can prove itself on familiar ground — the workflows your team already knows well, where results are easy to verify and the bar for trust is attainable.

The goal is the Eureka moment when a skeptic on your team sees a real result and becomes a believer. Those moments compound. When someone has seen AI make their work easier in a context they understand, they’re more likely to help you move things forward. You can’t force that change, but you can engineer the conditions for it.

2. Models will commoditize. Context will not.

Every few months, a new model claims to be smarter, faster, and cheaper than the last one. Don’t be distracted by that race. The lasting advantage in enterprise AI doesn’t just come from which model you’re running, it’s in the quality, governance, and semantic clarity of the data feeding it. Enterprises that invest in consistent business definitions, well-structured data, and clear lineage will outperform those that don’t, regardless of which model is in fashion. Context is your competitive moat. Focus on building that.

3. Nail down the permissions

In a world of dashboards, you know exactly what data will appear on a given page, so you can set permissions in advance for who can access it. In an AI world, the system can generate outputs that were never pre-designed. So how do you determine who has the right to see a result that was never anticipated?

Before deploying any agent that acts on someone’s behalf, such as filing a request, surfacing payroll data, or populating a record, first determine whether your existing permissions and access control frameworks can handle outputs that were never planned for. Most can’t. This is a prerequisite of what your CEO is asking for: the unglamorous infrastructure work that determines whether your AI is trustworthy in production. It needs to happen before you scale, not after.

4. Build an editing culture, not a writing one

For decades, engineers, analysts, and operations teams have been trained to write code, build reports, and define new processes. AI upends that. The skill now is editing — auditing what the system produces, catching what it got wrong, and knowing where to push back.

The truth is most people aren’t naturally good at editing because they’ve never had to be. That’s a skills gap that needs to be closed early on. Invest in helping engineers, analysts, and managers develop the judgment to evaluate AI outputs, not just generate them. Editing must become a core enterprise competency.

5. Measure behavior change, not tool adoption

Login data is a vanity metric. If your engineers are accessing AI coding tools but aren’t changing how they build, you haven’t adopted anything. The metric that makes more sense is productivity output. In agile terms, a team that completes 20 story points per sprint should hit about 28 with AI, not because the tools are magic, but because the repetitive work gets faster. If you’re not seeing that, you’re measuring the wrong thing. Pay attention to output, not usage metrics.

6. Reframe your organization’s relationship with failure

The instinct to de-risk everything made sense when software deployments were expensive and slow to reverse. AI works differently. The outputs are probabilistic, the iteration cycles are fast, and being overly cautious can cost valuable time. CIOs need to give teams permission to experiment in ways that feel uncomfortable by traditional enterprise standards, all while building the feedback loops that make fast failure safe. That culture shift has to be modeled from the top.

FOMO isn’t going away

CEOs will keep getting pulled into cycles of urgency and FOMO, and that pressure will keep landing on CIOs. The organizations that make real progress will be the ones that redirect that energy into infrastructure that makes AI trustworthy, measurement systems that show what’s working, and cultural changes that make adoption stick. That’s the agenda that’ll move your organization forward.

AI sprawl: Why your productivity trap is about to get expensive

I have seen this movie before.

A decade ago, at Tesla, our Finance team faced a data crisis. We had information scattered across accounting, supply chain and delivery systems, all disconnected, all using different structures. The engineering team was rightfully focused on Full Self-Driving (FSD) and manufacturing. So, we did what productivity-hungry teams always do: We built our own solution. We taught ourselves Structured Query Language (SQL), normalized the data with creative IF-THEN logic and created our own reporting database.

It worked beautifully. Until it became a governance nightmare. The VP of Engineering hated our siloed system with embedded business logic. We eventually handed it over to IT, but not before our workaround forced the company to finally resource a proper data team.

The pattern is always the same: Productivity-hungry teams build workarounds faster than the organization can govern them, and by the time leadership notices, the workarounds have become the infrastructure.

That was more than a decade ago. The pattern took years to unfold.

Today, I am watching the exact same dynamic play out in insurance and industries across the board, but compressed into months, not years. AI adoption is sprawling across organizations, led by the same productivity-hungry individuals, but without central platforms or governance. Leadership has not created space for safe experimentation, so adoption spreads like a city without a highway system. The difference? Back then, we were building SQL databases. In 2026, we are building AI agents. And the cost of fragmentation is exponentially higher.

What is AI sprawl?

AI Sprawl is what happens when the cost of building AI drops faster than an organization can govern it. Teams spin up models, agents and automations independently. Each one works in isolation. None of them connect. The result is fragmented data, drifting decisions and intelligent systems that quietly get abandoned.

It happens because execution has become cheap. Large Language Model (LLM) APIs, no-code tools and cloud infrastructure have made spinning up AI trivially easy. A claims team builds an automation to speed adjudication. Underwriting builds a model to assess risk. Customer service deploys a chatbot. Each initiative delivers local value. No single project looks like a problem.

But collectively, they create an ungovernable landscape.

Over the past 18 months, the GenAI acceleration intensified what IDC calls the GenAI scramble: scattered, fragmented and sometimes redundant applications launched by business-led initiatives without central oversight. Many organizations have fallen into what researchers describe as a productivity trap: Focusing on short-sighted value generation instead of scalability, which limits their ability to create reusable capabilities across departments.

AI sprawl is everywhere

A major property and casualty carrier recently invited us to speak with their innovation leadership about implementing process automation. We spoke with more than 10 key stakeholders across multiple lines of business and found more than a dozen different POCs and local solutions across claims intake, underwriting and fraud detection.

Six of them were solving overlapping problems. None shared data infrastructure. Two had been abandoned months earlier but were still running and still being billed.

This is not an outlier. It is the norm.

AI Sprawl persists because it is insidious, hiding in plain sight unless you look for it. Business units move fast, build independently and solve immediate problems. IT discovers shadow AI only when something breaks, when an audit is triggered or when a vendor renewal surfaces a tool, nobody knew existed. And this symptom multiplies as more innovative teams exist within the organization.

The 4 hidden costs of sprawl

AI Sprawl creates costs that compound over time, many of which are not visible in any single budget line. It results in a dangerous cascade of failures:

  1. Governance becomes impossible. Companies cannot govern what they cannot see. When AI systems scatter across departments, audit trails fragment. Bias monitoring becomes inconsistent. Explainability standards vary by team.
  2. Scaling stalls. Disconnected systems cannot integrate. Every new initiative starts from scratch instead of building on shared infrastructure.
  3. Maintenance and redundant spending multiply. Teams that built AI to accelerate their work end up spending most of their time maintaining it. One carrier reported that 60% of their AI engineering capacity was devoted to maintaining existing tools rather than building new capabilities. Meanwhile, teams unknowingly pay for overlapping capabilities because nobody has a complete view of AI spending.
  4. Talent drains away. The best AI engineers want to solve hard problems. When they are cornered into spending their time maintaining fragmented infrastructure, they walk out the door.

Why traditional governance fails

Seventy percent of large insurers are investing in AI governance frameworks. Yet only 5% have mature frameworks in place. This gap is not about commitment or resources. It is about a category mistake.

For the last two decades, enterprise software governance worked because the software itself worked a certain way. Systems were point solutions. A claims platform did claims. A policy admin system did policy admin. Each tool had a clear owner, a defined scope and a predictable boundary. Governance could wrap around the edges, through access controls, audit logs, change management, vendor reviews, because the edges were visible. We governed the perimeter because the perimeter was the product.

AI is not a point solution. It is foundational technology, closer to electricity or a database than to a piece of software. It does not sit inside a defined boundary; it flows across every process, every decision and every department that touches data. And because it flows, it cannot be governed at the perimeter.

This is why carriers applying the old playbook keep running in place. Policy documents, oversight committees and compliance checklists were designed to govern systems that stood still. AI does not stand still. It is built, modified, retrained and extended by the same teams it is meant to serve, often in the same week. By the time a governance committee reviews it, three more versions exist somewhere else in the organization.

The failure is not that carriers are governing AI badly. It is that they are governing it as if it were software, when it’s actually infrastructure. Infrastructure requires a different discipline: Shared foundations, common standards and the assumption that everyone will build on top of it. You do not govern electricity by reviewing each appliance. You govern it by standardizing the grid.

Until carriers make that shift, their frameworks will keep maturing on paper while sprawl compounds underneath.

3 questions every insurance CIO should be able to answer

If the failure of traditional governance is a category mistake, the first job of leadership is to check which category they are actually operating in. These three questions are not meant to produce tidy answers. They are meant to reveal whether you are still governing AI as software when you should be governing it as infrastructure.

1. Are you governing AI at the perimeter, or at the foundation?

Look at your current AI governance artifacts, such as the policies, the committees, the review processes. Are they designed to wrap around individual tools after they are built, or to set shared standards that every tool must be built on top of? Perimeter governance asks, “is this specific model compliant?” Foundational governance asks, “does every model in this organization inherit the same definitions, the same lineage and the same guardrails by default?” If your governance only kicks in at review time, you’re still treating AI like software. You’re already behind.

2. If you standardized one thing across your entire organization tomorrow, what would create the most leverage and why haven’t you?

Every carrier has a list of things they know should be standardized but have not been. Shared definitions for core entities. Common ways of handling unstructured inputs. A single source of truth for how decisions get logged. The question is not which item belongs at the top of the list; most CIOs already know. The question is what has been blocking the standardization: Is it political, budgetary, or organizational? Because that blocker, whatever it is, is also what is letting sprawl compound. Governance frameworks cannot fix what foundational decisions have been deferred.

3. When a new AI initiative launches next quarter, what will it automatically inherit from what already exists?

This is the real test. In a point-solution world, every new system is built fresh and governance is applied afterward. In a foundational world, every new system inherits shared standards, shared definitions, shared oversight before a single line of code is written. If the honest answer is “it will inherit nothing, and we will govern it after the fact,” then you do not have an AI governance problem. You have an AI foundation problem, and no amount of policy will close the gap.

The uncomfortable truth is that most carriers will answer these questions honestly and discover they are still operating from the old playbook. It is a signal that the work to be done is not more governance, but different governance, the kind that assumes AI is the ground floor, not the top floor.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

오픈AI·앤트로픽, SI 영역 넘본다…엔터프라이즈 AI 경쟁 ‘구현 영역’으로

오픈AI와 앤트로픽은 합작 투자와 인수 협상을 통해 전문 서비스 영역으로 사업 범위를 확장하며, 기존 시스템 통합 기업이 맡아온 구현 역할에 한층 더 가까이 다가가고 있다.

로이터의 5일 보도에 따르면, 두 AI 기업과 연계된 합작사는 기업의 AI 도입을 지원하는 서비스 업체 인수를 논의해 왔으며, 이 가운데 오픈AI 측은 3건의 협상에서 상당한 진척을 이룬 것으로 알려졌다.

또한 기업 고객들이 생성형 AI를 실험 단계에서 실제 운영 환경으로 전환하는 과정에서, 엔지니어와 컨설턴트 인력을 확충하려는 움직임도 나타나고 있다.

한편 앤트로픽은 블랙스톤, 헬만앤프리드먼, 골드만삭스의 투자를 기반으로 새로운 엔터프라이즈 AI 서비스 기업 설립 계획을 발표했다. 이 회사는 중견 기업이 ‘클로드(Claude)’를 핵심 업무에 적용할 수 있도록 지원하는 것을 목표로 한다.

앤트로픽은 자사의 응용 AI 엔지니어들이 신설 기업의 엔지니어링 팀과 협력해 유즈케이스를 발굴하고, 맞춤형 시스템을 구축하며, 장기적으로 고객 지원을 수행할 것이라고 밝혔다.

서비스 확장 배경…엔터프라이즈 AI 주도권 경쟁 본격화

CIO들에게 이번 변화의 핵심은 AI 벤더가 기존 컨설팅 기업, 시스템 통합(SI) 업체, 매니지드 서비스 제공업체가 맡아온 역할을 점차 대체하고 있는지 여부다. 이번 흐름은 모델 기업들이 엔터프라이즈 AI 구현 과정에서 더 큰 주도권을 확보하려는 의지를 보여준다. 다만 대규모 구축 프로젝트에서는 여전히 SI 기업의 역할이 중요하다는 점도 함께 드러난다.

이 같은 움직임은 이미 많은 CIO들이 직면한 문제를 반영한다. AI 파일럿은 빠르게 시작할 수 있지만, 이를 보안과 안정성을 갖춘 운영 시스템으로 전환하는 데에는 수개월에 걸친 통합과 프로세스 작업이 필요하다.

컨설팅 기업 테크아크의 설립자이자 수석 애널리스트인 파이살 카우사는 “엔터프라이즈 IT 구축은 전통적으로 컨설팅이나 자문 중심으로 이뤄져 왔다”라며 “실질적인 수익이 발생하는 도입 속도를 높이기 위해서는 기존 엔터프라이즈의 프레임워크와 시장 진출 모델에 맞춰야 한다”고 설명했다.

이어 카우사는 “현재 AI 기업들은 가치 사슬의 최상단에 위치해 있으며, 단순한 IT 공급업체로 전락하기보다는 ‘주도권을 쥔 상태’를 유지하려 한다”고 분석했다.

IDC 아시아태평양 지역 AI·데이터 분석·데이터 부문 리서치 총괄 디피카 기리는 “이번 변화는 엔터프라이즈 AI 전반의 구조 재편으로 이어질 가능성이 있다”라며 “AI 모델 기업들이 플랫폼 공급자를 넘어 전체 AI 가치 사슬을 적극적으로 설계하는 방향으로 이동하고 있다”고 말했다. 이어 “구현, 컨설팅, 매니지드 서비스까지 확장함으로써 단순 기술 공급을 넘어 기업의 실제 성과에 더 밀접하게 관여하려는 전략”이라고 덧붙였다.

카우사는 일부 IT 서비스 기업들이 AI 도입에 신중한 태도를 보이는 이유로 기술의 불확실성과 역할 축소 가능성을 지목했다. 그는 “시장 진출 전략의 변화 속에서 AI 기업들이 주도권을 잡고 있다”고 평가했다.

도입 리스크는 낮추지만…‘락인’ 심화 우려

AI 모델 기업으로부터 직접 서비스를 도입하면 초기 구축은 한층 수월해질 수 있다.

카덴스 인터내셔널의 수석 부사장 툴리카 쉴은 “기업이 더 긴밀한 통합과 전문 인력 지원을 받을 수 있어 단기적으로는 구축 리스크를 줄일 수 있다”고 설명했다.

다만 이러한 편의성은 장기적인 부담으로 이어질 수 있다는 지적도 나온다.

쉴은 “모델부터 데이터 파이프라인, 워크플로우에 이르기까지 전체 스택 전반에서 의존도가 더욱 심화될 수 있다”라며 “시간이 지날수록 락인이 강화돼, 큰 혼란 없이 벤더를 교체하기 어려워질 수 있다”고 말했다.

카운터포인트 리서치의 부사장이자 파트너인 닐 샤는 “AI 모델 기업들은 사용량 기반 비즈니스 모델과 애플리케이션, 서비스 간 결합을 강화하며 기업 고객을 위한 ‘원스톱 서비스’ 제공자로 자리매김하려 한다”고 분석했다.

이어 “애플리케이션과 서비스 계층을 직접 통제하면 기업을 자사 생태계에 묶어둘 수 있을 뿐 아니라, 고객의 요구와 문제, 업무 방식까지 직접 이해해 모델 최적화에도 활용할 수 있다”고 설명했다.

IDC의 기리는 락인이 불가피한 것은 아니라고 진단했다. 다만 이를 피하기 위해서는 초기 단계에서의 전략적 설계가 중요하다고 강조했다.

기리는 “모듈형 아키텍처를 통해 모델 계층은 점차 추상화할 수 있지만, 락인을 피하려면 의도적인 설계 선택이 필요하다”라며 “그렇지 않으면 특정 모델뿐 아니라 데이터 파이프라인, 워크플로우, 거버넌스 프레임워크까지 포함한 전체 스택에 종속될 위험이 있다”고 말했다.

한편 이번 흐름은 엔터프라이즈 AI가 여전히 많은 구현 작업을 필요로 한다는 점도 보여준다.

쉴은 “생성형 AI 플랫폼은 강력하지만, 실제 비즈니스 프로세스를 지원하려면 기업 내부 데이터와 워크플로우, 거버넌스 시스템과의 깊은 통합이 필수적”이라며 “이는 모델 성능과 실제 현장 적용 사이에 간극이 존재한다는 것을 의미한다”고 짚었다.

이러한 변화는 CIO들이 단순히 어떤 AI 모델의 성능이 더 뛰어난지를 넘어서, 해당 모델이 기업 시스템에 적용된 이후 구현과 운영을 누가 주도할 것인지까지 함께 고려해야 함을 시사한다.
dl-ciokorea@foundryco.com

멀티클라우드 시대 AI 에이전트 관리 전쟁···MS·구글 전략 ‘온도차’

마이크로소프트(MS)와 구글은 기업 IT 조직이 기업 데이터에 접근하고 다양한 비즈니스 애플리케이션을 넘나들며 작업을 수행하는 도구에 대응할 수 있도록, AI 에이전트 통제 기능을 강화하고 있다.

MS는 5월 1일 기업 고객을 대상으로 ‘에이전트 365(Agent 365)’를 정식 출시했다. 이 서비스는 조직이 AI 에이전트를 탐색하고, 관리하며, 보안을 유지할 수 있도록 지원한다. 특히 MS 환경뿐 아니라 서드파티 SaaS, 클라우드, 온프레미스 등 다양한 환경에서 작동하는 에이전트까지 포괄하는 것이 특징이다.

구글은 4일 ‘워크스페이스(Workspace)’용 AI 컨트롤 센터를 발표했다. 해당 기능은 AI 사용 현황, 보안 설정, 데이터 보호 정책, 프라이버시 보호 기능 등을 중앙에서 통합적으로 확인할 수 있도록 하는 데 초점을 맞췄다.

이 같은 발표 시점은 기업 AI 활용 방식의 변화를 반영한다. 많은 기업이 더 이상 챗봇 테스트 단계에 머무르지 않고, 기업 시스템에 접근해 사용자를 대신해 업무를 수행하는 에이전트 도입을 본격화하고 있다.

이 변화는 CIO와 CISO가 기업 내 AI 에이전트를 바라보는 방식에도 영향을 미친다.

시장조사업체 포레스터의 수석 애널리스트 비스와짓 마하파트라는 “벤더들이 에이전트 통제를 신원, 접근, 데이터, 워크로드 관리와 함께 배치하면서 AI 거버넌스를 IT와 보안 조직이 공동으로 책임지는 운영 영역으로 자리매김시키고 있다”라며 “CIO 입장에서는 AI 에이전트를 다른 디지털 인력과 마찬가지로 관리해야 하며, 라이프사이클 관리와 비용 가시성, 서비스 관리 체계와의 통합이 필요하다”라고 설명했다.

CISO의 역할도 확대되고 있다. 기존의 모델 리스크나 데이터 유출 대응을 넘어, 자율성이 높아진 에이전트의 행동을 지속적으로 통제하고, 위험 발생 시 영향을 최소화할 수 있는 체계가 요구된다.

옴디아(Omdia)의 수석 애널리스트 리안 지에 수는 “AI 거버넌스가 모든 AI 기반 기업 애플리케이션의 핵심 구성 요소로 부상하고 있다”라며 “파일럿 단계를 넘어 전사적 도입으로 확대되는 과정에서, 거버넌스는 AI 구축 단계부터 필수적으로 포함돼야 한다”라고 강조했다.

MS와 구글의 차이점

MS의 ‘에이전트 365’와 구글의 AI 컨트롤 센터는 유사한 거버넌스 문제를 다루지만, 출발점은 서로 다르다.

옴디아의 수는 “기업들이 멀티클라우드와 하이브리드 IT 환경에서 AI를 점점 더 적극적으로 도입하고 있다는 점을 고려하면 두 접근 방식은 상호 보완적”이라며 “각각 자사 환경의 AI 워크로드에 최적화돼 있어 특정 벤더에 집중 투자한 기업일수록 네이티브 AI 거버넌스 경험이 훨씬 원활해질 것”이라고 설명했다.

포레스터의 마하파트라는 이러한 차이를 거버넌스 성숙도가 아닌 ‘플랫폼 범위’의 문제로 해석했다. MS는 AI 에이전트를 조직 전반에서 관리해야 하는 ‘기업 행위자’로 보는 반면, 구글은 협업 데이터와 사용자 콘텐츠 내에서 AI가 어떻게 작동하는지에 더 집중하는 경향이 있다는 분석이다.

마하파트라는 “두 접근 방식은 서로 다른 통제 영역을 다루기 때문에 완전히 경쟁 관계라고 보기는 어렵다”라면서도 “기업이 두 생태계를 동시에 표준으로 채택하지 않는 한 완전한 보완 관계라고 보기도 어렵다”라고 말했다. 이어 “시간이 지날수록 각 모델은 자사 생산성 및 데이터 플랫폼과 더욱 긴밀하게 결합되면서, AI 거버넌스 의사결정이 기업 아키텍처 전략이 아닌 특정 벤더 선택에 종속될 위험이 커질 수 있다”라고 덧붙였다.

파리크 컨설팅(Pareekh Consulting)의 CEO 파리크 자인은 보다 중립적인 시각을 제시했다. 자인은 “두 접근 방식은 보완적이면서 동시에 경쟁적 성격을 지닌다”라며 “특히 MS와 구글을 함께 사용하는 기업의 경우 AI 거버넌스가 각 벤더의 기반 플랫폼에 더욱 밀접하게 연결될 가능성이 있다”라고 분석했다.

남아 있는 리스크

새로운 통제 기능은 기업이 AI 에이전트를 보다 잘 파악할 수 있도록 돕지만, 섀도우 AI, 서드파티 통합, 자율적 행동에 대한 책임 문제 등 더 큰 리스크를 해소하지는 못한다는 분석이 나온다.

파리크 컨설팅(Pareekh Consulting)의 CEO 파리크 자인은 개발 도구, 브라우저 확장 프로그램, 로컬 어시스턴트, SaaS 코파일럿, 비인가 도구 연동 등을 통해 섀도우 AI 에이전트가 여전히 등장할 수 있다고 지적했다. 또한 서드파티 통합은 보안 검증 속도를 앞지르며 빠르게 확산될 가능성도 있다고 덧붙였다.

자인은 “감사 로그는 어떤 일이 발생했는지는 보여주지만, 자율형 에이전트가 왜 그런 행동을 선택했는지까지는 항상 설명하지 못한다”라고 말했다.

이로 인해 에이전트가 비즈니스나 보안 리스크를 유발하는 행동을 했을 때, 기업은 통제와 책임 소재를 둘러싼 어려운 문제에 직면하게 된다. 로그가 개선된다고 해서 책임이나 통제 문제가 자동으로 해결되는 것은 아니라는 의미다.

포레스터(Forrester)의 수석 애널리스트 비스와짓 마하파트라는 가장 큰 공백이 네이티브 플랫폼 외부에서 발생할 가능성이 높다고 지적했다. 로우코드 도구, 외부 API, SaaS 애플리케이션을 통해 생성된 섀도우 에이전트는 중앙 통제를 우회하고 과도하거나 상속된 권한으로 작동할 수 있다는 설명이다.

마하파트라는 “서드파티 통합은 에이전트의 활동 범위를 확장시키지만, 이후 발생하는 행동이나 데이터 전파에 대한 가시성은 동일한 수준으로 확보되지 않는 경우가 많다”라며 “여러 시스템을 거치며 연쇄적으로 작동하는 경우 감사 가능성도 균일하지 않아 의도와 결과를 구분하기 어렵고, 자율형 에이전트가 실질적인 비즈니스 또는 보안 영향을 초래했을 때 책임 소재 역시 여전히 불분명하다”라고 분석했다.

결국 MS와 구글이 제공하는 기본 통제 기능은 도움이 되지만, 전체 AI 에이전트 환경을 완전히 포괄하기는 어렵다는 것이 전문가들의 공통된 시각이다. 멀티클라우드, 다양한 SaaS, 개발 플랫폼, 브라우저 기반 AI 어시스턴트를 함께 사용하는 기업이라면 단일 벤더 콘솔을 넘어서는 거버넌스 체계를 별도로 마련해야 한다는 지적이다.
dl-ciokorea@foundryco.com

From AI investment to innovation: What it takes to deliver real business impact

As organizations continue to invest heavily in AI, many CIOs are still working to understand how those investments translate into measurable business impact. At the center of that challenge is a shift in how AI is approached, from isolated experimentation to enterprise-wide execution. In this conversation, Jeff Baker, Technology Managed Services Lead at PwC, shares how organizations can move beyond early-stage use cases and begin realizing meaningful outcomes.

custom

Jeff Baker, Technology Managed Services Lead at PwC

CIO.com: Many CIOs are investing in AI but haven’t necessarily seen a return on that investment yet. What does it take to move from investment to actual innovation?

Jeff Baker: A couple of things. I don’t think a lot of our clients are thinking big enough about the impact of AI and some of the possibilities that are out there. One of the things we’re encouraging them to do is move it out of that experimental phase or the back office or cottage industry and really start teaming up with the business directly to find more impactful ways to use the technology that have a business outcome, not just a cool technology showcase.

There are a lot of skunkworks projects out there that look fun but aren’t necessarily hitting the bottom line from an impact standpoint. The more we can team the AI engineers with people inside the business who are asking for the technology, the more you’re going to see meaningful outcomes.

CIO.com: You’ve said that AI requires structural change, not just experimentation. What’s the most important operational shift CIOs should make?

Jeff Baker: I think about AI in two basic categories. There’s what I call citizen-led AI. We’re getting a lot of really cool tools into the hands of people at firms, and they’re doing interesting things with it. They’re organizing their inboxes and creating chat programs that respond to RFPs, and other “day in the life” tasks.

On the other side, there are more durable, agentic-type models that have a lot more business impact but require more investment. That’s where strong teaming between IT and the business is important to define what the outcome should be.

There’s also a lot of sophistication that comes with that. Is it durable? Is it secure? Are you thinking about bias? How are you curating it? Who owns the ongoing management and observability of those agents once they’re deployed?

Security and data management become critical. The agents are only as good as the data they’re based on. In many cases, companies need to clean up their data before these agents can be effective. And finally, this should be collaborative. These agents are not isolated. They’re going to work across the organization with other humans and other agents to help drive outcomes.

CIO.com: You’ve said AI-driven managed services differ from traditional models. How so, and where do CIOs get it wrong?

Jeff Baker: The difference for us, what we call Managed Services 2.0, is that it’s AI-first. It’s focused on business outcomes.

It’s not just about deploying a team to work tickets and hit service levels. It’s about improving business outcomes over time. We’re seeing efficiency gains of about 20% in the first year and up to 50% over five years with clients who allow us to use AI appropriately.

Where it can get tricky is in how these services are purchased. In an RFP process, procurement teams often try to normalize key elements across vendors. But that can flatten the innovation that providers are trying to bring to the table.

CIO.com: Looking ahead 3 to 5 years, what will separate organizations that succeed with AI from those that remain stuck in pilot mode?

Jeff Baker: It comes down to focusing on the business outcomes. What are you trying to achieve with technology, people, and your organization?

And then, in some ways, you have to get out of the way of the agents. They think differently than humans do. I see too many companies trying to treat agentic systems like a traditional business process automation exercise.

Instead, you should focus those agents on outcomes and allow them to operate in the way they’re designed to. That’s where you’re going to see a bigger impact.

To learn more about PwC managed services, click here.

The inference imperative: Why running AI is harder than building it

Enterprises have made significant progress in building artificial intelligence capabilities. Access to models, tools, and platforms has expanded rapidly, lowering the barrier to entry for experimentation. Yet many organizations are discovering that building AI is only the first step. Running it at scale is where the real challenge begins.

The difficulty is not in creating models, but in operationalizing them.

As AI moves from pilot to production, it must integrate into complex enterprise environments. These environments include fragmented data systems, legacy infrastructure, and distributed workflows that were not designed to support AI-driven execution. What works in a controlled experiment often breaks down under real-world conditions.

Data is one of the most significant constraints. AI systems rely on consistent, high-quality, and context-rich data. In most enterprises, data is spread across multiple platforms and lacks a unified structure. Without a shared understanding of what data represents, models struggle to produce reliable outputs. More importantly, business teams cannot act on those outputs with confidence.

This challenge becomes more pronounced as organizations attempt to scale AI across use cases. Each new deployment introduces additional complexity, from data integration and governance to security and compliance. Without a strong foundation, these factors slow progress and increase operational risk.

Running AI also requires a different operating model. Traditional approaches to cloud and application management are often reactive, relying on manual processes and ticket-driven workflows. These models are not designed to support the continuous monitoring, iteration, and optimization that AI systems require.

Organizations that treat AI as an isolated capability often encounter friction at this stage. Models may perform well in testing, but struggle to deliver consistent value once deployed. This disconnect between development and operations limits the return on AI investments.

In contrast, organizations that succeed with AI focus on how it is run, not just how it is built. They align data, infrastructure, and operations around AI-driven execution. This includes creating unified data environments, embedding governance into workflows, and enabling real-time access to information.

Automation plays a critical role in this transition. Managing AI systems at scale involves monitoring performance, maintaining data quality, and responding to changing conditions. Embedding automation into these processes helps reduce manual effort and improve consistency. Over time, this enables organizations to operate AI systems more efficiently and with greater reliability.

The shift toward AI-first operating models is becoming more pronounced. In these environments, intelligence and automation are embedded into how systems are designed and operated. This allows organizations to move from reactive processes to more proactive and predictive operations. As a result, they can reduce operational overhead, improve delivery speed, and better support AI-driven innovation. 

This evolution is also being driven by increasing business expectations. Leadership teams expect AI to deliver measurable outcomes tied to efficiency, speed, and resilience. However, these outcomes depend on the ability to run AI effectively across the enterprise. Without the right operating model, even advanced AI capabilities will struggle to deliver consistent value.

At the same time, AI-native organizations are setting a new benchmark. They can deploy and scale AI more quickly because their environments are built with automation and integration at the core. This allows them to iterate faster and respond more effectively to changing conditions.

For established enterprises, the path forward requires a shift in focus. Building AI capabilities remains important, but it must be matched with investments in data foundations, operating models, and automation. This is what enables AI to move beyond experimentation and deliver real business outcomes.

The takeaway for CIOs and technology leaders is clear: the success of AI initiatives depends less on the models themselves and more on the systems that support them. Organizations that prioritize how AI is run will be better positioned to scale, adapt, and realize the full value of their investments.

Continue building your AI strategy with a practical, execution-focused framework. Check out the AI Action Playbook to about the five stages of enterprise AI maturity. 

Why the future of software is no longer written — it is architected, governed and continuously learned

We are entering a decade where software is no longer just an enabler of business — it is the primary mechanism through which intelligence is created, scaled and monetized across the enterprise.

For CIOs, this is not another technology cycle. This is a leadership inflection point.

Across boardrooms, investor discussions and strategic planning sessions, the conversation is shifting rapidly:

  • From How fast can we build software?”
  • To How intelligently can we design, govern and scale decision systems?”

This is a fundamental reframing of the CIO mandate.

The organizations that recognize this shift early will not just move faster — they will compound intelligence faster, creating asymmetric advantage in markets where speed alone is no longer sufficient.

The following perspective must therefore be read not as a technology trend, but as a strategic operating model shift for CIOs entering 2026 and beyond.

The next inflection point: Software development is no longer about code

Over the past two decades, software development has evolved through predictable phases — manual coding, agile acceleration, cloud-native scaling and DevOps automation. But as we enter 2026, that trajectory is no longer linear.

We are now witnessing a structural break.

Generative AI and agentic systems are not simply accelerating development — they are redefining the very nature of software creation, ownership and accountability.

This shift mirrors the broader transformation outlined in the CIO 3.0 paradigm, CXO 3.0: How intelligent leadership will redefine enterprise value, where technology leadership has moved from operating systems to architecting enterprise intelligence itself.

In software development, this translates into a fundamental question for boards, CIOs, CTOs, CISOs and chief AI officers (CAIOs): Are we still building software or are we now orchestrating intelligence systems that build themselves?

What makes this transition particularly consequential is that it is already happening quietly but decisively.

Across high-performing organizations:

  • AI-generated code is already contributing meaningfully to production systems
  • Development cycles are compressing from weeks to days — and in some cases, hours
  • Decision-making is increasingly embedded directly into software systems rather than layered on top

Yet, in many enterprises, governance, accountability and operating models have not kept pace.

This gap between capability acceleration and governance maturity is where both the greatest opportunity and the greatest risk now reside.

2 forces reshaping software development in 2026

1. AI across the full software development lifecycle (SDLC)

Generative AI has moved beyond coding assistance into end-to-end lifecycle orchestration, consistent with broader enterprise AI adoption trends where organizations are embedding AI across multiple functions (McKinsey State of AI: The state of AI in 2025: Agents, innovation and transformation):

  • Planning & Design → AI-driven requirements synthesis, architecture generation
  • Development → Code generation, refactoring, pattern enforcement
  • Testing → Autonomous test case creation and validation
  • Deployment → Intelligent CI/CD pipelines with adaptive optimization
  • Maintenance → Self-healing systems, anomaly detection, auto-remediation

The developer is no longer just a coder. The developer is becoming a curator of intent, constraints and outcomes.

The compression of the SDLC

What historically required:

  • Weeks of design
  • Months of development
  • Iterative testing cycles

Can now be orchestrated through multi-agent AI systems operating in parallel.

This introduces a new dynamic: Software development is no longer a sequential process — it is becoming a continuously adaptive system.

For CIOs, this means:

  • Traditional governance checkpoints may become bottlenecks
  • Legacy approval workflows may inhibit innovation velocity
  • Organizational design must evolve alongside technical capability

2. Intensifying competition in AI coding ecosystems

The competitive landscape is accelerating rapidly, particularly across ecosystems led by:

  • Microsoft (GitHub Copilot, Azure AI)
  • Google (Gemini, Vertex AI, developer tooling)
  • Apple (on-device AI, developer ecosystem integration)

Events like Google I/O and Microsoft Build are no longer just developer conferences—they are strategic battlegrounds for control over the future of software creation (Google I/O: Google I/O | Microsoft Build: Microsoft Build).

The stakes are clear:

  • Whoever controls the AI development stack controls the next generation of digital economies
  • Whoever defines the developer experience defines the innovation velocity of entire ecosystems

Platform gravity is becoming strategic gravity

The implication for CIOs is profound.

Choosing a development ecosystem is no longer a tooling decision — it is a strategic alignment decision that determines:

  • Data gravity
  • Talent alignment
  • Innovation velocity
  • Long-term vendor dependency

In effect: Your AI development platform choice is becoming your enterprise’s innovation ceiling.

From SDLC to IDLC: The rise of the Intelligent Development Lifecycle

Traditional SDLC frameworks are becoming obsolete.

In their place, a new paradigm is emerging: The Intelligent Development Lifecycle (IDLC)

This is not simply an evolution — it is a redefinition of how software is conceived, built and governed.

Key characteristics of IDLC:

  • Intent-driven development: Developers define what and why, not just how
  • Agentic execution: AI agents perform multi-step development tasks autonomously
  • Continuous learning loops: Systems improve based on real-time feedback and usage patterns
  • Embedded governance: Compliance, security and auditability are built into execution (NIST AI Risk Management Framework)
  • Decision-centric architecture: The primary output is not code — it is decision capability

IDLC as a leadership operating model

IDLC is not just a development methodology.

It is an enterprise operating model for intelligence creation.

It changes:

  • How teams are structured
  • How accountability is defined
  • How value is measured

For CIOs, adopting IDLC means shifting from:

  • Managing delivery pipelines
  • To governing decision supply chains

The emerging reality: Developers as intelligence orchestrators

As AI agents take over repetitive and even complex coding tasks, the developer role is undergoing a profound transformation.

From:

  • Writing code line by line
  • Debugging manually
  • Managing environments

To:

  • Designing system intent
  • Governing AI agents
  • Ensuring ethical and secure outcomes
  • Orchestrating multi-agent collaboration

This is not a reduction in developer relevance.

It is an elevation of developer responsibility.

Talent transformation is now a CIO priority

This shift introduces a critical challenge:

Most current developer skill models are not aligned to this future state.

CIOs must now proactively invest in:

  • AI-native engineering skills
  • Prompt and intent engineering
  • Model governance literacy
  • Cross-disciplinary collaboration

Because the future developer is not just technical — they are decision designers.

The CXO convergence: Why this is no longer just a CTO conversation

The transformation of software development is not confined to engineering teams.

It now sits at the intersection of four critical leadership domains, reflecting the broader evolution of CIOs into strategic business leaders shaping enterprise outcomes (State of the CIO: State of the CIO):

CIO: The intelligence architect

  • Aligns AI-driven development with enterprise strategy
  • Ensures scalability and integration across platforms
  • Drives value realization from software investments

CTO: The innovation orchestrator

  • Defines architecture patterns for AI-native development
  • Leads platform engineering and developer experience
  • Drives competitive differentiation

CISO: The trust enforcer

  • Ensures secure AI-generated code
  • Governs data lineage and model integrity
  • Mitigates risks from autonomous systems

CAIO: The intelligence governor

This convergence reflects a broader reality: Software development is no longer a technical function — it is an enterprise risk, value and governance function.

Introducing a new framework: SAFE-AI DevOps

To navigate this transformation, enterprises require a disciplined, Board-ready approach.

SAFE-AI DevOps Framework (Secure, Adaptive, Federated, Explainable AI Development Operations)

This is a next-generation operating model for AI-driven software development.

1. Secure by Design (S)

  • AI-generated code must meet zero-trust security principles
  • Continuous vulnerability scanning integrated into AI pipelines
  • Secure prompt engineering and model access controls

CISO-led mandate: Trust is the new runtime environment

2. Adaptive Intelligence (A)

  • Systems learn and evolve continuously
  • AI models adapt to changing requirements and environments
  • Feedback loops drive improvement across lifecycle

CIO-led mandate: Learning velocity is the new productivity metric

3. Federated Development (F)

  • Multi-agent collaboration across distributed environments
  • Integration across cloud, edge and on-prem ecosystems

CTO-led mandate: Scale innovation without losing control

4. Explainable Execution (E)

  • Every AI-generated decision must be traceable
  • Audit trails for code generation and deployment

CAIO-led mandate: Explainability is the new compliance baseline

5. AI-Native DevOps (AI)

  • Autonomous CI/CD pipelines
  • Predictive deployment optimization
  • Self-healing systems and automated incident response

Cross-CXO mandate: Automation is no longer optional — it is foundational

The competitive battlefield: Ecosystems, not tools

The next phase of competition is not about individual tools.

It is about ecosystem dominance, as hyper-scalers invest heavily in AI infrastructure, platforms and developer ecosystems (McKinsey Technology Strategy Insights: McKinsey Global Tech Agenda 2026).

Key battlegrounds:

  • Developer platforms
  • Model ecosystems
  • Data gravity
  • AI infrastructure

As highlighted in your CIO.com perspective, infrastructure itself is becoming a strategic intelligence decision, not just an operational one.

The risk dimension: AI-generated code is not inherently safe

While productivity gains are undeniable, risks are escalating:

  • Hallucinated code vulnerabilities
  • Licensing and IP violations
  • Model bias and ethical concerns
  • Regulatory exposure (EU AI Act, NIST AI RMF)

This creates a new category of risk: AI Development Risk

This requires structured governance aligned with emerging regulatory and risk frameworks (NIST AI RMF: AI Risk Management Framework).

Blockchain and quantum: The next convergence layer

As we move beyond 2026, two additional forces will reshape AI-driven development:

Blockchain

  • Immutable audit trails for AI-generated code
  • Smart contracts governing software execution

Quantum Computing

  • Breakthroughs in optimization and cryptography

Together with AI, they form a converging intelligence stack that will redefine software engineering, consistent with broader enterprise transformation trends toward intelligent systems.

Boardroom implications: What investors and directors must understand

The shift to AI-driven development is not just technical — it is financial.

Research shows AI delivers the greatest impact when integrated into enterprise strategy rather than siloed initiatives (BankInfoSecurity: C-Suite Leaders Must Rewire Businesses for True AI Value).

Key board-level questions:

  • How much of our software is AI-generated?
  • What governance exists for AI-generated decisions?
  • How do we ensure security and compliance at scale?
  • What is our dependency on external AI ecosystems?
  • How does this impact enterprise valuation?

Because the reality is: Software is no longer a cost center — it is a capital engine.

The new metrics: Measuring success in AI-driven development

Traditional metrics are insufficient.

Old metrics:

  • Lines of code
  • Development velocity
  • Bug counts

New metrics:

  • Decision throughput
  • AI-assisted productivity ratio
  • Model governance maturity
  • Security incident reduction
  • Time-to-intelligence (TTI)

The leadership mandate for 2026 and beyond

The transformation of software development demands a new leadership mindset.

Three defining mandates for 2026:

  1. Architect intelligence, not just applications
  2. Govern AI as an enterprise asset
  3. Align ecosystems with strategy

The future of software is a leadership decision

As we look ahead to 2026 and beyond, one reality becomes undeniable: The future of software development will not be decided by developers alone.

It will be shaped by:

  • CIOs who architect intelligence
  • CTOs who orchestrate innovation
  • CISOs who enforce trust
  • CAIOs who govern AI responsibly
  • Boards that understand the strategic implications

Because in this new era, code is no longer the product. Intelligence is. And the organizations that learn fastest will not just build better software — they will redefine entire industries.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

When AI writes code, it joins the software supply chain

AI tools designed to assist developers are no longer staying in the background. They are starting to shape what actually gets built and deployed.

They open pull requests.

They modify dependencies.

They generate infrastructure templates.

They interact directly with repositories and CI/CD pipelines.

At some point, this stops being assistance.

It becomes participation.

And participation changes the problem.

When assistance becomes participation

The shift from generative to agentic behavior is the inflection point.

Earlier tools operated inside a tight loop. A developer prompted. The system suggested. The developer reviewed. Nothing moved without human intent.

That boundary is eroding.

Newer systems propose changes, update libraries, remediate vulnerabilities and interact with development pipelines with limited human intervention. They don’t just accelerate developers. They begin to shape the artifacts that move through the software supply chain — code, dependencies, configurations and infrastructure definitions.

That makes them something different.

Not tools.

Participants.

And once something participates in the supply chain, it inherits the same question every other participant does:

How is it governed?

A simple scenario

Consider a common pattern already emerging in many environments.

An AI system identifies a vulnerable dependency.

It opens a pull request updating the library.

A workflow triggers automated tests.

The change is promoted into a staging environment.

Four steps.

No human review.

No explicit governance checkpoint.

Each step is individually valid. Nothing looks wrong in isolation.

But taken together, they create something fundamentally different: A system that can change enterprise software without human intent being re-established at any point. Research from Black Duck found that while 95% of organizations now use AI in their development process, only 24% properly evaluate AI-generated code for security and quality risks.

This is autonomous change propagation across the software supply chain.

The “human-in-the-loop” fallacy

Many organizations rely on a “human-in-the-loop” (HITL) requirement as a safety mechanism for AI-generated code.

At low volumes, this works.

At scale, it breaks.

When an AI system generates dozens of pull requests in a short window, review becomes a throughput problem, not a control. The cognitive load of validating machine-generated logic exceeds what a human can realistically govern.

What remains is not oversight, but a checkpoint.

And checkpoints without effective review are not controls.

The governance gap

Most governance models assume a stable truth: Humans are the primary actors.

Controls tied identity to individuals, approvals to intent and audit trails to accountability.

Even automation systems are treated as extensions of human intent — predictable, bounded and deterministic.

AI systems break that model.

They can generate new logic, act on it and propagate changes across systems. Yet in most environments, they are still governed as if they were static tools.

That mismatch is the gap.

Machine identity is no longer what it was

One way to see this clearly is through identity.

Every interaction an AI system has — repository access, pipeline execution, API calls — requires credentials. In practice, these systems operate as machine identities.

But they are not traditional machine identities.

A service account executes predefined logic. Its behavior is known in advance. Its risk is bounded by what it was configured to do.

An AI-driven system is different. It generates the logic it then executes.

It can propose new code paths, interact with new systems and trigger actions that were not explicitly predefined at the time access was granted.

That is a category change.

Not just a new identity type, but a new attack surface: Identities that can generate the behavior they are authorized to execute.

The World Economic Forum has identified this class of non-human identity as one of the fastest-growing and least-governed security risks in enterprise AI adoption.

Measuring exposure before solving it

Most organizations already track access-related metrics. Those metrics were designed for human-driven systems.

They are no longer sufficient.

If AI systems are participating in the software supply chain, organizations need to measure where and how that participation introduces risk.

A few signals matter immediately:

  • AI-generated artifact footprint: What portion of code, dependencies or infrastructure definitions in production originates from AI-assisted processes?

  • Authority scope of AI systems: What systems can these identities access — and what actions can they take across repositories and pipelines?

  • Autonomous change rate: How often are changes introduced and propagated without explicit human review?

  • Cross-system interaction surface: How many systems does a single AI workflow touch as part of normal operation?

  • Auditability of AI-driven actions: Can changes be traced cleanly to a system, workflow and triggering context?

These are not abstract concerns. They are measurable.

And until they are measured, they are not governed.

The regulatory imperative

This is not just a technical shift. It is a governance and liability shift.

As regulatory expectations evolve — from AI accountability frameworks to cybersecurity disclosure requirements — organizations are increasingly responsible for explaining and controlling automated decisions inside their environments.

If an AI-driven change introduces a vulnerability or leads to a material incident, “the system generated it” will not be an acceptable answer.

Accountability will still sit with the enterprise.

That raises the bar: Governance must extend to how autonomous systems act, not just how they are accessed.

The architecture gap

Diagram of the AI architecture governance gap
AI systems operate horizontally across systems, while governance remains vertical

Puneet Bhatnagar

The issue is not that any one control is missing.

It is that AI systems operate across the seams of systems designed to govern within their own boundaries.

Repositories enforce code controls.

Pipelines enforce deployment controls.

Identity systems enforce access controls.

Security tools enforce policy checks.

Each works as designed.

But AI systems move across all of them.

They read from one system, generate changes, trigger another and influence a third. Authority is exercised across systems, while governance remains within them.

That is the architectural gap.

A different governance model

Most organizations will respond to this shift by trying to extend existing access controls. That instinct is understandable — and insufficient.

The problem is no longer just who or what can access a system. It is how control is maintained when authority can generate new actions dynamically.

This requires a different model of governance.

One that treats software systems as actors whose behavior must be bounded, observed and continuously evaluated across workflows — not just permitted or denied at a point of access. Governance becomes less about static permissions and more about controlling the shape and impact of actions across systems.

That is the shift.

Conclusion

The conversation around AI in software development often focuses on productivity.

But as AI systems begin to participate in producing and modifying enterprise software, the more important question becomes governance.

AI is not just accelerating the software development lifecycle. It is becoming part of the software supply chain itself.

And that changes the problem.

The challenge for CIOs is no longer just managing developers, tools or pipelines. It is understanding and governing the authority that software systems exercise across them.

Because in a world where software can act on behalf of the enterprise, governance is no longer just about access.

It is about authority — what systems are allowed to do, and how that authority is controlled and measured over time.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Los directores de sistemas de información alertan de que la escasez de talento está frenando la IA en las empresas

La escasez de experiencia ha frenado las iniciativas de IA en muchas organizaciones, pues la limitación de conocimiento de la tecnología ha restringido la capacidad de los profesionales para hacer realidad el potencial de la IA.

Según la encuesta State of the CIO 2026 de CIO.com, la falta de talento interno fue el principal reto al que se enfrentaron los equipos de TI a la hora de implementar estrategias de IA durante los últimos 12 meses, según el 40% de los encuestados.

Ha Hoang, directora de sistemas de información (CIO) del proveedor de resiliencia cibernética de Commvault, sostiene que la escasez es especialmente grave en los puestos situados en la intersección entre la IA y la ciberseguridad. A su juicio, las empresas de ciberseguridad necesitan personas capaces de comprender los datos y las operaciones, y de traducir los conocimientos sobre riesgos en decisiones empresariales, afirma.

Es más, considera que los proveedores como Commvault también necesitan ingenieros y analistas que sepan cómo proteger los modelos de IA, salvaguardar los datos de entrenamiento y detectar amenazas relacionadas con la IA, como la inyección de prompts y el envenenamiento de modelos.

Hoang tiene claro que, “a medida que la automatización impulsada por la IA transforma las operaciones de TI y seguridad, los directores de sistemas de información (CIO) y los directores de seguridad de la información (CISO) necesitarán profesionales capaces de interpretar, ajustar y gestionar los sistemas de IA, no solo de supervisar alertas”. Por eso cree que “necesitaremos menos especialistas aislados y más perfiles generalistas con dominio de la IA que puedan evolucionar al mismo ritmo que la tecnología”.

Se necesita una gran experiencia

Parte del problema es la escasez de personas que comprendan el potencial de la IA y puedan predecir hacia dónde se dirigen las tecnologías de IA, añade Anand Srinivasan, director de estrategia de o9 Solutions, proveedor de una plataforma de planificación empresarial basada en IA.

En su opinión, “el reto no es simplemente una escasez de expertos en IA, sino una brecha estructural más profunda entre cómo están organizadas las empresas y lo que permite la IA moderna”. Es más, considera que “la mayoría de las grandes organizaciones siguen operando mediante modelos de toma de decisiones jerárquicos y compartimentados, diseñados para la estabilidad y la escala, no para la velocidad y la adaptabilidad”.

Hay que considerar que la brecha de conocimientos más crítica no se encuentra sólo en la creación de sistemas de IA, sino también en replantear cómo se toman y ejecutan las decisiones en toda la empresa, según Srinivasan; y añade que la IA puede propiciar grandes cambios en la agilidad y la adaptabilidad, pero sólo si las capacidades de toma de decisiones permiten a las organizaciones convertir la estrategia en acción más rápidamente y con menos riesgo.

Como ejemplo cita a Srinivasan cita a Wayne Gretzky, leyenda del hockey sobre hielo, para ilustrar el problema: “Patinad hacia dónde va el disco, no hacia donde ha estado”. El disco de la IA se mueve muy rápido, señala, y la experiencia en IA es un objetivo en constante movimiento.

En su opinión, “las habilidades en aprendizaje automático tradicional están siendo rápidamente desplazadas por las necesidades de IA generativa, IA agente y gobernanza de la IA”. A lo que añade: “Los trabajadores con habilidades en IA obtienen ahora importantes primas salariales respecto a sus compañeros en los mismos puestos que carecen de esas capacidades”.

Más allá de los retos que plantea una tecnología en rápida evolución, existe un problema de conocimientos superficiales sobre IA, tal y como cree AJ Sunder, director de sistemas de información y director de producto de Responsive, proveedor de software de gestión de respuestas estratégicas. Es más, este experto sugiere que hay mucha gente disponible que tiene algunos conocimientos de IA, pero a muchos les falta una comprensión más profunda de cómo implementarla para satisfacer las necesidades empresariales.

Y dice lo siguiente: “Sin duda, hay escasez de personas capaces de crear sistemas de IA fiables, seguros y escalables para entornos de producción», añade. «Esta abundancia de talento con conocimientos de IA, combinada con la escasez de personas capaces de traducir eso en aplicaciones de IA funcionales, crea un enorme problema a la hora de filtrar el ruido”.

No tiene reparos en reconocer que para Responsive ha sido un reto encontrar trabajadores con ese nivel de experiencia, pero la empresa ha tenido la suerte de encontrar algo de talento externo.

Y la cosa no acaba aquí: “El tipo de problemas de IA que resolvemos requiere experiencia en el manejo de contenido a gran escala, con todas las complejidades que entrañan los desordenados datos empresariales. No hay demasiadas personas con la experiencia suficiente para resolver el tipo de problemas que abordamos a la escala a la que lo hacemos”, apostilla.

Formación práctica

Responsive ha priorizado la formación interna para desarrollar la experiencia dentro de la empresa, con equipos internos impulsando las iniciativas formativas, tal y como explica Sunder. La empresa centrada en la IA partía con cierta ventaja, ya que se había enfocado en esta tecnología antes de la ola actual.

Y añade: “Hemos tenido la suerte de contar con personas con talento que reconocieron rápidamente el ritmo de la IA y el valor del aprendizaje práctico, la experimentación, el ensayo y error, y el desaprendizaje para adquirir nuevos conocimientos. Eso permitió que todos, colectivamente, aprendiéramos, compartiéramos y nos enseñáramos unos a otros”.

Este experto admite que la empresa también forma equipos emparejando a especialistas en IA con expertos del ámbito de negocio, en lugar de colocarlos en grupos aislados. No en vano, Responsive también ha invertido de forma agresiva en herramientas de IA que permiten que un grupo más amplio de ingenieros contribuya a funciones impulsadas por IA sin necesidad de tener una formación profunda en aprendizaje automático.

Así que, como afirma, “no es necesario que todo el mundo sea un experto en IA desde el principio”.

Incluso cuestiona la necesidad de más programas externos de formación en IA, alegando que quizá ya existan demasiados.

“Es necesaria cierta formación estructurada para que la mayoría, si no todos, los miembros del equipo alcancen un nivel básico de conocimientos, y eso ya existe. Más allá de eso, el aprendizaje no estructurado, los ejercicios prácticos y la creación de soluciones útiles que vayan más allá de los tutoriales de hello world son mucho más eficaces que cualquier programa de formación de larga duración. Esto se debe principalmente a la rapidez con la que evolucionan las cosas”, dice para concluir su intervención en este reportaje.

Commvault también se centra en métodos de formación interna y en el reciclaje profesional de los empleados actuales, reconoce Hoang. Incluso está explorando colaboraciones con universidades y bootcamps de ciberseguridad.

Por eso dice que “las habilidades más difíciles de encontrar son aquellas que combinan los fundamentos de la seguridad con la gobernanza de modelos de IA o las herramientas de automatización. Muchos profesionales dominan un aspecto de la ecuación, pero no ambos».

También considera que las empresas también deben ser flexibles en cuanto a su visión de la experiencia en IA.

Y para acabar, Hoang deja esta reflexión: “Muchas organizaciones siguen basándose en descripciones de puestos rígidas que dan demasiada importancia a los años de experiencia o a certificaciones específicas, mientras que los candidatos tienen habilidades transferibles pero carecen del título exacto o de experiencia concreta con determinadas herramientas. Los directores de sistemas de información con visión de futuro están replanteándose el proceso de selección, dando prioridad a la capacidad y a la mentalidad de aprendizaje frente a una experiencia limitada”.

It took 4 years to master ‘The Knowledge.’ AI just collapsed it in a software update

In London, becoming a licensed cab driver used to require passing an exam called “The Knowledge.” Candidates spent three to four years memorizing 25,000 streets, 100,000 landmarks and thousands of optimal routes. Neuroscience researchers at University College London found that cabbies who passed had measurably enlarged hippocampi from the cognitive load.

GPS made the entire achievement irrelevant in a single software update. Not gradually. Not partially. A driver on their first day with a nav app could match a cabbie who had studied for four years. The skill did not get cheaper. It stopped mattering.

That same structural collapse just happened to cyberattack expertise.

The skill floor fell through the floor

For two decades, the most dangerous attack techniques were gated by skill and time. Adversary-in-the-middle phishing, polymorphic malware, living-off-the-land scripting, autonomous exploit development — nation-state groups ran these operations because they alone had practitioners who could execute them.

AI removed the gate. The same way GPS never taught anyone cartography — it made cartography optional.

IBM X-Force quantified one dimension: AI generates convincing phishing lures in five minutes versus sixteen hours for an experienced human operator. That’s a 192x reduction in time cost for a single task. Multiply it across reconnaissance, lure generation, payload evasion and exploit development, and you get a capability transfer from specialized actors to anyone motivated enough to open a Telegram channel. CrowdStrike’s 2026 Global Threat Report documented the result: An 89% year-over-year surge in AI-augmented attacks, alongside a 29-minute average eCrime breakout time — 65% faster than 2024.

Three techniques show how completely the collapse ran.

Adversary-in-the-middle phishing once required an operator who understood reverse proxy architecture, SSL certificate management and session token mechanics. Platforms like Tycoon 2FA packaged all of that into a browser dashboard with tiered pricing and customer support. The required skill dropped to “credit card and intent.” The result: 40,000 AiTM incidents daily across Microsoft environments, and 84% of compromised accounts had MFA enabled. The authentication was genuine. The theft happened after it succeeded.

AI spear phishing once required a skilled analyst spending two to four hours per target. AI automated the entire pipeline — LinkedIn scraping, lure generation, style-matching — producing messages with zero grammatical errors that reference real projects and mimic specific colleagues. A 2025 campaign targeted 800 accounting firms simultaneously with emails referencing each firm’s specific state registration details and hit a 27% click rate. Running 800 firm-specific, research-backed campaigns at once was previously not operationally feasible below nation-state level.

Autonomous exploit development may be the starkest case. Anthropic’s Mythos model demonstrated fully autonomous discovery and exploitation of unknown vulnerabilities — independently finding a 17-year-old remote code execution flaw in FreeBSD’s NFS server that human researchers had missed for years. Cost: under $20,000. That replaced months of nation-state research effort.

Eight major attack categories show the same pattern across 2025 and 2026 data. The skill that gated each attack stopped being required.

The auto-tune problem

Auto-tune didn’t make singers cheaper to hire. It made pitch control irrelevant. A tone-deaf performer with the plugin produces the same output as a conservatory graduate. The listener cannot tell the difference.

That’s the detection problem in one sentence.

Traditional defenses work by finding a signal: A known malicious hash, a grammar error in the lure, a failed authentication attempt. AI lets attackers strip those signals out. AiTM removes failed logins. AI-generated lures remove grammatical errors. Polymorphic malware removes stable code signatures. Automated reconnaissance removes advance warning entirely — it runs in public data sources the target cannot monitor.

The attack that succeeds now is the one designed to look completely normal. Pattern-matching fails when the patterns have been intentionally removed.

The architecture was built for a world that no longer exists

The defense stack most organizations run rests on three assumptions that held for two decades and are now false.

First, that sophisticated attacks are rare. They’re not — volume now scales to commodity levels. Second, that attacks contain detectable quality signals. They don’t — the absence of awkward phrasing or mismatched domains isn’t exculpatory. It’s the attack working as designed. Third, that human investigation speed is fast enough. A 29-minute breakout time and a 21-second average time-to-click leave no margin for a 15-minute triage cycle.

These weren’t bad assumptions when architects made them. But the architecture built on top of them doesn’t degrade gracefully when they fail. It fails structurally.

What still works — and why

The controls that survive share one trait: They depend on properties attackers cannot strip from the signal.

FIDO2 security keys bind authentication cryptographically to the legitimate origin domain. When an AiTM proxy intercepts the flow, the challenge comes from the proxy’s domain. The key refuses to sign. No AI-generated polish changes the domain mismatch at the cryptographic layer. Deploy it for all privileged accounts and disable fallback to phishable MFA methods — Proofpoint has already documented FIDO2 downgrade attacks in Microsoft Entra.

But hardware controls address only the front door. The deeper fix is a different detection philosophy: Reasoning about what the attacker is trying to accomplish rather than what the attack looks like. In January 2026, a mid-market financial firm caught an active AiTM operation before any payment moved. Their pipeline correlated an email click, a new-IP authentication and an inbox rule creation within a 90-second window — flagging the sequence as a single credential-theft operation. Their legacy email gateway evaluated the same email and generated no alert. SPF, DKIM and DMARC all passed. The link resolved to a legitimate SharePoint domain. The difference wasn’t a better product. It was a better question: One system asked what the email looked like; the other asked what the attacker was trying to accomplish.

That’s the architecture shift — from “does this match a known threat pattern” to “is this sequence of actions consistent with credential theft, regardless of what the initial email looked like.” Most SOCs present those as four unrelated alerts triaged by different analysts. The attacker’s operational logic is more coherent than the defender’s detection pipeline.

The capability transfer is permanent

London didn’t rebuild its transportation system assuming most drivers still couldn’t navigate. It accepted the collapse and adapted. The cabbies who survived stopped competing on memorization and shifted to what GPS couldn’t replicate: Judgment, local knowledge, reading the situation in real time.

The security equivalent is the same pivot. Stop competing on pattern recognition — the skill AI just made irrelevant for both sides — and shift to what attackers cannot automate away: Understanding what normal looks like inside your specific organization, connecting signals across kill chain stages, and reaching a verdict at machine speed.

The Knowledge took four years to master. One software update made it obsolete. The question for security leaders isn’t whether the same thing happened to APT tradecraft. The data says it did. The question is whether your architecture still assumes it didn’t.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

에이전틱 AI 확산 속 ‘파견형 엔지니어’ 의존 심화…기업 부담 커진다

금융 기술 기업 FIS가 화요일 금융 범죄 탐지용 신규 AI 에이전트를 개발했다. 이 에이전트는 앤트로픽이 자체 개발한 커넥터와 템플릿을 기반으로 구축됐으며, 개발 과정에서 앤트로픽의 파견형 엔지니어(FDE) 팀이 내부에 투입됐다.

기업 CIO들은 자체 데이터 품질 문제와 AI 모델 활용의 복잡성으로 인해, AI 벤더의 FDE(Forward Deployed Engineer) 즉 파견형 엔지니어 서비스에 점점 더 많은 비용을 지불하고 있다.

다만 이러한 팀을 어떤 방식과 목적로 도입하느냐에 따라, 기업이 AI 역량을 한 단계 끌어올릴 수 있을지 아니면 끝이 없는 컨설팅 비용 구조에 묶이게 될지가 갈린다.

FIS는 캐나다 몬트리올은행(BMO)과 아말가메이티드 은행을 해당 에이전트의 첫 도입 기업으로 공개했다. 이 에이전트는 은행 핵심 시스템 전반에서 데이터를 수집해 자금세탁 방지 조사 시간을 수시간에서 수분으로 단축하고, 가장 위험도가 높은 사례를 선별해 제공하며 모든 의사결정 과정에 대한 감사 가능성과 추적성을 확보한다.

FIS는 4일 보도자료를 통해 “앤트로픽의 응용 AI(Applied AI) 팀과 FDE가 함께 금융 범죄 AI 에이전트를 공동 설계하고 있으며, FIS가 향후 독립적으로 추가 에이전트를 구축·확장할 수 있도록 지식 이전도 진행하고 있다”라고 밝혔다.

뉴욕 기반 기술 컨설팅 기업 트라이베카 소프트텍의 최고전략책임자 아만 마하파트라는 유사한 AI 벤더 협업을 평가할 때 비용 흐름을 면밀히 살펴야 한다고 조언했다.

마하파트라는 “FIS와 앤트로픽 모델에서 구조적으로 가장 중요한 부분은 실제로 FDE 비용을 누가 부담하느냐”라며 “이는 CIO들이 반드시 던져야 할 질문이지만 대부분 간과하고 있다”라고 지적했다.

가트너의 수석 디렉터 애널리스트 알렉스 코케이루의 최근 보고서에 따르면, FDE 비용은 일부 AI 프로젝트를 위태롭게 만들 수 있다. 코케이루는 “2028년까지 기업의 70%가 높은 벤더 비용과 내부 역량 부족으로 인해 FDE 중심 협업에서 구축된 에이전틱 AI 솔루션을 포기하게 될 것”이라고 전망했다.

소프트웨어가 아닌 ‘서비스’

이 문제는 전적으로 AI 벤더의 책임만은 아니라는 지적이다. 많은 IT 조직이 데이터를 정제하고 AI 활용에 적합하도록 만드는 사전 준비를 충분히 하지 않고 있으며, 조직 내부의 정치적 역학과 개인 간 이해관계도 중요한 변수로 작용한다.

코케이루는 보고서에서 “FDE 성공에 가장 중요한 도메인 전문가일수록 이를 방해할 유인이 가장 크다”라며 “자신의 전문성이 에이전틱 자동화로 흡수된다고 인식한 전문가는 실제 업무 프로세스가 아닌 형식적인 절차만 제공하고, 그 결과 해당 기반으로 구축된 AI 에이전트는 의도적으로 누락된 예외 상황에서 실패하게 된다”라고 분석했다.

이어 “여러 차례 배포 이후에도 FDE 투입 규모가 줄지 않는다면 이는 역량이 아니라 의존성이 형성됐다는 신호”라며 “활용 사례가 성숙해져도 투입 노력이 감소하지 않는다면 기업은 스스로 운영해야 할 영역에 컨설팅 비용을 지불하고 있는 것”이라고 지적했다.

FIS와 앤트로픽 협업 사례에 대해 마하파트라는 “BMO와 아말가메이티드 은행이 분기별 컨설팅 비용 형태로 앤트로픽의 FDE에 직접 비용을 지불하는 구조가 아니다”라며 “FIS가 FDE 비용을 흡수해 전체 은행 고객군에 분산시키는 방식”이라고 설명했다.

이어 “각 은행이 개별적으로 엔지니어링 팀을 구성해 동일한 컨텍스트 경계, 섀도 자율성 통제, 탈옥(jailbreak) 저항 테스트를 반복 수행하는 방식보다 훨씬 경제적인 구조”라고 평가했다.

마하파트라는 이러한 문제의 상당 부분이 생성형 AI와 에이전틱 AI의 마케팅 방식에서 비롯됐다고 지적했다. “AI를 통해 더 적은 인력으로 더 많은 일을 할 수 있다는 초기 ROI 논리는 규제가 엄격한 금융 업무 환경에서는 현실과 맞지 않는 메시지였다”라고 말했다.

보안 AI 연합(CoSAI) 회원이자 ACM AI 보안 프로그램(AISec) 위원인 닉 케일은 FIS의 발표를 두고 “최첨단 AI가 아직 제품 단계에 이르지 못했음을 인정한 것”이라고 평가했다. 이어 “CIO들은 소프트웨어를 구매한다고 생각했지만 실제로는 전문 서비스 계약을 체결하고 있는 것”이라며 “이는 모든 기업 AI 도입에서 비용 구조, 의존성 구조, 거버넌스 모델을 바꾸는 요소”라고 설명했다.

케일은 발표 문구 자체가 에이전틱 전략의 방향성을 보여준다고 분석했다. “FIS는 모든 에이전트 의사결정이 추적 가능하고 감사 가능하다고 밝혔는데, 이는 사실이지만 핵심 질문은 아니다”라며 “진짜 어려운 문제는 에이전트가 어떤 결정을 내렸는지를 검증하는 것이 아니라, 애초에 어떤 결정을 맡길 것인지 정하는 것”이라고 짚었다.
이어 “은행은 수십 년간 의사결정 권한 체계를 구축해 왔지만, 외부 엔지니어가 만든 에이전트 구조에는 이를 그대로 적용하기 어렵다”라고 덧붙였다.

또한 “FDE 팀이 철수한 이후에도 조직이 에이전틱 워크플로를 운영하고, 모니터링하며, 문제를 제기하고, 안전하게 수정할 수 있는지가 CIO의 핵심 판단 기준”이라며 “그렇지 않다면 이는 성공적인 구축 프로젝트일 수는 있어도 아직 기업 역량이라고 보기는 어렵다”라고 강조했다.

컨설팅 기업 액셀리전스의 CEO이자 전 맥킨지 북미 사이버보안 책임자였던 저스틴 그라이스 역시 이 같은 견해에 동의했다.

프로세스로 위장된 인간 판단

그라이스는 “진짜 위험은 비용이 아니라 의존성”이라며 “수십만 달러를 들여 시스템을 운영 환경에 올리는 것 자체는 문제가 아니다”라고 말했다. 이어 “문제는 해당 시스템을 벤더만이 운영하거나 확장할 수 있고, 심지어 완전히 이해할 수 있는 구조가 되는 순간부터 발생한다”라고 지적했다.

일부 컨설팅 구조의 문제는 IT 역량 부족을 가리는 데 있는 것이 아니라, AI 도입 과정에서 ‘지름길’을 허용한다는 점이다.

그레이하운드 리서치의 수석 애널리스트 산치트 비르 고기아는 “FDE에 비용을 지불하는 구조는 에이전틱 AI의 ROI 자체를 훼손하는 것이 아니라, 단순화된 ROI 논리를 무너뜨리는 것”이라며 “이 차이는 매우 중요하다”라고 말했다.

이어 “지난 2년간 기업 AI는 지나치게 깔끔한 인력 절감 스토리로 포장돼 왔다. 모델을 도입하고, 업무를 자동화하고, 인력을 줄여 비용을 절감한다는 식의 접근은 이사회에는 매력적으로 보일 수 있지만 현실을 충분히 반영하지 못한다”라고 설명했다.

고기아는 “대기업은 자동화를 기다리는 정형화된 업무의 집합이 아니라, 예외 상황, 레거시 시스템, 취약한 통합 구조, 접근 통제, 문서화되지 않은 임시 대응, 규제 요구, 그리고 ‘프로세스로 위장된 인간 판단’이 얽힌 복잡한 구조”라며 “FDE는 AI를 실제로 작동하게 만들기 위한 비용 청구서에 가깝다. 이는 혁신이 아니라 더 정교해진 의존성”이라고 강조했다.

또 다른 FDE 관련 우려는 이해 상충 가능성이다. 복잡성을 해결하기 위해 비용을 받는 AI 벤더가, 동시에 그 복잡성의 상당 부분을 만들어낸 주체일 수 있다는 점이다.

프리랜서 기술 분석가 카미 레비는 이러한 비즈니스 구조가 기업의 목표를 저해할 수 있다고 지적했다. 레비는 “AI 에이전트가 조직 전반에서 고도화된 워크플로를 자율적으로 생성·배포·운영하는 것이 목표라면, 기존에 높은 수익을 창출해 온 유지보수 계약 모델과 충돌할 수 있다”라고 말했다. 이어 “고객과 함께 에이전트를 구축하기 위해 FDE를 지속 투입해야 한다면, 장기적인 지원이 필요 없는 수준까지 에이전트를 고도화할 유인이 과연 존재하는지 의문”이라고 덧붙였다.

또한 “FDE 중심 비즈니스 모델이 초기 모델 설계에까지 영향을 미칠 수 있으며, 지속적인 FDE 지원이 필요하도록 AI 플랫폼이 의도적으로 설계됐을 가능성도 있다”라고 분석했다.
dl-ciokorea@foundryco.com

US government agency to safety test frontier AI models before release

The Center for AI Standards and Innovation (CAISI), a division of the US Department of Commerce, has signed agreements with Google DeepMind, Microsoft, and xAI that would give the agency the ability to vet AI models from these organizations and others prior to their being made publicly available.

According to a release from CAISI, which is part of the department’s National Institute of Standards and Technology (NIST), it will “conduct pre-deployment evaluations and targeted research to better assess frontier AI capabilities and advance the state of AI security.”

The three join Anthropic and OpenAI, which signed similar agreements almost two years ago during the Biden administration, when CAISI was known as the US Artificial Intelligence Safety Institute.

An August 2024 release about those agreements indicated that the institute planned to provide feedback to both companies on “potential safety improvements to their models, in close collaboration with its partners at the UK AI Safety Institute (AISI).”

Microsoft said Tuesday in a blog about the latest agreement that it, and others like it, are essential to building trust and confidence in advanced AI systems. As AI capabilities advance, it said, so too must the rigor of the testing and safeguards that underpin them.

A shift toward proactive security

Fritz Jean-Louis, principal cybersecurity advisor at Info-Tech Research Group, said the CAISI agreements signal a shift toward proactive security for agentic AI by enabling government-led testing of advanced models before and after deployment.

This should, he said, “help strengthen visibility into autonomous behaviors while accelerating the development of standards to mitigate risks. By combining early access, continuous evaluation, and cross-sector collaboration, the initiative pushes the industry toward security-by-design for increasingly autonomous AI systems.”  

However, added Jean-Louis, “there are a few potential hurdles to consider, for example: how would intellectual property be protected under this approach? Regardless, I believe this is a positive step for the industry.”

Executive order ‘taking shape’

Following the announcement from CAISI, a published report on Wednesday indicated that the White House is on the verge of preparing an executive order that would see the creation of a vetting system for all new artificial intelligence models, key among them Anthropic’s Mythos.

Bloomberg reported, “the directive is taking shape weeks after Anthropic revealed that its breakthrough Mythos model was adept at finding network vulnerabilities and could pose a global cybersecurity risk.”

Significant change in policy direction

Carmi Levy, an independent technology analyst, said, “it is patently obvious that this week’s announcement that establishes the Center for AI Standards and Innovation as the testing ground for frontier AI models is directly linked to the potential executive order that would lead to a vetting system for AI models.”

It isn’t coincidental, he said, “that the announcements were made in rapid succession, and it reinforces the growing urgency for governments in the US and elsewhere to tighten partnerships with key AI vendors to maximize AI-related security and minimize the potential for systemic risk.”

This latest flurry of activity from Washington marks a significant shift in policy direction from an administration that up until recently had been following a more laissez-faire approach to regulation, Levy pointed out.

Concerns around Anthropic’s Claude Mythos model, and the relative ease with which it could discover and exploit vulnerabilities in digital systems, “might have helped shift the federal government’s position on AI-related regulation, particularly around the renewed push to enforce standards for AI-related deployments across government infrastructure,” he said.

AI vendors like Google, Microsoft, and xAI, Levy added, “must walk a political highwire of sorts as they balance the need to release models into the marketplace in a timely, cost-effective manner with increasingly defined rules around AI-related cybersecurity and safety. The industry can’t afford a scenario where vendors themselves make up the rules as they go along.”

At the same time, he said, the recent showdown between Anthropic and the Pentagon illustrates why the vendors might be forgiven for viewing the federal government’s growing interest in AI testing and regulation with at least a certain degree of caution.

According  to Levy, “while the administration’s efforts to centralize testing and oversight should streamline the go-to-market process for vendors and accelerate the development of best practices around frontier model development, the political overtones of recent government-industry partnerships cannot be ignored.”

CIOは「技術管理者」から「価値設計者」へ AI導入が進まない日本のCIOに求められる視点とは

美馬のゆり(みまのゆり)
学習環境デザイナー/学習科学者、公立はこだて未来大学 名誉教授。日本学術会議 第26期・第27期会員、電気通信大学 監事。電気通信大学(計算機科学)、ハーバード大学大学院(教育学)、東京大学大学院(認知心理学)で学ぶ。コンピュータと教育、認知科学の幅広い視点から、コミュニケーションや人材育成、ネットワーク形成などを促進する活動を行っている。その他、マサチューセッツ工科大学メディアラボ客員研究員、NHK経営委員会委員、カリフォルニア大学バークレー校人工知能研究所および人間互換人工知能センター客員研究員などを歴任。元日本科学未来館副館長。

日本でAI導入が進まない本当の理由

生成AIの活用が世界規模で進む中、日本企業の多くはいまだ本格的な導入に至っていない。ChatGPTを業務で試した経験はあっても、組織全体への定着には達していない企業が大多数だ。

一方、米国では状況が異なる。米国心理学会が2025年に発表した「Work in America」調査によると、47%の労働者が月1回以上業務でAIを使用していることがわかった。30%が「使わないと取り残される」と感じ、38%が「自分の職務が不要になる可能性」を懸念している。AIはすでに日常業務に深く浸透し、効率向上への期待と同時に、不安や組織内格差といった心理的・組織的影響をもたらしていることを示す調査結果だ。「AI導入は、人の不安と期待を前提にした制度設計と不可分である」という結論をレポートは導き出している。

では、日本でAI導入が進まない背景には何があるのか。美馬氏が講演や企業との対話を通じて見えてきたのは、能力や技術力ではなく、構造的な問題だという。第一に、(AI導入前に)DXの目的が組織内で十分に共有されていないこと。第二に、業務ルーチンの変更を望まない現場の意識と、それに引きずられる形で「クライアントが望まないものは提案しない」というベンダー側の行動原理。そして現状維持を合理的選択にしてしまう評価制度がそれを後押しする。

統計データをPDFで公開する、カルテを電子化するといった「本質ではないこと」で止まってしまう例は枚挙にいとまがない。このような状況を招く根本原因として、美馬氏が指摘するのは「設計思想」だ。例えば、EV専業の自動車メーカーTesla。頻繁にソフトウェアアップデートがかかり、一晩で画面表示が刷新されることも珍しくない。美馬氏はUCバークレーAI研究機関に在籍時にTeslaに乗っていた経験から、「日本では容易に受け入れられない仕様」と話す。日本の自動車産業の設計思想が確実さ、安全性、高い完成度であるとすれば、進化のスピード、継続的改善、実装後の修正といった設計思想を持つTeslaは正反対と言えるからだ。「これはどちらが良い・悪いの優劣ではなく、設計思想の違い」と美馬氏は説明する。

これを踏まえてみると、AIの技術的性質と日本とは相性が良くない。「AIは本質的に不完全。どんどん更新され、試すたびに精度が上がるという技術です。これは、”完成してから導入する”という日本の発想とは合わない」と美馬氏。日本にはAI導入が進みにくい構造がある、と続けた。

AI時代の技術判断は価値判断―ーCIOは「価値設計者」に

そのような状況でCIOはAIを導入しなければならない。そしてAIの導入では、CIOに新しい役割が加わると美馬氏は見る。

これまでCIOの重要な役割の1つに技術判断があった。だが、AI導入をめぐっては何を自動化するか、どの役割を再定義するか、どのスキルを価値とみなすかなどの判断も入ってくる。そしてこれらは、純粋な技術の判断ではなく価値の判断だ。

価値判断が重要であることを示した象徴的な事例がAmazonだ。採用選考にAIを導入しようとした際、過去の採用実績(男性が多数)を学習データとしたため、女性を不当に低く評価する結果となった(結局、導入は中止)。「どういう学習データを選ぶのか自体が、AI導入の判断にかかってくる」と美馬氏。技術的に可能であることと、組織として妥当であることは一致しない——そうした判断の前提を決めているのは、突き詰めればIT技術者であり、経営判断を下すCIOだ。

美馬氏はこう言い切る。「CIOは技術管理者であると同時に、組織の価値設計者でもある」。

価値を考えるにあたって美馬氏がまずスポットを当てるのが、「最適化と望ましさは一致しない」という視点だ。効率やコスト削減といった数値化できる指標が優先される一方、「数値化できるものを指標にした途端、そうじゃない優秀さは漏れていく」と美馬氏。

何を価値とみなすかを決めること自体が問われている。だからこそ美馬氏は倫理を中核に置く必要性を説く。倫理とは守るべき規則ではなく、価値が衝突し正解が定まらない状況において判断の拠り所となる枠組みだ。「何を良しとするか」という問いを組織の基盤に据えること——それがCIOに求められると美馬氏は言う。

技術者倫理3層モデル

このようなことから、AI導入にあたって価値設計者としてのCIOが考えるべき枠組みとして、美馬氏は「技術者倫理3層モデル」を提示する。第1層は「予見責任」——導入前に職務への影響をマッピングし、移行計画を可視化すること。第2層は「説明責任」——導入目的・影響範囲・限界を組織内で共有すること。そして第3層が「組織的ケア責任」だ。

特に第3層について、「ケア責任とは、情緒的配慮ではなく、人的資本の毀損を防ぐ経営責任」だと美馬氏は言う。AI導入で影響を受ける人材の能力移行を制度として保証すること——それは「優しさ」ではなく、人材という経営資源を守るための責務だ。「少子高齢化が進み、人材不足が深刻化する中、AI導入で影響を受ける人材を他の業務へ移行させることは、組織が果たすべき経営上の責務だ」と美馬氏。自分の仕事がなくなるかもしれないという不安を抱えた社員に対し、トップダウンで導入を命じるだけでは反発を招く。影響を受ける人材の不安に向き合い、移行の道筋を組織として明示すること——それが制度設計としてのケア責任の本質だ。この視点は人事部門との緊密な連携なしには実現しない。

さらに美馬氏は、技術者が担うべき倫理にもう一つの視点を加える必要があると説く。従来、技術者の倫理は「技術が社会に与える影響」「組織人としての責任」「専門職としての説明責任」という3つの観点で論じられてきた。しかしAI時代においては、「生活者倫理」の視点が必要だ、と美馬氏。AIコンパニオンと結婚する人が世界中に現れ、亡くなった家族とサブスクリプションで「会える」サービスまで登場している現在、技術を開発・導入する側もまた、その影響を受ける生活者の一人だ。「使う側・影響を受ける側としての判断が、今や4つ目の倫理として必要になっている」。

AI時代の人材育成を設計する

AI時代は人材面でも見直しが不可欠だ。先述の米国の調査で明らかになったように、AI導入に社員は不安を感じている。雇用そのものに対する不安ももちろんだが、業務のやり方や内容が変わることはキャリアの断絶を伴う。人材育成もまた、倫理を基盤に設計される必要がある。

「従来の『作業効率』『コスト削減』という指標だけでは、人材への影響を捉えることはできない」と美馬氏。AI時代に必要な指標として、スキル転換完了率、再配置成功率、組織内AI成熟度、そして心理的安全性——「導入しても大丈夫」と社員が感じられるかどうか——を挙げる。効率から持続可能性へ。その転換が、人材育成KPI再設計の核心だ。

「AI時代の人材育成とは、ツール操作を教えることではなく、変化を前提に学び続ける構造を設計することだ」と美馬氏は言う。加えて、変化を引き受ける倫理を育てることも重要になるという。そのような考えから、美馬氏は「リスキリング(学び直し)」よりも、新しいスキルを足す「学び足し(アップスキリング)」の視点が重要と話す。

WHYを問い続ける

冒頭のように、日本のAI導入が進まない背景には文化・制度・設計思想という構造的な問題がある。だが、AIを導入しないという選択肢はない。落とし所をどうするかーー「ここは肝になる」と美馬氏。悲観しているのではなく、「紙一重のところに希望がある」と言う。

「空気を読む」「場を整える」「察する」——日本社会に根付く関係性を重んじる感性は、美馬氏が提唱する「Humane Learning Design(HLD)」の考え方と深く響き合う。HLDとは、問いを立て、対話し、文脈と関係性に応じて責任ある判断を行うという学びの思想だ。humaneという言葉には、humanよりも人道的・思いやりという意味が込められている。個人の最適化ではなく「間」を整える倫理的な感性——これはケアの倫理の核心とも重なる。

ただし、関係性を重んじる感性は忖度と「紙一重」だと美馬氏は注意を促す。「空気を読みすぎて、失敗を許されない文化になると、何も変えないということになってしまう」。現場が課題を最もよく把握しているにもかかわらず、声が届かない——この構造はあらゆる組織に共通する病理だ。関係性を重んじる感性を活かしながら、変化を許容しオープンであること。「変化していくことを許すとか、関係が変わっていくことをよしとするとか、そういう意味での開かれた文化が必要だ」と美馬氏は言う。

その両立が実現できれば、日本はAI時代において独自の強みを発揮できると美馬氏は考える。「日本の『関係性の知』を、日本オリジナルとして世界に発信できるのではないか」。

最後に美馬氏がCIOに向けて強調するのは、「なぜ使うのか」という問いだ。「AIをどう使うかというHowやWhatの議論は進んでいる。でも、なぜ使うのか、何は使わない方がいいのか、何を人間の側に残しておくべきかというWHYの議論はされていない」。

美馬氏はこの問いに向き合う力を「AI Readiness」と呼ぶ。「使いこなす能力ではなく、判断に向き合う準備状態」——それを組織に根付かせることが、価値設計者としてのCIOに求められる最も重要な役割だと美馬氏は言う。「使うことの意味と影響、そして使わない判断。そこの判断が、人を生かしていく社会を作ることにつながる」と美馬氏は述べた。

Anthropic’s financial agents expose forward-deployed engineers as new AI limiting factor

When financial tech vendor FIS announced its new AI agent for detecting financial crimes on Tuesday, it made much of its embedding of a team of forward deployed engineers (FDEs) from Anthropic to make it happen. It’s just one of the dozen or so companies working with Anthropic on developing agents for financial services using new connectors and so-called “ready-to-run” templates Anthropic announced the same day.

Enterprise CIOs are increasingly paying for the services of AI vendors’ FDEs, given their own data quality issues and the complexity of working with AI models.

But how and why such teams are brought in can make the difference between whether the enterprise is helped to get to the next AI level or becomes a hostage to never-ending consulting costs. 

FIS listed the Bank of Montreal (BMO) and Amalgamated Bank as the first two companies to deploy its agent, which it said will compress anti-money-laundering investigations from hours to minutes, assembling evidence across a bank’s core systems and surfacing the riskiest cases for review with full auditability and traceability of decisions. “Anthropic’s Applied AI team and forward-deployed engineers (FDEs) are embedded with FIS to co-design the Financial Crimes AI Agent and transfer knowledge so FIS can build and scale additional agents independently over time,” it said.

Aman Mahapatra, chief strategy officer for Tribeca Softtech, a New York City-based technology consulting firm, suggests CIOs follow the money when evaluating similar work with AI vendors. 

“The structurally interesting thing about the FIS-Anthropic model is who actually pays the FDE cost. This is the question CIOs should be asking but mostly are not,” Mahapatra said.

The cost of FDEs could put some AI projects in jeopardy according to a recent report by Alex Coqueiro, a senior director analyst with Gartner. He predicted that by 2028, “70% of enterprises will be forced to abandon agentic AI solutions from FDE-led engagements because of high vendor costs and lack of internal skills to evolve them independently.”

Service, not software

He argued that the problem is not entirely the fault of the AI vendor. Many IT operations don’t put in the necessary preparatory work to clean their data and to make it AI-friendly. Internal corporate politics/personalities is another critical factor.

“The domain experts most critical to FDE success have the strongest incentive to undermine it. An expert who perceives the FDE as capturing their expertise for agentic automation will give the official process instead of the real one, and the AI agent built on it will fail on the exact edge cases they chose not to mention,” Coqueiro said in the report. “Flat FDE effort across successive deployments is the signal that an engagement has produced a dependency, not a capability. When effort does not decrease as use cases mature, the organization is paying consulting rates for operations it should own.”

In the case of FIS’s work with Anthropic, said Mahapatra, “BMO and Amalgamated are not writing direct checks to Anthropic for forward-deployed engineers at quarterly consulting rates. FIS is absorbing the FDE engagement and amortizing it across its banking customer base.”

That approach, he said, “is meaningfully better economics than direct Anthropic engagements where each bank funds its own embedded engineering team to redesign the same context boundaries, shadow autonomy controls, and the jailbreak resistance testing in isolation.”

Mahapatra said much of this problem stems from how generative and agentic AI have been marketed. The original ROI thesis, he said, was that AI enables enterprises to do more with fewer people, but that was “a marketing pitch that was never going to survive contact with regulated banking workflows.”

Nik Kale, a member of the Coalition for Secure AI (CoSAI) and of ACM’s AI Security (AISec) program committee, said that he sees FIS’s presentation of its work with Anthropic as “a concession that frontier AI isn’t a product yet. CIOs thought they were buying software. They’re actually buying a professional services engagement. That changes the cost model, the dependency model and the governance model for every enterprise AI deployment.”

Kale said the statement’s wording gives a clue about the agentic strategy. 

“The FIS release says every agent decision is traceable and auditable. True statement, wrong sentence. The harder question isn’t auditing what the agent decided. It’s deciding which decisions are the agent’s to make in the first place. Banks have decades of decision-rights frameworks. They don’t translate cleanly to agent harnesses built by someone else’s engineers,” Kale said. “The CIO test is simple: after the forward-deployed team leaves, can your organization still operate, monitor, challenge, and safely modify the agentic workflow? If the answer is no, it’s not mature yet. It may be a successful implementation project, but it’s not yet an enterprise capability.”

Justin Greis, CEO of consulting firm Acceligence and former head of the North American cybersecurity practice at McKinsey, agreed with Kale.

Human judgment pretending to be process

“The bigger risk isn’t the cost of these engagements. It’s the dependency they can create. Spending a few hundred thousand dollars to get something into production isn’t the issue,” Greis said. “Ending up with a system that only the vendor can operate, extend, or even fully understand is where things start to break down.”

The problem with some of these consulting arrangements is not that they hide IT deficiencies as much as they enable AI shortcuts.

Enterprises paying FDE teams “do not undermine the ROI case for agentic AI. They undermine the lazy version of the ROI case. That distinction matters,” said Sanchit Vir Gogia, chief analyst at Greyhound Research. “For the past two years, too much of the enterprise AI narrative has been sold as a tidy labor-reduction story. Buy the model. Automate the work. Reduce the people. Capture the savings. It is neat, board-friendly, and deeply incomplete. Large enterprises are not collections of clean tasks waiting to be automated. They are collections of exceptions, legacy systems, fragile integrations, access controls, undocumented workarounds, compliance obligations, and human judgement pretending to be process. Forward deployed engineers are the invoice for making AI real. That is not transformation. That is dependency with better stationery.”

Another FDE concern is the inevitable conflict of interest that can exist where the AI vendor that is being paid to fix the complexity is also the vendor that created much of that complexity in its model.

Carmi Levy, an independent technology analyst, said the business case can undermine enterprise objectives. “If AI agents are supposed to autonomously create, deploy, and manage super-capable workflows at all levels of the organization, their very capability threatens the future viability of vendors who have long attached lucrative support contracts to those very same deployments. If the FDE is going to be engaged to work alongside customers to make their AI agents come alive, where is the incentive for AI vendors to build agentic systems that are so capable that they don’t require ongoing support? The FDE business model influences up-front model design, and it’s entirely possible that AI platforms are being deliberately designed to require persistent FDE support.”

Agentic AI for marketing: Reimagine end-to-end customer experiences

Agentic AI represents the next phase of marketing performance, enabling organizations to connect insights, decisions, and execution across the customer experience. As customer journeys become more complex and expectations rise, enterprises need systems that can operate across data, content, and workflows in a coordinated way.

Generative AI has dramatically sped up how marketing teams produce content. Work that once required long cycles can now be completed in hours, enabling teams to support more channels, more formats, and more personalization than ever before. But as content volume increases, a deeper challenge has become clear.

Creating more content is not the same as delivering better customer experiences. Many leaders now recognize that while generative AI speeds up creation, it is not enough to accelerate the marketing and customer experience workflows required to meet today’s customer demands. The coordination, decisioning, and execution work that surrounds content remains complex and manual, shifting the bottleneck from creation to experience delivery.

This gap is fueling the adoption of agentic AI, which represents the next stage of value creation. AI agents can understand goals, make context-aware decisions, and assist with the complex steps required to bring one-to-one customer experiences to life, allowing teams to reduce manual effort, respond to changes faster, and shift their focus from operational tasks to strategic direction.

The momentum is significant: agentic AI is expected to create $450–650 billion in annual value by 20301.

What is agentic AI and how does it work?

Agentic AI refers to intelligent systems composed of agents that can reason, act, and coordinate work in real-time. These agents can understand goals, take initiative, monitor dashboards, trigger workflows, and collaborate across functions while keeping people in control through oversight and approvals.

Read the full guide: What is agentic AI?

What is the difference between agentic AI and generative AI?

Generative AI speeds up and scales the creation of content, concepts, and ideas, while agentic AI goes further by helping teams execute the work around that content by planning, deciding, optimizing, and coordinating actions across systems. Both work best when paired together across marketing operations.

Read the full guide on generative AI vs agentic AI.

Adobe is uniquely positioned to shape this next chapter by applying agentic intelligence to the areas where it creates the most enterprise value. Instead of treating AI as a series of point tools, Adobe connects agents across the full marketing lifecycle and provides a unified platform with real-time data and governance as the foundation, enabling organizations to move from task-level automation to coordinated, end-to-end experience performance.

This guide explores the practical path to scaling agentic AI for the enterprise with Adobe, revealing the core capabilities that define an enterprise-ready platform, why a foundation of trusted and governed data is non-negotiable, and how Adobe has designed agentic tools to manage complex, end-to-end workflows. You will discover exactly where our agents deliver high-impact value across the full marketing lifecycle and understand when and how you can extend this unified system for custom business solutions.

custom

Adobe

Interest in agentic AI is rising quickly, with two out of five organizations already investing significantly in this space, and a similar number of organizations in early testing or proof-of-concept stages. As more teams explore agentic AI, the question becomes what enterprises need to deploy agentic AI successfully at scale.

For agentic AI to support real customer experience work, it needs a strong, unified foundation. Teams must have access to reliable customer signals, clear understanding of content and context, and a shared view of what is happening across marketing and experience operations. When information is scattered or workflows are fragmented, AI can only handle narrow tasks in some pockets of the organization.

Agentic AI adoption is accelerating.

  • 40% of organizations are investing significantly in agentic AI2.
  • 44% of organizations are in early testing or proof-of-concept stages3.

When customer data, content knowledge, and operational insights are connected, AI agents can contribute to the full journey. Three qualities become especially important for organizations to adopt as they move forward.

1. Transparent oversight: It ensures teams understand how decisions are made, where intervention is needed, and how agent-driven actions lead to outcomes.

2. Unified operational context: It aligns planning, activation, personalization, and optimization around the same view of customers, content, and journeys.

3. Business-level adaptability: It allows organizations to expand and refine agentic use cases as strategies evolve, and new opportunities emerge.

Together, these qualities help organizations use agentic AI in a way that feels dependable, coordinated, and aligned with business goals. They create an environment where decision-making becomes faster and more consistent, enabling teams to shape customer experiences with greater relevance and precision.

To read the full guide, visit here.

❌