Why your data infrastructure — not your AI model — will determine whether Agentic AI scales
Nearly all the business media coverage of AI focuses on the eye-popping sums being deployed into data center infrastructure that drives the “compute” coveted by leaders in the AI industry. That “compute” provides the raw processing power required to train, build, and run AI systems. Think of it as the engine behind the technology. The tech community is expected to invest more than $750 billion into data centers this year alone. Estimates for total cumulative spend on the humming warehouses reach over $7 trillion by 2030. Such mind-boggling numbers and the circular financing arrangements to drum up the necessary capital have understandably generated a lot of buzz about a potential bubble comparable to the dot-com bubble.
The development of data centers is a must if we want to capture the productivity gains that AI promises. Overinvestment, though, could not only have a chilling effect on the rapid integration into the global economy but also lead to a calamitous outcome for financial markets. All the vigorous debate is warranted. However, not enough attention is paid to the other kind of infrastructure required to scale AI for highly productive, enterprise-agentic deployments—data infrastructure. Data and databases must be organized, checked for accuracy, and made easily accessible so that an AI agent can both locate a specific data point and use it to complete actual tasks across myriad systems without constant supervision.
Agentic AI has increasingly attracted attention over the past year, and for good reason. Systems that can reason, plan, and execute across complex enterprise workflows represent a genuine shift in what software can do. But the enthusiasm has outrun the evidence. Two-thirds of enterprises have experimented with AI agents, yet fewer than one in ten have scaled them to the point that they measurably change the cost base, revenues, or earnings. The public conversation remains fixated on what these systems can do in demonstrations, not on the conditions required to deploy them at scale.
The gap matters because agentic systems are not an incremental extension of prior AI. When AI drafts an email or helps write code, internal data barely matters. An agentic system, by contrast, does not just answer a question about an invoice—it locates the invoice in the Enterprise Resource Planning (ERP) system, matches it against the purchase order in procurement, and triggers payment, all without human direction. Its usefulness depends entirely on reaching across the systems where enterprise data actually lives. That is the central difficulty, and it is largely an infrastructure problem.
Agentic AI pilots obscure this. They succeed precisely because they are contained: a narrow, clean slice of data, a single system, and none of the integration complexity that characterizes real enterprise environments. When organizations move from pilot to deployment, the containment breaks, and the agentic system encounters data held across platforms, maintained by different teams, governed by different standards, and often unable to communicate consistently with one another. What looked like a capability problem is revealed to be a data infrastructure problem—a failure of accessible, consistent, and usable data across systems—and no amount of model improvement will solve it. Eighty percent of companies cite data limitations as the primary obstacle to scaling AI, a figure that has proven stubborn even as the models themselves have leaped forward.
The organizations best positioned to move past the pilot phase are not those with the most advanced tools but those who built the infrastructure to support them before needing it. What follows examines why that infrastructure is so consistently underestimated, how readiness varies across sectors, and what firms and policymakers need to do differently.
Why Data Infrastructure Is the Binding Constraint
The data dependency of Agentic AI is obvious. What is less obvious and frequently neglected is cross-system operability—the capacity of platforms built on different architectures and governed by different standards to communicate reliably enough to carry an autonomous decision from start to finish. Data infrastructure, in this sense, is less about storage than about translation.
The distinction is compounded by the fact that most enterprise systems were never built to be spoken to by other systems in the first place. Procurement, clinical, billing, and network management tools were each designed to excel within their own domains, not to interoperate across them. Companies, then, cannot simply layer Agentic AI on top of their existing infrastructure; they must first complete the lower-profile work of standardizing data across systems. Without that foundation, even the most capable AI model has nothing coherent on which to act.
Industries encounter this challenge in different forms. In real estate and financial services, workflows involve sensitive personal and financial information, and regulatory exposure slows integration even when the technology is ready. In travel, healthcare, and telecommunications, critical data sits across booking platforms, clinical records, and billing systems that were never designed to interoperate. In supply chain and logistics, enterprise software is often decades old and lacks modern APIs, making integration so costly that it erodes much of the value automation was meant to create. In each case, the limiting factor is not the agent’s intelligence but the coherence of the environment in which it must operate.
Two sectors illustrate what a serious response looks like. Manufacturing is furthest along on the foundations, with 57% of manufacturers having deployed cloud and data analytics—pulling data out of isolated machine-level systems into consolidated environments where it can be standardized. Retail has taken a different route, standardizing around an emerging protocol layer: Model Context Protocol for shared context, Agent-to-Agent Protocol for coordination, Agent Payment Protocol for secure transactions, and unified commerce APIs. In other words, retailers are building a translation layer on top of what they already have. Manufacturing is deepening the substrate, while retail is wrapping it. Both approaches reflect the same underlying recognition that the work to be done is not on the models but on the environment in which they operate.
A third pattern is also emerging, in which firms rely on AI vendors themselves to supply the connective layer. Anthropic’s enterprise rollout of Cowork plug-ins and connectors for Google Workspace, FactSet, DocuSign, and other systems is the clearest example. The model provider, rather than the customer, performs the cross-system integration. Cowork lowers the barrier to entry but raises a different set of questions about governance, vendor concentration, and whether outsourced connectivity matches the durability of native infrastructure. The trade-off between building, wrapping, and renting will define the next phase of enterprise adoption.
Such an environment makes the competitive dynamic more complicated than it first appears. On the surface, vendor-supplied connectivity democratizes access. A midsize firm can plug into Claude’s enterprise connectors or Microsoft’s Copilot Studio without building its own integration infrastructure. But the firms best positioned to use that connectivity effectively remain those with the cleanest underlying data, the strongest governance, and the leverage to negotiate custom integrations and enterprise-level terms. Smaller firms running heterogeneous legacy systems get further than they would have a year ago, but the ceiling of possibilities is lower, and their dependence on vendor decisions is higher. Data infrastructure is not merely a near-term obstacle to scaling Agentic AI. It is the mechanism that concentrates the benefits of Agentic AI within the firms already ahead.
The Readiness Picture
Most organizations are not close to ready. Only 7% describe their data as “completely ready” for AI; fewer than a quarter have a data strategy at all; and 63% either lack AI-suitable data management or are unsure whether they have it. The gaps are not independent of one another. Without a data strategy, storage does not get optimized; without optimized storage, centralization stalls; without centralization, governance cannot scale. The 80% who cite data limitations as their primary obstacle are not pointing to a single deficiency but to a stack of them.
The downstream consequences are also measurable. 95% of AI pilots fail to reach production, and 56% of CEOs report no financial return from AI investment. Both numbers track readiness more closely than model choice or vendor selection. When the data environment cannot support reliable agentic behavior, pilots stall and spending becomes sunk cost; firms that push forward anyway accumulate governance debt and brittle integrations that only become harder to remediate. Readiness scores are an early indicator of which organizations will convert AI investment into lasting capability and which will first have to revisit foundational work they assumed was behind them.
Industries sit at very different points on this curve and treating them as though they do not leads to misallocated investment and unrealistic timelines.
Finance, IT, and telecommunications are scaling now. Their data infrastructure is mature, their governance frameworks are established, and the tasks most amenable to agentic automation are already well defined. Firms like JPMorgan Chase and Goldman Sachs in banking, Microsoft and Google Cloud in enterprise technology, and Verizon in telecommunications are past the threshold and into early execution.
Manufacturing is in a rapid catch-up phase. Federal data shows 58% year-over-year growth in AI adoption, and the cloud and analytics foundation discussed earlier is genuine. The persistent constraint is the divide between operational technology—the embedded systems that run physical production—and information technology; the two were designed around incompatible protocols and are harder to integrate than typical enterprise IT. Where data flows cleanly, results are striking: a tier-one automotive supplier halved routine test-case preparation time using a multi-agent R&D system, and a truck OEM reported a 40% increase in order intake within six months of deploying autonomous prospecting agents.
Retail is executing the most deliberate catch-up strategy of any sector. Rather than wait on legacy modernization, retailers are standardizing around the agentic protocol layer described earlier—a faster path to interoperability than rebuilding underlying systems. The open question is whether that layer delivers the data quality and governance scaled deployment requires, or only connectivity. Retail will be a key signal for sectors weighing the wrap-versus-deepen choice. Walmart is the clearest test case. In late 2025, the retailer began offering roughly 200,000 products through ChatGPT’s Instant Checkout, powered by OpenAI and Stripe’s Agentic Commerce Protocol. Within months, in-chat conversion rates ran at about one-third of those on Walmart.com. The company pivoted to embed its Sparky shopping assistant directly into ChatGPT and Gemini, keeping the discovery protocol but routing checkout back to the store site—an early signal that connectivity is not the same as conversion.
Professional services and healthcare are mixed. Worker-level AI use is high because individual generative AI tools operate on individual data and do not cross system boundaries, but enterprise infrastructure is uneven. The integration barriers that personal AI sidesteps are exactly the ones agentic systems confront. Thomson Reuters’ CoCounsel legal research agent is live in more than 20,000 law firms, including most of the Am Law 100, and EY has scaled 150 AI agents to support 80,000 tax professionals. In healthcare, Kaiser Permanente has rolled out Abridge’s ambient documentation platform across 40 hospitals and more than 600 medical offices, yet 62% of hospitals still report data silos across their Electronic Health Record, labs, pharmacy, and claims systems—the structural barrier that keeps similar orchestration from scaling.
Construction, accommodation, and transportation face the longest path. Core work in these sectors is still largely undigitized, meaning the data on which Agentic AI would act does not broadly exist in an accessible form. The bottleneck is not AI readiness but foundational digitization, which has to come first.
Different starting points produce a K-shaped divergence. Firms and industries with mature data architectures can adopt agentic capabilities incrementally and at low marginal cost because the foundational work is already done. Those without that foundation face a step-change in time, effort, and capital before meaningful deployment is possible. Within industries, large enterprises absorb the cost of modernization while smaller operators—in real estate, for instance—are priced out of early adoption. The result is not a lag in timing but a structural gap in competitive capacity, and since infrastructure investment compounds, the gap widens rather than closes on its own.
Implications
For CEOs: sequencing, not procurement
The most common mistake organizations make is treating Agentic AI adoption as a procurement decision. Tools are selected, budgets approved, and pilots launched before anyone has audited the data systems on which those tools will depend. The result is reliably disappointing, not because the technology fails but because the conditions for its success were never established.
The correction is sequencing. Data infrastructure assessment should precede tool selection, not follow it. Before choosing an agentic platform, an organization needs a clear picture of where its data lives, how it moves across systems, where governance gaps exist, and which workflows span enough system boundaries to be viable candidates for deployment. The assessment is not glamorous and rarely surfaces in vendor conversations, but it is the highest-leverage input into whether a deployment will scale.
Governance readiness is the other half of the problem. Only around 20% of firms currently have mature AI governance models in place. For the remaining 80%, scaling agentic systems without first embedding frameworks around data access, decision accountability, and audit trails is not a shortcut; it is deferred cost. Governance debt compounds faster than technical debt because it accumulates through every decision the agent makes, and remediating it usually requires reconstructing a record that was never properly kept. Firms that scale first and govern later will spend the next cycle unwinding what they deployed.
For policymakers: getting ahead of concentration
If Agentic AI’s gains concentrate in large, data-mature firms within already-digitized sectors—and readiness data suggests they will—the consequences extend well beyond corporate competition. Market concentration accelerates, regional inequality widens, and labor displacement falls hardest on workers in industries already losing ground. Anthropic’s Economic Index has flagged that AI usage is already concentrated in prosperous regions and automation-ready sectors, with existing inequalities poised to widen rather than narrow.
The Inter-American Development Bank’s 2025 assessment adds a useful distinction for policymakers outside the AI frontier. Building new models is not the same as deploying existing ones productively. A country unlikely to produce frontier systems can still capture significant value from adopting them, provided its data interoperability and governance clear a minimum threshold. For most developing economies—and many regions within developed ones—the highest-return intervention is not AI development but the infrastructure that makes adoption viable.
The policy challenge, then, is not to slow adoption among firms that are ready but to close the gap for those that are not. The OECD identifies four concrete levers: targeted skills programs for small and medium-sized enterprises, where half of small firms report insufficient AI skills; public financing for data infrastructure; broadband and foundational digital infrastructure in regions where core work has not yet been digitized; and standardized governance frameworks that reduce compliance costs for smaller organizations. None of these are exotic. They are standard industrial policy tools applied to a problem whose economics—high fixed costs and compounding returns to scale—resemble those of earlier infrastructure transitions.
The question is whether policymakers will treat the gap as urgent while it is still narrow enough to close.
Data infrastructure is the threshold condition for Agentic AI, but it is not the destination. Organizations that treat infrastructure as the finish line will end up with capable systems and no clear sense of how to use them well—responsibly, competitively, or in ways their customers will accept. The CEOs and policymakers who succeed will be those who treat infrastructure, governance, and the social license for autonomous systems as a single, integrated problem rather than a sequential checklist. The work is harder that way. However, the triad approach is also the only one that scales.
The challenge of transcending data communication is overcome as much by awareness and intention as by money. As Rudyard Kipling advised constructively in the 1889 ballad “The Ballad of East and West”:
Oh, East is East, and West is West, and never the twain shall meet,
Till Earth and Sky stand presently at God’s great Judgment Seat;
But there is neither East nor West, Border, nor Breed, nor Birth,
When two strong men stand face to face, though they come from the ends of the earth!
Sadly, however, Kipling has been widely misunderstood by those who quote only the 14 words of the opening stanza and overlook his optimism about overcoming cultural communication barriers. Surely, commercial barriers can be transcended even more easily.
**This article is part two of a four-part series from the Yale Chief Executive Leadership Institute (CELI) on the state of Agentic AI adoption across industries and sectors. The research is designed to help CEOs understand the current and expected pace at which agentic systems are being deployed—and the strategic decisions that pace forces on them. Over the past six months, CELI researchers analyzed hundreds of company materials and industry analyses and conducted dozens of conversations with senior technology leaders across the U.S. The industries analyzed include Financial Services, Consumer Packaged Goods, Food & Beverage, Healthcare, Insurance, Manufacturing, Professional Services, Real Estate & Housing, Retail, Supply Chain & Logistics, Telecommunications, and Travel & Hospitality, as well as the public sector. The series examines four implications of the findings: labor market effects, data infrastructure readiness, governance and regulatory policy, and customer experience.
With research contribution from Dan Kent, Holden Lee, Johan Griesel, Andrew Alam-Nist, Peter Yu, Yevheniia Podurets, Jasmine Garry, and Christian Ruiz Angulo
The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.
This story was originally featured on Fortune.com







