Americas

  • United States

AI networking success requires deep, real-time observability

Opinion
Aug 28, 20255 mins

Preparing networks for AI traffic? Donโ€™t neglect network observability. It's not just a technical upgrade, itโ€™s a predictor of success.

Credit: Shutterstock

As enterprises adopt AI applications, network infrastructure teams are scrambling to optimize their networks for AI traffic, from the data center and the cloud to the WAN edge. Most projects are focused on transforming data center networks and accelerating AI traffic across the WAN. However, there is another crucial piece of the puzzle that project leaders should consider: network observability.

Only 47% of enterprises believe their network observability tools are fully prepared to monitor and manage AI traffic, according to Enterprise Management Associatesโ€™ (EMA) research report, Readying Enterprise Networks for Artificial Intelligence. This finding should serve as a bright red flag for any AI project leader. AI workloads are notoriously sensitive to latency, packet loss, and congestion. They generate unpredictable bursts of traffic and require seamless connectivity across data centers, clouds, and edge environments. Without deep, real-time visibility into network performance, AI training and inference jobs will fail.

Network observability is key to AI readiness

EMA based its new research on a survey of 250 IT professionals currently engaged in preparing network infrastructure and operations for AI projects. The report shows that companies with fully prepared observability tools are five times more likely to expect success with their AI networking strategies. These organizations tend to have:

  • An AI center of excellence guiding strategy
  • Significant IT budget allocations for AI
  • Fewer concerns about compliance and privacy risks

In short, observability isnโ€™t just a technical upgradeโ€”itโ€™s a predictor of strategic success.

Where visibility matters most

This research also found that AI workloads are distributed across hybrid architectures, residing in private data centers, public clouds, and edge computing environments. EMA believes that end-to-end network observability will be essential to the management of AI networks.

Our survey found that most network teams are trying to improve observability of AI networks by focusing on four general domains of their networks. The biggest priorities are improved network visibility in public cloud networks and the cloud interconnects that provide connectivity between enterprise networks and their cloud providers. Our research also found that enterprises are looking beyond the big three hyperscalers, AWS, Azure and Google. Theyโ€™re also placing AI workloads with emerging GPU-as-a-service providers, which will have less mature mechanisms in place for supporting network observability, posing a challenge to visibility.

Most research participants also told us they need to improve visibility into their data center network fabrics and WAN edge connectivity services.

(See also: 10 network observability certifications to boost IT operations skills)

The need for real-time data

Observability of AI networks will require many enterprises to optimize how their tools collect network data. For instance, most observability tools rely on SNMP polling to pull metrics from network infrastructure, and these tools typically poll devices at five minute intervals. Shorter polling intervals can adversely impact network performance and tool performance.

Sixty-nine percent of survey participants told EMA that AI networks require real-time infrastructure monitoring that SNMP simply cannot support. Real-time telemetry closes visibility gaps. For instance, AI traffic bursts that create congestion and packet drops may last only seconds, an issue that a five-minute polling interval would miss entirely. To achieve this level of metric granularity, network teams will have to adopt streaming network telemetry. Unfortunately, support of such technology is still uneven among network infrastructure and network observability vendors due to a lack of industry standardization and a perception among vendors that customers simply donโ€™t need it. Well, AI is about to create a lot of demand for it. 

In parallel to the need for granular infrastructure metrics, 51% of respondents told EMA that they need more real-time network flow monitoring. In general, network flow technologies such as NetFlow and IPFIX can deliver data nearly in real-time, with delays of seconds or a couple minutes depending on the implementation. However, other technologies are less timely. In particular, the VPC flow logs generated by cloud providers are do not offer the same data granularity. Network teams may need to turn to real-time packet monitoring to close cloud visibility gaps. 

Smarter analysis for smarter networks

Network teams also need their network observability tools to be smarter about AI networks. For example, 59% want their tools to identify AI applications in network traffic. This will allow them to monitor AI application performance, optimize network for AI traffic, and detect rogue AI adoption.

Many are also looking for advanced analytical capabilities tuned to AI traffic. Forty-six percent want tools that can predict and analyze AI traffic congestion, and 42% want anomaly detection tuned to AI traffic patterns. Finally, 34% want tools that can analyze traffic patterns across entire GPU clusters.

These capabilities will help network teams anticipate problems before they impact AI application performance, a necessity in environments where milliseconds matter.

Observability is not optional

AI is redefining what networks must do and how network teams should manage them. Investments in real-time, intelligent, and comprehensive network observability will determine a network teamโ€™s success at supporting AI adoption. As AI workloads grow in complexity and scale, effective observability will be the difference between innovation and failure.

shamus_mcgillicuddy

Shamus McGillicuddy is the research director for the network management practice at Enterprise Management Associates. He has been covering the networking industry for more than 12 years as an analyst and journalist.

Prior to joining EMA, Shamus was the news director for TechTarget's networking publications. He led the news team's coverage of all networking topics, from the infrastructure layer to the management layer. He has published hundreds of articles about the technology and competitive positioning of networking products and vendors. He was a founding editor of TechTarget's website SearchSDN.com, a leading resource for technical information and news on the software-defined networking industry.

Shamus holds a BA in English and Urban Studies from Vassar College and an MS in Journalism from Boston University.

More from this author