GPYOU: Building and Operating your AI Infrastructure with Juniper Networks

Summits Data Center
Kyle Baxter shown next to a presentation slide about Juniper conquering DC complexity with breakthrough technology

GPYOU – Building and Operating Your AI Infrastructure

AI infrastructure is a critical but complex domain, and IT organizations face the pressure to deliver results quickly. Juniper Networks shows Juniper Apstra as a solution to streamline the management of AI data centers, providing proven designs. Kyle Baxter emphasizes the necessity of a robust network foundation for AI and ML workloads and highlights the challenges of traditional network management tools, which often overwhelm users with data, making it challenging to pinpoint root causes and resolve issues efficiently.

Juniper addresses these challenges by offering a comprehensive solution built on the Apstra platform. This platform features a contextual graph database, intent-based networking, and a vendor-agnostic design approach. Combined with Mist AI and the Marvis Virtual Network Assistant, Juniper aims to provide a holistic view of the data center, moving away from managing individual switches to focusing on delivering desired outcomes. This approach simplifies the complex network, allowing for precise identification of root causes, related symptoms, and impacted applications or training jobs.

The presentation focuses on managing the training side of AI and ML clusters. It highlights Apstra's global capabilities to manage various data center networks, including back-end, storage, and inference networks, for large enterprises. Juniper offers designs and flexibility to manage any network design using a single tool. The key takeaways are the ability to design, deploy, and assure network operations, utilizing Juniper's leading switching portfolio and security solutions. This aims to provide a streamlined, efficient, and reliable AI infrastructure management solution.

Presented by Kyle Baxter, Head of Apstra Product Management, Juniper Networks. Recorded live in Santa Clara, California, on April 23, 2025, as part of AI Infrastructure Field Day.

Show more

You’ll learn

  • How Juniper addresses common AI networking challenges

  • What the rest of the GPYOU series will cover

Who is this for?

Network Professionals Business Leaders

Transcript

0:00 so my name is Kyle Baxter i head up the

0:03 Abstra product team today we're going to

0:06 be talking about GPU on how you are

0:10 going to be able to manage and deploy

0:13 your AI

0:15 infrastructure so we're going to walk

0:17 you through your AI journey we're going

0:20 to give you a quick intro on what is

0:23 Abstra for those that don't know and

0:26 then we'll get into how you use Abstra

0:29 to design your AI data center how you

0:35 manage your data center at scale for

0:38 deployment and how you operate your data

0:43 center and so let's get into a little

0:46 intro on what is

0:50 Abstra so I think everybody will agree

0:53 here that you can't build a great

0:56 building without a great foundation so

0:58 if you have a weak foundation you're

1:00 going to have problems in the network is

1:03 the foundation in AI and ML training and

1:06 inference jobs the problem is the

1:10 network is the problem there's always

1:13 lack of insight how do you know why

1:15 you're training jobs slow why are you

1:17 getting

1:18 congestion how do you deploy faster when

1:21 there's business needs to go deploy you

1:24 know new fabric or expand faster how do

1:26 you get that speed and get that reliably

1:28 um and then how do you prevent outages

1:32 and make sure your network is operating

1:35 reliable this is is where Juniper comes

1:38 in um because the network is complex

1:42 there's so many elements that make up a

1:45 network and when you look at you know

1:47 spines leaves and all the links and

1:49 we're amplituding and amplifying that

1:52 with AI when we're talking about 800 gig

1:55 networks and in the future 1.6 and

1:57 beyond that are going to get some crazy

1:59 speeds that if anything goes wrong it's

2:01 going to go wrong fast and the problem

2:04 with most network management tools out

2:07 there is they flood you with data which

2:10 is nice but then you're looking for the

2:12 needle in the haststack you're just

2:13 looking through all these data points

2:15 trying to figure out what means what and

2:18 that's what we've done differently at

2:21 Juniper we've brought together our

2:24 technology together to build something

2:26 different and something better and so we

2:30 brought Abstra which brings the

2:33 industry's only contextual graph

2:35 database with intentbased networking and

2:38 designs with a vendor agnostic approach

2:41 we've combined that with Mist in the AI

2:44 there to bring with the industry's only

2:46 AI native platform and Marvis virtual

2:49 network assistant to the data center and

2:52 so we want to look at the data center a

2:54 little differently rather than as a

2:56 collection of individual switches that

2:57 you're configuring one by one we want to

2:59 look at it holistically as a solution on

3:03 how we can deliver the right

3:06 outcomes and we do that by cutting

3:09 through the complexity instead of seeing

3:13 all these random events and dots and

3:15 trying to figure out what's going on

3:16 Abstra cuts through that complexity by

3:19 bringing context to that data we can

3:22 pinpoint exactly what's the root cause

3:25 that you need to address and fix but

3:27 what is what are related symptoms that

3:29 you can ignore that you don't need to

3:32 worry about that you can shift all that

3:33 aside and then what are the impacted

3:36 applications or training jobs because of

3:39 that issue that we have pinpointed we

3:41 can do all of that for you and we'll

3:43 walk through a lot of that today and

3:47 what this delivers is a complete

3:49 solution from design deploy assure that

3:54 can run on our industry-leading

3:57 um switches and switching portfolio with

4:00 the integrated solution from security

4:02 that we talked about in another session

4:05 and be able to bring that to you

4:10 and we're going to focus a lot about on

4:13 the training side of AI and ML clusters

4:16 today but I want to make sure it's clear

4:19 that Aster can manage more than that

4:21 we've been managing data centers with

4:24 all over the g globe for some of the

4:26 largest enterprises out there that we

4:29 can do not only the backend networks

4:30 that we'll talk about today but storage

4:32 networks inference networks or any kind

4:34 of other data center network we have the

4:37 designs we had the flexibility to manage

4:39 any design and you can use one tool and

4:42 we'll see that here in uh in a later

4:44 session we talk about operating on how

4:47 we can use one tool to manage multiple

4:50 different networks

Show more