Mansour Karam, GVP of Products, Data Center, Juniper Networks

Seize the AI Moment with Juniper Networks

Summits Data Center
Mansour Karam Headshot

Cloud Field Day 20: Seize the AI Moment

Mansour Karam sets the table for Cloud Field Day 20 with a talk about how AI is the next great tech transformation, why the data center sits at the heart of it, and how Juniper’s AI data center solution is best positioned to help customers achieve their AI goals.

Show more

You’ll learn

  • How the AI data center sits at the intersection of AI for networking and networking for AI

  • How Apstra and Marvis VNA for Data Center help operate and manage AI data center networks

  • Why the network is the most critical part of the AI data center

Who is this for?

Network Professionals Business Leaders

Host

Mansour Karam Headshot
Mansour Karam
GVP of Products, Data Center, Juniper Networks

Transcript

0:10 I am manour Karam and the global uh VP for data center here at at Juniper and

0:17 uh I know it's a bit early but I hope you're as excited as I am I mean that's

0:22 just been a feeling of excitement these days that's Happ is happening pretty much every day there is always new stuff

0:29 happening every day when was the last time we had 30,000 people cheer for a

0:34 new Asic I mean the last time I remember was with x86 po possibly like the x86

0:42 and then we had two 286 we were excited and 386 and then when penum came out I know I'm aging myself uh but you know

0:50 that's the feeling I'm getting here it feels like it feels like that same level

0:55 of uh of excitement and indeed every morning you wake up there is something new and exciting that

1:00 you can use like I've installed GPT 40 on my phone and I've become kind of

1:05 dependent on it I ask it questions every day and and I didn't know that thing existed just a a couple of months ago uh

1:12 I went to LinkedIn to look at my blog uh that I written back in January I'm sure all of you guys read my blog in

1:18 preparation for this and that was um that was the blog we I

1:24 wrote when we launched our AI native platform and uh what I saw was like this little Square on the side and that was

1:32 LinkedIn kind of uh nicely summarizing the blog for me and I read it and it

1:38 felt like I couldn't have written it any different or any better even like the

1:43 highlighting the fact that we had were the first to launch an 800 gig platform it was right there uh written by AI I

1:51 mean it's just one of those things where every day there is something exciting and it feels like we should be all uh

1:59 surprised that we are surprised but at the same time we shouldn't be I mean that's just been that kind of incredible

2:04 cycle of innovation that we've had since uh we've had Mo's law uh 50 years ago or

2:11 TW 6 50 60 years I can't count uh ago to the to the to the day uh where you know

2:17 uh uh more predicted that we would be doubling compute power every two years

2:23 in frct what happened is that we kind of doubled every 18 months and what it means is incredible Innovations when

2:28 once in a while amazing crazy breakthroughs and uh I was very uh

2:35 grateful that since I came to the us back in 1995 and that was in September

2:41 of 1995 which was actually the same month Netscape went IPO um I've been

2:47 witnessing some of these inflection points some of these great cycles of innovation starting with of course the

2:53 internet you know every company trying to go online that was the web Revolution and then the cloud Revolution when these

2:59 services became a lot more sophisticated and then the mobile Revolution we always remember Steve Jobs holding the iPhone

3:06 introducing the iPhone it became clear at that point that everything we do we can do it just from our phone and then

3:13 after that we had the digital uh transformation it was accelerated by the pandemic where we realized not only can

3:20 we do everything from our phone but now we can do it from the comfort of our own homes and all the way to today where

3:26 we're seeing this massive great AI Revolution and so and bringing it back to networking uh look at the evolution

3:33 of ethernet during this period when I uh was at Stanford in 95 100 mag was kind

3:39 of like the screaming latest exciting uh state-ofthe-art and then one gig kind of

3:45 fueled the web Revolution and then 10 gig fueled the cloud Revolution and then after that we had 40 gig 100 Gig just I

3:53 remember it was a year or two ago we sitting with analyst saying H 400 gig uh is U is arriving but are there use cases

4:01 for 400 gig you know maybe it's going to take longer for adoption of of 400 gig

4:06 and then we get AI not we skip 400 gig we go all the way to 800 gig that's shipping today and we're talking about

4:13 1.6 and 3.2 terabit per second and so if you're like me at kind of at the

4:18 intersection of networking and AI or like Juniper I mean of course Juniper in was founded in 97 uh as part of the web

4:26 Revolution we've been experts at networking for many many years and then since the acquisition of mist and and

4:33 abstra Juniper realized the importance of AI I would say you know very early and we're all be became AI expert so if

4:41 you're at the intersection of networking and AI it's an exciting place to be because really AI at the core of it the

4:47 engine driving AI is the data center and the core of that data center is the network the network is kind of the

4:54 neural network in connecting all of the components that make up these AI

4:59 clusters but not only that AI is also transforming the way we operate these

5:06 networks and so again if you're at the intersection of networking Andi it's a really exciting place to be and it's in

5:14 the right time to Kind of Revolution revolutionize networking and lead the AI

5:19 networking wave and hopefully what you'll see today is a lot of examples of how we're doing that and in fact you

5:26 know back to that blog that's when we launched our AI native platform which was kind of

5:32 like one big step in this direction and so this AI native platform is quite

5:37 ambitious but at the end it starts with being AI experts really understanding

5:44 first of all what data one has to collect and at Juniper we've been collecting data for many many years and

5:50 we've become experts at knowing what specific data one has to collect and then becoming experts at what models to

5:57 feed the data into in order to to provide the right insights and uh to the to our Network operators or to automate

6:05 the network in the most powerful way but it's also about building the right

6:10 infrastructures to support AI workloads so having the right secure infrastructure for AI and I'd say it's

6:18 pretty it's all encompassing you know we are uh we play in many domains all the

6:23 way from campus to Data Center which is the focus today to the to the W and AI

6:29 native uh spans across all all the way from client to cloud and I'm going to

6:35 talk about these two pillars today AI for networking and networking for ai ai not for networking is how we're

6:41 leveraging AI to transform how we operate networks that's kind of like the

6:47 first pillar and the second pillar is networking for AI which is what type of

6:52 infrastructure networking infrastructure is needed to deliver on these AI use

6:59 cases and so let's start with uh AI for networking for data center in the

7:06 context of data center for jiper the journey started with the acquisition of abstra and I was uh the co-founder of

7:13 abstra uh we started Abra back in 2014 we celebrated recently our 10y year

7:19 anniversary and we've been part of juniper for more than three years now and what abstra uh set itself to do is

7:27 to deliver this power powerful automation of data center networks end

7:33 to endend the entire life cycle from designing to building to deploying and to operating data center networks and we

7:40 did it with this Innovative approach uh that we invented called intent based

7:46 networking we'll get a bit more into that but before I describe the technology what I want to say is that

7:54 let me just tell you some of the outcomes we're seeing with our customers starting with I would I would like to focus on reliability first the reason we

8:01 started abstra is we wanted a an automation tool an automation solution

8:07 that Network operators could trust software that they could trust will run

8:13 their networks with them with utmost reliability like an autopilot on a plane

8:18 you tell it to go to 10,000 ft and then it will deterministically go to 10,000 ft you know it's going to work in fact

8:25 if you build your software right you should get much better reliability in order of magnitude more reliability than

8:32 if you were doing things manually and indeed that is the outcome we're seeing with our customers in order of magnitude

8:38 Improvement in reliability that's number one number two automation should let you

8:44 run a lot faster not just more reliably but also a lot faster and so we've seen

8:49 deployment times up upgrades times which have been accelerated 10-fold we've had

8:55 customers upgrade their Global networks worldwide across mult multi vendor

9:00 multiple vendors just by a click of the button within one morning uh and then

9:06 last but not least we want to empower these Network operators so they can do a

9:11 lot more with less we have uh network teams uh that maybe are just like a

9:17 handful of of folks running networks this is a bank that I'm thinking about in Europe running a network that is

9:23 supporting thousands and thousands of users and these are stories that we see in and out and so that is the type of uh

9:31 that is the type of outcome we see with our AI based Solutions with abstra uh as

9:38 part as part of our automating as part of us automating data centers 10x

9:43 outcomes manour um you you talked about experience first or you experience first

9:49 I think was on the on one of the earlier Graphics you showed and then here you have operations first which made me

9:55 suddenly wonder if you mean operator experience first or if you mean application user experience first a

10:03 great question in fact we so when we say experience first uh and this is kind of

10:08 like a juniper uh slogan I would say we're focusing on both we're saying uh

10:13 it is about the network operator so it's the experience of the network operator you know so we don't want a network

10:19 operator to be spending way too much time and then doing things with the

10:24 worry of and and and and in exposure to risk of making errors right so like

10:30 we're improving the network operators lives by give them giving them solutions that allow them to run very fast but

10:37 doing it very reliably but also at the end the network is there to your point to support the users and the apps and we

10:44 want the user experience and the users of using those applications we want

10:49 their experience to also improve dramatically yeah I don't I don't claim they have a huge audience but anybody

10:56 who listens to me for 10 seconds knows that um I think that the functioning of the

11:01 network and its management is fundamental if not foundational to the user

11:07 experience absolutely I mean I mean the network touches every component and we'll talk about that a bit more and

11:13 it's so frequently ignored it's I mean by by the lay person let's say the late Tech person it's it's taken for granted

11:21 or not thought of I agree it's it's one of those things networking is one of those things you only remember your

11:28 network or you when things go wrong right otherwise you're taking it for granted this is why we say you know the

11:33 best thing for the network is that you know people don't know about it because then you know it's working well right

11:40 and so but you I completely agree with you the network is critical to making the user experience and the app

11:47 experience be what it needs to be yeah absolutely and so uh back to abstra and

11:54 just want to talk a bit more about intent-based networking and where we're taking this technology uh the idea

11:59 behind intent-based networking was that it's a layer of software the way you interact with it is that you tell it

12:05 what you want uh like for example I'd like to add a new Rack or I'd like to add a new virtual uh virtual Network or

12:12 I'd like to add a security policy and then what the software does is that it makes it happen 100% reliably it does it

12:21 by uh configuring the devices of course but it's not just configuring the devices it's collecting all of the

12:27 Telemetry the state from your network building a whole representation in the distributed data store that captures all

12:33 of the relationships in a graph and then by having all of this knowledge it knows it can prev validate whether a

12:40 configuration change is going to deliver on the right outcome and then it can test post deployment whether or not in

12:47 uh whether or not a change that you've made the indeed at the end delivered on your outcome so it's a closed loop

12:53 system it's a closed loop system that is collecting Telemetry and interacting with your network in real time does it

13:01 have a roll back to that then thank you because everything is described in software every time you make a change

13:07 we're able to keep a snapshot of the entire network if you're familiar with juniper roll back function that's on a

13:14 per device basis this is a roll back function at the entire network uh level

13:20 the entire network and not only can you roll back to where the the network was prior to this change you just made you

13:27 can roll back to whichever version of your network you want you know if you want to roll back to the way your

13:32 network was it Tuesday a week ago it's it click of a button so how Backward

13:38 Compatible could you get uh so like if you had an

13:44 update H how do I put this side if you have a uh an update that all of a sudden

13:49 these backups just can't work because you have to roll back the software to make the backups work yes is H is plan

13:58 for that or no well so essentially the this is a great question so what I'll say is with abstra um it it used to be

14:05 in the past like's say 20 years ago that when you build automation it was you're still manually interacting with your

14:11 automation it was like through the web or something of course we have that with abstra you can do this you can you have

14:16 a web interface it's uh it's easy to use Etc but really it's an API layer

14:21 everything you do is through the apis and the type of thing you just described is orchestrated through the API this is

14:27 why I have here terraform and service now and what you're going to see through the demos today is how we've leveraged

14:33 those Technologies to essentially Implement some of the workflows you're describing and so you would have let's

14:39 say you're making a change because you want to upgrade your applications so you could use something like service now and

14:45 you could orchestrate all of that and then there is a component of that which is the network that's going along with it and let's say your network is

14:51 upgraded and you're done with that upgrade but the application team is not ready and so even though everything was

14:58 successful on the networking side you have to roll back because your application guys hav't tracked with you

15:04 and that's all orchestrated through something like service now and that's the importance of having this API layer

15:10 and that's also why what's really important here is that you want this API layer to be vendor agnostic you want a

15:18 Loosely coupled architecture between your choices southbound and your choices

15:23 Northbound and with Abra even though we're part of juniper we are still and we will continue to be a multivendor for

15:30 exactly that reason you don't want your choice of uh service now or terraform in

15:36 the north uh bound and all of the implementation you've done to be locked into the choice of the vendor that

15:43 you've chosen southbound okay and so part of that technology is giving you

15:48 that loose decoupling that is so critical when you have an API layer like you've described so my S two questions

15:56 um there you mentioned that you know you're talking to this solution almost in a natural language is that what

16:02 you're using to say I want to deploy some portion of the network is that or is that something that's uh you

16:09 predicted my next slide okay I mean a second question is is are

16:15 you simulating the network change before it's made or you you checking the network change after it's made to see if

16:20 it's still viable or are you doing BS I'm loving these question so we do both

16:26 you know how I said we have distributed data store in fact I like we have two of them right so there is one which is for

16:32 the staging uh of your changes and we have one that is for your Active network

16:38 they unless if you're not making any changes they're exactly equal to each other they're mirrors but then when you

16:44 start making changes you're making changes in the staging uh uh area and

16:50 when you're making changes in the staging area this is where all of the prevalidation happens it's making sure

16:56 that in the context of everything it knows and the context of the intent you have the changes you're proposing make

17:03 sense and it will not let you commit if it doesn't compute if these things don't

17:09 compute if the prevalidation doesn't pass and if when the prevalidation passes it will allow you to commit and

17:15 then you press commit and hopefully today in the demos you're going to see some of this when you press commit at which point the state from the staging

17:23 distributed uh uh State goes into the active State and which point the

17:28 configuration are pushed at which point the Telemetry is collected the tests are run on your real Network to post valid

17:35 validate that indeed the networks changed as you the network change happened as you expected them yeah great

17:42 questions and so and can can I jump in there to finish that thought there how

17:48 much control does the operator see and have over each step in that that process

17:57 you know we don't want our AI overlords just like rolling through here and taking over so when

18:04 you're so at that same point when you're when you're about to commit you have

18:10 like a whole description to every last detail of every change that's occurring

18:16 in your network and you can go and review these yourself in fact we have customers that you know they have like a

18:22 step there where a human goes and you know as part of the process they review everything and it's only when they

18:28 review everything that they press commit right and then again after you commit you have an ability to roll back when

18:34 you roll back you have a change you have an opportunity to say okay well actually I want to bring I want to do this again

18:40 but I want to do this extra change so you have tons of flexibility in how you do this and I could go in there and say

18:46 these changes I'm I'm confident in the system's ability to detect that it's correct and I'm going to just whatever

18:52 it says is okay these require an operator to say yes you can continue

18:59 yeah I mean at the end it becomes very much based on the policies that you as a an organization set up right so we can

19:05 let it go as fast as we want and stop at the critical Paces where we're worried

19:11 about going wrong ex exactly for example when I showed service now as one example like we have some customers that use

19:17 things like slack where there is also like you know like communication that's going on with the various teams to slack

19:23 and there is protocols in terms of like who has to approve Etc that is going that is going going on as part of the

19:30 workflow right and then uh in terms of change Windows and things that we're required to maintain those can be

19:36 programmed into so we make sure operators don't kick things off outside of legal change Windows yeah so again

19:43 this is done through the AP I would say this are done through the API we have tons of role based access controls so like we have different users we have

19:49 different that have very granular we have if we have a very granular control of in terms of what they can access

19:55 right these specific users so you have super admins you have admins you have like the folks that are only doing design and

20:01 then you can be very granular in in in in those authorizations uh but then using the

20:06 apis you can also restrict like access to the platform yeah but you know actually with change

20:13 Windows one thing I'll say is there was one customer just an anecdote that had this change window on Saturday uh

20:18 through the whole weekend and they started deploying on Friday and then they called me on Friday afternoon I

20:24 mean clearly they were a bit I would say you know they were very excited in Emotion emot and they were like Mansour

20:30 I just want to call you because I had reserved the whole weekend to do this but then it's Friday noon I'm done you

20:36 know going home and spending the weekend with my family so maybe you know sometimes change Windows are for the

20:41 week no I'm kidding it's I love that you're talking about um

20:49 network devops uh a lot of the methodologies around continuous delivery and continu well continuous deployment

20:55 in particular would say there was no such thing as a change window we deploy changes including Network changes yeah

21:01 no I mean I always I mean I was a joking I I would always recommend chames windows but I mean in fact with

21:07 technology like this you can do things a lot lot lot faster the reason you have changed Windows is because you need to

21:12 coordinate amongst multiple teams right so so that that's part of you know so I'm not

21:18 recommending no change Windows um so so I was going to just I I I skipped the

21:24 slide so one of the things we're doing well as I mentioned we are uh you know using intent based networking to operate

21:31 your network day in and day out right um and so like this is the tool that you

21:37 use to deterministically get outcomes from your network make changes like an autopilot

21:42 on a train as I on on on a plane as I mentioned but then in the process you're collecting all of this data and would it

21:47 be nice if that you if you took all this data and send it somewhere in the cloud in some AI Ops layer that is not just

21:54 collecting data from the data center but also from your campus from your when from your sdw and this way you have like

22:02 a one repository for all of the state like as I mentioned collecting the right data and having the ability to then

22:08 apply models on this data and then this is where the chat B bot can come in and then you have an ability to through a

22:14 conversational interface interact with your network and also this is you know

22:20 AI is probabilistic and so these things are very compliment deterministically you're making the right changes and

22:26 operating the network but then you can provide probabilistic insights in terms of hey you know I'm giving you

22:32 visibility I believe you know like if you're having an issue it's probably in in that realm right so having that

22:40 deterministic approach combined to that probabilistic AI approach is extremely powerful and this is essentially part of

22:47 what we launched in January and we've delivered and we started delivering in the first half of this year okay and

22:54 we've not we haven't stopped there the third aspect and we delivered that in Q4

22:59 of last year is uh is uh application

23:05 awareness and uh the first step towards that is I'd say the most comprehensive

23:10 flow uh flow Telemetry available on any platform but that's now part of abstra

23:17 part of that same distributed data store you can now uh collect a Telemetry

23:23 application uh level Telemetry and combining it with combine it with network

23:29 Data Network State data and this way as a network operator not only can you

23:34 answer questions about whether or not your network is working well but you can do that in the context of every specific

23:41 app in fact that's really like the what the job of network operator when the application team comes to them and say

23:46 hey my application is not working well you want to have the ability to tell them whether or not it's your fault it's

23:54 because of the network or not meantime to innocence Ain so it's never the network

24:00 it's never the network it's DNS or bgp yeah so here the summary

24:07 right for AI for networking we started with intent based networking deterministic control we're adding aiid

24:13 driven probabilistic insights and then we're combining this with application awareness and I dare any competitor to

24:21 have a solution that as comprehensive as ours with the outcomes that we are able to deliver to our customers and one of

24:27 the proof points I'd say is the number of customers that are upgrading to our premium licenses we have standard Advan

24:34 and premium licenses and 68% of our customers in q1 purchase premium premium

24:41 licenses to me this tells me the amount of value the value our customers are

24:46 getting with our Solutions okay so that was AI uh for networking so let's now

24:53 talk about networking for AI essentially the role networking infrastructure we

25:00 need for AI workloads well before you move on we did most of the talking in

25:05 that first section about the deploy side do you have a a similar story you could quickly share on the troubleshooting

25:12 side of of that uh implementations where this data actually raises issues and how

25:18 that how The Operators interact with that yeah no absolutely so well hopefully some of the demos will show

25:24 you that but we're collecting Telemetry on a continuous basis and you you can

25:29 with uh abstra collect any Telemetry that you want that you can otherwise

25:35 access and so once the Telemetry is in this tool you can run any uh analytics

25:42 pipeline against this Telemetry it's completely custom so of course we have a bunch of things uh that are TurnKey in

25:48 the solution uh that are there out of the box but then you can customize this to your heart's content and so if there

25:56 are any uh areas that like for example if for a given application you're seeing

26:01 packet loss on a port or you're interested in seeing there's going to be any packet loss on a port then you can

26:07 get alarms and events for those specific uh occurrences and so that is that goes

26:12 a long way in helping you uh troubleshoot so that's number one and number two we're now shipping all of the

26:18 data into the cloud and there if we see any patterns that could indicate some sort of trouble kind of lurking in the

26:25 in the uh in the future like for example uh for transceivers uh we collect the

26:30 state for these transceivers and we know a signature of a transceiver that is about to fail right so we don't wait for

26:38 the transceiver to fail for you to go and to give you an alarm so that you go and replace the transceiver we notify

26:45 you prior to the transceiver failing so that now you have an ability to go and replace your transceivers before they

26:51 fail okay so that's all um something I can select for automatically at the

26:56 beginning which ones of these alarms we're going to raise and push up to the to the to the knock operations for

27:03 escalation and they're all um pre prepackaged based on your your look at

27:09 the data from other sources correct yes just be clear AB support Arista Cisco

27:17 Del any of these other Solutions as well you yes absolutely Dey manage absolutely and we have many

27:23 customers that use them across uh these different vendors yes and is it just

27:29 limited to that list or kind of anything you can key on with an API or so we have a whole uh if you if you just actually

27:36 probably if you ask uh GPT for it'll tell

27:42 you well we it's all documented uh publicly and there is a a validated uh

27:49 list of uh and qualified list of devices which operating systems Etc we're uh

27:54 we're very detailed in in in specified in uh uh which platforms we support yeah

28:02 so so how far you can go in like orchestrating these different vendors and connect them together so I'm talking

28:08 for instance like VXL GPO stuff like that where I will be able to like connect like different vendors together

28:15 are you able to build like multi sight multi data center sort of a fabric

28:21 multiple different vendors you know so you know I'm a networking guy you know I've been in networking my entire career

28:27 and wasn't that the promise of networking that you should have that ability to connect these hey back here in the real

28:34 world is it isn't there isn't that why we've spent so much time on standards

28:40 like bgp and evpn vxlan and this why I feel so strongly about this vendor

28:45 agnostic thing right like why would you throw all of that away by bringing in a management solution that only works with

28:52 one vendor and locks you in completely we might have as well like implemented completely proprietary version of bgp

28:59 why do we need bgp if you're going to do that right I mean so so so absolutely

29:05 you can do that to the extent that the the standards allow it right so uh you

29:11 know we've seen for example VPN vxn is a good example everyone straight away a stay and kind of it wasn't really

29:16 consistent I think we're seeing a lot a bit more consistency um but like to me we're

29:22 limited by the ability for the standards to allow for that right so that's essentially the limiting factor but if

29:29 the standards are allow allow it then there is nothing in the solution that will prevent you so so let's say like

29:34 you know I'm a client and I'm acquiring like you know companies and my data center is now Cisco I'm going to acquire

29:41 another company they are doing Arista and I'm going to on board Arista into into abstra can I then like literally

29:49 use the abstra to sort of like connect these two data well you're describing exactly you know like the use case that

29:54 one of our customers has has used apps for yeah and Abra runs

30:00 in the cloud or runs on Prem no so abstra runs on Prem and there is a really good reason for that by the way the session is grunning longer that we

30:06 originally in sped but I'm really excited about it because because of all of the questions that I'm getting and I

30:11 would like to encourage them so this is good but uh what I'll say is with

30:17 uh uh the sorry repeat the question I running the

30:22 cloud there is a reason why we're r abstra on Prem the reason is we need to

30:28 hit the highest levels of reliability remember reliability is the number one goal we require an outof band Network

30:34 for that reason you want to be as close to the network as you are the last thing you want is you're trying to make a

30:40 change and you can't access your network right having said that the AI Ops layer I described is in the cloud because

30:46 that's more about providing you insights that's more optional right if you have you lose access to that that's not going

30:53 to affect your network okay so that's that's the answer okay so let's talk

30:58 about networking for AI let me motivate this by you know so we talked about the

31:04 role of networking and how it's so critical for users and applications well in the context of AI clusters that is

31:11 even more the case the network is a critical in AI clusters if you think of

31:17 an AI cluster it's a really complex machine it's made of like all these different components different layers of

31:23 memory SRAM Dam flash uh attached storage different kinds of compute you

31:30 have CPUs you have gpus you have accelerated compute different types of networks you have the front end the back

31:36 end the storage and in fact in one of the sessions Prof is going to dig um into the details of that you have the

31:42 scale up the scale out Network you have all of these components and they all have to work in this harmonious balanced

31:50 way if you want to deliver on your an optimal job completion time job

31:55 completion time is the measure the iCal measure where these clusters are

32:00 measured it's the the how long it takes you to train a model right and if you're

32:07 familiar with some of these models they can train they sometimes take months to train but you know what if your gpus are

32:14 underutilized or like your model is your your AI cluster is out of balance then

32:19 you can add many many more months to that and these gpus that you spent like $35,000 to

32:25 $50,000 for each of these they are going to run completely underutilized which would be sad right and so the network is

32:34 critical to making sure that you get Optimal Performance uh to reduce to to

32:40 kind of reduce your job completion time to a minimal to to a minimum it's kind of like the system of Highways and

32:46 streets like if there is congestion there if you're not load balancing your traffic the right way if you're having

32:52 accidents I mean that's you know my analogy for packet loss right then then

32:58 essentially uh you're you're dead in the water but then you may ask well you know that's true like the network is really

33:04 critical but like what does it have to do with juniper isn't aren't these networks specialized isn't it like something like infin band uh that is

33:11 running in these uh in these clusters and to me uh it's it's actually you know

33:16 uh dja vu in so many ways because you know being in networking for a long for

33:21 a long time we've had these battles you know this is not the first time we see this right I remember back in 2008

33:28 whether it was oil and gas or uh or high frequency trading you know there was a

33:35 infin ban came in but at the end you know you know the saying never bet against the US I think in networking

33:41 this the equivalent would be never bet against ethernet right even um M MFE had

33:47 to go and eat his word literally because he expected the collapse of ethernet

33:53 which Never Never Happened he's by the way the in one of the co-inventors of of ethernet uh and so and the reason

34:01 is pretty clear I mean ethernet is the most uh wide deployed technology where

34:07 you have most of the in most of the Investments it has the widest ecosystem as I mentioned it's really critical to

34:13 these to get these AI clusters working well and critical to that is visibility debuggability you have hundreds of

34:19 management tools uh with with ethernet you can go and tap on the the link and

34:24 you will know exactly uh what's going on on that link you know this Frame format has understood for decades and and

34:31 decades uh you also have a massive ecosystem it works with any GPU I know

34:36 Nvidia today has the line share of the market but for specific use cases uh you

34:42 may get much better performance with other gpus and we've seen customers deploy you know kind of many different

34:48 options including AMD Intel and newcomers like cerebras where for some use cases you get you know orders of

34:54 magnitude difference in performance and infin ban only works with in Nvidia whereas ethernet is going to work across

35:01 the board and last but not least because again it's ubiquitous technology and there is so many investment at the end

35:07 the pro the price plummets much faster in fact today the total cost of ownership for ethernet is 50 time 55

35:15 times lower than with infiniband Monsour um so Ultra ethernet is being

35:21 developed um particularly for this kind of high speed

35:26 the highest speed con it and the latency issues in the systems that you were talking about um but are you talking

35:33 about that because Ultra ethernet initially is pretty targeted in terms of

35:38 its use cases right so are you speaking broadly about ethernet as as it exists

35:44 right now or are you signaling towards Ultra ethernet with no so I'm talking

35:51 about ethernet broadly and Ultra ethernet is yet another standardization effort on top of ethernet we saw r ma

35:58 back in the U over ethernet back in in the days yeah Rocky Etc so like there is

36:04 always going to be improvements to ethernet and that has been part of the journey with ethernet and yes uh you

36:11 know we are part of the ultra ethernet itra low latency ethernet Consortium uh

36:16 and we expect there will be some good Innovations coming out of there and I suspect propul is going to talk a bit more about it in in the session today so

36:24 just give you an example with with juniper of like the breadth and the performance of these ethernet Solutions

36:30 with juniper we both have broadcom based Solutions we're partners with broadcom and we deliver some of these uh

36:36 solutions to Market and then we have our our custom uh Solutions a wide range

36:41 just looking at broadcom for a second here just over the last 12 years look at the kind of incredible improvements in

36:47 performance back in 2010 we had 64 ports of 10 gig 640 gigs in one chip at the

36:53 time it felt like an incredible amount if you look at today we have 64 ports of

36:58 800 gig 51.2 terbit per second of performance in just one chip I mean this

37:06 is kind of the level of performance you get with ethernet but you're not stuck just with that you can if you want

37:11 deeper buffers if you want more programmability which you're going to need in some use cases then you have solutions for that and that's just in

37:19 the context of juniper you have other vendors plenty other vendors out there and so hopefully I've convinced you that

37:24 ethernet is kind of like the right solution and in fact even Nvidia in their earning calls a a a couple of

37:31 weeks ago admitted that ethernet is going to be even for them a multi-billion dollar market and so this

37:37 is a really important market for us to go after and we're indeed very focused on it and this is what analysts are

37:44 saying uh deloro on the on the on on the left they're predicting that this

37:50 backend network will be growing at 65% year toe and you can see the impact on

37:56 the total addressor Market Market on the right here this is 650 group you can see how it's turbo boosting the total market

38:03 for Ethernet from $22 billion today to $32 billion in just a few years and so

38:10 of course with juniper we haven't stood idle uh since actually you know I got in the role here uh back in September we've

38:17 really accelerated all of our efforts towards kind of delivering solutions for AI cluster and remember that 51.2

38:24 terabit per second chip we were the first to announce and that's was I was describing back in January in the blog

38:30 but now it's shipping and it's being deployed in some of the largest clusters on the planet today and of course that's

38:37 just one switch amongst many switches we have a whole portfolio whether you have use cases that require that type of

38:43 performance or different types of use cases we have all of those switches for our customers that's just one aspect of

38:50 the of the of the strategy the other aspect is operations right as I

38:57 mentioned you need to fine-tune these clusters you need deep visibility into these clusters

39:03 and of course we have abstra right and Abra is extremely valuable in the

39:09 context of AI clusters for two reasons one is that as I mentioned you can collect any Telemetry you want and run

39:16 any tests you want if you care about visibility to understand exactly what's going on you need that capability and

39:23 and abstra provides it uniquely the second is that as I mentioned these comp these systems

39:29 are complex and you talked about ultral low ethernet uh Consortium and there is lots of new features and knobs you have

39:35 in these in these networks and you want an ability to tune these networks to get

39:41 Optimal Performance and you may want to do this dynamically in real time and in fact one of the demos today is going to

39:46 show you how you can use abstra in in order to tune these networks based on

39:52 what you see so essentially a closed loop system leveraging abstra to find you in the Network for Optimal

39:58 Performance so that is also something that we have uniquely and that we're delivering as part of our solution and

40:04 last but not least and you're going to actually have the chance today to visit our lab here we are becoming as I

40:10 mentioned AI experts and in that context we've built a whole Lab here for AI

40:15 we're learning every day in this lab this all all of the folks you're going to see here they've spent so much time

40:21 in the lab including myself I love to visit and spend time in that lab and

40:27 uh we are learning with our customers our customers are spending a week at a time with us tuning their networks

40:33 tuning their models in in the in the lab and we are sharing these learnings with our customers and packaging them in what

40:40 we call validated designs so we're making it our business to deliver endtoend performance not just think of

40:47 the network as one component but figuring out how to deliver these endtoend solutions for our customers um

40:56 and we're packaging them in these validated designs if you have a large llm we know like this is the net this is

41:02 the the the cluster you need this is the cluster design you need you have a smaller uh Network maybe you wanted for

41:08 for for to to to find to fine tune a model that's uh a EV validated design a

41:15 self-driving Network model that's EV validated design inference EV validated design think of them as blue PLS for

41:22 these different use cases turn key validated designs so that you don't have to choose between either I have to

41:29 assemble it all myself because I'm choosing the components or I get a turnkey solution you can get a turnkey

41:34 solution while getting all of the flexibility of being GPU agnostic or

41:40 vendor agnostic and so I'm going to leave you with this uh this is a a survey that

41:46 Barkley did uh it has been you know it's it's been conducting this survey over many many years and in fact it was

41:53 Michael Dell that shared this on his uh on his Twitter uh on his Twitter field um many years ago uh uh when you asked

42:02 cios if they were thinking of repatriating their workloads only 43% said that they were if you ask this

42:08 question today 83% are saying they want to repatriate their workloads into the private Cloud into un Prem and there are

42:16 two reasons for this one is with AI data is becoming of primordial importance and

42:24 customers are a lot more hesitant to ship away this data in areas where they

42:30 lose control of it and it become super expensive if one day they would have think about bringing that data back and

42:36 so they are increasingly building private clusters for their AI uh for

42:42 thei the AI workloads That's number one and number two with Solutions like we've

42:47 described today like abstra like AI for networking it has been it is much easier

42:53 than ever before to manage and operate uh private clouds um and so actually one

43:00 of the demos today will be about how you can manage a a private Cloud as simply

43:06 as you would a public cloud and so for those two reasons we're seeing this massive interest in with CIO in building

43:14 out their own private data centers and so it's a really uh interesting time it's an exciting time we're seeing

43:20 progress every day we're now deployed in more uh than 70 countries and we're

43:26 seeing more and more more opportunities every day in this in these very exciting times and so that's

Show more