Juniper & Broadcom: Why Ethernet Always Wins
Juniper & Broadcom: Why Ethernet Always Wins
In this session from our Seize the AI Moment virtual event, Juniper COO Manoj Leelanivas and Charlie Kawwas, President of Broadcom, talk AI market trends, customer use cases, and common questions. They also share their vision of how Juniper and Broadcom are addressing the needs of the market.
You’ll learn
What networking hardware enterprise customers need for their AI infrastructure
What future innovations Broadcom and Juniper have in store
Who is this for?
Transcript
0:00 [Music]
0:07 before we dive in tell us a bit about
0:09 yourself guys let's start with you ram U
0:13 my name is uh RAM valer um I run the
0:15 switching and routing uh ethernet
0:17 business at uh broadcom and the last
0:21 couple of years I've really been
0:22 fighting the ethernet versus infiny b
0:25 battle excellent Ray I'm Ray M I'm CEO
0:28 and pris analist for CG and we focus
0:31 primarily on the service Rider space and
0:33 the large Enterprise doing a lot of work
0:35 on research but economic modeling as
0:37 well ethernet it's been a Hot Topic
0:39 lately it has indeed uh so let's Jump
0:42 Right In Ram I'm going to start with you
0:45 uh at your most recent AI virtual event
0:48 uh you talked about infin man versus the
0:50 internet but I think you really set the
0:52 context properly by talking about the
0:55 importance of the network uh especially
0:58 in today's AI clusters and AI workload
1:02 and work AI applications uh can you tell
1:05 us a bit about that yeah sure um if you
1:08 think about AI right in the first thing
1:12 you have to think about is it's a
1:13 distributed computing problem and what I
1:16 mean by a distributed computing problem
1:18 is you cannot take an you know AI
1:21 workload and run it on one GPU no matter
1:25 how big your GPU is somebody today can
1:27 come and say they have the fastest GPU 2
1:29 years years from now they can come and
1:30 say they have something that's even
1:31 faster but the reality is any particular
1:35 GPU or accelerator is only as big as
1:37 what a tsmc can build or you know
1:41 Advanced packaging that you can do or
1:43 the fastest hbm you can put on but
1:45 really what you need for a machine
1:47 learning or an AI workload is many tens
1:50 of thousands of these gpus all acting
1:52 together as if they're one very very
1:54 large computer right to do that you
1:57 really have all of these to be tied
1:59 together and all of these have to be
2:00 networked together right and that's what
2:02 we mean by you know this is a
2:04 distributed computing problem when you
2:05 have a distributed computing problem the
2:07 network is what ties all of this
2:09 together and the network becomes the
2:11 computer right because anyone can build
2:13 the fastest GPU but if they cannot tie
2:16 all of these gpus together to act as if
2:18 they're one large piece of single
2:20 computer the whole thing falls apart and
2:22 that's what we mean by the network is
2:23 the computer and you know if network is
2:26 the computer and you want to build the
2:28 best network that's out there nothing
2:30 like Ethernet there was a finding from
2:32 meta which I thought was really uh uh
2:35 insightful that you highlighted in your
2:37 talk can you tell us what that finding
2:40 was about the how the network is
2:42 critical for making sure that these gpus
2:44 don't stay idle so um I I don't know if
2:47 you've seen this uh presentation which
2:49 was presented by meta I believe about a
2:52 couple of years ago at the ocp and what
2:55 they showed is um different um uh
2:58 workloads that they had different kinds
3:00 of recommendation models that they had
3:02 and how much time was spent uh in the
3:06 network the traffic going back and forth
3:09 between this gpus and it varied anywhere
3:12 from 20% to almost 57% right what it
3:16 means is there is that much amount of
3:18 time somewhere between 20 to almost 60%
3:21 of the time the gpus are sitting idle
3:24 waiting for the traffic to be shuffled
3:27 between these different gpus right now
3:29 so think about it typically these gpus
3:32 probably sell between 20 to $30,000
3:34 depending on how favored a customer you
3:36 are the vendor sometimes more a lot more
3:39 now you know and then you take those and
3:41 you're now putting together 100,000 of
3:43 this you do the math that's anywhere
3:44 between two to three plus billion
3:46 dollars in gpus right and if these
3:48 things are sitting idle that's a pretty
3:50 expensive Affair talking about a billion
3:52 dollars right like 30% yeah so if you
3:53 say 50% of the time you're sitting idle
3:55 you're sitting you know about a billion
3:58 and a half sitting idle then yeah and in
4:01 that same uh uh study meta actually they
4:04 built two different clusters one with
4:06 ethernet and we can talk about now
4:07 ethernet versus infin band look I think
4:10 you know two years ago if you talk to
4:11 anybody they said if you're building a
4:13 GPU you know cluster AI machine learning
4:17 nothing other than infinite band would
4:18 work I remember right it everywhere you
4:20 went it's like oh if it's not infin band
4:22 this is not going to work and I was
4:23 sitting there scratching my head saying
4:25 that's not true right today when you
4:27 look at it top seven seven out of the
4:32 top eight largest clusters in the world
4:35 are built based on ethernet y there is
4:38 one last remaining one that's built on
4:40 you know infin band but my take is you
4:43 know in a year year and a half from now
4:45 that will also be based on ethernet okay
4:48 so what's happened over this two-year
4:49 period is you know initially when you
4:52 get a solution that's kind of all
4:54 purpose built by the vendor and it says
4:56 okay look you cannot touch any of this
4:58 you've got the GPU you've got the
5:00 you've got these cables and you've got
5:02 the switches and all of this is
5:03 pre-engineered by us if you touch it the
5:05 thing is not going to work there is a
5:07 lot of fear uncertainty and doubt based
5:10 on which the customers who are in a rush
5:12 to deploy these systems will just take
5:14 it as somebody is giving it to them
5:16 right but then as customers start to
5:17 deploy them they'll start to find out
5:19 look operationally infin band is very
5:20 different than ethernet number two it
5:23 has a tendency to just literally you
5:25 know break down quite quite a bit more
5:27 than just ethernet has been built for
5:29 because ethernet is built under the
5:31 notion that it is going to be very
5:32 scalable reliable and so on and so forth
5:35 so customers have gone through these
5:36 experiences and said look I have to
5:38 actually Benchmark infin band versus
5:40 ethernet to see if it is worth this
5:42 hassle of maintaining this infinite band
5:45 which is very you know fragile right and
5:48 they started to test and you know uh
5:50 meta put out this paper they did over
5:51 24,000 plus gpus and they tested them
5:54 both and they found out ethernet was
5:56 pretty good it was actually in many
5:57 cases very comparable performance to
5:59 even band but with the operational ease
6:01 and reliability that you expect out of
6:03 you know ethernet so there's more and
6:05 more benchmarks that have been done
6:07 across the industry and that's why the
6:08 industry's moved on yeah and I think
6:11 there's I mean there's history here too
6:13 I mean in the past I mention when I was
6:15 a former CTO I loved ATM technology
6:18 because using slice and dice of data I
6:21 mean we had triple play back then but I
6:23 remember designing trading floors and
6:25 stuff like that and my boss came over
6:28 and said I want you to try these broker
6:30 workstations with ethernet I'm like
6:32 ethernet are you kidding me we have ATM
6:35 anywhere like that right but but then
6:38 our CFO came in hey it's a $1,200 a niit
6:41 car for an atm25 Meg and it's $69 for
6:46 100 price right and and and I was like
6:50 so I learn the economics part of network
6:52 design but initially I was concerned
6:55 about the architecture of ethernet but
6:57 it just kept getting better better speed
6:59 and that efficiency so I learned early
7:01 on never to get bet against Ethernet
7:03 from that perspective right I mean the
7:05 ubiquity of ether the ubiquity so so
7:08 from an economic standpoint maybe do you
7:10 want to comment a bit more I know that
7:11 you did a study recently right of the
7:13 economics of ethernet and and for those
7:15 don't know what we do is we have um a
7:18 software platform that's kind of like a
7:20 digital twin but it does economic
7:22 simulation modeling of whatever
7:24 architecture versus any architecture any
7:27 technology versus any technology or any
7:29 use use case or application for that
7:31 perspective so what we did in this
7:33 particular case was we model ethernet uh
7:36 against infin band and then we use a
7:38 similar architecture where we use spine
7:40 Leaf Technologies and we had a server
7:43 environment where we had the dgx 8 I
7:46 think the
7:46 h100s um and and from that perspective
7:49 we had the server environment but then
7:51 we had a compute Network that we had
7:53 infin ban but then we had in this case
7:55 the Juniper the qfx uh switches from
7:59 that and then the interconnections range
8:01 between 400 to 800 gig and honestly some
8:04 of the findings from a capex perspective
8:08 it's about 55% because even the ports on
8:11 infiniband itself are twice as expensive
8:14 as an Ethernet 50% cheaper for ethernet
8:17 compared to in 50% cheaper less than
8:19 half theost and then when we looked at
8:22 some of the other parts from the
8:23 switching cost then we looked at the
8:25 equipment cost which a lot of people
8:26 forget cables Optics that adds up over
8:29 time and there's different requirements
8:31 and then the second part is the Opex
8:34 right how much does it cost to manage
8:36 this environment and this is where we
8:38 model you know intent based automation
8:40 with abure to say how do we simplify so
8:43 the overall TCO came out to about 56%
8:46 savings over three-year time frame right
8:48 now wow so we're talking about same
8:51 performance uh better reliability and
8:55 more than half the cost but then to get
8:57 to Performance I think that in other
8:59 also critical factor is management right
9:03 debuggability
9:04 observability right in the ecosystem of
9:07 tools right um you know knowing what's
9:11 going wrong in the cluster and or
9:13 fine-tuning a cluster to make sure that
9:15 the parameters are all set properly from
9:17 a networking standpoint uh is also very
9:20 important knowing what's going on on the
9:23 you know on the link you know on the
9:25 wire is important and you know for
9:27 ethernet I you know there are hundred
9:29 and thousands of tools out there of
9:31 course we built one abstra uh now part
9:34 of juniper which works across all all
9:38 types of vendors uh but there are many
9:40 many such tools on the market yeah you
9:43 know I suspect that that's also a factor
9:45 compared to infin band I don't know how
9:46 many tools there are for to go and debug
9:49 or provide observability into into infin
9:52 bands I suspect that was factor in your
9:54 study that was a major factor I mean
9:56 there's we we focus mostly on tangible
9:58 benefits but there's a lot of intangible
10:01 benefits associated with it where when
10:03 there's only one vendor you're at the
10:05 mercy of their timeline and their
10:07 priorities not yours right so that's a
10:10 challenge right the other part is the
10:12 number of skill sets that are available
10:14 out there to be able to support
10:16 something that's uh set up this way and
10:18 stuff like that where if you look at I
10:20 talked about just the cost of the
10:22 equipment look at the ca of the skill
10:24 sets that you have to acquire you
10:26 normally when you interview an engineer
10:27 you don't ask if they have ether skill
10:30 right so I think those are the
10:32 intangibles that people aren't thinking
10:34 about how much is it to the skill sets
10:36 to maintain because that adds up to
10:38 operational costs and more importantly
10:40 I'm more concerned about business
10:42 continuity because realize we're talking
10:44 about AI but some of these models could
10:46 be used for high performance Computing
10:48 or whatever parallel processing that
10:51 requires that type of uh uh environment
10:53 and stuff like that so there's a variety
10:55 of use cases on top of AI for that yeah
10:58 I mean can you imagine that uh an
11:00 organization has multiple networks and
11:03 the more commonality there are in these
11:05 networks the better in terms of
11:07 leveraging the workforce the expertise
11:09 also there are some security aspects uh
11:11 right Ram yeah so let's talk about
11:13 security right so for example now
11:14 specifically when you think about ai ai
11:16 coming into the you know Enterprises
11:19 yeah what do Enterprises have that's
11:21 really differentiated that is their own
11:23 customer data their own analytics you
11:25 know how their whole business runs and a
11:26 lot of that is very proprietary and some
11:28 of it is so fun that they're not
11:30 necessarily maybe going to feel very
11:31 comfortable putting it outside their own
11:33 premise now so they start to build their
11:36 you know private AI Cloud so to say when
11:39 they start building it what needs to
11:40 happen is there's a lot of data that
11:42 goes back and forth between what is
11:44 stored in their Cold Storage active
11:46 storage and into this gpus doing the
11:48 data analysis crunching out some you
11:50 know coefficients and then pushing it
11:52 out so what this means is it has to come
11:54 into the natural security policies you
11:57 have already inside your Enterprise and
11:59 how things are being stored secured who
12:01 do you give access to what so and so
12:03 forth so if you build something with
12:05 something like infinite band which
12:06 doesn't have this notion of access
12:08 controls you know security and so on and
12:11 so forth you're going to be building
12:12 completely different islands and the
12:14 whole idea of building something that's
12:15 private you know into your Enterprise
12:16 and moving data back and forth is not
12:18 going to work right which is where kind
12:20 of having everything on ethernet one
12:22 common fabric one common set of policies
12:25 some one common set of access controls
12:27 right just makes all of the so much more
12:30 reason I'm actually glad you brought
12:32 that up because not enough people are
12:34 talking about the security aspects of it
12:36 because I always say security is as
12:38 strong as your weak it's link right and
12:40 then having the more distributed you
12:42 have security the harder it is to manage
12:44 and more opportunities for penetration
12:47 and you don't want to have these debt
12:48 Pockets where people don't have
12:51 understanding of what's going on in that
12:53 area so I always say no security no
12:55 business excellent um we've said that
12:58 ethernet uh provides similar performance
13:01 the to than infin band maybe tell us
13:04 about UE there is an effort right I
13:05 think it's about uh improving the
13:08 scalability of RDMA specifically right
13:11 uh do you want to tell us maybe a bit
13:12 about you know the purpose for U and how
13:16 it's going to scale ethernet to the even
13:19 further so while ethernet today does
13:22 everything that infinite band does you
13:24 know tooro can go in better performance
13:27 much higher reliability less than half
13:29 the cost we're all kind of thinking
13:31 ahead about how do you improve RDMA not
13:33 e how do you improve RDMA and that's
13:35 where U has come up with a bunch of
13:38 improvements to RDMA that allow it to do
13:41 multi- pathing so you don't assume
13:43 there's just one path between point a
13:44 and point B that there's multiple paths
13:46 then you have efficient you know
13:48 retransmits which is if packet four got
13:50 dropped but five six and seven got
13:53 transmitted we will go back and only
13:55 retransmit packet four rather than
13:56 transmitting five six and seven so built
13:59 onto the assumption that your underlying
14:01 fabric might actually fail rather than
14:03 infinity band and RDMA that said
14:05 underlying fabric will not fail right so
14:08 you build resiliency you know knowing
14:11 failures will happen otherwise it's like
14:13 hey building in you know a skyscraper in
14:15 San Francisco saying there will be no
14:17 earthquake right that's a ridiculous
14:19 assumption to make assume there's
14:20 earthquakes and retrofit the buildings
14:22 and that's what you know is excellent
14:24 there's a large ecosystem around it and
14:26 multiple uh topology options right like
14:29 where you can involve the Nick or you
14:30 may just uh work within the confines of
14:33 the the switches correct exactly yeah
14:36 excellent all right gentlemen that was
14:38 insightful and lots of fun
14:43 [Music]