Episode 14: Beyond GPUs: Broadcom is Driving the AI Revolution

Episode 14: Beyond GPUs: Broadcom is Driving the AI Revolution
In this episode of The Q&AI Podcast, Bob chats with Ram Velaga of Broadcom about their pivotal role in the AI world. They explore the rise of custom AI accelerators, the GenAI boom, and how Broadcom enables massive AI clusters. They also discuss AI's practical applications within Broadcom, build vs buy strategies, the future of AI, and the networking revolution it's driving, both in data centers and at the edge.
You’ll learn
About the trend of large cloud providers developing their own in-house accelerators for AI workloads
Internal use cases for AI, including customer support, software development, and silicon design
Strategies for leveraging existing LLMs while protecting proprietary data
Who is this for?
Host

Guest speakers

Transcript
Bob: Hello and welcome to another episode of Bob Friday Talks. Today I am joined by a very special guest here for another round after the AIDC Seize the Moment event. Ram Velega, Senior Vice President and General Manager of Core Switching Group at Broadcom. We're going to touch upon a wide range of topics today, such as general AI trends to networking for AI.
And from there, we're going to dive into a conversation around AI its Impact on various sectors of the networking industry, such as economics and staying stability. Ram, welcome today. You know, to start with, maybe I was reading an article about Broadcom is actually a big AI Player for investors. You know, you think of NVIDIA GPUs, but you really don't think of Broadcom being a big AI player. So maybe you can give us a little background on where does break Broadcom fit into the AI ecosystem.
Ram: Hey, good morning. Thanks for having me here. You're right, which is generally people don't think of us as an, you know, AI player because we don't have a GPU. Broadcom doesn't make a GPU.
Broadcom doesn't necessarily make, storage, so to say. But however, what Broadcom does do is if you look at the general AI market, today, about maybe close to 70 percent of the GPU consumption is the large clouds. The clouds I'm referring to Google, Meta, Amazon, Microsoft, Alibaba, ByteDance, Tencent and so on and so forth.
And when these large cloud customers are deploying their, GPUs, there is a tendency for them to also develop their own in-house custom, accelerators. If you look at Google, you've heard about their TPU is one of their in-house accelerators. And similarly, Meta has their in-house accelerators, and you probably heard about other companies building their own in-house accelerators.
That's where Broadcom comes in is when the customers have their own IP and they want to build their accelerators, Broadcom provides a whole suite of IP. You know, whether it is SerDes or being able to actually provide a platform where we can integrate their high speed HPMs and so on and so forth and take their design and take it to production.
So, that's one way we play in the, accelerator market. But more importantly, if you think about it, the AI is all about very large-scale distributed computing. And when you're doing a large-scale distributed computing, you have to be able to connect all these accelerators together. So, network becomes a very important element in this AI play and Broadcom does a very, comprehensive job in networking are switching and routing chips are optics capabilities are Nick capabilities.
All of our PCI switching capabilities. All of these kind of going to helping build this large AI clusters.
Bob: Okay, so it's not all about GPU. So, you're seeing a lot of enterprise and businesses building special accelerators for different use cases out there, be kind of interesting. What use cases are you seeing that they're requiring accelerators being out in the market right now.
Ram: Yeah. So today, when you look at these large, 20, 30, 40 plus billion dollars of cap excess by these companies that are happening, building these very, very large, AI systems. It's because they're all trying to race for what you call, artificial general intelligence, right?
You have seen chat GPT 4, and you've seen versions of it called, Gemini from Google, you've seen, Llama versions from Meta, Mistral, and so on and so forth. So, you have multiple different large language models all trying to get towards what's called AGI or what you perceive AGI. To build these the general feeling is that you may have to build data centers, which have many hundreds of thousands of accelerators in a data center, or up to a million accelerators in a collection off, data centers. So today, almost all the investment is going into these foundational large language models.
Bob: Okay, so they're really, this is all about Gen AI and LLM. And I think even here at Juniper, right? I mean, you almost see every business now, they either have a customer facing initiative, Yeah. But they also have internal initiatives to use Gen AI, LLMs to increase the efficiency of their business.
Curious to see at Broadcom, inside of Broadcom, what big, have you found operational efficiency where you're actually leveraging Gen AI and AI inside of Broadcom?
Ram: Yeah, look, I would say it's probably too early to show that there is a significant operational efficiency by using, the Gen AI capabilities inside the enterprise.
But that said, this is definitely the time to start actually investigating the art of what might be possible. One of the places that, for example, within my team, they're starting to look at is we have a customer support team. And oftentimes the similar questions come into our customer support, team and then they have to go look up some, documents that we have on how our chips work and how they interact with the customer systems and try to come back with them for the responses.
And what we're trying, what we're doing now is we've taken, one of these large language models, which are available via cloud with one of our, partners, and we've taken our internal data and started to fine tune or train with our internal data on these large language models so that when the customer's questions come in through our chat bot, we're allowing the chat bot to answer the questions.
Now, we haven't actually enabled it with the customers yet, but we are actually having our customer service team or our technical assistance team use the chat bot to see the kinds of answers that we are getting. And what we're finding is 60 to 70 percent of the time it actually is giving a pretty good answer.
But however, this, 20 to 30 plus percent of the time where the answer is not what we would give. So, we are kind of going through this process because until we get to a point where we are comfortable with almost all the answers that the chatbot is giving, we obviously won't fully enable it. But we can see some improvements in internal productivity where there is another filter being applied, which is our humans are applying a filter.
To the answer these machines are giving. So that's just one example. There's other examples too. I mean, when we think about Broadcom, it is as much a software company in terms of number of software engineers we have as much as we have silicon engineers. And we're using the tools available. You could refer to them as Copilot or other, equivalent tools.
We are Vortex AI and so on and so forth from Google. And we're using those tools to see can we make our software engineers more productive. And lastly, but definitely not the least is building silicon increasingly is a very, very hard job. And getting silicon right the first time is imperative if you're not going to spend another 20, 30 million dollars in doing spins.
We're starting to look at can you use AI to check our RTL code? Can you use AI to improve our, what we call design verification process before we do tape out? So, there's multiple different fronts. None of them is a slam dunk yet, but you know it's worth kind of probing to see where this leads. And that's what we're doing now.
Bob: The customer support use cases one dear to my heart. That's when we're working on. Also, maybe you want to share a little bit because I've just went through this exercise. There's kind of this build versus buy. Are you headed down the, buying the solution off the shelf, or is this more of internal, see if we can't build it ourselves, using rag or some other techniques to search through your documents?
Ram: Yeah. So, what I would say is, if you kind of think about the AI and how the infrastructure is built for AI, there is an, first a large language model. On top of that, you're kind of doing fine tuning of that model with your, data specific to yours. And also, oftentimes, remember, this data is proprietary to you, and you don't want to necessarily put this data out there in the public domain, right?
And after you're fine tuning the data, and then you're kind of trying to figure out how you start making it available for this engagement. So clearly, we're not going to be out there building our own large language model. It's an expensive affair. So we're going to take an available large language model likely from, one of these cloud providers or somebody who's developing this large language model and decide whether we're going to do it, inside our private, data centers or whether we do it inside the cloud.
But either way, what is important is the data that we're going to fine tune it with and train it with. That is very proprietary data to us, and we want to make sure we do it in a way that we don't leak that data out. And, that is a decision that each company will independently make, on what makes the most sense for them.
So there's very efficient ways of doing it rather than trying to build your own large language models, which I don't think most enterprises will do.
Bob: Yeah, I think we're all on this journey right now. I think you, Juniper, Broadcom, I think we're all on the same journey.
Ram: Yes.
Bob: So you think about where we are in this AI journey, and if I put AI singularity, Terminator, AI passing the Turing test, what do you think? Is this a 50-year journey? 20-year journey? Where are we in this journey to AI actually doing tasks on par with humans?
Ram: Well, I wish I knew. I'd be making a lot more money. But, if the book, the gentleman who coined the term singularity and in the book on singularity, what he does talk about is these are asymptotically increasing capabilities that they achieve escape velocity before you know it.
So any projections that this might be much further out than, then it might be based on historical development is usually under underestimating how quickly these things develop, right? And I think that's why when you actually, here, the CEOs of these very large mega scale companies, talking about their massive capex, they say the risk off not investing is much higher than, than actually, the capital cost of investing and being in the game. And generally, when you talk, when I speak with, people who are involved in these investments right now, they say we're probably at least two generational models away from what they believe might be the end state.
And each generational model is 5 to 10x more compute capacity than the previous one. So, if I assume that the next generational investment for the next generational model is going to start, sometime summer of next year, you've got another 18 months for the first phase of the first generation of that model.
You add another,18 months to the next generation of the models. You're talking about three years from summer of next year. So, we probably are in a four-year journey. At least, of this investment cycle happening before we know whether we are on the right path.
Bob: Yeah, so I don't know if you play around with these, these new LLM agent frameworks now. I mean, because it does seem like we're entering a new software programming paradigm, right? We're getting to the point now where we're using these LLMs to actually solve tasks where I'll give it access to a bank account, ask it to optimize a website. So, you can kind of see we're getting to that point where we couldn't build solutions now that can all sort of start, once you give an AI access to your bank account, it's kind of amazing what these things can start doing.
But, back to your distributed programming technique, do you see a new software programming paradigm coming down, AI programming versus cloud programming? How do you see our software paradigm, program paradigm changing here.
Ram: Well one, first I will never give AI access to my bank account. Two, I don't know if I can speak enough about, changes to the, software, model and, paradigms and so on and so forth. But however, what I do have very strong opinions on is. How the infrastructure for AI needs to be, built, right? And oftentimes people will compare cloud computing to, AI, machines.
And what I'd like to point out is, when you look at cloud computing, cloud computing is a very different, in a paradigm than AI. Because in cloud computing, it's about virtualization. Because if you think about cloud computing, how did cloud computing come about? You had a CPU that had far more horsepower than any particular application could use.
And then you said, Hey, how do I make more use of this, CPU? So, I'm going to virtualize the CPU so I can run multiple different applications on it. Right? And then you start to worry about how do I prevent isolation? So that one virtual machine is not talking to virtual machine. There's no leakage of information from one to the other.
How do I increase utilization, so on and so forth. And so, in cloud computing, because it was about increasing the efficiency of the CPU, generally the networks were not that stressed. Yeah, you built large mega scale data centers and there was a lot of east west traffic, but the amount of bandwidth on the network was only the amount of bandwidth that you had per CPU, which is probably 25 gigs at some point and 50 gig barely pushing 100 gig.
But if you look at machine learning, it's the completely different issue. No one application, actually no one GPU can run a machine learning application. Especially if you think about this large language models, you need many thousands of GPUs, many hundreds of thousands of GPUs to be attached together with the network to look like they're one large machine, right?
And now the other thing that you find is on each of these machines, these accelerators, the amount of bandwidth coming out of it is no longer 50 gigs or 100 gigs. It's 400 gigs, 800 gigs. And some of these roadmaps that you see, they have up to 10 terabits coming out of each of these accelerators. So, networking as we have seen before is going to go through a paradigm shift with regards to how large these networks are going to get.
And network is going to become the fundamental for how these accelerators are going to be built. And that's why I think Juniper is in an awesome, place to be at the center of what I call the network is the computer. Right. And, eventually, obviously, it might change how the software paradigms change in terms of how the software programmer interacts with the machine.
What value does a software programmer add versus what does a large language model already abstract away as value it can provide? But I'm definitely not the guy to speak about it, but I see a paradigm shift coming and how networks are going to be utilized and needed.
Bob: Well, I mean, I think it's clear what you like you said. And they will Juniper calls networking for AI, right?
Ram: Yeah.
Bob: We definitely have the X86 front of the house. And now we're going to have this GPU back to the house. And that networking infrastructure is definitely going to a paradigm shift, 800 very high-speed connections in between all these CPU clusters to move data around. Now, the other paradigm shift we talked a little bit before the show here was around, what AI use cases, is that going to extend outside the data center? Are you seeing the need for bigger, larger, faster networks to handle these AI use cases going outside that are actually running in these data centers?
Ram: No, look, that's, what people today call the 600-billion-dollar question. Okay, to give you a rough idea of how this number 600 billion dollar comes about is, NVIDIA, they say, roughly 100 to 150 billion is their annualized run rate. And let's assume you're spending 150 billion on GPUs. You're probably spending at least half that much on building the data center infrastructure and the software and everything that goes around it.
So now you're talking about 300 billion a year of spend. Now, if you assume the people who are building these data centers at 300 billion dollars of spend are hoping to get at least a 50 percent margins on their business, they got to generate about 600 billion dollars a year in revenue. Right. So, there had better be a lot of applications coming in into the 600 billion to sustain the 600 billion dollars of revenue at the user level.
Right. And clearly for these users to be able to extract value eventually for these math to add up. These are users who are sitting in the home or these are users sitting in the enterprise. Who are trying to access this intelligence, that is probably being fine tuned into the, at the data centers, but eventually being delivered to them either at their home or at their place of work.
Or let's say, somebody is on the the field? Looking at a wind turbine or looking at an HVAC system, trying to figure out how to repair it. All of this information has to be delivered to the user, and so the networks are what connects them from within the data centers over the service provider, eventually to the last mile towards the edge.
And I would say you will find use cases. We will figure it out in the next year or two, but at the end of it, it's going to push networks.
Bob: I don't know if you've tried this Apple vision pro or the Meta. I mean, I think the use case is going to drive. These bigger pipes is going to be around augmented or virtual reality. What I saw coming down the pipe with the augmented reality is kind of that remote worker use case, and that definitely is going to require bigger pipes to handle those use cases where you're doing a remote augmented reality use case out in the field somewhere.
Ram: Yeah, no, I agree. Look, I think sometimes the word remote has a bad annotation. Everybody's like, oh, is that working from home working remotely? And I would actually probably change it to the field worker use case, right? People are in the field. You cannot have a robot today who is as, capable and agile as a human. But you may have humans on the field solving problems that you could actually feed them information that makes them far more productive, right? We will see a lot of those applications now.
Bob: Now, Broadcom has been a big part of my career. I'm in the networking business. We've got Broadcom inside of our access points, switches, routers everywhere. We talked about accelerators going out to the edge, so, we have these large data centers being built with big GPU clusters to train and run big Gen AI models.
But we also have kind of this thing moving out to the edge where Broadcom has a big play, what use cases do you see happening at the edge that's going to be driving, am I going to be bringing all my video back to some data center? Or am I going to be doing more and more of this AI out towards the edge of the network somewhere?
Ram: Here, if you look at specifically in the, equipment that we build and kind of deploy at the edges,one simple, use case for it is set-top box. Broadcom is in the business of set-top box, business and we build this set of box chips. We are embedding, neural engines inside our set-top box chips where you are able to do things along the lines of security troubleshooting because otherwise for the same thing where a cable provider might have otherwise had to move, send a truck in to replace and troubleshoot and stuff.
You're able to actually improve the performance off the end user. Those things that we could do on the set-top box, whether that is security, as I was saying, troubleshooting it and being able to actually look at your, link, health and so on and so forth. So, and there's, I'm, pretty, confident that even in the Wi Fi space, even a lot of it is about getting your Wi Fi connections, right?
Your signals correction, correction happening. You'll also start to see some of these AI neural, engines going right into the chips to improve the user experience and make it more secure and more available.
Bob: Yeah, well, I can tell you working with my healthcare customers, there's definitely visions of doing Wi Fi radar at the edge for fall prevention. I've seen a lot of video at the edge now, where they basically want to use a lot more computer vision. I definitely see where computer vision in that process is going to be moving towards the edge, because I'm not sure I want to bring all that video traffic back to the data center. So, I think that's probably a good example where we're going to start seeing more Broadcom at the edge becoming more and more relevant to this AI adventure.
Ram: True. I think, look, more likely than not, some of the video might still be delivered from the data center, but the latency with which you deliver, that video and then being able to actually do any localization or, as you were saying, augmentation of that in the, specific to that particular location or that user is definitely where the silicon's come into play.
Bob: Well, Ram, I want to thank you for joining this episode of Bob Friday talks, and maybe for our audience, any last words of vision where you see this heading, five years from now.
Ram: Oh, five years from now. I hope we will look back and say was an outstanding run. And I think we are a generational opportunity, especially for those of us who are in the networking business.
Right? A few years ago, if we built a 50-terabit switch or 100 terabit switches, we would be looking at and saying, who's the customer now? So, we have customers, knocking on our door saying, hey, when is your next switch? When is your next switch? So, I see this is a tremendous opportunity for, Juniper for Broadcom and everybody using the networking business because the network matters and we are at the heart of distributed computing and we are at the beginning of a long cycle for distributed computing.
Bob: I think what I tell people, internet networking on part of power and electricity, you have to choose what you want. But anyway, Ram, I want to thank you for coming today. It's been great having you. And I want to thank everyone here for joining Bob Friday Talks, and look forward to seeing you on the next episode.