The Q&AI: Predicting User Experience - Inside the Marvis Large Experience Model


The Q&AI: Predicting User Experience - Inside the Marvis Large Experience Model
In this episode of The Q&AI Podcast, host Bob Friday is joined by Kumar, Product Manager for Marvis®, and Navraj, Director at Juniper Networks, to discuss how Juniper’s Marvis Large Experience Model (LEM) is revolutionizing the way enterprises monitor and troubleshoot collaboration application performance.
Originally designed to correlate Zoom and Teams data with network metrics, LEM has evolved into a generalized, AI-driven system that can predict poor user experiences—even without third-party labels. The conversation explores the model’s architecture, real-world customer use cases, and how its integration with tools like Marvis Minis is accelerating the shift toward intelligent, self-driving networks.
You’ll learn
Targeted troubleshooting: LEM pinpoints whether collaboration app issues stem from the network, client, or app, eliminating guesswork.
AI-driven predictions: Trained on massive datasets, LEM forecasts video/audio quality using Juniper-only telemetry, even without third-party labels.
Root cause clarity: Uses Shapley values to break down contributing factors like Wi-Fi, WAN, or client-side issues
Who is this for?
Host

Guest speakers


Transcript
Bob: Hello and welcome to another episode of Q&AI. Today I am joined in the studio by Kumar, Product Manager for Marvis, and Navraj, Marvis Development, and in this episode today we are going to be talking about the Marvis Large Experience Model. Welcome Kumar, welcome Navraj. Kumar, maybe we'll start with you, Large Experience Model what exactly is the customer value proposition? What problem are we trying to solve with this large experience model?
Kumar: So, for any customers across any vertical, whether it's enterprise, school districts or, let's say, higher education. So, one of the most susceptible applications for any network changes is collaboration apps. So usually, you see like people are browsing, they are kind of doing their chat on the internet. Everything seems to be going good. Users only complain when they are on audio or video call and they are screen sharing. And there will be complaints here and there saying like hey, I was on a call, my video froze, my desktop share didn't work, and these are the harder problems to get after.
So, what we figured out was for this kind of problem. Everybody is now in cloud. So all these collaboration apps like Zoom and Teams, they do have like a cloud and then we can ingest data from like a cloud to cloud integration, which now helps us to put a lens to these bad user experience minutes exactly and understand hey, for any such bad user experience, was Juniper, misdeployed network was the problem, or like we are not the problem, right, so we'll be able to get to that point very easily. Yeah.
Bob: Okay, so there always says you know, there's only two problems in Wi-Fi either I can't connect or it sucks. So, it's like the Large Experience Model is working on this post-connection. It sucks problem right, trying to understand why your network's having problems. You know, Navraj, maybe a little bit in the audience you know, if I go under the hood technically, what am I going to find? What is this? You know, it's not like I've got some application data, network data exactly what are you doing here?
Navraj: So, we laid out the problem quite well. One is we have data from Zoom and Teams. It tells us about how the call is going, and those metrics are the latency during loss. Now then the question is can we predict it? And can we predict it with our Juniper MIST data? And the answer is well, let's try it. So, we developed a model, this large experience model. We have millions of data points.
Bob: Millions or tens of millions?
Navraj: Hundreds of millions actually, so getting close to a billion now. So, we have a great deal of data all within the cloud. We can then see. Can we predict our video collaboration experience just based on our Juniper MIST data? The answer is yes. And how do we know that? We can tell from our statistics in the model, and then we have our first success story.
We have a model, but then what do we do with that now? People want to know is what are the problems that we are facing? And that goes into the whole idea of can we explain what are the dominant issues in the model? And that's what we have done in the Large Experience Model.
Bob: Okay, so we have a model here that can actually predict someone's Zoom team user experience, and my understanding is there's some other Shapley, some other map that basically starts to look at the features and how they contribute to these bad user minutes.
Navraj: Exactly. So we have employed the Shapley algorithm, which essentially tells which one of our Juniper MIST features. Is it a client issue? Is it a wireless issue? Is it a WAN issue? And it could be a combination of them. And that's what Shapley does. It says which one of the features or parameters that we're inputting from our network is most dominant.
Bob: Yeah, in my understanding in the first generation missed SLEs we were actually calculating all the bad user minutes for either pre-connection or post-connection and trying to correlate those with some network features and giving you a probability of this feature. My understanding of Shapley, we're actually looking at the contribution margin of these network features to the actual latency and jitter right.
Navraj: Yes, so some of the main differences between the two approaches is the earlier, classical SLEs take into account each variable, one at a time, right? So, whereas Shapley and machine learning approaches are multivariate, so we can see which contributes most altogether. So we have all of the indications of all the different network parameters and that's, I think, one of the things that gives it the additional power, and then we can get a ranking for all of the features altogether rather than a pairwise ranking.
Bob: Yeah, so I mean, I think this is the big differentiation. You know, large models being trained on a large data set is what's really disrupting the industry right now. Kumar last year, big announcement, Team, Zoom model, Large Experience Model.
Kumar: That's correct.
Bob: What are you bringing to us this year? What have you done for us lately?
Kumar: Yeah, so you're right in the sense. Last year we solved the problem of like hey, whenever we have a third-party label data from Zoom and Teams, we are able to root cause, which is the feature that's contributing the most right. What's the network function that's causing the issue? This year I mean, thanks to Navraj and team and Prashant, who is ingesting the data as well we are now able to get to a point where the model if someone is not ready to or not able to give us like collaboration label minutes, such as zoom and teams inference the model is now at a point where it can still predict like which users and what access points around what access points will the users have like a bad user experience, right?
So the model is now at a generalized point of view where, without having a third party label data, we can still just from our network metrics, through full stack, will be still be able to predict as to like which section of the building or which section of the floor or around APs, right, will users have bad user experience?
We have got to that point and this has been Navraj can add more details into that where we have compared with label data being run across a model not feeding the data, and then the precision is pretty close actually. So, we are able to now predict, without any third party label like, how the user experience will be for any collaboration application user experience.
Bob: So, I know that Mist is always being customer driven. I know this feature Large Experience Model came from one of Mist’s biggest customers a couple of years ago. And, Navraj, I know I hear you get customer calls all the time. Maybe a little bit about the story of you know where did this Large Experience Model come from, and what was the customer problem that you were actually trying to solve originally?
Navraj: Actually, I wasn't here at the time, but I think Kumar can actually answer that question Much better than me.
Kumar: Yeah, no, thank you. So, I think post-pandemic when users started returning to work. So, there was a unique usage pattern that changed. People come into office, they don't go to a conference room, they sit at their desk and the video is on, audio is on, they are sharing desktop, pretty much like if you can see like 12 people sitting across a couple of floors and then everybody is doing audio video together. Now, at this moment, there are some kind of hiccups that happen and user complain.
Prior to this Zoom and Teams and the collaboration minutes integration from cloud to cloud, there was no way like an administrator could figure out, like all one could do is send like a feet on the ground, sit around the same area, try to simulate that whenever it happens again. Now with this integration, we thought like hey, let's say every enterprise has already like a BI, dashboards of collaboration minutes weather Zoom or Teams.
Why don't we bring those user minutes inference into our data and kind of once we kind of map them with our network data? Can we put like more details as like was the network deployed from Juniper Mist had anything to do with the bad user experience or not? That's how the problem started and, frankly, there are a lot of problems we solved. I think I'll go into specific user examples, but do you want to add anything with respect to how the root cause of the things?
Navraj: Oh, I think you nailed it on the head. Yeah, when we have all of this data, and so the data has a history of everything that happens and being able to relate that to network features and having such a large amount of data, we're able to see all the problems. So, then we're able to identify it. So even when we don't even have the data so I guess that's a general picture. So, we're finding interference issues, we're finding roaming issues, we're finding a lot of client issues as well, and also WAN issues as being a problem.
Bob: Okay, so this is the generalized, so this is the big new innovation I keep hearing about. We now have a generalized Large Experience Model that can actually solve problems at customer sites where we don't even have Zoom, Teams data.
Kumar: Yeah, like. For example, recently we were tasked with a customer where he said like hey, for me Teams data, I can't share it, but is there some way you can help us? There has been a lot of complaints with respect to Teams calls on the building and when we put the generalized model into work, what it came back and indicated is like hey, this is pretty much. There has been a lot of interband rooms and interflow rooms as well happening right.
And customer kind of said yes, that's what we have been seeing, what do we do? How do we help here, right. So, then the model was like hey, if we can turn on or turn off 2.4 for corporate and have all the corporate users for 5 gigahertz, the simulation, the prediction, works better where there should be less bad user minutes. And that's what the customer did and we were able to solve the problem.
Bob: So, this is like the needle in the haystack tool. You have that hard problem that has multiple issues going on. You're combining network features from the client side. It's not like Zoom and Teams is bringing in all the client features. You've got all the wireless features, LAN features. You're bringing the WAN features. Maybe Navraj, what is the most interesting needle in the haystack story that you've had to work on now, using your new tool?
Navraj: The needle in the haystack. There's been quite a few. I mean what I like the most. You know, whenever you develop a tool, there's always a lot of skepticism. People are wondering you know, how can this be done? You know, many of our experts are very hands-on. They want to see the problem. So, one issue that was raised last year was a very important customer for one of our network admins.
Calls were just dropping and they did not know what the problem was, but it was affecting their business. So, I had run the model and I had said there's too many clients on the access point and the network admin was very skeptical. He says this person is a VIP, they have their own access point. What you're doing does not make sense.
Bob: This is our university customer right
Navraj: Exactly. Yes, and so he didn't believe us. But what he did, he said no, it has to be interference, right, I mean, there's a signal happening there. But he wanted to test for himself. So, he set up things and saw our students were connecting to this access point from outside and transiently connecting, disconnecting, and that was affecting the call.
Bob: This is a lunchtime student story. Lunchtime students walking by the Provost's office grew up as a Zoom team.
Navraj: Exactly, yeah, so that gave me a lot of confidence because everyone was always skeptical. But the way to win over people. Also, with the story that Kumar had mentioned about the interbond roaming, while we were studying it, someone else was looking at it as well, and he was always skeptical about these measures he likes our traditionalist at least but then he became someone who was supportive of that too. So that's always something that gives me gratification. These experts, you know they've been looking at these for years. They know their networks, they know where the problems are. But you know he was happy when he was able to see that.
Bob: Kumar, your needle in the haystack story. You deal with customers on a daily, weekly basis.
Kumar: Yeah, there is one particular one which I kind of always like and mention. Right so, there was this one customer where they have problems every Tuesday and Thursday, right so? And then turns out that a lot more users come to work on Thursday and sorry, Tuesday and Thursday and when that happens their total bandwidth usage on the site goes beyond their WAN capacity or the pipe that they have for the WAN.
And then the model pointed out to the WAN and it indicated hey, this is something to do with the WAN, where total number of users plus the total upload download capacity of the WAN. And then we were able to point that through the, the SHAP root cause, and the customer then upgraded their circuit to like a larger pipe and since then they didn't have a problem.
Bob: Navraj, I mean you know another episode. We were talking about Marvis Minis, you know I hear there's a better together story Marvis Mini, Large Experience Model, two ships in the night or something going on here.
Navraj: I mean, that's the nice thing about whenever you have machine learning algorithms, you can combine information seamlessly. So, we are now working together to put Minis data in and seeing if we can have this Minis magic also in LEM. And also we have a new switch we are all very proud of with the announcement just a few months ago. We are ingesting data from the switch, so if we can get more information to point to those problems as well. So, yeah, so I'm very excited about the future and seeing what more, how more we can push the rock up the hill.
Bob: All data scientists, I know it's all about data. You know the more data you have, the better. So, it sounds like we have wireless data coming at us. It's not like Minis is going to start bringing in better visibility on the WAN side.
Kumar: In particular Minis. There was one small problem with whenever we used to point to WAN. We are a little blindsided for the WAN. Unless it our gateway, it's ours, like SSR and SRX, we don't have much data into the into the WAN. So now with Minis specifically around, like application Minis, where we are able to perform like network metrics, performance metrics, we are able to get latency loss, jitter.
Imagine that data being fed into the into this large experience model, where now we will be able to indicate not just WAN but where exactly in the WAN, right? So, because it has all the hops and latencies and packet losses across all hops, it will be able to indicate hey, this section of the hop is beyond our deployed network, somewhere in the internet as well. Which section of the hop is kind of the problem? Right, we'll be able to get to that level.
Bob: Maybe, in wrapping this up here, we got Minis, we got Large Experience Model, Gen-AI conversational interface. It feels like it's all coming together, Kumar. What's got you excited, looking forward to the future now are we at the brink of self-driving yet?
Kumar: We are almost, and I think a lot of our actions are self-drivable. We are also this year we have got to a point where including all of this data right, let's say I'll just kind of give one simple example right, sometimes, due to, like, a lot of users let's say there is an all-hands going on and there are a lot of users let's say there is a wireless capacity issue right. Now and there are a lot of users, let's say there is a wireless capacity issue right now we are able to identify and then, if the headroom of the headroom capacity of the wireless gives us an opportunity to either increase the bandwidth or, let's say, turn on dual-5, we will be able to do that on the fly, of course, with the user permission.
Once user gives permission, we'll be able to self-drive that aspect where, whatever could have been their RF template, we can go beyond that with permission to self-drive that aspect. Similarly, we have a lot more self-driving actions with this kind of data coming in both from minis as well as Large Experience Model.
Bob: And Navraj, I know you spent the last couple of years bringing the generalized Large Experience Model to life, looking forward in Marvis future excitement.
Navraj: I like the idea of justifying those actions that we are going to do. So, our customers want to be informed, they want to make informed decisions, just like we do. So, for the case that Kumar has mentioned, we can say we can improve your bad minutes by 20% if you make that change.
So, giving that information will say to the customers oh wow, this is an improvement we're willing to do and act upon. Let's do it and then with that information, we can get to the self-driving aspect of it. We will see a whole history of customers approving our actions and we'll gain that confidence.
Kumar: So, I think just to kind of add to that right, so there is a before and after right, whenever we make a recommendation or sell right, there is a before and after evidence to prove that whether that change worked in favor or against right. So, we'll be able to kind of go before and after for every of this point.
Bob: Okay, well, I want to thank you. I mean, it sounds like this generalized Large Experience Model is actually gonna be a key disruption in the industry and everything. So, Kumar, I wanna thank you for joining us. Navraj, thank you for joining us, thank you to the audience for joining us in this episode of Q&AI and look forward to seeing you in the next episode.