Bob Friday; Chief AI Officer, Juniper Networks

Episode 24: Agentic AI and the Future of Intelligence with Richard Socher

The Q&AI AI & ML
Bob Friday Headshot
Aug 08, 2025

Episode 24: Agentic AI and the Future of Intelligence with Richard Socher

In this episode of The Q&AI Podcast, Bob Friday sits down with Richard Socher—renowned AI researcher, founder of MetaMind and you.com, and pioneer of prompt engineering—to explore the evolution and future of agentic AI. From early neural networks and transformer models to enterprise-ready AI agents, Richard shares his journey through academia, startups, and big tech. Tune in as Bob and Richard unpack the real-world impact of AI agents, the challenges of enterprise adoption, and the path toward superintelligence.

Show more

You’ll learn

  • Richard’s AI journey: From Stanford research to founding MetaMind and you.com, and shaping the future of enterprise AI

  • The rise of agentic AI: Differentiating knowledge agents from action agents and exploring their real-world applications

  • Looking ahead to AGI and superintelligence: Predictions, challenges, and the evolving definition of intelligence

Who is this for?

Network Professionals Business Leaders

Host

Bob Friday Headshot
Bob Friday
Chief AI Officer, Juniper Networks

Guest speakers

Richard Socher
Richard Socher
Cofounder and CEO of you.com

Transcript

Bob: Hello and welcome to another episode of Q&AI. I'm Bob Friday and we have the special honor to have Richard Socher here, who is an industry-leading AI expert and the co-founder of two companies MetaMind and you.com. Welcome, Richard, it is a true honor to have you here.

Richard: Thanks for having me.

Bob: Your background is amazing. Maybe you give the audience a little bit. How did you end up from Germany to being the co-founder of two AI companies?

Richard: Yeah, I originally came from Germany for my PhD at Stanford and worked on neural networks for natural language processing and computer vision. It was pretty broad at the time because it was a very nascent field. I pushed a lot of that forward and then started a company called Metamind.

We eventually got acquired by Salesforce where we became chief scientists and eventually executive vice president, running most of the efforts there, and then in 2020, started you.com, because in 2018, we had invented prompt engineering, a paper cited by the early GPT papers from OpenAI. And in 2021, I also started a venture fund to kind of structure my angel investing a little bit better, and that fund has done phenomenal things and co-founded with several other amazing people, and that's where we're at today.

Bob: So, you are a PhD? What year was this now?

Richard: I started in 2009 and graduated in 2014.

Bob: Okay, so this was the very early days of natural language processing and I'm always interested in the origin story of the startup. So, you're doing your PhD and somewhere along the way you had the vision that MetaMind that you were going to go start this company from scratch. Scary jumping off the cliff.

Richard: It was. I had actually worked for almost a decade at that point to get a coveted faculty position in one of the top 10 universities and ended up getting it. And then I had a couple of VCs, namely Coastal Ventures, say, hey, I'm going to give you some funding. Don't you want to take some of this research that at the time already had many thousands of citations and make it into a startup? And I thought maybe I'll try it for a year and then I'll probably still become faculty.

And then during that year I both was allowed to teach at Stanford and were still doing some interesting research at MetaMind. And so I ended up realizing maybe I can do the teaching on the side one quarter a year, do still some research in industry and also start shipping these as real products to real users and have that immediate impact versus the long-term impact that research can have. And so, I ended up not taking my faculty job.

Bob: Okay so you were faced with going down the academic path to the tenured professor or really jumping into this industrial startup adventure.

Richard: That's right.

Bob: So, the original vision for MetaMind. How did you really engage with the customers back then, or how did you come up with the purpose of MetaMind?

Richard: Yeah, it was such a different world back then. You had to explain why I might be useful and what classification is and why it'd be great to just drag and drop some images or text documents into the browser and get some knowledge about them back. And we had a first paper in 2016 called Ask Me Anything.

It was a paper that had attention mechanisms that had the ability to ask all kinds of different questions to one large neural network and that eventually we pushed that idea even further with more complex questions in 2018, after the acquisition and at Salesforce. But, yeah, we're doing all kinds of interesting research and trying to explain that to users. Most of them had no idea what a neural network is and whether that matters to their companies or not. Now, of course, it's a completely different world.

Bob: Okay, so this is the early days of transformers, and attention is all you need. How big were these models you were playing with? Was it the size that kind of held you back?

Richard: Definitely. Yeah, the size was definitely a problem.  Also, the reason that transformer paper is called attention is all you need is that they showed that you don't need a recurrent layer and it's a different kind of neural network that is much harder to train on GPUs and so it was just slower in many different ways, required more engineering to really scale up, and so we had a lot of attention mechanisms, but we still all made the mistake of thinking we needed that recurrent layer to keep very long context.

And the transformer paper is beautiful and it has this quadratic blow-up. But you can see how you know, even with quadratic sort of complexity in the input size, you can do amazing things with transformers.

Bob: Okay, so 2018, having the very early days of attention and Transformers trying to sort out how to solve this problem, and somewhere along the way, you decided to which is always another big part of the startup adventure to sell the company.

Richard: Oh sorry. Yeah, we sold the company already in 2016. So, this was all like from 2016 to 2020, I was at Salesforce already.

Bob: But you know, I'm always interested in you know what made you, what made you sell. So, you were two years into the adventure and you decided it was better to stay at the play or sell.

Richard: Yeah, yeah, it was like kind of interesting because we had built so much powerful platform technology and then we realized that, because the world wasn't quite ready yet for all these different use cases, you had to go very, very deep into one vertical to just start selling it.

We didn't, unfortunately, have the funding abilities to just keep on doing interesting research and build more and more large general tools. That was always the hope. That was why it was called MetaMind. It was going to be this general purpose platform, but fundraising back then for AI and with our team, we just couldn't do what OpenAI, for instance, was able to do, where they build robotic hands and they play Dota games and after each of these projects they're able to raise a couple hundred million dollars more.

That wasn't the case for us in 2015, 2016. And then also we realized that the technology is so powerful in the hands of a company that has, no pun intended, a strong enough salesforce to actually explain this and bring it into enterprise use cases, and so there's just a lot of synergy, and Mark has been a phenomenal mentor and friend and investor of mine, and so it just was a natural fit.

Bob: Okay. So, you kind of realize you're a little ahead of the market. It was going to take a lot more money to stay at the plate. Yeah, so you went to Salesforce, and I think I went through that same experience. I sold a company to Cisco, small company. Salesforce was your first big company venture.

Richard: It was. Yeah, I mean I was an intern at Siemens Corporate Research in Siemens before back in Germany and Princeton, New Jersey, but yeah, it was my first sort of full-time job.

Bob: And big company culture. Like you have worked in both of these. They are slightly different. Any words of wisdom of people making the transition from small startup experience to big company experience?

Richard: Yeah, I think a lot of people assumed it was bad to work in a big company, coming from a small startup, but I actually had a phenomenal time, I think, a lot of very smart, very kind, hardworking people at Salesforce. And a lot of people say, oh, they just assume the bigger the company, the more politics there are.

And on the one hand, I'd say, well, you put three people in the room and you're going to get four different opinions on anything right, and so I think that is also true at Salesforce but ultimately what the right thing is just people differ and you have to learn how to navigate the organization, understand who the real decision makers are behind the scenes and then just see how you can have the most amount of impact on it, which I enjoyed a lot. We had built out an amazing research team and we could have a lot of freedom in that pure research world.

We published a ton of papers, we invented prompt engineering, we built the first large language model for protein generation, so not just understanding the folding of a protein, but create completely new kinds of proteins. In biology, we had one paper that still hasn't really had its time in the sun, which was an AI economist.

So it was a reinforcement learning problem where you had an ai economist and had a bunch of intelligent agents that would interact in this economy and buy products and get resources and build houses and can trade and get taxed, and then you basically said, overall, the system should maximize productivity multiplied with equality, and then you could basically build, you know, billions or millions and millions of years of taxation. Test them out and see what kind of tax structures and subsidy structures would create the best outcomes. That paper was like small, small, just a couple dozen agents. If that kind of work could have the hundreds of millions of dollars that we see for LLMs, we could actually much more realistically simulate economies and then find out what are the best taxation and subsidization schemes.

Bob: This is like the early Agentic AI you were working.

Richard: That's right. That was in 2017 and 18. We already published that.

Bob: Okay, So, you were already using, but you weren't using foundational large language models. You were probably using smaller models.

Richard: Yeah, we train actually a lot of large language models, but not as large as they are now. We had the best language models. Steve Merity and I published several papers where we really nudged language modelling forward as a task, but we unfortunately didn't train them as large as OpenAI did.

Bob: Okay, so back then, billions of weights. Where were you at 2018?

Richard: Like hundreds of millions of weights.

Bob: Because right now we're at what? 500 billion or something.

Richard: That's the really large ones. You can get pretty good results with 7 billion or so, but yeah, the really cutting-edge models are several hundreds of billions.

Bob: Okay so it sounds like you're at Salesforce having some fun.

Richard: Shipping real products, doing research. Yeah, it was fun.

Bob: Okay, so the whole professorship thing was in the rear-view mirror.

Richard: Yeah, I was teaching at that point for like three, four years at Stanford and yeah, the class kind of blew up to like 600 students and it was a lot of work. Eventually, I was like when I first taught the class, no one in the world was teaching neural networks for NLP like anywhere and there was still an official NLP class that my advisor taught Chris Manning one of my two advisors and I was teaching sort of the neural nets with NLP class, like on the side, and after two years those two classes were merged because all the best models were becoming neural network based and then after four years it was, I think, that quarter of 600 plus students, the most popular class at Stanford, and then it was a little bit too much work and then everywhere in the world you could take neural net for NLP classes, so I figured that was less impact.

Bob: So, I take it, if I look under all that Salesforce, Einstein, and you take somewhere.

Richard: Yeah, we started the whole Einstein thing.

Bob: I'll find you under there, underneath the hood. That's all your work. I take it.

Richard: Yeah, it was still. It was a lot of fun at the Dreamforce we introduced Einstein.

Bob: Yeah, so now, somewhere along the way, you've got your first agents going and somewhere, you.com, was you.com kind of like MetaMind version 2, or is it something new?

Richard: It definitely started as something different, right, and so in 2018 we had invented prompt engineering, and prompt engineering back then didn't look like what prompt engineering looks like right now. It was much simpler. It's just the idea that instead of pre-training just word vectors or just an encoder, you want to pre-train the whole system. And to pre-train the whole system, the main idea was you have to make the task, the question you have for that piece of text as part of the input.

And so that paper in 2018 co-published, together with Brian McCann and Ming and others, the idea that you can ask one model all kinds of different questions in NLP and you'd get an answer, no matter what you can ask.

What's the translation of this sentence into French? What's the summary of this paragraph? Who is the president of Ukraine that is mentioned somewhere in this document? And all these different kinds of questions.

What's the sentiment of this tweet? And all of these would just be inputs to one model, together with some context and outcomes and answer. And that paper was also very famously rejected. I talked about this in a TED Talk because for reviewers at the time, it was just unfathomable that you can have one model that does all of these things, because there are thousands of PhDs and professors who have worked on just one of those tasks for years, to just be down to downgrade it to like oh, there's just one more question you asked to that one model which is very unpopular at the time, but it did inspire a few other folks.

Bob: There's a theme here I can see you're a little ahead of your time here. I've learned with startups. Right, you gotta make sure you're not too far ahead of the work.

Richard: You're absolutely right the biggest transition from academic to startup founders. If an academic, you're ahead of your time, you're a visionary. If in a startup you're ahead of your time, your company might just be dead, and so you need to be at the right time, right. And so now we have a strong sales team and we've built agents before. They're sort of very popular now. That's why our platform, our customers, have already built over 90,000 agents to automate workflow processes on you.com, but we're running several years ahead, so it's been working out really well in recent quarters.

Bob: Yeah, maybe for the audience here that's a good point. You might give a description you.com, the mission here is relative to. I know Perplexity is out there and there's a whole bunch of different.

Richard: Yeah, a lot of copycats out there now. Yeah, so we were the first to connect a search engine that we already had and had been building for a few years to large language models so that those large language models had citations, they had accurate answers and they could be up to date. No one can train an LLM. Every five minutes something happens. Right, you have to have that search.

A lot of people underappreciate that, how important that is, and so we're now selling as a company both seed licenses. You just go to you.com, and you sign up and we have over 50% of all Fortune 500s have paying accounts on you.com already millions of users. But we're now more and more focused also on our sales team, helping companies in the enterprise context, searching over all internal data and web data and merging those both as APIs as well as a UI on you.com.

Bob: Are you chasing more of the enterprise space or more of the consumer?

Richard: We're fully enterprise, fully enterprise space. Yeah, we're not focused, we realized that the consumer space is going to be dominated by Google. Most normal consumers don't have complex enough questions every day that they have a big need to switch. If you ask in your average week oh, what's the weather tomorrow? Is it going to rain, what's the score of the game, who's the president of France, what's the price of the stock, if those are your five questions that week, you don't really need to switch from Google. But if your knowledge worker has really complex needs, you're going to want to use you.com because we're actually accurate. We're up to date.

Bob: So, you're focused on enterprise documents, specialized documents, where I need your help to help me make that document more accessible.

Richard: That document or that entire database of petabytes of documents that are inside your company. We work with the largest press agency in Europe, the DPA, a German press agency. We work with insurance companies. We work with hedge funds. We also work with other consumer companies that have hundreds of millions or billions of users to provide them with our APIs to get search results on the web and on news and other things, as well as full LLM answers.

Bob: So, you think the big guys, Google and Meta they're going to win the consumer business. You don't think the little startups are going to challenge those guys.

Richard: I mean already now you see a lot of the challenges to Google reducing in traffic and it's very hard to monetize. You have an early adopter crowd, but then normal folks are seeing more and more AI answers, also on Google for the consumer space. I think the killer app for this is enterprise productivity. That's where understanding a lot of different types of context, bringing together lots of different types of information and document sources and data lakes and so on, all into one place, then help you automate certain workflows with really good answers. That's the best use case for LLMs.

Bob: Yeah. So, the other thought our audience is really looking, since we have you here. It's not very often we have an industry-leading global AI researcher here. Is this  Agentic AI? When I've talked to a lot of my enterprise customers, they're like what is the difference between? How do I know this is Agentic AI? Now I've looked at Andrew Ng's definition, so I'd be interested Agentic AI? What's your definition of Agentic AI?

Richard: Yeah, so agentic AI is generally AI that can take actions. If you look one level lower, all the exciting models today are neural sequence models. They're large neural networks for which you have large training sets that consist of lots of sequences, and with that sequence is a list of amino acids and proteins for biology, it's a sequence of notes for music. It's a sequence of notes for music. It's a sequence of programming tokens in Python or English language tokens for something like ChantGBT and you.com.

It doesn't really matter to these models what kind of sequence it is, and if that sequence includes the ability to take actions like click on this button, then fill out this form with this text. Given this sequence, if you have enough of those kinds of sequences, then you can train these models also, either in environments or with feedback from humans, to basically take those actions for people. And now I would differentiate between knowledge agents and action agents. Just a definition I came up with to show part of what's reality right now and what will come very soon. Knowledge agents are here.

We have, like I said, over 90,000 agents created on you.com by our customers, and these knowledge agents will also take certain kinds of actions, like they say oh, I should make a search here on the internet and get more information. Or maybe I should program code here and then even run that code to get you the real numbers, not like a bunch of fake statistics or fake numbers that the LLM might hallucinate, but real numbers based on your CSV files or your database or something like that. Right, so those are knowledge agents and they work and they're here and they're already. We have customer beautiful customer quotes of folks saying work that used to take me days now takes me hours. Work that used to take me hours now takes me minutes, and so that is a reality.

Now the action agents that take irreversible actions like booking a flight and so on. A lot of times they're just a little bit ahead of their time. They're not quite ready yet for prime time. In many cases because you haven't seen enough sequences for training. You know, on the internet we can download billions and billions of sequences of English text and so we get very, very good at producing and understanding English text. We don't have billions and billions of sequences of people saying, oh, I click on Expedia.com and I click on these buttons, and then I click on this and this drop down menu, and then I do this and then I think a little bit and I check my email and my calendar and then I verify if my flight, if I catch this, and so on, and so, because you don't have enough sequences, it's harder to automate these tasks as well as it is to do knowledge tasks.

Bob: Now, when I've looked at these agents, it looks like they all have something common. That's, like, there's a prompt with context and there's some large language model that is consuming that prompt, either to do knowledge or action. Is that the right kind of high-level description of what an agent looks?

Richard: Yeah, a lot of agents need to prompt them well, and then these agents are often getting access to tools also. In our case, tools are like searching the web, writing code, executing that code, and in the future, both in you.com and, I'm sure, many other places, you'll have even more actions. Now there's even a whole protocol. Maybe relevant also for networking nerds is protocols like the MCP model context protocol. These agents can more easily access other tools that can help them do certain things like book a flight or something like that, and they have very specific ways to communicate, and that makes them even more powerful.

Bob: Now, does each agent in your mind have its own specialized, trained LLM, or are they using a generic? There's kind of what I call large foundational LLMs to solve problems, and then you have these smaller specialized language models to solve different problems.

Richard: Yeah, so a lot of times you just use one general purpose large language model as the base model for an agent. Give that agent in the prompt access to certain tools and instructions on what it should do, and then the neural sequence model will produce English tokens and from time to time action tokens, and so the big difference is that the prompt needs to really include that possibility of taking those actions and then, of course, the system around the LM.

A lot of times people think about an LM as like the whole thing, but you can more correctly think of the LM as kind of like the CPU on the motherboard and there still needs to be some RAM, which is kind of the context that you put into the prompt. You still need a hard drive which is retrieval, augmented generation over your entire database and stuff that we do at you.com a lot of, still need an ethernet connection where it connects the internet and you get information from the web. All of these things need to be connected to that CPU center that is the LM.

Bob: Yeah, I often describe people this. Agentic AI is kind of a new software programming paradigm. It's non-linear, non-deterministic for dealing with large language models. It's non-linear because it looks like people are connecting these agents in graphs that are not totally predictable how they're going to work the way of the graph. Where do you see us in the journey of leveraging Agentic AI? I think it's what do you think? A year old. How old do you think Agentic AI is right now?

Richard: Yeah, depending on how you define it. You know we've had at Salesforce even in 2017, 2018, we had the first agents that could pick up a phone and have a conversation with an agent and it would give you answers and it will do certain things and find a replacement package and stuff like that. So agents go way back, but thanks to large language models, they've just become much more general and now actually you have to restrict them back, like sometimes people you know you ask, like your agent, for a car mechanic to write you a Python script and it's so general purpose, it's a general purpose element, so they'll write you a Python script, but then it's like, should a car shop really write you Python scripts? Because it actually is overhired.

So, I think there's sort of three stages in general AI one is just simple co-pilots, like autocomplete. It'll do a little bit of work that would have taken you a few more minutes, down to seconds. Now we're in the second phase, already starting to get into where you take more longer workflows and you give them to an AI. Things that would have taken hours now takes a few minutes, and we see this with ARIA, our advanced research agent, for instance, on you.com, where it will take five to 10, maybe even 15 minutes, but then it writes you like a 20 page report and it's like hiring a consultant. In fact, we have a lot of consultants who are our users.

And then the third one, which we're not quite there yet, is when you actually really hand off entire tasks to an AI coworker and then it will go out and maybe come back a few days later but have done now the work of months. And to do that, well, we have to learn to onboard these AI agents better and, like you said, they're sort of not deterministic. You're going to have to onboard them almost like people. Even if you take Einstein, one of the smartest people in the world, or someone and say, all right, now, go answer all these emails, they have to be like well, how do you want to answer this email? What are the option choices that I should consider? What are the trade-offs I should consider, and so on. So, you need to onboard these agents more and more, similarly to how you'd onboard a human.

Bob: Now here's a topic. I'm going through this myself right now. It's one thing. I'm dealing with public documents. I can pretty much send that data anywhere. When I started dealing with enterprise customer data where do you think the enterprises are? Do they trust you to send that data to Microsoft? They trust Microsoft for their email. Are they going to trust Microsoft for OpenAI ChatGPT, or do you think those large language models need to be brought in-house, or smaller versions of it?

Richard: We in particular often deal with searching over the internet and it's very hard to bring the entire internet in-house. So, for us, a lot of the queries that come to us are going. They're staying with us. They're not shared. There's no data sharing, there's no selling of your data the way you'd use when you use Google, and there's no advertising or any of that stuff. But it's hard to keep that within your own on-prem infrastructure, and so that is a big part.

And then, of course, you have to be SOC 2 compliance and think about single sign on enterprise security and so on, which a lot of the players on the consumer space don't do as well. But we really focus so we're allowed and able to work with hedge funds, insurance companies. The NIH just became a customer of you.com, so we have some very complex use cases where the accuracy of the answers really matters. You don't want to just have a rough, hallucinated, maybe correct or maybe not correct answer.

Bob: And where do you think we're on the journey right now of enterprise trusting AI? I deal with a lot of customers right now and I think you know they're okay with you human in the loop, helping them gather knowledge back. But the action part seems to be the next phase of the adventure. You know I don't know if you've done the Waymo Uber self-driving thing you know it's like. You know people are fine jumping in that Uber. They trust it. I'm not sure why, but they trust it to get them from A to B. But they're still a little hesitant to let an AI assistant touch their network or take action in their own network.

Richard: Right, yeah, it will take some time. I think people have to test but verify, and it's not helpful that there are a lot of sort of quick prototypes out there that look on the surface like oh, that's like you.com, why do I really need to pay for this? We actually have several customers who started having their own versions of a you.com like interface internally in the company, because you just slap a web interface together and connect an OpenAI API it's not that hard. But then they realize like it's like 70 accurate and you can cherry pick like five examples and like look, it's perfect. But then when you really use it people don't have adoption. They don't know how to use it, they don't know how to prompt.

So a big part of what we're doing in you.com for our bigger customers is we actually give role-specific training also to all their employees, and we've seen a huge amount of improvement in adoption of these tools after you do also AI-powered personalized training for each role. So in the beginning of that training the user says this is my role, I'm a VP of back-end engineering or something, or I'm the CEO, and then the training actually becomes unique with examples for that role, and so that is one of many steps you need to take to get people comfortable over time and eventually there's going to be a lot of co-adaptation.

People are going to just like, when you use early speech recognition, you wouldn't just be like, oh, just get me from here to there, you would just be like directions to and you try to speak very clearly, that kind of co-adaptation of humans and AI. We're going through a major new phase of that and, like kids that are growing up now, they're already, kind of you know, learning how to scream at their Alexa or something to make stuff happen, and it's a little bit scary sometimes.

Bob: Yeah, I kind of wonder you guys, you know we get into this new kind of non-linear, non-deterministic software programming. You know, is it going to take a whole new generation of programmers to get used to programming in this technique, right? Because I think you and I we're used to linear programming. We bite our function, we write a unit test and we're done.

Richard: Yeah we're also an investor in Windsurf and at AIX Ventures, and so we do think all of programming is going to be very different. I also invest in Marimo, which is like a specific type of like I don't know if you know Jupyter notebooks where you can do programming for data scientists often. It's like Jupyter notebooks, but on steroids. Your data gets loaded right away. You can spawn massive numbers of machines, everything is just done for you and you just can focus on really important bits and just have much more efficiency and productivity. We're going to see more and more of that.

I think right now programming is getting better and better. I think if you're at or below average of a programmer and you do a lot of things that are quite repetitive but still need a little bit of thinking, AI can already massively 10x you. If you're one of the best programmers in your field, then maybe AI is not yet as good, but even then you'll just be like just write unit tests for this remaining stuff. And there are things where AI and in fact this is a general thing any domain where you can verify and or simulate that domain, AI will dominate it and get eventually better than humans and programming you can verify. It's similar to math. Like math you can allow the AI to explore a bunch of potential options and then you can verify every step through it, and so, eventually, AI is going to write proofs that humans have not yet come up with.

Bob: Okay, Richard, are you a wine connoisseur?

Richard: No.

Bob: Okay, because I think this could be an all-day session over some wine tasting here, maybe to wrap up this episode. I mean, I think you were quoted as saying last year 2024, that we were five years away from superhuman AI, so any, changes in your prediction or how close we to this general terminator.

Richard: So, I think there's AGI and then we have to separate that out from superintelligence. So, I think, depending on how you define AGI, either already been there for a while or we're going to get there in like three years or so and that's the like. There are lots of different ways, but you want to say basically that it's as general as a human. You can start onboarding it and it will do many tasks for you, with reasonable feedback back and forth, and generally be able to solve lots of different kinds of problems the way a human would. Then we have AGI.

I think the next level which me and a lot of other folks are thinking about now is super intelligence, getting to have AI that is even better than not just one human, which is not that hard. Most humans can translate 50 languages right and I can just like a calculator could have done better multiplication than humans and superhuman that aspect. But it's superhumanity. It's better than all humans working together on a task, and I think we're going to make an immense amount of progress towards that.

I think, just like with AI, the goalposts will continuously shift. Once we solve a problem, then it's just like oh, that's just speech recognition. That's just like Siri on your phone. It's just a chess app on your phone. All of those used to be sort of the cutting edge, and if we get there, then we have real AI. We'll see that moving goalpost several times with superintelligence too, but again, anything you can simulate, you can verify, like AI, will make a huge amount of progress in the next few years.

Bob: Turing test. Do you think anyone's passed the Turing test yet?

Richard: The Turing test has already clearly won, even before LMs. There are some cases where it was won sometimes. Now, I think the Turing test is best won by reversing it, if you can ask questions that are so hard that a human couldn't do it and that's how you know it's an AI. Like hey, like write me this Python script that does this really complex thing and sort of lists and like thinks about this ramp, and if, like, 10 seconds later you have a perfect answer, you're like that was not a human. The humans can't type that quickly.

Bob: You get the AI to look a little dumber.

Richard: Exactly, that's how much things have changed.

Bob: Okay, well, Richard, it's been a pleasure. I cannot tell you how much it's been a pleasure to have you here today.

Richard: Thank you, thanks for having me.

Bob: And thank you everyone for joining us and look forward to seeing you on the next episode of Q&AI.

Show more