Project CAIRaoke could help Meta, formerly Facebook, build better assistants in AR glasses and VR headsets.
Chatting with an AI assistant can feel frustrating, especially when the conversation doesn't flow naturally or the assistant doesn't understand a command.
Meta, formerly Facebook , has been working on a solution. On Wednesday, the social media giant unveiled a new project that aims to improve AI assistants. Called Project CAIRaoke, the effort could also help Meta build better assistants in augmented reality glasses and virtual reality headsets.
The tech giant has been betting big on the metaverse, virtual spaces where people will be able to work, socialize and shop. AR and VR are both a crucial part of Meta's vision of the future, even though the metaverse is still largely hypothetical. This also isn't the first time the social media giant has been pushing people to join immersive online spaces, and it's unclear if the hype around the metaverse will eventually fizzle out. Building the metaverse also comes with the same problems Meta has struggled to combat on social media, including privacy and harassment. Still, Project CAIRaoke is just one example of how Meta is trying to design the technology needed to build the metaverse.
Meta CEO and co-founder Mark Zuckerberg views the metaverse as the successor to the mobile internet. Instead of staring at a screen, people will feel like they're present with another person in a virtual space, he says.
"That's going to require advances across a whole range of areas, from new hardware devices to software to building and exploring worlds. And the key to unlocking these advances is AI," Zuckerberg said at a virtual event Meta held on Wednesday.
Zuckerberg showcased how assistants could work in a virtual world through a demonstration of an AI concept called Builder Bot.
In the demo, an avatar of Zuckerberg asks Builder Bot to create a scene of a park but changes his mind and asks to go to a virtual beach instead. He then asks the bot to add clouds, an island and even a hydrofoil, something Zuckerberg has been photographed riding in the physical world.
"As we advance this technology further you'll be able to create nuanced worlds to explore and share experiences with others with just your voice," he said.
In the future, Meta says, models created with Project CAIRaoke will make it possible for people to refer back to an earlier conversation, change topics or even use gestures without confusing an AI assistant. One day, a person wearing a pair of AR glasses will be able to ask an assistant, "What goes with these pants?" The assistant will then be able to respond, "Here's a shirt in your favorite color, red."
"It can see what you see from your first-person perspective, hear what you hear and, most importantly, understand the context of the situations you are in," said Alborz Geramifard, a senior research manager at Facebook AI.
Creating smarter assistants could raise privacy concerns, an issue Meta said it's keeping in mind as it builds new products. Facebook, though, has a poor track record when it comes to protecting user privacy and has been plagued by scandals involving allegations that the company collects data without a user's consent. Meta said it's trying to give users more details about how their AI system works.
In a video, Meta describes Project CAIRaoke as an end-to-end AI model that combines the four models (natural language understanding, dialog state tracking, dialog policy management and natural language generation) typically used in assistants today. "With our new approach, dialogs are much more robust because they're able to make decisions by looking at the full range of information in a single place," Geramifard wrote in a blog post about the project. While some assistants are programmed to look for certain words and phrases, ProjectCAIRaoke is meant to better understand context and to recognize different phrases used to say the same thing.
Meta has already used the model created in ProjectCAIRaoke to improve its Portal video chat device. If you ask Portal to set a reminder for 6:30 and the assistant asks you to clarify whether that's in the morning or night, you don't need to repeat the entire command.
Building better assistants isn't the only project Meta has been working on. The company announced two projects that could help people access online content in their own language. One project includes creating a translation system that can learn every language, and another involves building a universal speech translator.
Meta says that improving translations is an important part of building the metaverse because people in these virtual spaces will be socializing with others who speak different languages and reside in different countries.
"As we think about creating and building towards the metaverse collectively, we need to prioritize everyone being able to access new technology, which requires translation for billions of people around the world," said Angela Fan, a research scientist at Facebook AI Research Paris. Fan said Meta aims to improve translation so English is no longer the default language.
Meta estimates that around 2 billion people or about 25% of the world speak a language that doesn't have a translation system available.