Meta's Ray-Ban Glasses Added AI That Can See What You're Seeing
I tried these Ray-Ban AI glasses, and they're pretty wild. Here's how the experimental feature works.
Scott SteinEditor at Large
I started with CNET reviewing laptops in 2009. Now I explore wearable tech, VR/AR, tablets, gaming and future/emerging trends in our changing world. Other obsessions include magic, immersive theater, puzzles, board games, cooking, improv and the New York Jets. My background includes an MFA in theater which I apply to thinking about immersive experiences of the future.
ExpertiseVR and AR, gaming, metaverse technologies, wearable tech, tabletsCredentials
Nearly 20 years writing about tech, and over a decade reviewing wearable tech, VR, and AR products and apps
"Hey, Meta. Look at this and tell me which of these teas is caffeine-free."
I spoke these words as I wore a pair of Meta Ray-Bans at the tech giant's New York headquarters. I was staring at a table with four tea packets, their caffeine labels blacked out with a marker. A little click sound in my ears was followed by Meta's AI voice telling me that the chamomile tea was likely caffeine-free. It was reading the labels and making judgments using generative AI.
I was testing a feature that's rolling out to Meta's second-generation Ray-Ban glasses starting today -- a feature that Meta CEO Mark Zuckerberg had already promised in September when the new glasses were announced. The AI features, which can access Meta's on-glasses cameras to look at images and interpret them with generative AI, were supposed to launch in 2024. Meta has introduced them more quickly than I expected, although the early-access mode is still very much a beta. Along with adding Bing-powered search into Ray-Bans as part of a new update, which ups the power of the glasses' already-available voice-enabled capabilities, Meta's glasses are gaining new abilities fast.
The demo wowed me because I had never seen anything like it. I have in parts: Google Lens and other on-phone tools use cameras and AI together already, and Google Glass -- a decade ago -- had some translation tools. That said, the easy-access way that Meta's glasses invoke AI to identify things in the world around me feels pretty advanced. I'm excited to try it a lot more.
Multimodal AI: How it works right now
The feature has limits right now. It can only recognize what you see by taking a photo, which the AI then analyzes. You can hear the shutter snap after making a voice request, and there's a pause of a few seconds before a response comes in. The voice prompts are also wordy: Every voice request on the Meta glasses needs to start with "Hey, Meta," and then you need to follow it with "look and" (which I originally thought needed to be "Hey, Meta, look at this") to trigger the photo-taking, immediately followed with whatever you want the AI to do. "Hey, Meta, look and tell me a recipe with these ingredients." "Hey, Meta, look and make a funny caption." "Hey, Meta, look and tell me what plant this is."
Each request triggers a shutter snap, and then a few-second pause while the AI reads the image and interprets it. It's similar to how a phone-based AI camera app might work, except on your face and voice controlled.
Every AI response, and the photo it looked at, is stored in the Meta View phone app that pairs with the glasses. I like this, because it's a visual/written record for later, like memory-jogging notes. I could see wandering somewhere and asking questions, using this as some form of head-worn Google search for my eyes, while shopping or who knows what.
It could also have possible uses for assistive purposes. I wore a test pair of Meta glasses that didn't have my prescription, and I asked it what I was looking at. Answers can vary in detail and accuracy, but it can give a heads-up. It knew I was showing it my glasses, which it said had bluish-tinted lenses (blue-black frame, pretty close).
Sometimes the glasses can hallucinate. I asked about fruit in a bowl in front of me, and it said there were oranges, bananas, dragonfruit, apples and pomegranates. It was correct, except for the pomegranates. (There were none of those.) I was asked to have it make a caption for a big stuffed panda in front of a window. It made some cute ones, but one was about someone being lonely and looking at a phone, which didn't match.
I looked at a menu in Spanish and asked the glasses to show me spicy dishes. It read off some dishes and translated key ingredients for me, but I asked again about dishes with meat and it read everything back in Spanish.
The possibilities are wild and fascinating, and possibly incredibly useful. Meta admits that this early launch will be about discovering bugs and helping evolve the way the on-glasses AI works. I found there were too many "Hey, Meta, look" moments. But that process might change, who knows. When engaged in immediate image analysis, asking direct follow-up questions can work without saying "look" again, but I'm sure my success will vary.
The future of wearable AI is getting interesting
This AI, which Meta calls "multimodal AI" because it uses cameras and voice chat together, is a precursor of future AI that the company plans to mix many forms of inputs into, including more sensory data. Qualcomm's AI-focused chipset on Meta's new Ray-Bans already seems ready to take on more. It's also a process that Meta plans to make more seamless over time.
Meta CTO Andrew Bosworth told me in September that while the glasses now need a voice prompt to activate and "see" so that they don't burn through battery life, eventually they'll "have sensors that are low power enough that they're able to detect an event that triggers an awareness that triggers the AI. That's really the dream we're working towards." Meta is also already researching AI tools that blend multiple forms of sensory data together, ahead of more advanced future wearables.
Right now, know that this is an early-access beta. Meta is using anonymized query data to help improve its AI services during this phase, which may concern people wanting more privacy. I don't know the specific opt-in details yet, but more discrete controls over sharing data look like they may be in place once the final AI features launch, likely next year. The early-access opt-in isn't even available to everyone yet, but according to Meta, more people should get the option over time.
This all reminds me of what Humane is aiming for with its wearable AI Pin, a device I haven't even seen in person yet. While Humane's product is expensive and needs to be worn on clothing, Meta's glasses are $300 and already on store shelves. As watches, VR headsets and smart glasses all evolve their AI capabilities, things could get very different for the future of wearable tech and its level of assistive awareness.
It's clear that a new frontier of wearable AI products is already forming, and Meta's glasses are getting there first.
Editors' note: This story was updated on 12/13 to clarify how the voice prompts work and make a correction. We said "take a look at this" in our original story, but the actual working phrases are "look" or "look at this."
Editors' note: CNET is using an AI engine to help create some stories. For more, see this post.