Speaker 1: End up powering a new set of transformative experiences. Today we have some exciting new progress to share about the future of AI assistance that we're calling Project Astra.
Speaker 1: For a long time we've wanted to build a universal AI agent that can be truly helpful in everyday life. Our work making this vision a reality goes back many years. It's why we made Gemini multimodal from the very beginning. [00:00:30] An agent like this has to understand and respond to our complex and dynamic world just like we do. It would need to take in and remember what it sees so he can understand context and take action, and it would have to be proactive, teachable, and personal so you can talk to it naturally without lag or delay. While we've made some great strides in developing AI systems that can understand multimodal information, getting response time down to something conversational is a difficult engineering challenge. [00:01:00] Building on our Gemini model, we've developed agents that can process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events and caching this for efficient recall. We've also enhanced how they sound with a wider range of intonations. These agents better understand the context you're in and can respond quickly in conversation, making the pace and quality of interaction feel much more natural. Here's [00:01:30] a video of our prototype, which we'll see has two parts. Each part was captured in a single take in real time.
Speaker 2: Okay, let's do some tests. Tell me when you see something that makes sound.
Speaker 3: I see a speaker, which makes sound.
Speaker 2: What is that part of the speaker called?
Speaker 3: That is the tweeter. It produces high frequency sounds.
Speaker 2: [00:02:00] Give me a creative alliteration about these
Speaker 3: Creative crayons, color cheerfully. They certainly craft colorful creations.
Speaker 2: What does that part of the code do?
Speaker 3: This code defines encryption and decryption functions. It seems to use A-E-S-C-B-C encryption to encode [00:02:30] and decode data based on a key and an initialization vector iv.
Speaker 4: That's right.
Speaker 2: What neighborhood do you think I'm in?
Speaker 3: This appears to be the King's cross area of London. It is known for its railway station and transportation connections.
Speaker 2: Do you remember where you saw my glasses?
Speaker 3: Yes, I do. Your glasses. Were on the desk near a red [00:03:00] apple.
Speaker 2: What can I add here to make this system faster?
Speaker 4: Adding
Speaker 5: A cache between the server and database could improve speed.
Speaker 2: What does this remind you of?
Speaker 5: Schrodinger's [00:03:30] cat.
Speaker 2: Alright. Give me a band name for this duo.
Speaker 5: Golden Stripes.
Speaker 2: Nice. Thanks Gemini.