X

Google Teases an Even Better Version of Gemini, but You'll Have to Wait for It

No, Gemini 1.0 Ultra, announced just last week, isn't obsolete. At least not yet.

Lisa Lacy Lead AI Writer
Lisa joined CNET after more than 20 years as a reporter and editor. Career highlights include a 2020 story about problematic brand mascots, which preceded historic name changes, and going viral in 2021 after daring to ask, "Why are cans of cranberry sauce labeled upside-down?" She has interviewed celebrities like Serena Williams, Brian Cox and Tracee Ellis Ross. Anna Kendrick said her name sounds like a character from Beverly Hills, 90210. Rick Astley asked if she knew what Rickrolling was. She lives outside Atlanta with her son, two golden retrievers and two cats.
Expertise Technology, AI, Advertising, Retail
Lisa Lacy
4 min read
A person accesses Gemini Advanced on a mobile device.
Photo Illustration by Pavlo Gonchar/SOPA Images/LightRocket via Getty Images

Another day, another generative AI update.

Google's AI subsidiary DeepMind has previewed Gemini 1.5 Pro, an upgraded model of Google's renamed Bard chatbot Gemini. Gemini got its new name less than a week ago along with the release of the premium, paid version, Ultra, which Google called "our largest and most capable state-of-the-art AI model."

Gemini 1.5 Pro is the latest evolution of Google's chatbot, which also recently gained the ability to generate images from text.

Gemini 1.5 Pro can ingest video, images, audio and text in order to answer questions, and it boasts multiple advantages over its predecessors -- but most of us can't get our hands on it yet. In a call with press on Wednesday, DeepMind announced it's giving access to developers and enterprise customers first.  

Oriol Vinyals, vice president of research at Google DeepMind and co-lead of Gemini, called this a "research release" for "an audience that understands the technology really well."

"When you create a new model -- and especially when we unlock some new capabilities -- I think it makes sense to see what creative minds … can do with the model to understand what will this model [do], how will this matter to users ultimately?" Vinyals added.

DeepMind will "roll it out slowly" to regular Joes and Janes via a wait list.

The limited release of Gemini 1.5 Pro comes amid a flurry of activity in a sector projected to reach $1.3 trillion in revenue by 2032. Meanwhile, ChatGPT maker OpenAI has released its GPT-4 Turbo large language model and allows anyone to create custom AI apps for its app store. Microsoft intends to add a dedicated key on Windows 11 laptops and PCs to launch its AI tool, Copilot.

Better performance, new architecture and a longer context window

Gemini 1.5 Pro is "as capable as" the Gemini 1.0 Ultra model, which Google announced on Feb. 8. The 1.5 Pro model has a win rate -- a measurement of how many benchmarks it can outperform -- of 87% compared to 1.0 Pro and 55% against 1.0 Ultra. So 1.5 Pro is essentially an upgraded version of the best available model now.

Research advances quickly. "We get these sorts of breakthroughs and new model versions every few months or so," Vinyals said. "When a model is more capable, what you would then try to [do is] make sure it basically can be a drop-in replacement from the previous generation in the sense that it's more capable, so hopefully it can do what it was doing already, but better."

According to Vinyals, 1.5 Pro is also "very efficient" thanks to a unique architecture, which can answer questions by zeroing in on expert sources in that particular subject rather than seeking the answer from all possible sources.

Finally, 1.5 Pro has a long context window, which means it can ingest up to 1 million tokens, which is equal to 1 hour of video or 11 hours of audio, 30,000 lines of code or 700,000 words.

"The longer and the more complex questions and your interactions are, the longer the context gets that the model needs to be able to deal with," Vinyals said. 

Gemini 1.5 Pro in action

So, for example, you can feed 1.5 Pro the Apollo 11 transcript and ask the AI to find funny moments. Or you can share a rudimentary drawing and ask the model to find the moment the sketch depicts, such as Neil Armstrong's "One small step for man" quote.

"That's an example of uploading a very long document that you might not have had the time to read and then really interact with it in this very interesting way," Vinyals said.

Gemini 1.5 Pro users can also ask the model to find specific moments within a video, including silent Buster Keaton films, via text and image.

Finally, Gemini 1.5 Pro can "operate in many languages," including Spanish.

After feeding 1.5 Pro a grammar book and a dictionary of kalamang, a language from western New Guinea with fewer than 200 speakers, the model was able to translate a sentence to English.

"It takes a few seconds to process and voila, you now have an expert of this language," Vinyals said.

But Gemini 1.5 Pro is still subject to common challenges like hallucinations.

"The model will sometimes fail and it's a work in progress for the whole community to get these models better," Vinyals said. "But, of course, they're incredibly useful and you just need to understand their limitations."

What about Ultra 1.0?

Less than a week ago, Google announced Gemini Advanced, a new "experience" that provides access to its Ultra 1.0 AI model for consumers willing to pay $20 a month.

Does that mean Google just made Ultra 1.0 obsolete? Not according to Vinyals. "We are some time away from getting 1.5 Pro out."