Microsoft's Bing doesn't have an equivalent to Apple Siri or Google Now, but that doesn't stop company officials from downplaying the value of the two most popular digital personal assistants.
"Siri and Google Now have a fairly shallow understanding of the world," said Stefan Weitz, senior director for Bing. The voice-controlled, interactive services are good for finding restaurants and movies and for driving directions, Weitz explained, but if you want to buy a new hard drive or figure out the circumference of Lake Erie, they start to fail.
Apple's Siri uses data from dozens of sources, such as Yelp and Wolfram Alpha, and apps to answer basic queries and perform tasks, such as scheduling a meeting or sending a tweet. Google Now goes a step further than Siri. With permission, it scans your e-mail, calendar, and other activities to anticipate your information needs, such as how long it takes to drive from work to your home or to alert you if there are changes in your airline reservation.
Weitz seems content to hold back Microsoft's version of a digital assistant and let Apple and Google have the field to themselves for now. At some future date, he expects Microsoft to leapfrog Siri and Google Now.
"We have had internal debates about when to ship something. We could come out with something now like them, but it wouldn't be state of the art. It's too constrained to be an agent now," Weitz said. "We are not shipping until we have something more revolutionary than evolutionary."
For Weitz, revolutionary is trying to "recreate the physical planet inside of Bing," mapping every square inch of it, creating a semantic model inclusive of everything from a Kleenex box and Justin Bieber to the Empire State Building and Monrovia.
At the core of Microsoft's work to create a state-of-the-art Bing digital assistant is Satori, a knowledge repository of more than a billion objects digested in the past 3.5 years, Weitz said. Like Google's Knowledge Graph, Satori catalogs entities and the associated data and relationships among them. They both crawl the Web and utilize existing data sources, such as Freebase (owned by Google) and Wikipedia, to build their repositories of semantically rich encoded information that can help answer questions in milliseconds, rather than deliver a page of links.
For example, Satori has cataloged 1.8 million bottles of wine. "With a bottle of wine, you have any number of characteristics -- color, vintage, grapes, amount of rain in the region. There could be 10,000 characteristics, and they are all over the place on the Web," Weitz said. Satori reassembles the disparate pieces in a central location.
"Satori is a self-learning system that is running every day and learning more, adding 28,000 DVDs of content every day," he said. "It's mind-blowing how much data we have captured over the last couple of years. The line would extend to Venus and you would still have 7 trillion pixels left over."
Satori's computation engine is supported by more than 50,000 nodes in a Microsoft compute cloud.
State of the art for Weitz is being able to perform complex "path chaining" scenarios, such as planning a vacation. As if in conversation with a travel agent or a knowledgeable friend, the digital assistant should be able to answer queries in real time, anticipate different paths or branches the conversation could take, and understand the concepts, context, and variables -- such as weather, seat preference, hotel availability, and exchange rate -- to complete a transaction.
In addition, with natural user interfaces and augmented reality, users could gesture or point to an object, such as a painting or building, and the personal assistant would say or display relevant information about the object. Moreover, in anticipation of next questions, it would have the probable answers pre-cached.
"We want to get you deeper into the conversation, not through rule sets but because we understand the thing you are talking about it. For that, there is bunch of work around predictive models for conversation," Weitz said.
Bing search and Windows already are using Satori's knowledge repository. Bing presents at-a-glance answers, "snapshots" about the people, places, and things in search results. It also "autosuggests" as users type to help disambiguate queries and get to answers more quickly. Google Search offers similar features using its Knowledge Graph.
Windows Phone 8 Local Scout offers personalized recommendations for coupons, restaurants, music, and other entities, based on a user's location, prior searches, and recommendations from Facebook friends.
At Build 2013 last month, Microsoft announced its Entity API, which will enable developers to tap into Satori and build search into their apps with voice, optical character recognition, and language translation. However, the Entity API only works with Windows 8, 8.1, and Xbox One.
Perhaps, Microsoft 's go-slow approach reflects the lessons of Bob, a virtual assistant introduced in 1995 to help users navigate Windows and perform tasks with Microsoft applications. Microsoft Bob wasn't able to provide sufficient utility to users and lived a brief, much ridiculed life, disappearing from the world in early 1996.
"Microsoft Bob didn't have ubiquitous connectivity or access to the world of knowledge; it couldn't get smarter," Weitz said. "Now we have the computing power and the digital footprints people and objects are leaving, and an amazing level of context. It's the first time we have all the necessary ingredients to truly build agents that really work.
"You will see some things now in our products and more in the future. There are teams working hot and heavy on this right now," he said.
In the meantime, Apple's Siri and Google Now, which are integrated into the vast majority of mobile devices sold, are continuing to learn more about you and the world, and becoming more conversational. Even a revolutionary Bing agent -- and it's not clear how Microsoft could outsmart Lord Google in search -- will have a hard time at leapfrogging Siri and Google Now if iOS and Android continue to rule the mobile world.