CNET también está disponible en español.

Ir a español

Don't show this again

Sci-Tech

Chatty app from Facebook helps the sight-impaired 'see'

The VisualQ&A app answers questions about a photo and is just one of the artificial intelligence advances being developed by Facebook.

Mike Schroepfer discusses recent developments by the Facebook AI Research team.

Stephen Shankland/CNET

DUBLIN -- Facebook could soon tell you what you're looking at. The social network has developed an app that invites those with impaired vision to ask questions about photos and have the answers read to them.

Speaking here today at the annual Web Summit Internet conference, Facebook Chief Technology Officer Mike Schroepfer showed off the company's latest advances in artificial intelligence, including the experimental photo app. Known by Facebook as Visual Q&A, the app lets people ask questions about a picture to form an idea of what it depicts, even when they can't see it.

When presented with a picture of a friend's baby, for instance, you might ask, "Where is the baby?" or "What is the baby doing?" The app would then announce aloud that the baby is in the kitchen, say, or that she's eating cereal.

Here's the demo video of the app, which is still in development:

Visual Question and Answering Demo

Earlier this year, we showed some of our work on natural language understanding - specifically, a system called Memory Networks (MemNets) that can read and then answer questions about short texts. In this demo of a new system we call VQA, or visual Q&A, MemNets are combined with our image recognition technology, making it possible for people to ask the machine what's in a photo.

Posted by Facebook Engineering on Tuesday, 3 November 2015


In a blog post to accompany the presentation, Schroepfer outlined a number of recent developments made by the Facebook AI Research, or FAIR, team. They include improved image recognition designed to help a computer segment an image and therefore see what's actually in it, for example seeing where the outline of a person ends and differentiating them from other people or the background.

The Faceboffins have also scaled up a technology that helps neural networks develop "a short-term memory" and answer questions as a human might. The technology is called Memory Networks (aka MemNets). By combining MemNets with image recognition, the computer can then answer questions about an image.

Why does Facebook want to do all this? For a start, the Menlo Park, California-based company is testing an artificial intelligence-driven personal assistant called Facebook M that could rival Apple's Siri, Microsoft's Cortana and Android's Google Now. And Facebook also wants its AI to recognise what's in photos so the world's largest social network can fine-tune what shows up in your newsfeed.

Schroepfer also discussed how Facebook is teaching AI to predict things, showing a simple demonstration in which the computer attempted to predict whether a stack of blocks would topple. He joked that a lot of work had gone into "teaching artificial intelligence how to play Jenga."

During his speech Schroepfer delivered an update on Facebook's three-pronged set of next-generation technologies. As well as advancing artificial intelligence, the company's 10-year plan hinges on connecting the world and developing virtual reality.

The Aquila drone is one of the ways Facebook intends to connect areas of the globe currently without Net access. The autonomous plane has the wingspan of a 737 jet. Storing power from the sun, a network of Aquila drones is designed to stay 60-90,000 feet in the air for three months at a time, beaming the Internet to each other with precisely calibrated lasers.

Schroepfer brought an Aquila engine pod on stage with him at Web Summit. Despite standing taller than the Facebook CTO, the pod is "lighter than a MacBook," he said.

The full-size Aquila is set to be tested soon. A small-scale version was tested earlier this year.