AI beats humans in Stanford reading comprehension test

Alibaba and Microsoft put their AI to the test this month, literally. And their scores bested ours, but barely.

Zoey Chong Reporter

Zoey is CNET's Asia News Reporter based in Singapore. She prefers variety to monotony and owns an Android mobile device, a Windows PC and Apple's MacBook Pro all at the same time. Outside of the office, she can be found binging on Korean variety shows, if not chilling out with a book at a café recommended by a friend.

See full bio

Zoey Chong

Jan. 16, 2018 1:21 a.m. PT

2 min read

CHINA-SCIENCE-GAME-GO — Yes, AI-based systems are becoming as smart as, if not smarter, than us.
STR/AFP/Getty Images

AI created by Chinese tech giant Alibaba and Microsoft have tied for first place on the Stanford Question Answering Dataset (SQuAD) this month, beating the human score for Exact Match (providing exact answers to questions). Alibaba and Microsoft announced the news separately on Monday.

According to the SQuAD website, it is a machine reading comprehension dataset comprising of questions pertaining to a set of Wikipedia articles. Answers to questions are usually a segment of text from a corresponding reading passage.

More reads

The leaderboard on SQuAD's website shows Alibaba's and Microsoft's EM scores to be 82.44 and 82.65 respectively, which put both at first place. The scores are higher than a human's, which is 82.304.

The results may not be surprising to some since AI-based systems have proven to be formidable, with Google's AlphaGo defeating Ke Jie as the Go world champion last year. The systems are also expected to go into hospitals and act as our assistants, and Alibaba founder Jack Ma predicted AI-powered robots will head companies in 30 years.

But not everyone will agree on how intelligent AI-based systems really are yet. Just a little more than three months ago, Chinese researchers published a study saying AI-based systems are no smarter than a six year-old. A Chinese robot called AI-MATHS which did a version of a Maths paper at China's college entrance exams was unable to beat the national average last year. The robot's developers explained it was unable to comprehend certain words, causing marks to be lost.

Luo Si, Chief Scientist of Natural Language Processing (NLP) at Alibaba iDST commented:

"It is our great honor to witness the milestone where machines surpass humans in reading comprehension. That means objective questions such as 'what causes rain' can now be answered with high accuracy by machines. We are especially excited because we believe the technology underneath can be gradually applied to numerous applications such as customer service, museum tutorials and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way."

"We are thrilled to see NLP research has achieved significant progress over the year. We look forward to sharing our model-building methodology with the wider community and exporting the technology to our clients in the near future," Si added.

CNET has reached out to Microsoft for a comment.

Watch this: CNET vs. the robots of CES 2018

01:45

Virtual reality 101: CNET tells you everything you need to know about VR .

CNET Magazine: Check out a sample of the stories in CNET's newsstand edition.