Twitter to start labeling deepfakes but sets high bar for removal

Manipulated media would get pulled down only if it'd likely lead to serious harm.

Queenie Wong Former Senior Writer
Queenie Wong was a senior writer for CNET News, focusing on social media companies including Facebook's parent company Meta, Twitter and TikTok. Before joining CNET, she worked for The Mercury News in San Jose and the Statesman Journal in Salem, Oregon. A native of Southern California, she took her first journalism class in middle school.
Expertise I've been writing about social media since 2015 but have previously covered politics, crime and education. I also have a degree in studio art. Credentials
  • 2022 Eddie award for consumer analysis
Queenie Wong
4 min read

Twitter shared more details Tuesday about how it'll handle manipulated media. 

James Martin/CNET

Twitter will start labeling manipulated videos such as deepfakes in March, a step that comes as social media companies move to adopt policies to address misinformation on their platforms. The policy, which goes into effect on March 5, will make it easier for users to spot deepfakes but wouldn't lead to their removal unless the content was likely to lead to serious harm, such as a threat to someone's physical safety or privacy.  

The policy will affect photos, videos and other media that Twitter finds to be "significantly and deceptively altered or fabricated." The new policy applies to deepfakes -- artificial intelligence-powered videos that make it seem like people are doing or saying something they didn't -- and media altered with simple editing software. 

"Our goal is really to provide people with more context around certain types of media, they come across on Twitter and to ensure they're able to make informed decisions around what they're seeing," Del Harvey, Twitter's vice president of trust and safety, said during a call to explain the policy.

The new rules show that Twitter, like other social networks, is trying to combat disinformation ahead of the 2020 US elections, while balancing concerns over free speech. Lawmakers and the US intelligence community are worried that deepfakes could be used to meddle in elections in the US and those of its allies. New rules could help social media companies fend off criticism that they aren't doing enough. 

Twitter will examine whether content has been edited in a manner that changes its composition or timing, as well as if images or audio has been added or removed. The company will also consider if a user shares the media in a deceptive way, resulting in confusion or misunderstanding. For example, media is shared in a deceptive manner when the user falsely claims it depicts reality, Twitter said. 

Twitter unveiled a draft policy for manipulated media in November. 

Twitter said altered and deceptive content that was likely to impact public safety and cause physical harm would likely be removed. The company might also show a warning to users before they share or like a tweet, and reduce the spread of the content on Twitter by preventing it from being recommended. 


In this chart, Twitter explains how it will take action against manipulated media. 


Different approaches

Social media companies have responded differently to misleading videos. In May, videos of House Speaker Nancy Pelosi were doctored to make it seem as if she were slurring her words. YouTube, which has a policy against "deceptive practices," took the video down, though Twitter didn't. Facebook provided information from fact-checkers and slowed the spread of the video. 

Twitter's new rules mean that media like the Pelosi video would likely be labeled, but not removed. The policy isn't retroactive.

"Since the video is significantly and deceptively altered, we would label it under this policy. Depending on what the tweet sharing that video says, we might choose to remove specific tweets," said Yoel Roth, who heads site integrity at Twitter.

Watch this: We're not ready for the deepfake revolution

It was less clear how Twitter would have approached a selectively edited video, such as one of Democratic presidential candidate Joe Biden that falsely suggested he made racist remarks. A Twitter spokeswoman said the Biden video, which attracted more than 1 million views, may have been labeled if the rules were already in effect. The anonymous Twitter user who posted the edited video said the clip was part of "a humorous thread of out-of-context Biden gaffes and verbal stumbles."

Roth said that selective editing would be covered under the new policy but also acknowledged that determining what is satire is also very challenging for the company. 

"We need to try and get as much context as we can about the interactions on Twitter and a lot of times we're sort of an outside party to a conversation that's happening on our service," he said. 

Other social media companies have similar policies for dealing with deepfakes and manipulated media, but some critics say these rules don't go far enough. Facebook said in January that it would ban deepfake videos, but the policy had an exception for parody, satire or videos that were solely edited to omit or change the order of words. 

On Monday, Google-owned YouTube said that it would remove "technically manipulated or doctored" videos and content that try to mislead people about when and where to vote or that pose "a serious risk of egregious harm."

Twitter decided to move forward with its draft rules after getting more than 6,500 responses from people worldwide. People opposed to removal of all altered media, the company said, raised concerns about censoring speech and freedom of expression.