How encryption could stop personal data exposures on the cloud

The cloud leaks your data like crazy.

Laura Hautala
Laura Hautala
Laura Hautala Former Senior Writer
Laura wrote about e-commerce and Amazon, and she occasionally covered cool science topics. Previously, she broke down cybersecurity and privacy issues for CNET readers. Laura is based in Tacoma, Washington, and was into sourdough before the pandemic.
Expertise E-commerce, Amazon, earned wage access, online marketplaces, direct to consumer, unions, labor and employment, supply chain, cybersecurity, privacy, stalkerware, hacking. Credentials
  • 2022 Eddie Award for a single article in consumer technology
Laura Hautala
4 min read

An encryption product could improve cloud security.

Getty Images

What do a Peruvian movie theater chain and a payment service for US cannabis dispensaries have in common? Unsecured databases. In separate incidents this month, sensitive customer data from Cineplanet in Peru and THSuite in the US were found exposed on cloud servers without password protection. Identity theft experts say the global trend of exposures is just as concerning as hackers stealing data outright.

To ease the problem, database software makers have been trying to make security easier for cloud database managers. On Monday, Kenn White, a security principal at database software maker MongoDB, will describe a new technique, called field level encryption, to make data safer on the cloud. The research will be presented at the Enigma Conference in San Francisco.

Field level encryption works by scrambling data before it's sent to a cloud database and unscrambling it when the data is retrieved. The promise of the product is to protect the contents of a cloud database, even if bad guys access it. The feature has been available on MongoDB's open-source product since December, as well as for customers of the company's corporate products.

MongoDB's new feature comes as more and more companies move user data to cloud servers, rather than run their own costly data centers. In April, Gartner projected that cloud computing would be a $214 billion industry by the end of 2019. That was up more than 17% from 2018, when it was $182 billion.

Companies have rushed to the cloud without understanding all of the security implications. Many companies have left countless databases exposed, revealing personal data that has included records from drug rehab centers. A database containing details about who lives in 80 million US households was left unprotected in 2019, as was data on Facebook users and the anticipated salaries of job seekers.

Watch this: The Trump administration and Apple are set for a new battle on encryption

The seemingly endless exposures -- the result of a failure to password-protect a database -- have inspired an army of security researchers who hunt down countless exposed databases containing Social Security numbers, passwords, personal histories and other details that shouldn't be accessible to just anyone with an internet connection.

Data on the cloud should be password-protected by default, says Chris Vickery, a security researcher who looks for database exposures at UpGuard. Often, though, it isn't.

"There's so many different platforms out there these days," Vickery said. "From one to the other, you're going to have varying levels of default security."

Sometimes the person setting up the cloud database will inadvertently turn off password protection, said White, the MongoDB executive.

Maximum security storage

MongoDB's field level encryption might encourage some companies that don't use the cloud to consider it. Big companies are wary of putting financial or health information on the cloud because exposures of that information carry high penalties in the US. In some cases, companies aren't legally permitted to share the data with cloud providers in the first place.

Field level encryption could change that because companies wouldn't be sharing the data. Instead, they'd share a string of incomprehensible characters that can only be unscrambled with an encryption key stored on the companies' machines. MongoDB has already signed up Apervita, a processor of medical and prescription data, to use the feature. 

MongoDB dedicated 24 engineers to the project, which took two years. Its open-source software is popular -- it's been downloaded more than 80 million times -- because it can be used to build virtual databases that run on lots of platforms, including Windows and Linux machines. It's compatible with the processors in laptops and mobile phones, and it's interoperable with more than a dozen programming languages.

The widespread use of MongoDB created a challenge for the engineers, who had to build a feature that let users store and search for encrypted data that operates seamlessly with all of the hardware, operating systems and programming languages supported by MongoDB. White called it "a crazy amount of combinations."

Making it usable

Field level encryption tackles a paradox. Database managers want to store their data in an unreadable format, but they also want to be able to find specific pieces of information in the database with a simple search query. For example, someone might want to look up health care patients by their Social Security numbers, even if those numbers are stored as random characters.

To make this possible, field level encryption lets database managers encrypt a search term on their machine and send it to the database as a query. The database matches the encrypted version of the search term with the record it's storing and then sends it back to you. 

This approach only works with specific kinds of data. Attackers could break the encryption when a database stores information that only has a relatively small number of potential values, like gender or state codes, by spotting repeating patterns throughout the data set. Field level encryption also isn't useful for long text entries, like notes in a patient's medical chart, because you can't always search for individual words.

Still, for data like account numbers, passwords and government ID numbers, field level encryption protects data and maintains a usable database.

Most importantly, White said, it's simple to set up. Database managers turn it on with a one-time configuration change when they set up the database.

"That's really powerful," he said.