Startup packs all 16GB of Wikipedia onto DNA strands to demonstrate new storage tech

Biological molecules will last a lot longer than the latest computer storage technology, Catalog believes.

Stephen Shankland Former Principal Writer
Stephen Shankland worked at CNET from 1998 to 2024 and wrote about processors, digital photography, AI, quantum computing, computer science, materials science, supercomputers, drones, browsers, 3D printing, USB, and new computing technology in general. He has a soft spot in his heart for standards groups and I/O interfaces. His first big scoop was about radioactive cat poop.
Expertise Processors, semiconductors, web browsers, quantum computing, supercomputers, AI, 3D printing, drones, computer science, physics, programming, materials science, USB, UWB, Android, digital photography, science. Credentials
  • Shankland covered the tech industry for more than 25 years and was a science writer for five years before that. He has deep expertise in microprocessors, digital photography, computer hardware and software, internet standards, web technology, and more.
Stephen Shankland
3 min read
Startup Catalog has stored all 16GB of English-language Wikipedia on DNA contained in this vial.

Startup Catalog has stored all 16GB of English-language Wikipedia on DNA contained in this vial.


Computer storage technology has moved from wires with magnets to hard disks to 3D stacks of memory chips. But the next storage technology might use an approach as old as life on earth: DNA. Startup Catalog announced Friday it's crammed all of the text of Wikipedia's English-language version onto the same genetic molecules our own bodies use.

It accomplished the feat with its first DNA writer, a machine that would fit easily in your house if you first got rid of your refrigerator, oven and some counter space. And although it's not likely to push aside your phone's flash memory chips anytime soon, the company believes it's useful already to some customers who need to archive data.

DNA strands are tiny and tricky to manage, but the biological molecules can store other data than the genes that govern how a cell becomes a pea plant or chimpanzee. Catalog uses prefabricated synthetic DNA strands that are shorter than human DNA, but uses a lot more of them so it can store much more data.

Relying on DNA instead of the latest high-tech miniaturization might sound like a step backward. But DNA is compact, chemically stable -- and given that it's the foundation of the Earth's biology, it's arguably not as likely to become as obsolete as the spinning magnetized platters of hard drives or CDs that are disappearing today the way floppy drives already vanished.

Who's in the market for this kind of storage? Catalog has one partner to announce, the Arch Mission Foundation that's trying to store human knowledge not just on Earth but even elsewhere in the solar system -- like on Elon Musk's Tesla Roadster that SpaceX launched into orbit. Beyond that, Catalog isn't ready to say who other customers might be or if it'll charge for its DNA writing service.

Catalog's DNA writing machine can write data at a rate of 4 megabits per second, but the company hopes to make it at least a thousand times faster.

Catalog's DNA writing machine can write data at a rate of 4 megabits per second, but the company hopes to make it at least a thousand times faster.


"We have discussions underway with government agencies, major international science projects that generate huge amounts of test data, major firms in oil and gas, media and entertainment, finance, and other industries," the company said in a statement.

Catalog, based in Boston, has its own device to write data that can record 4 megabits per second right in DNA. Optimizations should triple that rate, letting people record 125 gigabytes in a single day -- about as much as a higher-end phone can store.

Conventional DNA sequencing products already for sale in the biotechnology market read the DNA data. "We think this whole new use case for sequencing technology will help [drive] down cost quite a bit," Catalog said, arguing that computing business is a potentially much larger market.

Watch this: DNA data storage could solve a big problem

Chief Executive Hyunjun Park and Chief Technology Innovation Officer Nathaniel Roquet founded Catalog in 2016. At the time, Park was an MIT postdoc and Roquet was a Harvard graduate student.

Catalog uses an addressing system that means customers can use large data sets. And even though DNA stores data in long sequences, Catalog can read information stored anywhere using molecular probes. In other words, it's a form of random-access memory like a hard drive, not sequential access like the spools of magnetic tape you might remember from the heyday of mainframe computers a half century ago.

Although DNA data can be disrupted by cosmic rays, Catalog argues that it's a more stable medium than the alternatives. After all, we've got DNA from animals that went extinct thousands of years ago. How much do you want to bet that USB thumb drive in your desk drawer will be still useful even 25 years from now?

New Seagate and LaCie storage drives

See all photos

Originally published on June 29.
Update on July 2: Clarifies founders background.