X

Mozilla eyes hassle-free PDFs on the Web

A Firefox project uses Web standards such as HTML5, JavaScript, and SVG to show PDF files, an effort to make them easier and safer to use.

Stephen Shankland Former Principal Writer
Stephen Shankland worked at CNET from 1998 to 2024 and wrote about processors, digital photography, AI, quantum computing, computer science, materials science, supercomputers, drones, browsers, 3D printing, USB, and new computing technology in general. He has a soft spot in his heart for standards groups and I/O interfaces. His first big scoop was about radioactive cat poop.
Expertise Processors, semiconductors, web browsers, quantum computing, supercomputers, AI, 3D printing, drones, computer science, physics, programming, materials science, USB, UWB, Android, digital photography, science. Credentials
  • Shankland covered the tech industry for more than 25 years and was a science writer for five years before that. He has deep expertise in microprocessors, digital photography, computer hardware and software, internet standards, web technology, and more.
Stephen Shankland
4 min read
The W3C's new HTML5 logo
The W3C's new HTML5 logo W3C

PDF files have long been an awkward fit with the Web, but a new project from the developers of Firefox shows how online PDFs are changing for the better.

For years, the only way to view them was with viewer software from Adobe Systems, which created the Portable Document Format in the 1990s. Clicking a link to a PDF often meant a wait as the software loaded, followed by an alien interface, framed within the browser window, that meant actions like searching and printing were different. It's faster today, but PDFs still don't feel like native Web documents.

But PDF has become an international standard, and now PDFs are becoming less obstreperous. Google started indexing PDF content and showing PDFs in search results years ago, helping to ensure their utility on the Internet. And browsers have begun handling them better, too.

Google's Chrome, for example, added a PDF reader directly into the browser so that Adobe Reader, Mac OS X's Preview, or other third-party applications aren't required. (Well, except in cases where Chrome's plug-in isn't up to snuff; happily, it now sometimes warns you when a PDF has elements it can't handle.) Chrome is tackling the performance issue, too, making a PDF reader plug-in that uses the Native Client software technology.

Now Mozilla has begun a project of its own called pdf.js: a PDF reader that uses Web technology, not native software, to render PDFs in the browser. Eventually it will be built directly into Firefox, said programmer Andreas Gal in a blog post last week.

Thus, while Google is working on native-code PDF abilities--software tailored for a specific processor--Mozilla is working on an approach that uses the browser's engine instead.

We intend to use pdf.js to render PDFs "natively," within Firefox itself. Our most immediate goal is to implement the most commonly used PDF features so we can render a large majority of the PDFs found on the web. We believe we can reach that point in less than 3 months (the entire code so far is less than one month old, and it already renders a large set of PDF features).

Initially we will make a Firefox extension available to interested users that enables inline PDF rendering using pdf.js, but our ultimate goal is of course shipping pdf.js with Firefox. This will result in a substantial usability but also security improvement for our users. pdf.js uses only safe Web languages and doesn't contain any native code pieces attackers could exploit.

Indeed, security has been a problem for PDF reading on the Web. Adobe's widely used free Reader software needs regular attention as new security vulnerabilities are uncovered, some of zero-day problems that emerge before a patch is ready. Browser technology is by no means immune to security problems, but Web applications don't get the same privileges granted to native software, so that makes attacks harder.

The project uses JavaScript, the programming language of Web pages and Web applications, to interpret the PDF coding. It should be noted that Gal has been involved for years in improving Firefox's JavaScript execution speed. Another Web standard in use is the HTML5 Canvas technology for two-dimensional drawing.

For a look at how well the project compares to other PDF rendering software, check at the screenshots below.

Canvas is fast, something Mozilla likes given the sour sentiments that often arise at the prospect of loading a PDF. But it's got drawbacks, too, said Chris Jones in a blog post. For one thing, it's a low-level interface that doesn't easily let people select text. For another, high-quality printing is hard.

To get around those drawbacks, Mozilla also might use a PDF renderer using another Web technology, Scalable Vector Graphics (SVG). The idea is to render a quick version using Canvas, then swap in a more elaborate SVG-based version after it's been created, Jones said, mentioning that other approaches are possible, too.

To gauge progress, people can open a Web-based version of pdf.js showing a 2009 research paper about JavaScript that Gal and others wrote. Ordinarily I'd include a parenthetical warning to readers that they link leads to a PDF, but in this case, it leads to an ordinary Web page that shows a PDF.

Mozilla hopes the pdf.js will improve people's experience with PDFs, but ultimately help phase out the technology, too.

"It's important to note that we're not trying to promote PDF to a first-class web citizen like HTML5 is," Gal said. "Instead we hope that a browser-native PDF renderer written on the Web platform allows Web technologies to subsume PDF."

Perhaps the work will make PDF fade into the background. But people use PDFs for its advantages in formatting flexibility, archiving information in a standard file format, and sharing documents across a variety of operating systems and programs.

It seems possible to me, therefore, that Mozilla work to make PDFs easier and safer to use on the Web might actually strengthen the technology's position.

Mozilla's pdf.js project, just over a month old, uses Web technology such as JavaScript and HTML5 to process and display PDF files. This example shows it's still rough, but programmers expect more polish in the next three months.
Mozilla's pdf.js project, just over a month old, uses Web technology such as JavaScript and HTML5 to process and display PDF files. This example shows it's still rough, but programmers expect more polish in the next three months. screenshot by Stephen Shankland/CNET
Google's Chrome has a built-in PDF reader. Here's how it shows the same area of the same PDF file on Mac OS X.
Google's Chrome has a built-in PDF reader. Here's how it shows the same area of the same PDF file on Mac OS X. screenshot by Stephen Shankland/CNET
Adobe's Acrobat Pro software can be used to create, edit, and view PDF files. Here's how it shows the same section of the PDF on Mac OS X.
Adobe's Acrobat Pro software can be used to create, edit, and view PDF files. Here's how it shows the same section of the PDF on Mac OS X. screenshot by Stephen Shankland/CNET