Postini: Google's take on e-mail security
At the search giant's e-mail security unit, globally distributed automated systems and Zero-Hour software keep spam and viruses out of inboxes.
MOUNTAIN VIEW, Calif.--The computer security industry historically borrows military defense concepts to combat digital threats, literally creating war rooms where experts follow attacks in progress on huge screens with phones ringing off the hook.
Not so at Google's Postini e-mail security service provider unit. Instead, computerized systems monitor 3 billion messages per day that flow in and out of customer systems and pass through Postini's thousands of machines in data centers around the U.S. and in Europe before hitting the Internet. The Postini system is highly automated, distributed, and scalable, characteristic of all of Google's operations.
Google's Gmail antispam efforts are separate from those of Postini, which, although it follows similar computerized operations and the teams have started to integrate the processes.
Postini represents Google's commercial push into e-mail security, offering a subscription-based service to more than 50,000 customer companies and organizations and more than 15 million business users. In addition to protecting e-mail from spam and viruses, Postini offers compliance and archiving services.
Sentinels and canaries
About 35 members of the Postini Site Reliability Engineering team have access on their machines to a dashboard that shows the number of transactions per second the Postini service is handling, as well as the message per minute rate and graphs of the error percentage rate obtained from a test system known internally as "Sentinel," according to Craig Croteau, who leads the group.
The Sentinel system has devices located on dedicated pipes into the Internet with daemon software, automated programs that run in the background that routinely send out test messages to gauge the performance of the flow through the Postini infrastructure. If there is a problem with a round-trip test message, indicating possible congestion, it will show up on the dashboard.
"It's a canary in the system," a tiny data stream that serves as an early warning system so potential issues can be stopped before they become major problems, Croteau said.
The Sentinal system posts the information to a database that feeds into the dashboard; one of several different data collection engines that are superimposed on the dashboard. Traffic monitors generate message rate graphs while the system extrapolates rates from live log scraping. Telemetry, remote measurement and reporting, is served up in multiple views.
Postini uses multiple fail-over sites, and if a potential problem is detected, the customer message flow is moved to a backup system. Unlike typical cloud hosting providers, Postini's subscription service does not store the customer's data on its servers but provides the protection services as the data passes through the Postini gateway.
In traditional network operation centers someone sitting in front of a screen notices a rise in error rates or some other problem, then conducts triage and follows a set work-flow procedure for dealing with events, according to Croteau.
"There's a built-in lag," he said. "It can take minutes, 15 minutes, to do something," especially if the worker is out of the office on a pager.
"If you want high, high up-time, you need to take action immediately in the face of a service degradation," Croteau said. "Our team looks at the dashboard, but our key is we let computers take action" without needing a human to have to make a decision first.
Asked about the potential for the computerized system to assume too much control, Croteau said: "I don't think it's HAL-like, actually. Humans are responsible for application debug and event analysis."
In addition to the automation, engineers have playbooks, or rules guidelines, to follow if something goes wrong. The playbooks explain how to attack a problem and what to do in case of specific types of events.
Asked what might prompt his alarm to go off in the middle of the night, Croteau said that might happen as a result of a regional network outage or if an anomalous event stresses the system, such as a poor interaction with messaging payload and scanning binaries. "For us, the most challenging item would be something involving a legitimate payload," he said.
"Antispam is not about identifying spam; it's about identifying good mail," said Croteau.
To identify and block spam and viruses, the automated Postini system looks for key words or phrases that indicate it's an ad or something dangerous, as well as looks at the structure of the e-mail message and the headers, said Kevin Lund, a software engineer who developed a lot of the code the Postini system runs.
The system scores each message on numerous combinations of criteria, assigning a weight to each and then comparing the score to those in a database of several hundred thousand message types that have been flagged as good or bad from Postini honey pots and customer spam reports. The system identifies and blocks more than 99 percent of the spam campaigns, according to Lund.
"We're rolling out little corks to plug the dikes," as part of a quick filtration process, then adding the data to the database for re-calibration, Lund said.
To block fresh spam attacks not covered by existing heuristic technologies and viruses not covered by existing signature databases Postini relies on proprietary Zero-Hour technology to identify new outbreaks that show up in the traffic patterns and quarantine them for later rescanning.
Customers can also create and build out their own white lists of message senders they trust and blacklist others they don't trust. It takes an average of 150 milliseconds for a message to be scanned by the antivirus engines that Postini licenses from McAfee and Authentium.
I asked Lund whether the problem of spam has been solved to satisfaction.
"If you can't bear to get a spam a day, then it's still a problem. It depends on your tolerance level," he said. "It's still a resource drain. You have to pay someone to get your e-mail workable. It takes money and resources to keep spam at bay."
Personally, I get maybe one spam message in my personal Gmail account every two weeks or so, which is tolerable, but I end up removing dozens of spam messages each day from my Outlook inbox at work, which is not tolerable.
"We take (spam) seriously, but we're not on some crusade," Lund said.
Lund, the technologist, would appear to be more laid back about the anti-spam mission than Scott Petry, who founded Postini in 1999 and now leads the group as a product management director at Google. During an interview, Petry animatedly drew a diagram on a whiteboard to illustrate how spam directly impacts a company's bottom line.
Basically, good protection can't mask the fact that spam volumes are rising as spammers continue to take advantage of economies of scale and are able to send exponentially more spam to more targets at virtually no additional cost.
Spam was a mere annoyance in e-mail's early years in the early 1990s. The tipping point for the industry hit in 2002 when spam reached 40 percent to 50 percent of all messages. Estimates now put it as high as 90 percent of all e-mail, with virus-related messages ranging from 15 percent to 50 percent of the total, according to Postini.
To keep up with the rising spam tide, companies are forced to buy more hardware to handle the increased storage and bandwidth consumption. As spam volumes rise and fall, companies can find themselves lacking capacity or with an excess, a waste of money and resources that could be directed elsewhere. Then there's the loss of productivity from end users wasting precious time having to clean junk out of their in boxes; not a negligible factor based on my own Outlook experiences.
Spam volumes were at a peak in November before the McColo ISP was shut down, prompting an estimated 70 percent drop in spam volumes practically overnight. Within about four months, the spam spigot wasas spammers found new hosters for their operations.
With Postini's subscription model ($12 or $25 per user per year depending on the type of service), companies don't have to plan ahead and wrestle with spam volatility; they let Google do it for them just like people pay a fee for Internet access or cable service.
Folded into Google, Postini is attracting bigger customers in more areas of the world, and in particular, is looking to leverage Google's sales channel and infrastructure to expand in Asia Pacific and Latin America, Petry said.
Q2 spam rises
The latest report from Postini on spam trends shows that despite law enforcement efforts to shut down spammers--like Sigourney Weaver blasting away the tenacious alien parasite in "Alien"--they just keep coming back.
In June, the FTC, or 3FN, for hosting spam and botnets. Volumes dropped 30 percent immediately, but have since climbed back up 14 percent, according to Postini's second-quarter spam trends report due out on Wednesday.
Overall, the second-quarter spam levels are 53 percent higher than in the first quarter and six percent higher than the same quarter a year ago.
Postini found that one attack alone, on June 18, unleashed 50 percent of a typical day's spam volume in just two hours. The attack featured an e-mail that looked like a legitimate newsletter from CNN but which had malicious links and images in it, said Amanda Kleha, a product marketing manager at Google. Postini's filters detected more than 11,000 variants of that spam during the attack, which enabled spoofing of the "from" field so that distribution lists were hit especially hard.
Spammers seem to be resurrecting old techniques, according to Postini's report. For instance, there was a rise during the quarter in image spam, basically advertisements with an image that can include malicious links and which are large in size. Postini also detected a resurgence in payload viruses, or e-mails with attachments containing viruses. Volumes of those types of messages rose to their highest level in nearly two years as spammers continued efforts to grow their botnets.
Meanwhile, spammers are still trying to exploit the public's interest in current events, such as using spam with subject lines and content related to the death of Michael Jackson.
Last year, Postini detected a huge bump in the amount of spam, possibly reflecting successful efforts to create armies of spam-sending compromised PCs that form botnets, Kleha speculated.
Google's global reach and its reliance on metrics and automation help provide its Postini unit with firepower and counter-attack capabilities to limit the number of spam-related casualties.
"At Google we can take advantage of the network effects with the traffic and interaction in the system," Lund said. "We can spot broader patterns" and use machine learning.