Terabytes of data are streaming through dedicated fiber-optic links between laboratories and universities globally in preparation for the world's largest particle accelerator, the Large Hadron Collider, being switched on in August at CERN in Geneva, Switzerland.
The Large Hadron Collider Computing Grid (LCG), a super-high-bandwidth network, will channel about 15 petabytes--15 million gigabytes--of data from the LHC to about 5,000 scientists in 500 institutions every year for at least 10 years.
The particle accelerator will
The LCG will allow researchers to tap into the distributed processing power of almost 100,000 CPUs, crunching through vast amounts of data from the detectors and speeding their hunt for clues about the fundamental nature of the universe.
Rutherford Appleton Laboratories, near Oxford, England, has a 10-gigabit connection to CERN capable of 1,250 megabits per second upstream and downstream that will pipe in almost-raw data from the collider via the U.K. part of the LCG--the GridPP.
Andrew Sansum, tier one manager at RAL, said its connection with CERN is about 1,000 times faster than the download speeds on a home broadband connection.
It may be less than two decades before commercial networks catch up: "Video and other media services are going to push the speed of consumer network connections up as the demand is going to be huge," Sansum said. "We were at today's speed of about 10Mbps about 10 to 15 years ago, so you could take that as a precedent for how long it will take for the commercial networks to catch up with us today."
RAL and other "tier one" sites across the world in the LCG will shape the mass of data from the LHC into chunks that can be usefully analyzed by physicists and pass it on to hundreds of "tier two" universities and laboratories in their respective countries.
"The LHC experiment would not be possible without the power and throughput of the LCG. CERN has not got the capacity to solely process the vast amount of data on site. The tier one sites will be busy refining the data and enhancing the software that analyses it, growing the processing operations of the grid," Sansum said.
"Our role," he said, "is to make sure that those physicists are getting the most useful and relevant data. Grid technology is transforming the way that experiments are being carried out. Ten years ago these institutions were working on their own; now they work closely together."
Sansum said RAL and the GridPP are prepared for the LHC going live. "We have run it up to 250Mbps to 300Mbps each way sustained over several days so far. We are in the final shakedown at the moment and seem to be in good shape to face the challenges the LHC will throw at us," he said.
There are bound to be surprises around the corner, he acknowledged. "The biggest challenge is for the software to work out which of the 200 or so tier two sites has which data. You need to be able to move vast amounts of data from site to site, check it has all got there, flag up any problems and correct those immediately--it quickly gets immensely complicated," Sansum said.
A wide range of projects are already tapping into the vast number-crunching capabilities and fat pipes of the GridPP during its downtime, including those searching for antimalarial drugs, combating avian flu, or using an image search engine.
There are various grid projects around the world analyzing weather data, collaborating on other scientific and academic projects, but none match the scale and sustained throughput of the LCG.
Grid technology will continue to grow in use, according to Sansum, linking up diverse data, such as climate information and localized cancer rates, and offering insight and driving scientific progress forward in ways never before possible.
Nick Heath of Silicon.com reported from London.