Collecting the genomic sequences of various strains of the influenza A virus, as well as the coronavirus that causes Severe Acute Respiratory Syndrome, has helped in the fight against outbreaks around the globe in recent years.
Using genetics, geography, and phylogenetic trees to map how different strains of pathogens evolve and mutate helps researchers predict hot spots where diseases are most likely to reemerge.
Today, the hope is that Supramap, a Web-based application that operates on parallel programming on computing systems at the Ohio Supercomputer Center and Ohio State University, will better enable researchers to map the spread of disease among different hosts, as well as to track mutations.
Users can submit raw genetic sequences and see a phylogenetic tree of strains of pathogens. That tree is projected onto the Supramap globe, viewable via Google Earth. Each branch in the evolutionary tree is both geolocated and time-stamped. Pop-up windows and branch colors reveal exactly how pathogens infect new hosts and mutate over space and time.
"Supramap does more that put points on a map--it is tracking a pathogen's evolution," says Daniel A. Janies, co-author of the paper and an associate professor at Ohio State University. "We package the tools in an easy-to-use Web-based application so that you don't need a Ph.D. in evolutionary biology and computer science to understand the trajectory and transmission of a disease."
Janies and colleagues tested Supramap with location and genetic data on the H5N1 ("avian") virus. They were able to see a diversity of viral strains from birds and mammals in China, Russia, the Middle East, Africa, and Europe as they spread west over a four-year period. The resulting tree revealed that, based on 239 sequences of the gene "polymerase basic 2," shifts between hosts correlate with a specific mutation that allows avian viruses to adapt to mammalian hosts.
Janies cites the sharing of as much data as possible as key to being able to better control the spread of infection diseases:
There are many efforts by governments and nongovernmental organizations to encourage sharing of raw genomic information, especially for pathogens, but the raw genetic information still needs interpretation, and we are sharing our know-how and even our computers so that this can happen. We aim for our tools to inform decisions about potential global hot spots for the emergence of diseases from animals and areas of drug resistance.
H1N1, for instance, is still around. Users can view it from a global perspective, but even down to basic street views as users enter in instances and specific strain of infection.
Registering takes 30 seconds, and creating projects is pretty straightforward. To start, click on project name and upload data files in plain text format using Unix line breaks. Sequence files are presented in FASTA format:
One file can be used for each locus and multiple files can be used. The first taxon in the file will be considered the outgroup. The outgroup will be used to root the tree. The choice of the outgroup taxon is up to the user. In the case of temporal series of isolates of pathogens, the outgroup is of often the oldest isolate. In natural sciences, the outgroup is often selected because it is outside of the set of interest, termed the ingroup. If the outgroup is related to but not a member of the ingroup then these two groups share a more ancient common ancestor than that shared by the ingroup. Rooting on an ancestor more ancient than the ancestor of the ingroup provides a baseline from which the branching pattern and polarities of changes within the ingroup can be elucidated.
The Supramap project was supported by the U.S. Army Research Laboratory and Office and the Defense Advanced Research Projects Agency, Ohio State University, Google.org fund of the Tides Foundation, and the American Museum of Natural History.