Web-based app integrates genetic sequences of pathogens submitted by users with geographic data points so researchers can map spread of diseases, track mutations.
Elizabeth Armstrong Moore
Elizabeth Armstrong Moore is based in Portland, Oregon, and has written for Wired, The Christian Science Monitor, and public radio. Her semi-obscure hobbies include climbing, billiards, board games that take up a lot of space, and piano.
Using genetics, geography, and phylogenetic trees to map how different strains of pathogens evolve and mutate helps researchers predict hot spots where diseases are most likely to reemerge.
Today, the hope is that Supramap, a Web-based application that operates on parallel programming on computing systems at the Ohio Supercomputer Center and Ohio State University, will better enable researchers to map the spread of disease among different hosts, as well as to track mutations.
Users can submit raw genetic sequences and see a phylogenetic tree of strains of pathogens. That tree is projected onto the Supramap globe, viewable via Google Earth. Each branch in the evolutionary tree is both geolocated and time-stamped. Pop-up windows and branch colors reveal exactly how pathogens infect new hosts and mutate over space and time.
"Supramap does more that put points on a map--it is tracking a pathogen's evolution," says Daniel A. Janies, co-author of the paper and an associate professor at Ohio State University. "We package the tools in an easy-to-use Web-based application so that you don't need a Ph.D. in evolutionary biology and computer science to understand the trajectory and transmission of a disease."
Janies and colleagues tested Supramap with location and genetic data on the H5N1 ("avian") virus. They were able to see a diversity of viral strains from birds and mammals in China, Russia, the Middle East, Africa, and Europe as they spread west over a four-year period. The resulting tree revealed that, based on 239 sequences of the gene "polymerase basic 2," shifts between hosts correlate with a specific mutation that allows avian viruses to adapt to mammalian hosts.
Janies cites the sharing of as much data as possible as key to being able to better control the spread of infection diseases:
There are many efforts by governments and nongovernmental organizations to encourage sharing of raw genomic information, especially for pathogens, but the raw genetic information still needs interpretation, and we are sharing our know-how and even our computers so that this can happen. We aim for our tools to inform decisions about potential global hot spots for the emergence of diseases from animals and areas of drug resistance.
H1N1, for instance, is still around. Users can view it from a global perspective, but even down to basic street views as users enter in instances and specific strain of infection.
Registering takes 30 seconds, and creating projects is pretty straightforward. To start, click on project name and upload data files in plain text format using Unix line breaks. Sequence files are presented in FASTA format:
One file can be used for each locus and multiple files can be used. The first taxon in the file will be considered the outgroup. The outgroup will be used to root the tree. The choice of the outgroup taxon is up to the user. In the case of temporal series of isolates of pathogens, the outgroup is of often the oldest isolate. In natural sciences, the outgroup is often selected because it is outside of the set of interest, termed the ingroup. If the outgroup is related to but not a member of the ingroup then these two groups share a more ancient common ancestor than that shared by the ingroup. Rooting on an ancestor more ancient than the ancestor of the ingroup provides a baseline from which the branching pattern and polarities of changes within the ingroup can be elucidated.