IPlant innovates biodata analysis
Just managing the files on your computer can be difficult enough — now imagine sifting through and analyzing the sequences of thousands of plant genomes.
The only tool biology researchers can turn to is iPlant.
The iPlant Collaborative started five years ago at the UA, and has just received its second five-year round of funding. The goal of the project is to allow access to cutting-edge technology, high-level supercomputing, software analysis and metadata to biologists of any level of expertise.
”This kind of approach is essential to getting researchers of all disciplines to work together using large data sets and computational power that is going to be necessary to solve large-scale problems,” said Naim Matasci, an analyst at the BIO5 Institute, which houses iPlant.
Before the technological revolution and 21st century biology, it was hard to produce small amounts of DNA sequences.
“Informatics has always served biology, since we started sequencing,” said Matasci. “It’s the scale that’s new.”
This data, however, has yet to yield much new knowledge. Rather, researchers are left with an overabundance of data. This means that today’s biologists have a lot more data to work with and analyze — and this is where iPlant thrives.
“It acts as an intermediary between non-computer savvy biologists and the tools,” said Matasci.
IPlant gives scientists the tools necessary to compare the genetics of a plant and the aspects of its environment in order to see how the phenotype, or the expression of the genes of an organism, will change. The system is powerful enough that it can analyze how one gene can affect one particular organism in one precise setting, making predictive biology a reality.
Computer software like this is giving even the smallest labs the chance to accelerate their research to heights never before imagined.
“Before, maybe one or two steps would be computational,” said Matasci. “You would turn off your computer, go to your desk, look at your data … think about it hard enough and then write something down.”
With this new cyberinfrastructure, scientists can analyze and compare the genes of hundreds of thousands of plant species.
“All sorts of other ways to observe biological systems are becoming much more quantitative,” said Eric Lyons, a scientific developer on the iPlant project.
Without access to high-level computers, biologists would end up with massive amounts of data and nothing to do with it
“We’re trying to understand the structure, evolution and dynamics of the genome. … [Biologists] need to use computational tools to make sense of it, as well as having the appropriate types of cyberinfrastructure to manage the life cycle of their data,” Lyons said. “Reusing data that people have already generated, and using them in a more creative way than the original author had intended, makes science happen faster than ever.”
Recently, though, there has been a pushback in big science: Many people think that science funding should be spread around to a lot of the smaller labs.
IPlant may not exist five years from now, according to many of the researchers, but its basic idea will live on through a network of data and collaboration for the biology community to expand their research to new horizons.
“We are not one group of scientists doing big science,” Matasci said. “What we are doing is really allowing the smaller labs to pool resources to create a loose consortium.”