With an estimated 350,000 plant species on earth, one of the greatest challenges facing ecologists is quantifying plant diversity and understanding its relationship to plant survival. To make matters more difficult, even with all of the observation data collected by independent botanists and researchers for the last 500 years, it has been nearly impossible to organize all of these data without the appropriate computational power.
Recent access, however, to big data supercomputing resources is revolutionizing the field. This research has been recently published in a series of papers in journals including Ecology, Ecography, Ecology and Evolution, and the Proceedings of the National Academy of Sciences.
The Botanical Information and Ecology Network (BIEN) is an international group that leads collectors of botanical data, ecologists and computer scientists worldwide to categorize and analyze plant species in the Americas. Using high performance computing (HPC) and data resources through the iPlant Collaborative and the Texas Advanced Computing Center (TACC), the researchers combined data — from some of the very earliest plant collections, to modern day herbarium specimens, to ecological surveys and measurements of plant traits to answer important questions on plant diversity.
- Read more: http://www.scientificcomputing.com/news/2015/07/taming-genomical-beast-big-data-resources
The group found that, in all of North and South America, there were approximately 120,000 plant species. But mapping and determining the hotspots of species richness requires computationally intensive geographic range estimates. These methods can precisely track where plant species tend to grow and develop. They also give ecologists the ability to document continental scale patterns of species diversity that show where any species of plant can be found — such maps have never been made before.
“As you move along a gradient, such as a gradient in latitude or temperature or precipitation, we see changes in diversity,” said Brian Enquist, principal investigator of the BIEN group and professor at the University of Arizona. “As ecologists, an important question we want to understand is what determines the total number of species we see in a given location.”
A firm grasp on plant diversity also allows ecologists to understand how diversity is related to species’ survival or functioning in different environments and how species diversity can influence the functioning of ecosystems. The prevailing school of thought is that locations that contain a fewer number of species will then have lower ecosystem functioning. According to Enquist, ecologists commonly think of species diversity and ecosystem functioning as similar to investing in stocks.
“In general, you want to diversify your portfolio, so that you invest in a lot of different things,” Enquist said. “We have the same analogy in thinking about ecosystems. If there are fewer ways species can make a living by having a smaller number of traits or functions, it’s thought that ecosystems are less resilient. They tend to be more susceptible to big changes and crashes.”
High performance ecology
To investigate these claims further, the BIEN group had to first run algorithms that incorporate botanical observation records and observation data from numerous ecological and museum sources for each species. In 2011, the group first turned to the iPlant Collaborative, a biotechnology project that allows researchers to access HPC and to manage their data.
“Running geographic range estimation algorithms one species at a time is not that challenging, but once you scale it up to more than 100,000 species, it becomes computationally limiting with our standard set of resources,” Enquist said.
Working with iPlant gave the group access to TACC’s Stampede supercomputer, one of the most powerful in the world, enabling them to dramatically scale up their algorithms and workflow.
With supercomputing, and for the first time, the BIEN group could generate and store geographic range estimates for plant species in the Americas. These data and the corresponding geographic ranges, are now available via the groups’ geoportal. After analyzing this data, they were able to map and visualize the plant diversity of the New World. The group found that there is no relationship between the diversity of species and the range of ecological functioning, overturning a popular theory in ecology.
“We’ve discovered something that no one has hit upon,” Enquist said. “Not only are we able to visualize the distribution of plant diversity for the first time, but our findings have also completely changed the way we think about the potential functioning of ecosystems and the role that biodiversity has to play.”
Martha Narro, senior projects coordinator for iPlant, facilitated the group’s interactions with resources at iPlant and TACC.
“The effort it took for the BIEN group to pull all of this data together by networking with scientists in North, Central, and South America is an incredible accomplishment,” Narro said. “It shows what can happen if researchers collaborate to accomplish something they could never do with their own small datasets or by working piecemeal.”
These findings advance basic science and public understanding of the wealth of ecological diversity in North and South America.
From theory to the masses
One practical application of the BIEN group’s efforts is the new smartphone application, Plant-O-Matic, which was developed using TACC resources. The free app offers users the opportunity to explore and discover plants found in a localized region. “Even if you are in the middle of the Amazon, the top of a mountain in Colorado, or your backyard, we can deliver a personalized plant species guide to you,” Enquist said.
The group is also examining the distribution of plant species in light of climate change. By integrating data from their geographic range estimates with available climate change predictions from the Intergovernmental Panel on Climate Change (IPCC), the BIEN group developed the Web site Forest Forecasts. The site provides an interactive visualization of the best and worst-case scenarios of how distinct species and forests will be affected by climate change up to year 2080.
Enquist was recently invited to speak at the Aspen Ideas Festival to unveil the first of these visualizations and to show how access to high performance computing gives researchers new ways to visualize and personalize the potential impacts of climate change. Although Forest Forecasts is currently in its infancy stage, the group plans to use this data to inform policymakers and the public of the effects of climate change on forest diversity in the western United States.
These findings have been the result of the group’s first attempts to analyze plant data. Enquist anticipates making additional waves in the field with future data outputs and high-impact research papers. But the researchers are just at the tip of the iceberg of ecological understanding.
Said Enquist: “We only have rudimentary knowledge in terms of what sets diversity of species and functioning. As ecologists and evolutionary biologists we’re still at the starting point and just getting our first glimpses.”
The iPlant Collaborative is a federation of the University of Arizona, Texas Advanced Computing Center, Cold Spring Harbor Laboratory, and the University of North Carolina Wilmington. iPlant is funded by National Science Foundation award numbers DBI-0735191 and DBI-1265383.