Green digitization: Botanical collections data answer real-world questions
Even as botany has moved firmly into the era of "big data," some of the most valuable botanical information remains inaccessible for computational analysis, locked in physical form in the orderly stacks of herbaria and museums. Herbarium specimens are plant samples collected from the field that are dried and stored with labels describing species, date and location of collection, along with various other information including habitat descriptions. The detailed historical record these specimens keep of species occurrence, morphology, and even DNA provides an unparalleled data source to address a variety of morphological, ecological, phenological, and taxonomic questions. Now efforts are underway to digitize these data, and make them easily accessible for analysis.
Two symposia were convened to discuss the possibilities and promise of digitizing these data–at the Botanical Society of America's 2017 annual meeting in Fort Worth, Texas, and again at the XIX International Botanical Congress in Shenzhen, China. The proceedings of those symposia have been published as a special issue of Applications in Plant Sciences; the articles discuss a range of methods and remaining challenges for extracting data from botanical collections, as well as applications for collections data once digitized. Many of the authors contributing to the issue are involved in iDigBio (Integrated Digitized Biocollections), a new "national coordinating center for the facilitation and mobilization of biodiversity specimen data," as described by Dr. Gil Nelson, a botanist at Florida State University and coeditor of this issue.
iDigBio is funded by the U.S. National Science Foundation's Advancing Digitization of Biodiversity Collections initiative, and has already digitized about 50 million herbarium specimens. According to Dr. Nelson, "A primary significance has been community building among biodiversity scientists, curators, and collections managers, and developing and disseminating recommended practices and technical skills for getting these jobs done." The challenges of digitizing these data are formidable, said Dr. Nelson, and include "developing computer vision techniques for making species determinations and scoring phenological traits, and developing effective natural language processing algorithms for parsing label data."
But as the papers in this issue show, steady progress is being made in developing methods to address these challenges. Nelson et al. (2018) and Contreras (2018) address more nuts-and-bolts issues of data management, the former discussing the need for globally unique IDs for herbarium specimens, and the latter providing a workflow for digitizing new fossil leaf collections. Botella et al. (2018) review and discuss the prospects for "computer vision" aided by deep-learning neural networks that, while in their infancy, could eventually identify species from variable images. Yost et al. (2018) offer a protocol for digitizing data on phenology (the timing of events such as flowering or fruiting) from herbarium specimens.
These digitization methods can help unlock valuable herbarium data to address a range of questions. James et al. (2018) discuss how digitized herbarium specimens can be used to show how plant species have responded to global change, for example by using location and time data to model shifts in range. Cantrill (2018) discusses how the Australasian Virtual Herbarium database has been used for ecological and other research. Thiers and Halling (2018) extend the applications to the fungal world, showing how herbarium data can be used as a baseline to determine the distribution of macrofungi in North America. Furthermore, digitization efforts can have real payoff in public perception; Dr. Nelson sees an "increasing presence of biodiversity data and museums in the popular press, which has raised the profiles of herbaria and other collections for the general public." Along these lines, Konrat et al. (2018) show how digital herbarium data can be used to engage citizen scientists.
Through centuries of painstaking collection and cataloguing, botanists have created a unique and irreplaceable bank of data in the tens of millions of herbarium specimens worldwide. But converting a dried, pressed plant specimen with a handwritten label from 1835 into a format that you can fit on a USB stick is no small trick. Using creative thinking, sophisticated methodology, and hard work, these scientists are bringing the valuable information locked in herbarium specimens into the digital age.
The Applications in Plant Sciences special issue "Green digitization: Online botanical collections data answering real-world questions" is available online at: https://onlinelibrary.wiley.com/toc/21680450/6/2
Soltis, P. S., G. Nelson, and S. A. James. 2018. Green digitization: Online botanical collections data answering real-world questions. Applications in Plant Sciences 6(2): 1028. https://doi.org/10.1002/aps3.1028
Articles in the issue:
Botella, C., A. Joly, P. Bonnet, P. Monestiez, and F. Munoz. 2018. Species distribution modeling based on the automated identification of citizen observations. Applications in Plant Sciences 6(2): e1029. https://doi.org/10.1002/aps3.1029
Cantrill, D. J. 2018. The Australasian Virtual Herbarium: Tracking data usage and benefits for biological collections. Applications in Plant Sciences 6(2): e1026. https://doi.org/10.1002/aps3.1026
Contreras, D. L. 2018. A workflow and protocol describing the field to digitization process for new project-based fossil leaf collections. Applications in Plant Sciences 6(2): e1025. https://doi.org/10.1002/aps3.1025
James, S. A., P. S. Soltis, L. Belbin, A. D. Chapman, G. Nelson, D. L. Paul, and M. Collins. 2018. Herbarium data: Global biodiversity and societal botanical needs for novel research. Applications in Plant Sciences 6(2): e1024. https://doi.org/10.1002/aps3.1024
Nelson, G., P. Sweeney, and E. Gilbert. 2018. Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6(2): e1027. https://doi.org/10.1002/aps3.1027
Thiers, B. M., and R. E. Halling. 2018. The Macrofungi Collection Consortium. Applications in Plant Sciences 6(2): e1021. https://doi.org/10.1002/aps3.1021
von Konrat, M., T. Campbell, B. Carter, M. Greif, M. Bryson, J. Larraín, L. Trouille, et al. 2018. Using citizen science to bridge taxonomic discovery with education and outreach. Applications in Plant Sciences 6(2): e1023. https://doi.org/10.1002/aps3.1023
Yost, J. M., P. W. Sweeney, E. Gilbert, G. Nelson, R. Guralnick, A. S. Gallinat, E. R. Ellwood, et al. 2018. Digitization protocol for scoring reproductive phenology from herbarium specimens of seed plants. Applications in Plant Sciences 6(2): e1022. https://doi.org/10.1002/aps3.1022
Applications in Plant Sciences (APPS) is a monthly, peer-reviewed, open access journal focusing on new tools, technologies, and protocols in all areas of the plant sciences. It is published by the Botanical Society of America, a nonprofit membership society with a mission to promote botany, the field of basic science dealing with the study and inquiry into the form, function, development, diversity, reproduction, evolution, and uses of plants and their interactions within the biosphere. APPS is available as part of the Wiley Online Library.
For further information, please contact the APPS staff at [email protected]
Related Journal Article