We've got tapeworms and scabies! And reproducible research
Two new research papers on scabies and tapeworms published today in the Open Access journal GigaScience also include a collaboration with protocols.io. This collaboration showcases a new way to share scientific methods that allows scientists to better repeat and build on these complicated studies on difficult-to-study parasites. It also highlights a new means of writing all research papers with citable methods that can be updated over time. Keeping work clear, consistent, and current.
Parasitology remains a complex field given the often extreme differences between parasites, which all fall under the umbrella definition of an organism that lives in or on another organism (host) and derives nutrients at the host's expense. Published today in the Open Access journal GigaScience, are articles on two parasitic organisms, scabies and the tapeworm, Schistocephalus solidus. Not only are both papers in parasitology, but the way in which these studies are presented includes a unique means for reporting the Methods that serves to improve reproducibility. Here the authors take advantage of the open access repository of scientific methods and a collaborative protocol-centered platform, protocols.io. New mechanisms for presenting scientific research are a must to improve reusability of scientific information.
Currently, the most common way of presenting methods in articles is in extremely brief paragraphs or as supplemental downloadable PDF files. The result is often incomplete or non-discoverable methodology: a serious problem given methodology is key for scientists to properly build on scientific discovery. The parasitology articles published today are the first two studies to showcase the seamless integration into the manuscript submission and publication process of clear, detailed, and complete methodology descriptions. The protocols.io platform enables researchers to submit their methods in a standard format, with no space limitations, that can be directly linked to any article simply through a citable link. These can also be searched online, and best yet, can be versioned allowing for adaptations for future work. Not only does this allow the research community easy access to detailed methods, it also means authors don't have to continually rewrite methods for every paper that uses them. They can simply cite and credit the 'recipe' in protocols.io.
It seems fitting that the complexity of making scientific reporting reproducible is demonstrated in papers that capture the complexity of parasitic organisms, and, in these cases, parasites that require many different complicated experimental steps and unusual computational pipelines to study them.
In the first study, researchers from the Walter and Eliza Hall Institute of Medical Research and 4 other Australian institutions studied the genome of the human scabies parasite collected from remote disadvantaged and indigenous communities in Northern Australia, where up to 25% of adults and 50% of children acquire scabies infections each year. Scabies infections are linked to bacterial skin infections and rheumatic fever. As a consequence of this, children with scabies do less well, and this is a contributing factor to indigenous Australians having significantly reduced life expectancy and among the highest rates of rheumatic heart disease in the world.
Until now studying this species has been challenging. Being fractions of a millimeter in size, the researchers needed to collect, per sample, about 1000 mites to obtain enough DNA for next generation sequencing. In addition to the complications of collecting and pooling the mites, their tiny size also meant they had to deal with contamination from the mite's gut contents. All of these variables can create difficulty in clearly describing how conclusions are derived and how the research can be built on. The lead author Anthony Papenfuss, discussing the challenges of communicating this work, stated: "Writing clear and accurate descriptions of the wet lab and bioinformatics methods is a challenge at the best of times. It is especially hard when the design is complex and requires iterative exploratory analysis using multiple tools. It necessitates great care and time consuming refinement of the text. I think documenting the methods using protocols.io will make this much easier."
In the second paper, researchers from the Institut de Biologie Intégrative et des Systèmes and University of 22 Leicester studied the molecular biology of the parasitic tapeworm Schistocephalus solidus. Despite S. solidus serving as an emblematic study system in parasitology with two centuries of research, it has an extremely complicated life-cycle with multiple developmental states and host species (parasitizing crustaceans, fish and birds). As a consequence, while there is much known about its morphology and physiology, identifying which genes are used at each stage of infection, has been comparatively lacking. The work here includes recreating the different host conditions and collecting living worms from the different life cycles to collect RNA and producing a transcriptome gene catalogue. First author François-Olivier Hébert explained: "Describing such a long process of field sampling, experimental infections in the lab using multiple hosts and, of course, the complementary bioinformatic analyses, was one of the greatest challenges in this paper". With the new integrated data and method publishing pipeline aiding this, the authors added: "We were able to achieve that by making all of our homemade scripts, programs and datasets freely available to the public through GigaScience, GigaDB and protocols.io. They represent essential complementary platforms that allowed us to respect our vision of a reproducible science".
Mofiz E. et al., Genomic resources and draft reference assemblies of the human and porcine scabies mites, Sarcoptes scabiei var. hominis and var. suis. GigaScience. 2016. DOI: 10.1186/s13742-016-0129-2 http://dx.doi.org/10.1186/s13742-016-0129-2
Mofiz, E; Holt, D; Seemann, T; Currie, B, J; Fischer, K; Papenfuss, A, T (2016): Draft genome assembly using parasitic mite population NGS DNA sample from mites extracted from host wound environment. Protocols.io. http://dx.doi.org/10.17504/protocols.io.exwbfpe
Hebert FO. et al., Reference transcriptome for the parasite Schistocephalus solidus: insights into the molecular evolution of parasitism. GigaScience. 2016. DOI:10.1186/s13742-016-0128-3 http://dx.doi.org/10.1186/s13742-016-0128-3
Herbert, F.O.; Grambauer, S.; Barber, I.; Landry, C.R., Aubin-Horth, N. (2016): Protocols for "Reference transcriptome sequence resource for the study of the Cestode Schistocephalus solidus, a threespine stickleback parasite.". Protocols.io. http://dx.doi.org/10.17504/protocols.io.ew9bfh6
Notes to News Writers:
1. Publication in GigaScience includes storage of relevant associated data in the journal's affiliated database, GigaDB, where every dataset is provided with a digital object identifier (DOI), making it possible to cite and track data in standard scientific literature, which serves as a strong incentive for researchers to more rapidly release expensive and work-intensive datasets for community use. For these two papers these are:
Mofiz, E; Holt, D; Seemann, T; Currie, B, J; Fischer, K; Papenfuss, A, T (2016): The data for: Genomic resources and draft reference assemblies of the human and porcine scabies mites, Sarcoptes scabiei var. hominis and var. suis. GigaScience Database. http://dx.doi.org/10.5524/100198
Hebert, F, O; Grambauer, S; Barber, I; Landry, C, R; Aubin-Horth, N (2016): Reference transcriptome sequence resource for the study of the Cestode Schistocephalus solidus, a threespine stickleback parasite. GigaScience Database. http://dx.doi.org/10.5524/100197
Additionally, by citing data, and now protocols, readers of scientific papers can more easily find and access the specific information used in a paper. As always with genomic data, where there is a community approved data repository, and also to provide the most extensive availability of these data to the community, the sequence reads for the two projects have been submitted to the SRA with BioProjects PRJEB12428 and PRJNA304161.
2. GigaScience is co-published by BGI, the world's largest genomics organization, and BioMed Central, the world's largest open-access publisher. The journal covers research that uses or produces 'big data' from the full spectrum of the life sciences. It also serves as a forum for discussing the difficulties of and unique needs for handling large-scale data from all areas of the life sciences. The journal has a completely novel publication format — one that integrates manuscript publication with complete data hosting, and analyses tool incorporation. To encourage transparent reporting of scientific research as well as enable future access and analyses, it is a requirement of manuscript submission to GigaScience that all supporting data and source code be made available in the GigaScience database, GigaDB (http://gigadb.org), as well as in their publicly available repositories. GigaScience will provide users access to associated online tools and workflows, and has integrated a data analysis platform, maximizing the potential utility and re-use of data. Follow GigaScience on twitter @GigaScience; Facebook https://www.facebook.com/GigaScience/, and keep up-to-date with our blog http://blogs.biomedcentral.com/gigablog/.
3. Protocols.io was conceived in 2012 by geneticist Lenny Teytelman and computer scientist Alexei Stoliartchouk to facilitate science communication and rapid sharing of knowledge. The protocols.io platform is a free open access service for academic and industry scientists to record and share detailed up-to-date protocols for research. It provides an open access hub for scientists to share improvements and corrections to science methods. The company is located in Berkeley, California. Follow @ProtocolsIO on twitter.