The standardization, portability, and reproducibility of analysis pipelines is a renowned problem within the bioinformatics community. Bioinformatic analysis pipelines are often designed for execution on-premise, and this inevitably leads to a level of customisation and integration that is only applicable to the local infrastructure. More notably, the software required to run these pipelines is also tightly coupled with the local compute environment, and this leads to poor pipeline portability, and reproducibility of the ensuing results - both of which are fundamental requirements for the validation of scientific findings. Here we introduce nf-core, a framework that provides a community-driven platform for the creation and development of best practice analysis pipelines written in the Nextflow language. Nextflow has built-in support for pipeline execution on most computational infrastructures, as well as automated deployment using container technologies such as Conda, Docker, and Singularity. Therefore, key obstacles in pipeline development such as portability, reproducibility, scalability and unified parallelism are inherently addressed by all nf-core pipelines. Furthermore, to ensure that new pipelines can be added seamlessly, and existing pipelines are able to inherit up-to-date functionality the nf-core community is actively developing a suite of tools that automate pipeline creation, testing, deployment and synchronization. The peer-review process during pipeline development ensures that best practices and common usage patterns are imposed and therefore, adhere to community guidelines. Our primary goal is to provide a community-driven platform for high-quality, excellent documented and reproducible bioinformatics pipelines that can be utilized across various institutions and research facilities.
For historic individuals, the outward appearance and other phenotypic characteristics remain often non-resolved. Unfortunately, images or detailed written sources are only scarcely available in many cases. Attempts to study historic individuals with genetic data so far focused on hypervariable regions of mitochondrial DNA and to some extent on complete mitochondrial genomes. To elucidate the potential of in-solution based genome-wide SNP capture methods - as now widely applied in population genetics - we extracted DNA from the 17th century remains of George Bähr, the architect of the Dresdner Frauenkirche. We were able to identify the remains to be of male origin, showing sufficient DNA damage, deriving from a single person and being thus likely authentic. Furthermore, we were able to show that George Bähr had light skin pigmentation and most likely brown eyes. His genomic DNA furthermore points to a Central European origin. We see this analysis as an example to demonstrate the prospects that new in-solution SNP capture methods can provide for historic cases of forensic interest, using methods well established in ancient DNA (aDNA) research and population genetics.
In Scientific Reports,
Egypt, located on the isthmus of Africa, is an ideal region to study historical population dynamics due to its geographic location and documented interactions with ancient civilizations in Africa, Asia, and Europe. Particularly, in the first millennium BCE Egypt endured foreign domination leading to growing numbers of foreigners living within its borders possibly contributing genetically to the local population. Here we mtDNA and nuclear DNA from mummified humans recovered from Middle Egypt that span around 1,300 years of ancient Egyptian history from the Third Intermediate to the Roman Period. Our analyses reveal that ancient Egyptians shared more Near Eastern ancestry than present-day Egyptians, who received additional Sub-Saharan admixture in more recent times. This analysis establishes ancient Egyptian mummies as a genetic source to study ancient human history and offers the perspective of deciphering Egypt’s past at a genome-wide level.
In Nature Communications,
The automated reconstruction of genome sequences in ancient genome analysis is a multifaceted process. Here we introduce EAGER, a time-efficient pipeline, which greatly simplifies the analysis of large-scale genomic data sets. EAGER provides features to preprocess, map, authenticate, and assess the quality of ancient DNA samples. Additionally, EAGER comprises tools to genotype samples to discover, filter, and analyze variants. EAGER encompasses both state-of-the-art tools for each step as well as new complementary tools tailored for ancient DNA data within a single integrated solution in an easily accessible format.
In Genome Biology,