Various next generation sequencing (NGS) based strategies have been successfully used in the recent past for tracing origins and understanding the evolution of infectious agents, investigating the spread and transmission chains of outbreaks, as well as facilitating the development of effective and rapid molecular diagnostic tests and contributing to the hunt for treatments and vaccines. The ongoing COVID-19 pandemic poses one of the greatest global threats in modern history and has already caused severe social and economic costs. The development of efficient and rapid sequencing methods to reconstruct the genomic sequence of SARS-CoV-2, the etiological agent of COVID-19, has been fundamental for the design of diagnostic molecular tests and to devise effective measures and strategies to mitigate the diffusion of the pandemic.
Diverse approaches and sequencing methods can, as testified by the number of available sequences, be applied to SARS-CoV-2 genomes. However, each technology and sequencing approach has its own advantages and limitations. Researchers from the University of Bari provide a brief, but comprehensive, account of currently available platforms and methodological approaches for the sequencing of SARS-CoV-2 genomes. The researchers also present an outline of current repositories and databases that provide access to SARS-CoV-2 genomic data and associated metadata. Finally, they offer general advice and guidelines for the appropriate sharing and deposition of SARS-CoV-2 data and metadata, and suggest that more efficient and standardized integration of current and future SARS-CoV-2-related data would greatly facilitate the struggle against this new pathogen. The researchers hope that their ‘vademecum’ for the production and handling of SARS-CoV-2-related sequencing data, will contribute to this objective.
Overview of the properties of different approaches for SARS-CoV-2 genome sequencing
(A) Violin plot of the size of SARS-CoV-2 genome assemblies obtained through different sequencing approaches. Assembly size in Knt (Kilonucleotides), is reported on the x-axis. (B) Violin plot of the sequencing depth (log10 of the total number of sequenced bases) obtained by different sequencing approaches. (C) Profile of normalized coverage levels of the genome of SARS-CoV-2 as obtained from different sequencing approaches. Coverage profiles were calculated on 300 non-overlapping genomic windows of 100 nt in size. A subset of 100 distinct records as available from public repositories of raw sequencing data has been considered to estimate the coverage profile of every sequencing approach. Coverage values were normalized by using the upper quartile normalization, and averaged for every data point (genomic window).