In amplicon-based NGS sequencing runs, often a mixture of known bacteria, a so-called mock, is included as one of the samples.
This sample can be used to tune processing pipelines and to calculate error rates.
The latter assumes that we know the exact marker gene sequence.
However, even the same bacterial strain can show variants with respect to its database sequence.
Indeed, amplicons can also be used to confirm species and to uncover new sequence variants.
NGS-eval is a web server tailored to calculate sequencing error rates as well as to detect new variants.
This server can also be used to correct the mock reference sequences.
You upload your (trimmed) reference sequences and the (processed) reads of your mock sample.
This typically contains a few thousand reads, which are processed in several minutes.
The output supports interactive plotting of the errors for each reference sequence.
The example output shows the results of a mock sample, where 16S rDNA (V4) amplicon reads have been mapped to the 17 16S rDNA reference sequences of bacteria in the sample.
This example contains 43707 (chimera-filtered) forward reads from a MiSeq 250-nt paired-end run (FASTQ format, gzipped size: 4.7Mb).
You can to reproduce this output.
May, A., Abeln, S., Buijs, M.J., Heringa, J., Crielaard, W. and Brandt, B.W. (2015).
NGS-eval: NGS Error analysis and novel sequence VAriant detection tooL.
Nucleic Acids Research 43:W301-W305.
May, A., Abeln, S., Crielaard, W., Heringa, J. and Brandt, B.W. (2014)
Unraveling the outcome of 16S rDNA-based taxonomy analysis through mock data and simulations.