Welcome to MEGARes: an Antimicrobial Database for High-Throughput Sequencing
The MEGARes database contains sequence data for approximately 4,000 hand-curated antimicrobial resistance genes accompanied by an annotation structure that is optimized for use with high throughput sequencing. The acyclical annotation graph of MEGARes allows for accurate, count-based, hierarchical statistical analysis of resistance at the population level, much like microbiome analysis, and is also designed to be used as a training database for the creation of statistical classifiers (Figure 1).
When should I use MEGARes?
For the population-level profiling or population comparison of antimicrobial resistance (count-based analyses, similar to microbiome analysis). MEGARes can also be used for the construction of sequence classifiers, e.g. naive Bayes, hidden Markov models. For users who wish to predict the protein function and functional mutations in their sequencing data, we recommend using a database suited for functional genomics, such as the Comprehensive Antibiotic Resistance Database (CARD).
What distinguishes MEGARes from other databases?
MEGARes has been designed for use in the computational analysis of large-scale sequencing data (on the order of terabytes) in a way that is fast and statistically accurate for count-based data and the construction of sequence classifiers.
- Sequences are annotated in a biologically meaningful way that preserves within-group nucleotide similarity.
- The annotation graph contains no cycles. Therefore, it contains no statistical dependencies and is accurate for the count-based analyses commonly performed in population-level profiling (Figure 1).
- The annotation graph contains only three hierarchical levels, which maximizes the number of representative sequences for each annotation node. This is designed to work well for the construction of statistical classifiers. The annotation levels are:
- Class: the major antimicrobial chemical class, e.g. betalactams, aminoglycosides
- Mechanism: the biological mechanism of resistance, e.g. penicillin binding protein
- Group: the gene- or operon-level group for that sequence, e.g. SHV betalactamase, MCR-1
- All sequence metadata has been formatted to work well with the majority of bioinformatics software. Sequence headers contain no whitespace or non-compliant symbols.
- All sequences and annotations have been hand-curated using a multi-factorial approach. See the manuscript for more details.
Citation for MEGARes and AmrPlusPlus:
Lakin, S.M., Dean, C., Noyes, N.R., Dettenwanger, A., Spencer Ross, A., Doster, E., Rovira, P., Abdo, Z., Jones, K.L., Ruiz, J., Belk, K.E., Morley, P.S., Boucher, C. (2016)
MEGARes: an antimicrobial database for high throughput sequencing. Nucleic Acids Res., 45. DOI: 10.1093/nar/gkw1009
Click to Download Citation
Click on a bar to dive into the data. Clicking on a blank area of the graph will move you up a level.