Multi-Locus VNTR Analysis (MLVA)

Multi Locus VNTR Analysis (MLVA) is a molecular typing method to subtype microbial isolates based upon the Variable copy Numbers of Tandem Repeats (VNTR). A VNTR typically exhibits a large range of copy numbers, even among highly related bacterial strains. For a selected set of tandem repeats, copy number analysis reveals insights about the relationships at a micro-evolutionary level.

multi locus VNTR analysis

In practice, VNTR loci are selected that are sufficiently and complementary discriminatory for the organisms studied, and conserved primers are designed outside the tandem repeat for each VNTR. Thus, the size in base pairs of each PCR-amplicon is the sum of the size of the tandem repeat plus the offsets at both ends.

VNTR copy number analysis

Knowing the repeat size, the copy number can easily be calculated as

Copy number formula

For economy reasons, several VNTRs are sometimes pooled, i.e. they are marked with the same dye and loaded as a mixture in the same column of a capillary sequencer. A condition is that the mixed VNTR PCR products have size ranges that do not overlap. E.g., using 4 dyes and 2 non-overlapping VNTRs, 6 VNTRs can be determined per capillary run (one dye contains a reference marker set for size calculation).

VNTR pooling

Multilocus VNTR analysis in BioNumerics

The BioNumerics software offers a fully automated workflow for multi-locus VNTR analysis, starting from raw capillary sequencer chromatogram files or preprocessed peak tables (Applied Biosystems and Beckman). The MLVA setup has to be entered initially in the database. This involves entering the pooling strategy: a pool is a mix of VNTR amplification products loaded together in the same capillary. This includes the different dyes used and optionally, the compatible VNTRs with non-overlapping size ranges. Thus, each VNTR is defined by a pool, a dye and (optionally) a size range. The size range is defined by the repeat length, the offset and the copy range. As such, the software knows exactly within which size range it should look for a specific VNTR. Note that the copy range is only essential in case different VNTRs are pooled with the same dye.

VNTR settings

VNTR file name parsingIn case of raw chromatogram files (AB, Beckman), the software can automatically parse pool, dye and strain information from the file names, using a parsing string defined by the user.

In case of GeneMapper or Beckman peak table files, this information is automatically parsed from the tab-delimited peak table (see example below).

Peak table example

Robust and reliable approach, independent of instrument type

VNTR mappingDue to differences in instruments, dyes, and capillary columns, measured VNTR amplicon sizes often differ more or less from theoretical sizes based upon copy numbers. Therefore, a tolerance can be entered in bp. Obviously, the tolerance should always be less than RepeatSize/2. In case of small repeat sizes, calculated copy numbers may be systematically different when compared between different instrument types. To solve this problem, BioNumerics offers the possibility to create VNTR maps, i.e. a mapping from observed size to real size per VNTR and per copy number. Such VNTR ensure compatibility between different instruments, protocols, dyes, columns etc.

Fully automated copy number analysis

Once the settings for VNTRs and parsing have been entered, the software can automatically process thousands of MLVA runs, thereby creating reports listing unresolved VNTRs, multiple peaks found, and any other problems. Reports can display deviation from expected value as green to red (below, upper image), left/right deviation (blue and red, resp. (center image) or only errors and warnings (bottom image).

VNTR report

Distribution plots for individual VNTR's are also very useful to assess the deviation from theoretical copy numbers and define or refine VNTR mappings.

VNTR histogram

A myriad of analysis tools

The resulting VNTR information is stored in integer-type character sets where each VNTR represents one character. VNTR data can be analyzed as categorical characters (each different copy number is a different allele) or as quantitative characters. In the latter case, the larger the difference between copy numbers, the less related the organisms are considered. Population modelling networks can be calculated using the finest and most comprehensive cluster analysis application available today, applying micro-evolutionary criteria as priority rules and displaying branch significance support indication. The Minimum Spanning Tree algorithm applied on VNTR data in BioNumerics has proven to be invaluable for epidemiological study and population genetics of bacterial populations.

More information can be found in the PDF brochure of the MLVA plugin.

Share this: