DiversiLab genotyping

The DiversiLab™ System (bioMérieux) is an automated DNA fingerprinting system based on the rep-PCR electrophoresis technique. The system is particularly useful for bacterial genotyping and is used for source tracking of hospital and community infection, contamination and epidemiological surveillance.

In collaboration with bioMérieux, Applied Maths has developed a plugin for import and analysis of Diversilab patterns. The web-based Diversilab software from bioMérieux allows XML files to be exported from sets of patterns, which can be imported in BioNumerics. BioNumerics stores the patterns as fingerprints in the database, thus offering the advantages of creating libraries of thousands of patterns and analyzing Diversilab patterns using BioNumerics' rich analysis environment. Importantly, Diversilab patterns can be analyzed in combination with other techniques, yielding a more reliable picture of the genomic relationships between the organisms under study.

Normalization and correction of Diversilab patterns

Auto-correct settingsIn the Diversilab system, traces are automatically normalized between two external marker peaks (one at the top and one at the bottom of each trace). Due to a phenomenon of random migration within wells, which is inherent to the Agilent microfluidics chips, there is still a significant shift between the peak positions of very similar patterns. Therefore, software for cluster analysis of Diversilab patterns needs to provide additional correction between patterns, which is based on [subjective] global similarities between data patterns rather than dedicated reference lanes. Since there is no global standard on which this correction can be based, it has to be carried out on-the-fly for a selected set of patterns and thus cannot be saved to the database.

Why BioNumerics produces the most reliable clusterings

Non-linear shift with fixed edgesFrom the above, it is clear that the correction applied to the Diversilab patterns is of crucial importance for obtaining reliable clusterings. Therefore, the BioNumerics Diversilab plugin provides smart on-the-fly normalization algorithms for automatically aligning the Diversilab patterns within a comparison. All settings and parameters can be tuned by the user and BioNumerics offers various visual comparison tools to evaluate the results of auto-correction. The following algorithms are available:

  1. Non-linear shift with fixed edges: A non-linear shift is performed on the curve to obtain the highest correlation between the trace and its weighted average. The extremes of the curve are thereby fixed as anchor points so that a global stretch/compression is not possible. Considering the fact that the extremes of the curves correspond to the marker peaks used for the normalization performed by the DiversiLab software, this alignment can be seen as meaningful. However, it is frequently observed that the distortion of Diversilab patterns is not maximal in the center of the traces, so that bands towards the edges might still be not well-aligned.
  2. Global shift with linear stretch and compressionGlobal shift with linear stretch/compression: The curve is aligned to its weighted average by means of a global shift and a linear stretch/compression factor. The extremes of the curves are thereby not used as fixed anchor points. This alignment has two degrees of freedom and is therefore slower that the non-linear shift with fixed edges. It will usually result in a better alignment of all major bands in the patterns. However, because of the greater freedom, some “over-correction” might occur when the settings are not chosen carefully.

In BioNumerics, the user has full control over the effect of the auto-correction. The figure below shows an overlay image in BioNumerics of grayscale pattern images (before correction) and densitometric curves (after correction).

Before and after correction in BioNumerics

Below is an example of a global shift correction with linear stretch/compression applied on Diversilab patterns in BioNumerics.

DiversiLab profiles corrected in BioNumerics