|

The Dimensioning & Statistics module
Under dimensioning techniques, we understand all techniques that place the entries in a two- or more dimensional space, rather than imposing a hierarchical, bifurcating structure like a dendrogram. All dimensioning techniques in BioNumerics provide great interactive features, making it possible to select, add or remove entries directly on the plot, display additional database information as colors or labels, relate groupings directly to discriminatory characters, etc.
Principal Components Analysis (PCA) and Discriminant Analysis
This popular technique starts directly from a character table to obtain groupings in a multi-dimensional space. Any combination of axes can be displayed in two or three dimensions. The advanced presentation modes of PCA produce fascinating three-dimensional graphs in an X-Y-Z coordinate system, which can rotate in real time to enhance the perception of the spatial structures.
Multi-Dimensional Scaling (MDS) Rather than starting from the data set, MDS uses the similarity matrix as input, which has the advantage over PCA that it can be applied directly to pairwise-compared banding patterns. The same graphical 3-D viewing tools are available as for PCA.
Self-Organizing Maps (SOM) Basically being a type of neural network, a SOM is able to place many thousands of entries in a two-dimensional representation, a map, according to overall relatedness. For complex data sets with large numbers of entries, SOM analysis can be the preferred grouping tool. An interesting option of a SOM is that unknown entries can be placed in an existing map with very little computing time, which offers a quick and easy-to-interpret classification tool. BioNumerics has been the first software to apply this exciting technique for biological data.
Partition mapping A innovative tool that contains various statistical functions to compare and map two different partitionings. For example, two different techniques may subdivide the entries studied into groups (e.g. serotypes and sequence types) which may correspond to a certain degree. The partition mapping tool analyzes the correspondences between the two partitionings and produces a number of Mapping rules that define the significantly pairing groups between the two sets. The partition mapping tool is very useful for analyzing the congruences and discrepancies between typing or classification techniques. It can be used to define reliable and consistent groups on the basis of multiple classification methods and to predict the groups of one technique based upon the results from another technique.

ANOVA and MANOVA These very useful statistical methods allow the relation between groups of entries and characters to be discovered, and the significance of such groups to be determined. The groups can be clusters derived from a dendrogram, or any user-defined selections of entries (e.g., by origin, species, serotype …). BioNumerics offers a generalized and well-documented implementation of ANOVA (Analysis of Variance) and MANOVA (Multivariate Analysis of Variance) with comprehensive statistical analysis and validation testing tools. Every plot, table or report is a hyperlink that can be viewed in detail or exported as text and HTML. To facilitate the interpretation of the various assumption tests for non-experts, the result of each test is indicated as a circle that ranges from green (OK) to red (assumption not fulfilled).

Statistical tests and charts Easy and intuitive tool to perform a number of parametric and non-parametric statistical tests (Chi-square test, T-test, Wilcoxon signed-ranks test, Kruskal-Wallis test, ANOVA, Pearson correlation test, Spearman rank-order test…). For each input data type, the software displays the suitable tests and the available plot types.
Features:
- Principal Component Analysis. Non-hierarchic grouping by PCA. Spatial representation of clouds of entries in user-definable X-Y-Z coordinate systems. Indication of total discrimination of axes. Real-time rotation of coordinate system to enhance perception of 3-D structures. Advanced Open-GL presentation and layout for publication. Delineation of populations using colors and/or codes. Plotting of dendrogram branches on PCA for advanced grouping comparisons and methodological validations.
- Multi-Dimensioning Scaling. Non-hierarchic grouping by MDS. Iterative optimization of distances according to similarity matrix. Same presentation features as PCA.
- Self-Organizing Maps. Non-hierarchic grouping by the technique of Self-Organizing Maps (Kohonen maps), a variant of the neural network approach, extremely useful for large and complex data sets. Includes quick classification tool by placing unknown entries in existing SOM.
- Partition Mapping. Analyzes the correspondences between two partitions (classifications) and produces a number of mapping rules that define the significantly pairing groups between the two sets. The partition mapping tool is very useful for analyzing the congruences and discrepancies between typing and classification techniques and for defining reliable and consistent groups on the basis of multiple classification methods.
- ANOVA and MANOVA. Comprehensive exploratory data analysis reporting including group means, histograms, and covariance matrices. Full validity testing of the model including multivariate normality test, univariate normality tests (QQ plots and Kolmogorov-Smirnov tests of response variables, QQ plots and Kolmogorov-Smirnov tests of principal components), and homoscedasticity tests. Possibility to select useful characters directly from the Analysis of Variance report window and mark these characters as active/inactive for analysis and identification purposes. Canonical discriminants analysis for in-depth assessment of the discriminatory importance and behavior with respect to explanatory variables of each character.
- Statistics. A number of parametric and non-parametric statistical tests can be performed in an easy and intuitive environment (Chi-square test, T-test, Wilcoxon signed-ranks test, Kruskal-Wallis test, ANOVA, Pearson correlation test, Spearmann rank-order test. Automatic display of available tests for each input data type. Kolgomorov-Smirnov test for normality. Clear significance reporting.
|