CD-NuSS

Circular Dichroism to Nucleic acids Secondary Structures

Three machine learning algorithms have been used to predict the nucleic acid secondary structural type. They are XGboost algorithm, nnet algorithm and kohonen algorithm. The training and testing CD spectral dataset corresponding to 16 different nucleic acid secondary structures were collected through literature survey



After training the 450 CD spectral dataset,validation has been carried out with 150 CD spectral dataset and testing has been carried out for 150 CD spectral dataset using the three machine learing methods. Among these methods, XGboost method shows high prediction accuracy followed by nnet method which is followed by kohonen method. The prediction accuracy for all the methods have been shown below with different interactive representations


Grouped bar representation

Grouped bar chart represents the prediction accuracy. The top bar chart (dark blue) represents the total number of datasets including all forms of nucleic acids considered for testing and the bottom bar chart (light blue) represents the total number of predicted datasets corresponding to all forms of the nucleic acids

Confusion matrix (16*16) representation

Confusion matrix is the 16*16 diagonal output matrix file obtained after testing the datasets with the trained model. The rows and columns contain 16 different forms of nucleic acids, wherein, the diagonal elements correspond to successful prediction of the particular nucleic acid conformation

Donut chart representation

The interactive Donut chart represents the prediction accuracy. User can choose any of the 16 nucleic acid conformations listed in the left side of window, which will subsequently be highlighted in the donut chart (given in the right side) according to the prediction







































Test Cases