Feature Importance in the Quality of Protein Templates

Authors

  • Muhamed Adilović
  • Altijana Hromić-Jahjefendić

DOI:

https://doi.org/10.21533/pen.v9.i2.786

Abstract

Proteins are in the focus of research due to their importance as biological catalysts in various cellular processes and diseases. Since the experimental study of proteins is time-consuming and expensive, in silico prediction and analysis of proteins is common. Template-based prediction is the most reliable, which is why the aim of this study is to analyze how important are the primary features of proteins for their quality score. Statistical analysis shows that protein models with a resolution lower than 3 Å or R value lower than 0.25 have higher quality scores when compared individually to their counterparts. Machine learning algorithm random forest analysis also shows resolution to have the highest importance, while other features have lower but moderate importance scores. The exception is the presence of ligand in protein models, which does not have an effect on the global protein quality scores, both through statistical and machine learning analyses.

Downloads

Published

2021-04-30

Issue

Section

Articles

How to Cite

Feature Importance in the Quality of Protein Templates. (2021). Periodicals of Engineering and Natural Sciences, 9(2), 820-829. https://doi.org/10.21533/pen.v9.i2.786