Journal of Biology ›› 2025, Vol. 42 ›› Issue (5): 67-.doi: 10.3969/j.issn.2095-1736.2025.05.067
Previous Articles Next Articles
MENG Xiangbo, LI Cen, YUAN Chengwu, LIU Fufeng, LU Fuping, PENG Chong
Online:
Published:
Abstract: This study aimed at the problem of poor regularity in the secretion efficiency of heterologous proteins guided by signal peptides. Eight datasets were constructed from the relevant data of the secretion of heterologous proteins guided by signal peptides fromBacillus subtilis, and prediction models of signal peptide secretion efficiency were developed using support vector machine (SVM) and Random Forest (RF) algorithms. Through various permutations of datasets, sequence features, and computational algorithms, a total of 458 classification models and 228 regression models were devised. The RF algorithm demonstrated superior classification performance, achieving 83.21% accuracy with the α-amylase dataset. In regression analysis, RF also outperformed other methods for the α-amylase dataset, yielding a model with a determination coefficient of 0.43. Additionally, the work revealed compositional differences in amino acids and GC3 content (the frequency of G and C nucleotides at the third position of codons) between high- and low-efficiency signal peptides, highlighting that good-performing signal peptides tended to have a higher proportion of unfolded amino acids and elevated GC3 content. In this study, the prediction of signal peptide secretion efficiency was realized, and the factors affecting the secretion efficiency of signal peptide were explored.
Key words: signal peptides, secretion efficiency, support vector machine, Random Forest;Bacillus subtilis
CLC Number:
Q936
MENG Xiangbo, LI Cen, YUAN Chengwu, LIU Fufeng, LU Fuping, PENG Chong. Machine learning-based prediction of secretory efficiency of signal peptides in Bacillus subtilis [J]. Journal of Biology, 2025, 42(5): 67-.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://www.swxzz.com/EN/10.3969/j.issn.2095-1736.2025.05.067
http://www.swxzz.com/EN/Y2025/V42/I5/67