Generating Anticancer Peptides Sequences Using Seq2Seq Modelling and Machine Learning Methods
- Authority: IEEE Access
- Category: Journal Publication
Cancer remains a major health threat with rising incidence and mortality rates. Despite the efficacy of chemotherapy, its lack of selectivity and associated severe side effects highlight the need for new, targeted anticancer therapies. Anticancer peptides (ACPs) have emerged as a promising alternative due to their biocompatibility, broad-spectrum anticancer activity, and unique mechanisms of action. This study presents a novel computational approach to design and identify ACPs using a multi-tier filtration system. Our method begins with peptide sequence generation via a recurrent neural network (RNN) trained on the acp740 dataset. The generated sequences undergo rigorous filtration: Tier-1 employs three deep learning-based classifiers (ACP-DL, ACP-MHCNN, ACP-LSE) to identify potential ACPs; Tier-2 uses a nearest centroid classifier to filter out statistically less relevant sequences; Tier-3 involves a final filtration using unsupervised nearest neighbor learning based on fused feature encoding schemes (CKSAAP, k-Mer, and BPF). Experimental results demonstrate a significant improvement in identifying viable ACP candidates, with the proposed method showing a 2.21-fold higher hit-rate compared to random sequence generation. Further analysis using t-SNE, PCA, and antimicrobial peptide (AMP) prediction tools confirms the robustness and effectiveness of the selected ACPs. Furthermore, performance comparisons using the proposed sequence filtering technique reveal that it surpasses the baseline LSTM and RNN-based sequence generation models by 2.95% and 14.11%, respectively. Complementary reverse analyses further validate the robustness and effectiveness of proposed sequence generation framework. The proposed computational approach offers a streamlined and economical alternative to traditional experimental methods, expediting the discovery of new ACPs and enhancing the accuracy of anticancer peptide predictions. The relevant models, codes, and results are also available on the authors github page at (https://github.com/mhdshl/ACP-Seq2Seq).