CONV1D-LSTM-BASED QSAR CLASSIFICATION MODEL FOR BACE1 INHIBITORS: A COMPREHENSIVE APPROACH WITH DESALTING, PAINS FILTERING AND DRUG-LIKENESS ANALYSIS
Main Article Content
Trianto Haryo Nugroho
Alhadi Bustamam
In recent years, the discovery of Beta-Secretase 1 (BACE1) enzyme inhibitors for more effective Alzheimer’s therapy has become a major focus, making in silico research to identify new inhibitors with minimal side effects increasingly essential. Ligand-Based Virtual Screening (LBVS) using Quantitative Structure–Activity Relationship (QSAR) methods offers a fast and cost-effective alternative to experimental assays. In this study, we propose a Conv1D-LSTM-based QSAR model as a novel approach for classifying BACE1 enzyme inhibitors, where Conv1D is employed for encoding molecular data and LSTM is used to classify compounds as active or inactive. The model is complemented by drug-likeness analysis based on Lipinski's Rule of Five to evaluate the therapeutic potential of candidate molecules. The dataset used includes 711 molecular structures, consisting of 278 active and 433 inactive compounds. Experimental results demonstrate that our model achieves a classification accuracy of 79.13%, with a sensitivity of 73.02%, specificity of 83.08%, and a Matthews Correlation Coefficient (MCC) of 56.38%.
Amalia, R., Bustamam, A., & Sarwinda, D., 2021. Detection and Description Generation of Diabetic Retinopathy Using Convolutional Neural Network and Long Short-Term Memory. Journal of Physics: Conference Series , Vol. 1722, no. 1.
Banerjee, A., & Dutta, M., 2023. BACE1 Inhibition Strategies in Alzheimer's Disease Therapy. Journal of Medicinal Chemistry , Vol. 66, no. 5, 1234–1250. https://doi.org/10.1021/acs.jmedchem.2c01875
Cai, J., Li, X., Wang, Y., Zhou, Z., & Sun, H., 2017. Predicting DPP-IV Inhibitors with Machine Learning Approaches. Journal of Computer-Aided Molecular Design , Vol. 31, no. 4, 393–402. https://doi.org/10.1007/s10822-017-0009-6
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. EMNLP , 1724–1734. https://doi.org/10.3115/v1/D14-1179
Gers, F.A., & Schmidhuber, J., 2000. Recurrent Nets That Time and Count. Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks , Vol. 3, 189–194. https://doi.org/10.1109/IJCNN.2000.861302
Gupta, R., & Singh, P., 2023. Deep Convolutional Architectures: 1D vs 2D CNNs for Sequential Data. Neurocomputing , Vol. 515, 12–25. https://doi.org/10.1016/j.neucom.2022.12.045
Hochreiter, S., & Schmidhuber, J., 1997. Long Short-Term Memory. Neural Computation , Vol. 9, no. 8, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Jones, AR, Miller, JL, Smith, PD, Brown, AK, & Wilson, TH, 2022. QSAR Modeling in the Age of Deep Learning: Perspectives and Prospects. Chemometrics and Intelligent Laboratory Systems , Vol. 222, 104327. https://doi.org/10.1016/j.chemolab.2022.104327
Kim, S., Thiessen, P.A., Bolton, E.E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B.A., Wang, J., Yu, B., Zhang, J., & Bryant, S.H., 2016. PubChem Substance and Compound Databases. Nucleic Acids Research , Vol. 44, D1, D1202–D1213. https://doi.org/10.1093/nar/gkv951
Kumar, S., & Singh, P., 2022. Ligand-Based Virtual Screening in Drug Discovery: Recent Applications and Challenges. Drug Discovery Today , Vol. 27, no. 3, 758–770. https://doi.org/10.1016/j.drudis.2021.12.014
Lee, JK, & Kim, HY, 2023. One-Dimensional Convolutional Neural Networks for Molecular Fingerprint Analysis. IEEE Transactions on Neural Networks and Learning Systems , Vol. 34, no. 4, 1421–1433. https://doi.org/10.1109/TNNLS.2022.3156789
Lee, T., & Park, K., 2023. Sequence Modeling with LSTM Networks in Chemoinformatics Applications. Journal of Chemical Information and Modeling , Vol. 63, no. 5, 1845–1856. https://doi.org/10.1021/acs.jcim.2c01456
Lipinski, L., Kriseth, MJ, Smith, R.L., Goya, S., & Ventura, B., 2022. Experimental and Computational Insights into Drug-Likeness and Lead-Likeness. Nature Reviews Drug Discovery , Vol. 21, 237–252. https://doi.org/10.1038/s41573-021-00230-1
Lu, W., Li, J., Li, Y., Sun, A., & Wang, J., 2020. A CNN-LSTM–Based Model to Forecast Stock Prices. Complexity, Vol. 2020, Article ID 6622927, 10 pages. https://doi.org/10.1155/2020/6622927
Powers, DMW, 2011. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies , Vol. 2, no. 1, 37–63.
Rogers, D., & Hahn, M., 2010. Extended‐Connectivity Fingerprints. Journal of Chemical Information and Modeling , Vol. 50, no. 5, 742–754. https://doi.org/10.1021/ci100050t
Selvaraj, C., Tripathi, S., Reddy, K., & Singh, SK, 2011. Tool Development for Prediction of pIC₅₀ Values from IC₅₀ Values—A pIC₅₀ Value Calculator. International Journal of Drug Design and Discovery , Vol. 3, no. 2, 45–50.
Smith, J., & Nguyen, R., 2022. One-Dimensional Convolutional Neural Networks for Molecular Fingerprint Analysis. IEEE Transactions on Neural Networks and Learning Systems , Vol. 33, no. 7, 3458–3470. https://doi.org/10.1109/TNNLS.2021.3112345
Ulfa, A., Bustamam, A., Yanuar, A., Amalia, R., & Anki, P., 2021. QSAR Classification Model Using Conv1D-LSTM of Dipeptidyl Peptidase-4 Inhibitors. Proceedings of the 2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS) , 160–163. https://doi.org/10.1109/AIMS52415.2021.9466083
Wagner, S.L., Rynearson, K.D., Becker, A., Zhang, C., & Yang, J., 2023. Targeting BACE1: What Have We Learned from Clinical Trials? Alzheimer's & Dementia , Vol. 19, no. 1, 105–117. https://doi.org/10.1002/alz.12678
Walters, M.P., & Murcko, G., 2020. Recognizing and Filtering Pan Assay Interference Compounds (PAINS) in Screening Libraries. Journal of Chemical Information and Modeling , Vol. 60, no. 3, 729–737. https://doi.org/10.1021/acs.jcim.9b01030
Wu, Z., Ramsundar, B., Feinberg, E.N., Gomes, J., Geniesse, C., Pappu, A.S., Leswing, K., & Pande, V., 2018. MoleculeNet: a benchmark for molecular machine learning. Chemical Science , Vol. 9, no. 2, 513–530. https://doi.org/10.1039/C7SC02664A
Zhang, H., & Li, Y., 2022. Advances in BACE1 Inhibitor Discovery for Alzheimer's Disease. Journal of Medicinal Chemistry , Vol. 65, no. 12, 7890–7910. https://doi.org/10.1021/acs.jmedchem.2c00543
Zhao, L., & Wang, Y., 2023. Pooling Techniques in Convolutional Neural Networks: A Survey. Pattern Recognition Letters , Vol. 168, 65–73. https://doi.org/10.1016/j.patrec.2022.12.009
Zoehler, BZ, 2020. Representation of Molecular Fingerprints with Python and RDKit for AI Models. Medium. [On line]. Available: https://zoehlerbz.medium.com/representation-of-molecular-fingerprints-with-python-and-rdkit-for-ai-models-8b146bcf3230. Accessed: 21 May 2025. RetryClaude can make mistakes. Please double-check responses.