2020, issue 3, p. 59-73
Received 21.08.2020; Revised 15.09.2020; Accepted 23.10.2020
Published 27.10.2020; First Online 05.11.2020
The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem
V.M. Glushkov Institute of Cybernetics of the NAS of Ukraine, Kyiv
Introduction. The spatial protein structure folding is an important and actual problem in computational biology. Considering the mathematical model of the task, it can be easily concluded that finding an optimal protein conformation in a three dimensional grid is a NP-hard problem. Therefore some reinforcement learning techniques such as Q-learning approach can be used to solve the problem. The article proposes a
new geometric “state-action” space representation which significantly differs from all alternative representations used for this problem.
The purpose of the article is to analyze existing approaches of different states and actions spaces representations for Q-learning algorithm for protein structure folding problem, reveal their advantages and disadvantages and propose the new geometric “state-space” representation. Afterwards the goal is to compare existing and the proposed approaches, make conclusions with also describing possible future steps of further research.
Result. The work of the proposed algorithm is compared with others on the basis of 10 known chains with a length of 48 first proposed in . For each of the chains the Q-learning algorithm with the proposed “state-space” representation outperformed the same Q-learning algorithm with alternative existing “state-space” representations both in terms of average and minimal energy values of resulted conformations. Moreover, a plenty of existing representations are used for a 2D protein structure predictions. However, during the experiments both existing and proposed representations were slightly changed or developed to solve the problem in 3D, which is more computationally demanding task.
Conclusion. The quality of the Q-learning algorithm with the proposed geometric “state-action” space representation has been experimentally confirmed. Consequently, it’s proved that the further research is promising. Moreover, several steps of possible future research such as combining the proposed approach with deep learning techniques has been already suggested.
Keywords: Spatial protein structure, combinatorial optimization, relative coding, machine learning, Q-learning, Bellman equation, state space, action space, basis in 3D space.
Cite as: Chornozhuk S. The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem. Cybernetics and Computer Technologies. 2020. 3. P. 59–73. (in Ukrainian) https://doi.org/10.34229/2707-451X.20.3.6
1. Dill K.A. Theory for the folding and stability of globular proteins. Biochemistry. 1985. 24 (6). P. 1501–1509. https://doi.org/10.1021/bi00327a032
2. Bazzoli A., Tettamanzi A.G.B. A Memetic Algorithm for Protein Structure Prediction in a 3D-Lattice HP Model. Applications of Evolutionary Computing. 2004. 3005. P. 1–10. https://doi.org/10.1007/978-3-540-24653-4_1
3. Custodio F.L., Barbosa H.J., Dardenne L.E. A multiple minima genetic algorithm for protein structure prediction. Applied Software Computing, Elsevier. 2014. 15. P. 88–99. https://doi.org/10.1016/j.asoc.2013.10.029
4. Boscovic B., Brest J. Genetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice. Applied Soft Computing. 2016. 45. P. 61–70. https://doi.org/10.1016/j.asoc.2016.04.001
5. Morshedian A., Razmara J., Lotfi S. A novel approach for protein structure prediction based on estimation of distribution algorithm. Software computing. 2019. 23. P. 4777–4788. https://doi.org/10.1007/s00500-018-3130-0
6. Nazmul R., Chetty M., Chowdhury A.R. Multimodal Memetic Framework for low-resolution protein structure prediction. Swarm and Evolutionary Computation, Elsevier. 2020. 52. https://doi.org/10.1016/j.swevo.2019.100608
7. , Development and analysis of the parallel ant colony optimization algorithm for solving the protein tertiary structure prediction problem. Information Theories and Applications. 2014. 21 (4). P. 392–397.
8. Chornozhuk S.A. The new simulated annealing algorithm for a protein structure folding problem. Komp’uternaa matematika. 2018. 1. P. 118–124. http://dspace.nbuv.gov.ua/handle/123456789/161856
9. Jafari R., Javidi M.M. Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning. SN Applied Sciences, Springer. 2020. 2 (259). https://doi.org/10.1007/s42452-020-2012-0
10. Czibula G., Bocicor M., Czibula I. A reinforcement learning model for solving the folding problem. Int J Computational Technology Applied. 2011. 2. P. 171–182.
11. Li Y., Kang H., Ye K., Yin S. FoldingZero: protein folding from scratch in hydrophobic-polar model. Deep reinforcement learning workshop (Oral) of NIPS. 2018. https://arxiv.org/abs/1812.00967
12. Dogan B., Olmez T. A novel state space representation for the solution of 2D-HP protein folding problem using reinforcement learning methods. Applied Soft Computing, Elsevier. 2015. 26. P. 213–223. https://doi.org/10.1016/j.asoc.2014.09.047
13. Hulianytskyi L.F., Chornozhuk S.A. Genetic algorithm with new stochastic greedy crossover operator for protein structure folding problem. Cybernetics and Computer Technologies. 2020. 2. P. 19–29. https://doi.org/10.34229/2707-451X.20.2.3
14. Hulianytskyi, L.F., Rudyk, V.O. Protein structure prediction problem: formalization using quaternions. Cybernetics and System Analysis. 2013. 49 (4). P. 597–602. https://doi.org/10.1007/s10559-013-9546-8
15. Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. MIT Press, 1998. 9 (5). P. 1054. https://doi.org/10.1109/TNN.1998.712192
16. Yue K., Fiebig K.M., Thomas P.D., Chan H.S., Shakhnovich E.I., Dill K.A. A Test of Lattice Protein Folding Algorithms. Proceedings of the National Academy of Sciences. 1995. 92 (1). P. 325–329. https://doi.org/10.1073/pnas.92.1.325
ISSN 2707-451X (Online)
ISSN 2707-4501 (Print)