2020, issue 1, p. 74-82

Received 08.01.2020; Revised 28.01.2020; Accepted 10.03.2020

Published 31.03.2020; First Online 26.04.2020


Previous  |  Full text (in Ukrainian)  |  Next


MSC 68-04, 68M14, 68T45


Vadim Tulchinsky 1 ORCID ID favicon Big,   Serhii Lavreniuk 1,   Viacheslav Roganov 1,   Petro Tulchinsky 1,   Valerii Khalimendik 1 *

1 V.M. Glushkov Institute of Cybernetics, Kyiv, Ukraine

* Correspondence: This email address is being protected from spambots. You need JavaScript enabled to view it.


Introduction. In machine learning (ML) and artificial intelligence (AI) works, the emphasis is usually on the quality of classification or the accuracy of parameter estimation. If the focus is on performance, then it is also mainly about the performance of the model's training phase. However, with the proliferation of AI applications in real-world problems, the problem of ensuring high data processing performance with ready models becomes more important. By its nature, this problem is fundamentally different from the one of model training: the latter deals with intensive calculations and the former with simple calculations, but large flows of data (files) coming from the network or file system for processing. That is, the typical task of parallel processing with intensive input-output.

Besides, in terms of application, the AI module that performs classification, evaluation, or other data processing is a "black box": the cost of developing and training the model, as well as the risks of failure, are too high to handle such tasks in a non-professional manner. Therefore, performance optimization primarily involves the selection and balancing of system parameters. Cloud systems with their flexibility, manageability and easy scaling are the ideal platforms for such tasks.

Consider in more detail the task of investigating the factors which affect performance on a single, but notable, pattern recognition sample of a subset of ImageNet image collection [1] classified by the 50-layer deep learning neural network ResNet-50 [2].

The purpose of the paper is to experimentally investigate the factors that influence the performance of a ready-to-use neural network model application in GPU cloud systems of various architectures.

Results. Overheads related to microservices and distributed architectures, memory, network, batch size, synchronous and asynchronous interactions are estimated. The complex nonlinear nature of the influence of the system parameters in various combinations is demonstrated.


Keywords: machine learning; cloud technologies; GPU; system architecture; performance.


Cite as: Tulchinsky V., Lavreniuk S., Roganov V., Tulchinsky P., Khalimendik V. Factors of Performance for Application of AI Models in GPU Cloud. Cybernetics and Computer Technologies. 2020. 1. 74–82. (in Ukrainian) https://doi.org/10.34229/2707-451X.20.1.8



           1.     Internet image collection ImageNet. http://www.image-net.org (accessed Jan. 01, 2020).

           2.     He K., Zhang X, Ren S., Sun J. Deep Residual Learning for Image Recognition. In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE. 2016. https://arxiv.org/abs/1512.03385

           3.     Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Huang Z., Karpathy A., Khosla A., Bernstein M.S., Berg A.C., Li F. Imagenet large scale visual recognition challenge. Computing Research Repository (CoRR). Ithaca, NY, USA: Cornell University. 2014. https://doi.org/10.1007/s11263-015-0816-y

           4.     Mikami H., Suganuma H., U-chupala P., Tanaka Y., Kageyama Y. Massively Distributed SGD: ImageNet/ResNet-50 Training in a Flash. Machine Learning. Ithaca, NY, USA: Cornell University. 2019. https://arxiv.org/abs/1811.05233

           5.     Golovynskyi A., Sergienko I., Tulchinskyi V., Malenko A., Bandura O., Gorenko S., Roganova O., Lavrikova O. Development of SCIT supercomputers family at the Institute of Cybernetics of NAS of Ukraine in 2002 – 2017. Cybernetics and System Analysis. 2017. 53 (4). P. 124–129. https://doi.org/10.1007/s10559-017-9962-2

           6.     Khalimendik V. Porosity structure prediction from conventional sonic well logs on the base of synthetic samples computed by Prodaivoda-Maslov’s method. In proceedings of 18th International Conference on Geoinformatics – Theoretical and Applied Aspects (Kyiv, May 2019). EAGE. 2019. P. 1–5. https://doi.org/10.3997/2214-4609.201902061

           7.     Lavreniuk A.M., Lavreniuk S.I. Optimization of model parameters selection for big data from telecommunication company analysis. In proceedings of XІІІ Intern. conf. “Perspektyvy telekomunikatsiy” (PT-2019). Кyiv, Ukraine: NTUU “Igor Sikorsky Kyiv Polytechnic Institute”. 2019. P. 230–232. (in Ukrainian) http://conferenc.its.kpi.ua/2019/paper/view/15736

           8.     Vdovychenko R.O. Implementation of Sparse Distributed Memory for modern GPU and investigation of features of the model. Kompyuterna matematyka. 2019. 1. P. 77–84. (in Ukrainian) http://dspace.nbuv.gov.ua/handle/123456789/161936



ISSN 2707-451X (Online)

ISSN 2707-4501 (Print)

Previous  |  Full text (in Ukrainian)  |  Next




© Website and Design. 2019-2024,

V.M. Glushkov Institute of Cybernetics of the NAS of Ukraine,

National Academy of Sciences of Ukraine.