PhD Research Seminar: GPU Algorithms for Mathematical Modelling, Variance Reduction for Policy Gradient
First talk: Development and Optimization of GPU Algorithms for Mathematical Modelling of Materials
Speaker: Yea Rem Choi, second-year PhD student, MIEM HSE
The progress in our understanding of nature requires sophisticated methods of mathematical modelling. In the last decades, parallel programming methods have been used to speed up numerical algorithms. Nowadays, CPU computing is gradually superseded by GPU computing, which requires novel approaches to parallel algorithms development. In the framework of materials modelling, a wide spectrum of mathematical methods are considered, one of which is molecular dynamics. The corresponding software has been supplemented with GPU offloading using CUDA or OpenCL, or developed from scratch (e.g., using OpenACC). In order to use one of the latest high-performance hardware type (the nodes with several GPUs connected by high bandwidth and low latency communication links), the algorithms are required where CPU is used only to manage the task and only GPUs are used for computation. Such algorithms are in the focus of my study.
Specifically, we studied an original GPU-only parallel matrix-matrix multiplication algorithm (C = αA ∗ B + βC) for servers with multiple GPUs connected by NVLink. The algorithm is implemented using CUDA. The data transfer patterns, the communication and computation overlap, and the overall performance of the algorithm are considered. By regulating the commands call order and the sizes of tiles, we tune the uninterrupted asynchronous data transmission and kernel execution. Two cases are considered: when all the data are stored in one GPU and when the matrices are distributed among several GPUs. The execution efficiency of this new algorithm is compared with cuBLAS-XT from the Nvidia CUDA Toolkit library. Porting of the GPU algorithm to the novel AMD HIP technology is discussed.
Second talk: Variance Reduction for Policy Gradient via Empirical Variance
Speaker: Maxim Kaledin, third-year PhD student, Faculty of Computer Science
One of the main issues of gradient methods in reinforcement learning (RL) is high variance of the gradient estimator. In RL, to mitigate this problem, the method of control variates is widely used and implemented via baselines: Advantage Actor-Critic (A2C), Qprop, Stein's baseline, and many more. Our work is devoted to construction of control variates based on empirical variance. In my talk, I will speak about the construction of EV-methods, its experimental benefits, and also will present some theoretical results concerning the stability and reliability of the algorithm.