Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:Improving Performance on Replica-Exchange Molecular Dynamics Simulations by Optimizing GPU Core Utilization 
著者
和文: Taisuke Boku, 杉田 昌岳, Ryohei Kobayashi, Shinnosuke Furuya, 藤江 拓哉, 大上 雅史, 秋山 泰.  
英文: Taisuke Boku, Masatake Sugita, Ryohei Kobayashi, Shinnosuke Furuya, Takuya Fujie, Masahito Ohue, Yutaka Akiyama.  
言語 English 
掲載誌/書名
和文: 
英文:Proceedings of the 53rd International Conference on Parallel Processing (ICPP2024) 
巻, 号, ページ         Page 1082-1091
出版年月 2024年8月12日 
出版者
和文: 
英文:Association for Computing Machinery 
会議名称
和文: 
英文:53rd International Conference on Parallel Processing (ICPP2024) 
開催地
和文: 
英文:Gotland 
公式リンク https://dl.acm.org/doi/10.1145/3673038.3673097
 
DOI https://doi.org/10.1145/3673038.3673097
アブストラクト While GPUs are the main players of the accelerating devices on high performance computing systems, their performance depends on how to utilize a numerous number of cores in parallel on each device. Typically, a loop structure with a number of iterations is assigned to a device to utilize their cores to map calculations in iterations so that there must be enough count of iterations to fill the thousands of GPU cores in the high-end GPUs. In the advanced GPU represented by NVIDIA H100, several techniques, such as Multi-Process Service (MPS) or Multi-Instance GPU (MIG), which divides GPU cores to be mapped to the multiple user processes, are provided to enhance the core utilization even in a case with a small degree of parallelism. We apply MPS to a practical Molecular Dynamics (MD) simulation with AMBER software for improving the efficiency of GPU core utilization to save the computation resources. The critical issue here is to analyze the core utilization and overhead when running multiple processes on a GPU device as well as the multi-GPU and multi-node parallel execution for overall performance improvement. In this paper, we introduce a method to apply MPS for AMBER to simulate the membrane permeation process of a drug candidate peptide by a two-dimensional replica-exchange method on an advanced supercomputer with NVIDIA H100. We applied several optimizations on parameter settings with NVIDIA H100 and V100 GPUs investigating their performance behavior. Finally, we found that the GPU core utilization improves up to twice compared with a simple process assignment method to maximize the GPU utilization efficiency.

©2007 Institute of Science Tokyo All rights reserved.