(405d) Optimization of a Lennard-Jones Particle Monte Carlo Gpucode
AIChE Annual Meeting
2012
2012 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Beyond Standard Hardware: GPUs, Cloud Computing and Crowdsourcing
Wednesday, October 31, 2012 - 9:45am to 10:10am
Monte Carlo (MC) simulations of atomic particles are pleasingly parallelizable problem [1], making them an ideal candidate for simulation on graphics processing units (GPUs), powerful single-instruction-multiple-data (SIMD) computing devices. GPUs offer cheaper parallel processing compared to CPUs, thanks to their large quantities of compute cores. In many cases, evaluating every pair of interactions in the system is too costly, even with the parallel processing power of modern GPUs. Simulation of the 100,000+ atom biomolecular systems that are being currently explored [2, 3] with molecular dynamics in the Gibbs or grand canonical ensembles is not feasible unless significant reductions in the number of evaluations of the non-bonded interactions.
This work details refinements in Monte Carlo simulations performed on GPU [4] to enable the rapid simulation of 100,000 atom systems on typical desktop workstations. Neighbor lists[5] are adapted to a form suitable for the GPU’s memory [6], which consists of high-speed shared memory and registers, plus low-speed global memory. By keeping track of nearby molecules, the number of interactions to be considered is reduced significantly, which reduces looping on the CUDA-core-constrained GPU. Further improvements include placing current CPU side logic (pseudo-random number generation (PRNG), etc.) on the GPU device and optimizations to the parallel displacement, volume swap, and particle insertion moves, in an effort to minimize the computationally expensive transfer of information from the device to the CPU over the PCI bus. Additional speed gains are realized by using idle threads during the tree summation of energies to calculate part of the pair interactions based on the next random draws in the PRNG sequence. This allows each kernel call for a specific move selection to perform the necessary arithmetic for the current move and part of the next. The benefits of these improvements are highlighted by the simulation of large (N>100,000 particles) systems in the canonical and Gibbs ensembles. The results of a very-large-scale simulations near the critical point [7] of a tail-corrected Lennard-Jones fluid are also presented.
1. Zara, S.J. and D. Nicholson, Grand Canonical Ensemble Monte Carlo Simulation on a Transputer Array. Molecular Simulation, 1990. 5(3-4): p. 245-261.
2. Freddolino, P.L., et al., Molecular Dynamics Simulations of the Complete Satellite Tobacco Mosaic Virus. Structure (London, England : 1993), 2006. 14(3): p. 437-449.
3. Sanbonmatsu, K.Y., S. Joseph, and C.-S. Tung, Simulating movement of tRNA into the ribosome during decoding. Proceedings of the National Academy of Sciences of the United States of America, 2005. 102(44): p. 15854-15859.
4. Mick, J.R., Potoff, J.J., Hailat, E., Russo, V., Schwiebert, L. GPU Accelerated Monte Carlo Simulations In the Gibbs and Canonical Ensembles. in AIChE Annual Conference. 2011. Minneapolis, MN.
5. Verlet, L., Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules. Physical Review, 1967. 159(1): p. 98-103.
6. Wang, P. Short Range Molecular Dynamics on GPU. in GPU Tech Conf. 2006.
7. Potoff, J.J. and A.Z. Panagiotopoulos, Critical point and phase behavior of the pure fluid and a Lennard-Jones mixture. Journal of Chemical Physics, 1998. 109(24): p. 10914-10920.
See more of this Group/Topical: Computational Molecular Science and Engineering Forum