Skip to content

Conversation

drroe
Copy link
Contributor

@drroe drroe commented Jul 11, 2022

Version 6.11.0.

Adds a GPU version of the radial command using CUDA. Currently this is only for atom-atom RDFs (one or two masks). The speedup can be significant. Here's a comparison of the previous OpenMP code (run on an Intel sandybridge, 4 cores) to the new CUDA code (run on a GTX 780):

  CPPTRAJ BENCH (3): Radial solvent-solvent benchmark.
    TOP: ../FtuFabI.tip3p.parm7  CRD: ../FtuFabI.tip3p.nc  NTRAJIN: 1
    TRAJIN_ARGS: 1 10
	/home/droe/Cpptraj/rdfgpu.cpptraj/bin/cpptraj.OMP
	Iter   Time Context_Switches  Waits      FPS
	0     91.12             5982     36    0.110
	1     91.26             6146     36    0.110
	2     91.22             7455     36    0.110
	/home/droe/Cpptraj/rdfgpu.cpptraj/bin/cpptraj.cuda
	Iter   Time Context_Switches  Waits      FPS
	0      1.26                1     92   12.518
	1      1.26               21     89   12.527
	2      1.23               22     85   12.506
  Baseline  :  91.2000
  Benchmark :   1.2500
  Speedup   :  72.96
  %Change   :  98.63 %

The GPU RDF is calculated in single precision, so small numerical differences on the order of 0.0002-0.0004 can be expected in the resulting histograms.

This PR also includes a 2x speedup in the CPU code when only a single mask is specified (e.g. when calculating water oxygens to water oxygens RDF).

drroe added 30 commits July 6, 2022 19:32
@drroe drroe self-assigned this Jul 11, 2022
@drroe drroe merged commit ef8117f into Amber-MD:master Jul 11, 2022
@drroe drroe deleted the rdfgpu branch July 11, 2022 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant