ref url: https://www.pugetsystems.com/labs/hpc/TitanXp-vs-GTX1080Ti-for-Machine-Learning-937/
nbody -benchmark -numbodies=256000
(1) GTX 1070 4137 GFLOP/s
(1) GTX 1080Ti 7514 GFLOP/s
(1) Titan X Pascal 7524 GFLOP/s
(1) Titan Xp 7904 GFLOP/s
--------------------------------------
In my case,
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
gpuDeviceInit() CUDA Device [0]: "Graphics Device
> Compute 6.1 CUDA device: [Graphics Device]
number of bodies = 256000
256000 bodies, total time for 10 iterations: 1611.591 ms
= 406.654 billion interactions per second
= 8133.082 single-precision GFLOP/s at 20 flops per interaction
(The range of GLOP/s was from 8066.468 to 8133.082)
Nvidia driver: 375.26
cuda: 8.0
g/c: titan xp
cpu: 8700k
m/b: ga-z370-hd3
ram: dominator 3466kHz(XMP applied)
* GTX 680 showed "1606.946 single-precision GFLOP/s at 20 flops per interaction"