I bought a 64 core workstation and installed NASA fun3d with open mpi 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the performance of the fun3d run is bad. I got data below:
the run command is (it is for 32 core as an example)
mpiexec -np 32 --bysocket --bind-to-socket ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi --time_timestep_loop --animation_freq -1 > screen.dump_bs30
CPUs times iterations time/it
60 678s 30it 22.61s
48 702s 30it 23.40s
32 734s 30it 24.50s
16 894s 30it 29.80s
You can see using 60 cores, to run 30 iteration, FUN3D will complete in 678 seconds, roughly 22.61 second per iteration.
Using 16 cores, to run 30 iteration, FUN3D will complete in 894 seconds, roughly 29.8 seconds per iteration.
the data above shows FUN3D run using mpirun does not scale at all! I used to run fun3d with mpirun on a 8 core WS, and it scales well.
The same job to run on a linux cluster scales well.
Would you all give me some advice to improve the performance loss when I increase the use of more cores, or how to run mpirun with proper options to get a linear scaling when using 16 to 32 to 48 cores?