Am 07.10.2013 um 08:45 schrieb San B:
> I'm facing a performance issue with a scientific application(Fortran). The issue is, it runs faster on single node but runs very slow on multiple nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & fftw. What could be the problem here with?
How do you provide a list of hosts it should use to the application? Maybe it's now just running on only one machine - and/or can make use only of local OpenMP inside MKL (yes, OpenMP here which is bound to run on a single machine only).
PS: Do you have 16 real cores or 8 plus Hyperthreading?
> Is it possible to do any tuning in OpenMPI? FY More info: The cluster has Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is Enabled. Jobs are submitted thru LSF scheduler.
> Does HyperThreading causing any problem here?
> users mailing list