Sorry, I should say one more thing about the 500 procs test. I tried to run two 500 procs at the same time using SGE and it runs fast and finishes at the same time as the single run. So I think OpenMPI can handle them separately very well.
For the bind-to-core, I tried to run mpirun --help but not find the bind-to-core info. I only see bynode or byslot options. Is it same as bind-to-core? My mpirun shows version 1.3.3 but ompi_info shows 1.4.2.
Thanks a lot.
On Mon, Oct 4, 2010 at 9:18 PM, Eugene Loh <firstname.lastname@example.org>
Thanks for sending the mpirun run and error message. That helps.
Storm Zhang wrote:
Here is what I meant: the results of 500 procs in fact shows it with 272-304(<500) real cores, the program's running time is good, which is almost five times 100 procs' time. So it can be handled very well. Therefore I guess OpenMPI or Rocks OS does make use of hyperthreading to do the job. But with 600 procs, the running time is more than double of that of 500 procs. I don't know why. This is my problem.
BTW, how to use -bind-to-core? I added it as mpirun's options. It always gives me error " the executable 'bind-to-core' can't be found. Isn't it like:
mpirun --mca btl_tcp_if_include eth0 -np 600 -bind-to-core scatttest
It's not recognizing the --bind-to-core option. (Single hyphen, as you had, should also be okay.) Skimming through the e-mail, it looks like you are using OMPI 1.3.2 and 1.4.2. Did you try --bind-to-core with both? If I remember my version numbers, --bind-to-core will not be recognized with 1.3.2, but should be with 1.4.2. Could it be that you only tried 1.3.2?
Another option is to try "mpirun --help". Make sure that it reports --bind-to-core.