On 3 April 2009 at 03:33, Jerome BENOIT wrote:
| The above submission works the same on my clusters.
| But in fact, my issue involve interconnection between the nodes of the clusters:
| in the above examples involve no connection between nodes.
|
| My cluster is a cluster of quadcore computers:
| if in the sbatch script
|
| #SBATCH --nodes=7
| #SBATCH --ntasks=15
|
| is replaced by
|
| #SBATCH --nodes=1
| #SBATCH --ntasks=4
|
| everything is fine as no interconnection is involved.
|
| Can you test the inconnection part of the story ?
Again, think about in terms of layers. You have a problem with slurm on top
of Open MPI.
So before blaming Open MPI, I would try something like this:
~$ orterun -np 2 -H abc,xyz /tmp/jerome_hw
Hello world! I am 1 of 2 and my name is `abc'
Hello world! I am 0 of 2 and my name is `xyz'
~$
ie whether the simple MPI example can be launched successfully on two nodes or not.
Dirk
--
Three out of two people have difficulties with fractions.
|