On 3 April 2009 at 03:33, Jerome BENOIT wrote:
| The above submission works the same on my clusters.
| But in fact, my issue involve interconnection between the nodes of the clusters:
| in the above examples involve no connection between nodes.
| My cluster is a cluster of quadcore computers:
| if in the sbatch script
| #SBATCH --nodes=7
| #SBATCH --ntasks=15
| is replaced by
| #SBATCH --nodes=1
| #SBATCH --ntasks=4
| everything is fine as no interconnection is involved.
| Can you test the inconnection part of the story ?
Again, think about in terms of layers. You have a problem with slurm on top
of Open MPI.
So before blaming Open MPI, I would try something like this:
~$ orterun -np 2 -H abc,xyz /tmp/jerome_hw
Hello world! I am 1 of 2 and my name is `abc'
Hello world! I am 0 of 2 and my name is `xyz'
ie whether the simple MPI example can be launched successfully on two nodes or not.
Three out of two people have difficulties with fractions.