On 07/02/2011, at 12:36 PM, Michael Curtis wrote:
>
> On 04/02/2011, at 9:35 AM, Samuel K. Gutierrez wrote:
>
> Hi,
>
>> I just tried to reproduce the problem that you are experiencing and was unable to.
>>
>> SLURM 2.1.15
>> Open MPI 1.4.3 configured with: --with-platform=./contrib/platform/lanl/tlcc/debug-nopanasas
>
> I compiled OpenMPI 1.4.3 (vanilla from source tarball) with the same platform file (the only change was to re-enable btl-tcp).
>
> Unfortunately, the result is the same:
To reply to my own post again (sorry!), I tried OpenMPI 1.5.1. This works fine:
salloc -n16 ~/../openmpi/bin/mpirun --display-map mpi
salloc: Granted job allocation 151
======================== JOB MAP ========================
Data for node: ipc3 Num procs: 8
Process OMPI jobid: [3365,1] Process rank: 0
Process OMPI jobid: [3365,1] Process rank: 1
Process OMPI jobid: [3365,1] Process rank: 2
Process OMPI jobid: [3365,1] Process rank: 3
Process OMPI jobid: [3365,1] Process rank: 4
Process OMPI jobid: [3365,1] Process rank: 5
Process OMPI jobid: [3365,1] Process rank: 6
Process OMPI jobid: [3365,1] Process rank: 7
Data for node: ipc4 Num procs: 8
Process OMPI jobid: [3365,1] Process rank: 8
Process OMPI jobid: [3365,1] Process rank: 9
Process OMPI jobid: [3365,1] Process rank: 10
Process OMPI jobid: [3365,1] Process rank: 11
Process OMPI jobid: [3365,1] Process rank: 12
Process OMPI jobid: [3365,1] Process rank: 13
Process OMPI jobid: [3365,1] Process rank: 14
Process OMPI jobid: [3365,1] Process rank: 15
=============================================================
Process 2 on eng-ipc3.{FQDN} out of 16
Process 4 on eng-ipc3.{FQDN} out of 16
Process 5 on eng-ipc3.{FQDN} out of 16
Process 0 on eng-ipc3.{FQDN} out of 16
Process 1 on eng-ipc3.{FQDN} out of 16
Process 6 on eng-ipc3.{FQDN} out of 16
Process 3 on eng-ipc3.{FQDN} out of 16
Process 7 on eng-ipc3.{FQDN} out of 16
Process 8 on eng-ipc4.{FQDN} out of 16
Process 11 on eng-ipc4.{FQDN} out of 16
Process 12 on eng-ipc4.{FQDN} out of 16
Process 14 on eng-ipc4.{FQDN} out of 16
Process 15 on eng-ipc4.{FQDN} out of 16
Process 10 on eng-ipc4.{FQDN} out of 16
Process 9 on eng-ipc4.{FQDN} out of 16
Process 13 on eng-ipc4.{FQDN} out of 16
salloc: Relinquishing job allocation 151
It does seem very much like there is a bug of some sort in 1.4.3?
Michael
|