Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault with SLURM and non-local nodes
From: Samuel K. Gutierrez (samuel_at_[hidden])
Date: 2011-02-03 17:35:59


Hi,

I just tried to reproduce the problem that you are experiencing and
was unable to.

[samuel_at_lo1-fe ~]$ salloc -n32 mpirun --display-map ./mpi_app
salloc: Job is in held state, pending scheduler release
salloc: Pending job allocation 138319
salloc: job 138319 queued and waiting for resources
salloc: job 138319 has been allocated resources
salloc: Granted job allocation 138319

  ======================== JOB MAP ========================

  Data for node: Name: lob083 Num procs: 16
          Process OMPI jobid: [26464,1] Process rank: 0
          Process OMPI jobid: [26464,1] Process rank: 1
          Process OMPI jobid: [26464,1] Process rank: 2
          Process OMPI jobid: [26464,1] Process rank: 3
          Process OMPI jobid: [26464,1] Process rank: 4
          Process OMPI jobid: [26464,1] Process rank: 5
          Process OMPI jobid: [26464,1] Process rank: 6
          Process OMPI jobid: [26464,1] Process rank: 7
          Process OMPI jobid: [26464,1] Process rank: 8
          Process OMPI jobid: [26464,1] Process rank: 9
          Process OMPI jobid: [26464,1] Process rank: 10
          Process OMPI jobid: [26464,1] Process rank: 11
          Process OMPI jobid: [26464,1] Process rank: 12
          Process OMPI jobid: [26464,1] Process rank: 13
          Process OMPI jobid: [26464,1] Process rank: 14
          Process OMPI jobid: [26464,1] Process rank: 15

  Data for node: Name: lob084 Num procs: 16
          Process OMPI jobid: [26464,1] Process rank: 16
          Process OMPI jobid: [26464,1] Process rank: 17
          Process OMPI jobid: [26464,1] Process rank: 18
          Process OMPI jobid: [26464,1] Process rank: 19
          Process OMPI jobid: [26464,1] Process rank: 20
          Process OMPI jobid: [26464,1] Process rank: 21
          Process OMPI jobid: [26464,1] Process rank: 22
          Process OMPI jobid: [26464,1] Process rank: 23
          Process OMPI jobid: [26464,1] Process rank: 24
          Process OMPI jobid: [26464,1] Process rank: 25
          Process OMPI jobid: [26464,1] Process rank: 26
          Process OMPI jobid: [26464,1] Process rank: 27
          Process OMPI jobid: [26464,1] Process rank: 28
          Process OMPI jobid: [26464,1] Process rank: 29
          Process OMPI jobid: [26464,1] Process rank: 30
          Process OMPI jobid: [26464,1] Process rank: 31

SLURM 2.1.15
Open MPI 1.4.3 configured with: --with-platform=./contrib/platform/
lanl/tlcc/debug-nopanasas

I'll dig a bit further.

Sam

On Feb 2, 2011, at 9:53 AM, Samuel K. Gutierrez wrote:

> Hi,
>
> We'll try to reproduce the problem.
>
> Thanks,
>
> --
> Samuel K. Gutierrez
> Los Alamos National Laboratory
>
>
> On Feb 2, 2011, at 2:55 AM, Michael Curtis wrote:
>
>>
>> On 28/01/2011, at 8:16 PM, Michael Curtis wrote:
>>
>>>
>>> On 27/01/2011, at 4:51 PM, Michael Curtis wrote:
>>>
>>> Some more debugging information:
>> Is anyone able to help with this problem? As far as I can tell
>> it's a stock-standard recently installed SLURM installation.
>>
>> I can try 1.5.1 but hesitant to deploy this as it would require a
>> recompile of some rather large pieces of software. Should I re-
>> post to the -devel lists?
>>
>> Regards,
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users