Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fwd: gadget2 infiniband openmpi hang
From: Yevgeny Kliteynik (kliteyn_at_[hidden])
Date: 2011-05-29 09:23:23


Gretchen,

Could you please send stack-trace of the processes when it hangs? (with padb/gdb)
Does the same problem persist in small scale (2,3 nodes)?
What is the minimal setup that reproduces the problem?

-- YK

>
> ---------- Forwarded message ----------
> From: *Gretchen* <umassastrohpcc_at_[hidden] <mailto:umassastrohpcc_at_[hidden]>>
> Date: Mon, Mar 28, 2011 at 8:35 PM
> Subject: Re: [OMPI users] gadget2 infiniband openmpi hang
> To: users_at_[hidden] <mailto:users_at_[hidden]>
>
>
> The gadget code hangs at the same spot (i.e. number of steps completed AND same section of code) when I run with --mca btl_openib_cpc_include rdmacm
> (code is doing MPI_Sendrecv).
> Thanks,
> Gretchen
>
>
> Date: Thu, 17 Mar 2011 12:45:32 -0400
> From: Jeff Squyres <jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>>
> Subject: Re: [OMPI users] gadget2 infiniband openmpi hang
> To: Open MPI Users <users_at_[hidden] <mailto:users_at_[hidden]>>
> Message-ID: <C03801DD-A057-4544-A365-F2483687926C_at_[hidden] <mailto:C03801DD-A057-4544-A365-F2483687926C_at_[hidden]>>
> Content-Type: text/plain; charset=us-ascii
>
> Are you able to run if you use --mca btl_openib_cpc_include rdmacm ?
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>