- Next message: Bouguerra mohamed slim: "[OMPI users] Problem with Filem"
- Previous message: Jeff Squyres: "Re: [OMPI users] Bogus memcpy or bogus valgrind record"
- Next in thread: Jeff Squyres: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Reply: Jeff Squyres: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Maybe reply: jan: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Maybe reply: jan: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
Thank You Jeff Squyres. Could you suggest the method to run layer 0
diagnostics to know that if the fabric is clean. I have contacted Dell
local(Taiwan). I don't think they are familiar with Openmpi even the
infiniband module. Does anyone have the IB stack hangs problem with Mellanox
ConnectX product?
Thank you again.
Best Regards,
Gloria Jan
Wavelink Technology Inc
>> I can confirm that I have exactly the same problem, also on Dell
>> system, even with latest openpmpi.
>>
>> Our system is:
>>
>> Dell M905
>> OpenSUSE 11.1
>> kernel: 2.6.27.21-0.1-default
>> ofed-1.4-21.12 from SUSE repositories.
>> OpenMPI-1.3.2
>>
>>
>> But what I can also add, it not only affect openmpi, if this messages
>> are triggered after mpirun:
>> [node032][[9340,1],11][btl_openib_component.c:3002:poll_device] error
>> polling HP CQ with -2 errno says Success
>>
>> Then IB stack hangs. You cannot even reload it, have to reboot node.
>>
>
>
> Something that severe should not be able to be caused by Open MPI.
> Specifically: Open MPI should not be able to hang the OFED stack.
> Have you run layer 0 diagnostics to know that your fabric is clean?
> You might want to contact your IB vendor to find out how to do that.
>
> --
> Jeff Squyres
> Cisco Systems
>
>
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1217, Issue 2
> **************************************
>
- Next message: Bouguerra mohamed slim: "[OMPI users] Problem with Filem"
- Previous message: Jeff Squyres: "Re: [OMPI users] Bogus memcpy or bogus valgrind record"
- Next in thread: Jeff Squyres: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Reply: Jeff Squyres: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Maybe reply: jan: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
- Maybe reply: jan: "Re: [OMPI users] users Digest, Vol 1217, Issue 2, Message3"
|