Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] RE : RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks
From: Sébastien Boisvert (sebastien.boisvert.3_at_[hidden])
Date: 2011-09-21 21:39:49


> I would still be suspicious -- ofud is not well tested, and it can definitely hang if there are network drops.

It hanged.

> ________________________________________
> De : users-bounces_at_[hidden] [users-bounces_at_[hidden]] de la part de Jeff Squyres [jsquyres_at_[hidden]]
> Date d'envoi : 21 septembre 2011 17:09
> À : Open MPI Users
> Objet : Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks
>
> On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote:
>
>>> What happens if you run 2 ibv_rc_pingpong's on each node? Or N ibv_rc_pingpongs?
>>
>> With 11 ibv_rc_pingpong's
>>
>> http://pastebin.com/85sPcA47
>>
>> Code to do that => https://gist.github.com/1233173.
>>
>> Latencies are around 20 microseconds.
>
> This seems to imply that the network is to blame for the higher latency...?
>
> I.e., if you run the same pattern with MPI processes and get 20us latency, that would tend to imply that the network itself is not performing well with that IO pattern.
>
>> My job seems to do well so far with ofud !
>>
>> [sboisver12_at_colosse2 ray]$ qstat
>> job-ID prior name user state submit/start at queue slots ja-task-ID
>> -----------------------------------------------------------------------------------------------------------------
>> 3047460 0.55384 fish-Assem sboisver12 r 09/21/2011 15:02:25 med_at_r104-n58 256
>
> I would still be suspicious -- ofud is not well tested, and it can definitely hang if there are network drops.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>