Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] gadget-3 locks up using openmpi and infiniband (or myrinet)
From: Jaime Perea (jaime.perea_at_[hidden])
Date: 2010-05-17 11:18:39


El Lunes 17 Mayo 2010, Scott Atchley escribió:
> On May 16, 2010, at 1:32 PM, Lydia Heck wrote:
> > When running over gigabit using -mca btl tcp,self,sm the code runs
> > alright, which is good as the largest part of our cluster is over
> > gigabit, and as Gadget-3 scales rather well, the penalty for running over
> > gigabit is not prohibitive. We also have a myrinet cluster and on there
> > larger runs freeze. However as the gigabit cluster was available we have
> > not really investigated this until just now.
>
> Hi Lydia,
>
> I can't help with the IB issue, but I am interested in the issue running
> over MX.
>
> I found a ticket from 2007 regarding Gadget-2. The last set of mails
> indicated that the app was running. You have had a few tickets since, but
> none mentioned Gadget. Can you give me more details about the hang that
> you experienced?
>
> I have a couple of ideas that we could investigate (one in Open-MPI and the
> other in MX).
>
> Scott
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
Hello

To add a bit more of noise, I gave up with Gadget2 with openmpi-gm, it
always gets frozen, and it may happen after days of integration, I was not
able to get a clear trend. Now it is working well with mpich-gm (thanks to
to the nice myricom folks ), meaning that at least I don't have any
hardware problem.

Regards

-- 
           Jaime D. Perea Duarte. <jaime at iaa dot es>
             Linux registered user #10472
           Dep. Astrofisica Extragalactica.
           Instituto de Astrofisica de Andalucia (CSIC)
           Apdo. 3004, 18080 Granada, Spain.