Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] sge tight integration leads to bad allocation
From: Eloi Gaudry (eloi.gaudry_at_[hidden])
Date: 2012-04-20 09:29:26

Those are related to the GM/MX component that are built within OpenMPI.
Actually, I used the --with-mx=PATH_TO_MX_INSTALL_DIRECTORY option while configuring OpenMPI and the mx btl and mtl were properly built.
It seems that OpenMPI wrongly tries to check for/use the gm module (which is of no use here as I'm using the mx2g myrinet libraries) upon startup instead of only probing the mx modules.

I don't think that those problems are somehow related to the allocation issue.

-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Reuti
Sent: vendredi 20 avril 2012 15:20
To: Open MPI Users
Subject: Re: [OMPI users] sge tight integration leads to bad allocation

Am 20.04.2012 um 15:04 schrieb Eloi Gaudry:

> Hi Ralph, Reuti,
> I've just observed the same issue without specifying -np.
> Please find attached the ps -elfax output from the computing nodes and some sge related information.

What about these error message:

component_find: unable to open /opt/openmpi-1.4.4/lib/openmpi/mca_btl_gm: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored) [charlie:23188] mca: base:

-- Reuti

> Regards,
> Eloi
> -----Original message-----
> From: Ralph Castain <rhc_at_[hidden]>
> Sent: Wed 04-11-2012 02:25 pm
> Subject: Re: [OMPI users] sge tight integration leads to bad allocation
> To: Open MPI Users <users_at_[hidden]>;
> On Apr 11, 2012, at 6:20 AM, Reuti wrote:
> > Am 11.04.2012 um 04:26 schrieb Ralph Castain:
> >
> >> Hi Reuti
> >>
> >> Can you replicate this problem on your machine? Can you try it with 1.5?
> >
> > No. It's also working fine in 1.5.5 in some tests. I even forced an uneven distribution by limiting the slots setting for some machines in the queue configuration.
> Thanks - that confirms what I've been able to test. It sounds like it is something in Eloi's setup, but I can't fathom what it would be - the allocations all look acceptable.
> I'm stumped. :-(
> <job1882.log><><pselfax.carl><pselfax.charlie><qstat-gt><qst
> at-j1882>_______________________________________________
> users mailing list
> users_at_[hidden]

users mailing list