Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] btl_openib_connect_oob.c:867:rml_recv_cb error after Infini-band stack update.
From: Joshua Ladd (jladd.mlnx_at_[hidden])
Date: 2014-06-20 15:14:46


Aleksandar,

Please ensure your system administrator follows the guidelines outlined in
the link printed in the error message

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Best,

Josh

On Fri, Jun 20, 2014 at 2:56 PM, Ivanov, Aleksandar (INR) <
aleksandar.ivanov_at_[hidden]> wrote:

> Hi,
>
>
>
> I was not the one updating the machine unfortunately, however I can ask my
> colleagues for specific list of modifications done. If I understand you
> correctly you are referring to the “ulimit” parameters. They are properly
> set, in fact we use JMS as job scheduler, therefore the “ulimit -v” is set
> by the user. In my case I used 31GB per MPI process.
>
> The stack size is set to infinity.
>
>
>
>
>
>
>
>
>
> *From:* users [mailto:users-bounces_at_[hidden]] *On Behalf Of *Ralph
> Castain
> *Sen**t:* Friday, June 20, 2014 8:42 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] btl_openib_connect_oob.c:867:rml_recv_cb
> error after Infini-band stack update.
>
>
>
> What was updated? If the OS, did you remember to set the memory
> registration limits to max?
>
>
>
>
>
> On Jun 20, 2014, at 11:25 AM, Ivanov, Aleksandar (INR) <
> aleksandar.ivanov_at_[hidden]> wrote:
>
>
>
>
>
> Dear Sir or Madam,
>
>
>
> I am using the openmpi 1.6.5 library compiled with IFORT / ICC 13.1.5.
> Since a recent update of our machine I started generating mpi errors. The
> code crashes after completing approx. 24 % from the total job. The same
> code and input were run before on the same machine and no such problems
> were ever observed. The actual error message is attached.
>
> I presume that after the update an incompatibility between the
> infiniband-stack and the openmpi library might have been introduced. I
> think that the suggested “out of memory problem” should not be causing the
> malfunction, since the application uses only 1GB of the total 32 GB
> available.
>
>
>
> I would appreciate your help and ideas how to clarify this issue.
>
>
>
> Thank you in advance
>
>
>
> Best Regards
>
>
>
> Aleksandar Ivanov
>
>
>
>
>
>
>
>
>
> <openmpi.log>_______________________________________________
> users mailing list
> users_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24685.php
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24687.php
>