Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] OpenMPI-1.6.1: Warning - registering physical memry for mpi jobs
From: Reuti (reuti_at_[hidden])
Date: 2012-09-05 05:21:33


Hi,

Am 05.09.2012 um 06:42 schrieb San B:

> OpenMPI-1.6.1 is installed on Rocks-5.5 Linux cluster with intel compilers and OFED-1.5.3. A sample Helloworld MPI program gives following warning message:
>
>
> /mpi/openmpi/1.6.1/intel/bin/mpirun -np 4 ./mpi
> --------------------------------------------------------------------------
> WARNING: It appears that your OpenFabrics subsystem is configured to only
> allow registering part of your physical memory. This can cause MPI jobs to
> run with erratic performance, hang, and/or crash.
>
> This may be caused by your OpenFabrics vendor limiting the amount of
> physical memory that can be registered. You should investigate the
> relevant Linux kernel module parameters that control how much physical
> memory can be registered, and increase them to allow registering all
> physical memory on your machine.
>
> See this Open MPI FAQ item for more information on these Linux kernel module
> parameters:
>
> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>
> Local host: masternode
> Registerable memory: 4096 MiB
> Total memory: 32151 MiB
> --------------------------------------------------------------------------
> Greetings: 1 of 4 from the node masternode
> Greetings: 2 of 4 from the node masternode
> Greetings: 3 of 4 from the node masternode
> Greetings: 0 of 4 from the node masternode
> [masternode:29820] 3 more processes have sent help message help-mpi-btl-openib.txt / reg mem limit low
> [masternode:29820] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>
> The ulimit parameters also set to unlimited:
>
> ]# ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 278528
> max locked memory (kbytes, -l) unlimited
> max memory size (kbytes, -m) unlimited
> open files (-n) 1024
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) unlimited
> cpu time (seconds, -t) unlimited
> max user processes (-u) 278528
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
>
>
> The file /etc/securoty/limits.conf contains following lines:
>
> * soft memlock unlimited
> * hard memlock unlimited
>
> But why still OpenMPI is throwing warning message wrt registered memory.

These are not honored when a job is started by SGE, instead definitions inside SGE are used:

`man sge_config` paragraph H_MEMORYLOCKED.

execd_params H_MEMORYLOCKED=unlimited

-- Reuti

http://arc.liv.ac.uk/pipermail/gridengine-users/2008-July/019722.html

> Thanks in advance
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users