Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI fails to run with -np larger than 10
From: Gutierrez, Samuel K (samuel_at_[hidden])
Date: 2012-04-24 17:20:27


Hi,

I just wanted to record the behind the scenes resolution to this particular issue. For more info, take a look at: https://svn.open-mpi.org/trac/ompi/ticket/3076

It seems as if the problem stems from /tmp being mounted as an NFS space that is shared between the compute nodes.

This problem can be resolved in a variety of ways. Below are a few avenues that can help get around the "globally mounted /tmp space" issue, but others are welcome to add to the list.

o Change the place where ORTE stores its session information
-mca orte_tmpdir_base /path/to/some/local/store
For example:
-mca orte_tmpdir_base /dev/shm

**Note: the following options are only available in Open MPI v1.5.5+**

o Change where shmem mmap places its files.
-mca shmem_mmap_relocate_backing_file -1 -mca shmem_mmap_backing_file_base_dir /dev/shm

o Change the backing facility used by the sm mpool and sm BTL to posix or sysv
-mca shmem posix
-mca shmem sysv

Sam

On Apr 24, 2012, at 12:34 PM, Seyyed Mohtadin Hashemi wrote:

Hi,

I ran those cmd's and have posted the outputs on: https://svn.open-mpi.org/trac/ompi/ticket/3076

-mca shmem posix worked for all -np (even when oversubscribing), however sysv did not work for any -np.

On Tue, Apr 24, 2012 at 5:36 PM, Gutierrez, Samuel K <samuel_at_[hidden]<mailto:samuel_at_[hidden]>> wrote:
Hi,

Just out of curiosity, what happens when you add

-mca shmem posix

to your mpirun command line using 1.5.5?

Can you also please try:

-mca shmem sysv

I'm shooting in the dark here, but I want to make sure that the failure isn't due to a small backing store.

Thanks,

Sam

On Apr 16, 2012, at 8:57 AM, Gutierrez, Samuel K wrote:

Hi,

Sorry about the lag. I'll take a closer look at this ASAP.

Appreciate your patience,

Sam
________________________________
From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]> [users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>] on behalf of Ralph Castain [rhc_at_[hidden]<mailto:rhc_at_[hidden]>]
Sent: Monday, April 16, 2012 8:52 AM
To: Seyyed Mohtadin Hashemi
Cc: users_at_[hidden]<mailto:users_at_[hidden]>
Subject: Re: [OMPI users] OpenMPI fails to run with -np larger than 10

No earthly idea. As I said, I'm afraid Sam is pretty much unavailable for the next two weeks, so we probably don't have much hope of fixing it.

I see in your original note that you tried the 1.5.5 beta rc and got the same results, so I assume this must be something in your system config that is causing the issue. I'll file a bug for him (pointing to this thread) so this doesn't get lost, but would suggest you run ^sm for now unless someone else has other suggestions.

On Apr 16, 2012, at 2:57 AM, Seyyed Mohtadin Hashemi wrote:

I recompiled everything from scratch with GCC 4.4.5 and 4.7 using OMPI 1.4.5 tarball.

I did some tests and it does not seem that i can make it work, i tried these:

btl_sm_num_fifos 4
btl_sm_free_list_num 1000
btl_sm_free_list_max 1000000
mpool_sm_min_size 1500000000
mpool_sm_max_size 7500000000

but nothing helped. I started out with varying one parameter at the time from default to 1000000 (except fifo which i only varied till 100, and sm_min and sm_max which i varied from 67mb [default was set to 67xxxxxx] to 7.5gb) to see what reactions i could get. When running with 10 npp everything worked, but as soon as i went to 11 npp it crashed with the same old error.

On Fri, Apr 13, 2012 at 6:41 PM, Ralph Castain <rhc_at_[hidden]<mailto:rhc_at_[hidden]>> wrote:

On Apr 13, 2012, at 10:36 AM, Seyyed Mohtadin Hashemi wrote:

That fixed the issue but have brought a big question mark on why this happened.

I'm pretty sure it's not a system memory issue, the node with least RAM has 8gb which i would think is more than enough.

Do you think that adjusting the btl_sm_eager_limit, mpool_sm_min_size, and mpool_sm_max_size can help fix the problem? (Found this athttp://www.open-mpi.org/faq/?category=sm ) Because compared to the -np 10 the performance of -np 18 is worse when running with the cmd you suggested. I'll try playing around with the parameters and see what works.

Yes, performance will definitely be worse - I was just trying to isolate the problem. I would play a little with those sizes and see what you can do. Our shared memory person is pretty much unavailable for the next two weeks, but the rest of us will at least try to get you working.

We typically do run with more than 10 ppn, so I know the base sm code works at that scale. However, those nodes usually have 32Gbytes of RAM, and the default sm params are scaled accordingly.

On Fri, Apr 13, 2012 at 5:44 PM, Ralph Castain <rhc_at_[hidden]<mailto:rhc_at_[hidden]>> wrote:
Afraid I have no idea how those packages were built, what release they correspond to, etc. I would suggest sticking with the tarballs.

Your output indicates a problem with shared memory when you completely fill the machine. Could be a couple of things, like running out of memory - but for now, try adding -mca btl ^sm to your cmd line. Should work.

On Apr 13, 2012, at 5:09 AM, Seyyed Mohtadin Hashemi wrote:

Hi,

Sorry that it took so long to answer, I didn't get any return mails and had to check the digest for reply.

Anyway, when i compiled from scratch then i did use the tarballs from open-mpi.org<http://open-mpi.org/>. GROMACS is not the problem (or at least i don't think so), i just used it as a check to see if i could run parallel jobs - i am now using OSU benchmarks because i can't be sure that the problem is not with GROMACS.

On the new installation i have not installed (nor compiled) OMPI from the official tarballs but rather installed the "openmpi-bin, openmpi-common, libopenmpi1.3, openmpi-checkpoint, and libopenmpi-dev" packages using apt-get.

As for the simple examples (i.e. ring_c, hello_c, and connectivity_c extracted from the 1.4.2 official tarball) i get the exact same behavior as with GROMACS/OSU bench.

I suspect you'll have to ask someone familiar with GROMACS about that specific package. As for testing OMPI, can you run the codes in the examples directory - e.g., "hello" and "ring"? I assume you are downloading and installing OMPI from our tarballs?

On Apr 12, 2012, at 7:04 AM, Seyyed Mohtadin Hashemi wrote:

> Hello,
>
> I have a very peculiar problem: I have a micro cluster with three nodes (18 cores total); the nodes are clones of each other and connected to a frontend via Ethernet and Debian squeeze as the OS for all nodes. When I run parallel jobs I can used up ?-np 10? if I go further the job crashes, I have primarily done tests with GROMACS (because that is what I will be running) but have also used OSU Micro-Benchmarks 3.5.2.
>
> For a simple parallel job I use: ?path/mpirun ?hostfile path/hostfile ?np XX ?d ?display-map path/mdrun_mpi ?s path/topol.tpr ?o path/output.trr?
>
> (path is global) For ?np XX being smaller than or 10 it works, however as soon as I make use of 11 or larger the whole thing crashes. The terminal dump is attached to this mail: when_working.txt is for ??np 10?, when_crash.txt is for ??np 12?, and OpenMPI_info.txt is output from ?path/mpirun --bynode --hostfile path/hostfile --tag-output ompi_info -v ompi full ?parsable?
>
> I have tried OpenMPI v.1.4.2 all the way up to beta v1.5.5, and all yield the same result.
>
> The output files are from a new install I did today: I formatted all nodes and started from a fresh minimal install of Squeeze and used "apt-get install gromacs gromacs-openmpi" and installed all dependencies. Then I ran two jobs using the parameters described above, I also did one with OSU bench (data is not included) it also crashed with ?-np? larger than 10.
>
> I hope somebody can help figure out what is wrong and how I can fix it.
>
> Best regards,
> Mohtadin
>
> *****************************************************************************
> ** **
> ** WARNING: This email contains an attachment of a very suspicious type. **
> ** You are urged NOT to open this attachment unless you are absolutely **
> ** sure it is legitimate. Opening this attachment may cause irreparable **
> ** damage to your computer and your files. If you have any questions **
> ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
> ** **
> ** This warning was added by the IU Computer Science Dept. mail scanner. **
> *****************************************************************************
>
> <Archive.zip>_______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
De venligste hilsner/I am, yours most sincerely
Seyyed Mohtadin Hashemi
--
De venligste hilsner/I am, yours most sincerely
Seyyed Mohtadin Hashemi
--
De venligste hilsner/I am, yours most sincerely
Seyyed Mohtadin Hashemi