Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Heterogeneous SLURM cluster segfaults on largetransfers
From: James (jamesgao_at_[hidden])
Date: 2009-09-08 15:17:11


Hi,
Sorry it took so long to respond - recompiling everything across the cluster
took a while. Without the --with-threads config flag, it seems to work a
little better - the limit still exists, there is still the same segfault,
but now it's up around 21,000,000 characters, instead of 16,000,000.

Any ideas?

-James

On Wed, Sep 2, 2009 at 12:55 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> Can you try without the --with-threads configure argument?
>
>
> On Aug 28, 2009, at 11:48 PM, James Gao wrote:
>
> Hi everyone, I've been having a pretty odd issue with Slurm and
>> openmpi the last few days. I just set up a heterogeneous cluster with
>> Slurm consisting of P4 32 bit machines and a few new i7 64 bit
>> machines, all running the latest version of Ubuntu linux. I compiled
>> the latest OpenMPI 1.3.3 with the flags
>>
>> ./configure --enable-heterogeneous --with-threads --with-slurm
>> --with-memory-manager --with-openib --without-udapl
>> --disable-openib-ibcm
>>
>> I also made a trivial test program:
>> #include "mpi.h"
>> #include <stdio.h>
>> #include <stdlib.h>
>>
>> #define LEN 12000000
>>
>> int main (int argc, char *argv[]) {
>> int size, rank, i, len = LEN;
>> MPI_Init(&argc, &argv);
>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>
>> if (argc > 1) len = atoi(argv[1]);
>> printf("Size: %d, ", len);
>> char *greeting = malloc(sizeof(char)*len);
>>
>> if (rank == 0) {
>> for ( i = 0; i < len-1; i++)
>> greeting[i] = ' ';
>> greeting[len-1] = '\0';
>> }
>> MPI_Bcast(greeting, len, MPI_BYTE, 0, MPI_COMM_WORLD);
>> printf("rank: %d\n", rank);
>>
>> MPI_Finalize();
>> free(greeting);
>> return 0;
>> }
>>
>> I run this with salloc -n 28 mpirun -n 28 mpitest on my slurm cluster.
>> At 12,000,000 characters, this command works exactly as expected, no
>> issues at all. However, beyond a certain critical limit somewhere
>> around 16,000,000 characters, the program will consistently segfault
>> with this error message:
>>
>> salloc -n 28 -p all mpiexec -n 28 mpitest 16500000
>> salloc: Granted job allocation 234
>> [ibogaine:24883] *** Process received signal ***
>> [ibogaine:24883] Signal: Segmentation fault (11)
>> [ibogaine:24883] Signal code: Address not mapped (1)
>> [ibogaine:24883] Failing at address: 0x101a60f58
>> [ibogaine:24883] [ 0] /lib/libpthread.so.0 [0x7f6c00405080]
>> [ibogaine:24883] [ 1] /usr/local/lib/openmpi/mca_pml_ob1.so
>> [0x7f6bfd9dff68]
>> [ibogaine:24883] [ 2] /usr/local/lib/openmpi/mca_btl_tcp.so
>> [0x7f6bfcf3ec7c]
>> [ibogaine:24883] [ 3] /usr/local/lib/libopen-pal.so.0 [0x7f6c00ed5ee8]
>> [ibogaine:24883] [ 4]
>> /usr/local/lib/libopen-pal.so.0(opal_progress+0xa1) [0x7f6c00eca7b1]
>> [ibogaine:24883] [ 5] /usr/local/lib/libmpi.so.0 [0x7f6c013a185d]
>> [ibogaine:24883] [ 6] /usr/local/lib/openmpi/mca_coll_tuned.so
>> [0x7f6bfc10c29c]
>> [ibogaine:24883] [ 7] /usr/local/lib/openmpi/mca_coll_tuned.so
>> [0x7f6bfc10c9eb]
>> [ibogaine:24883] [ 8] /usr/local/lib/openmpi/mca_coll_tuned.so
>> [0x7f6bfc10295c]
>> [ibogaine:24883] [ 9] /usr/local/lib/openmpi/mca_coll_sync.so
>> [0x7f6bfc31b35a]
>> [ibogaine:24883] [10] /usr/local/lib/libmpi.so.0(MPI_Bcast+0xa3)
>> [0x7f6c013b78c3]
>> [ibogaine:24883] [11] mpitest(main+0xd4) [0x400bc0]
>> [ibogaine:24883] [12] /lib/libc.so.6(__libc_start_main+0xe6)
>> [0x7f6c000a25a6]
>> [ibogaine:24883] [13] mpitest [0x400a29]
>> [ibogaine:24883] *** End of error message ***
>>
>> As far as I can tell, the segfault occurs on the root node doing the
>> broadcast. This error only occurs when I try to send across
>> heterogeneous sections. If I only communicate between homogeneous
>> subsets of the cluster, I can go as far as 120,000,000 characters
>> without issue. However, a hard "limit" seems to occur somewhere just
>> under 16,000,000 characters across the heterogeneous cluster. Any
>> ideas?
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>