Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problem with -npernode
From: David Turner (dpturner_at_[hidden])
Date: 2010-06-18 12:57:17


Hi,

On 06/17/2010 03:34 PM, Ralph Castain wrote:
> No more info required - it's a bug. Fixed and awaiting release of 1.4.3.

I downloaded openmpi-1.4.3a1r23261.tar.gz, dated June 9. It behaves the
same as 1.4.2. Is there a newer version available for testing?

> On Jun 17, 2010, at 3:50 PM, David Turner wrote:
>
>> Hi,
>>
>> Recently, Christopher Maestas reported a problem with -npernode in
>> Open MPI 1.4.2 ("running a ompi 1.4.2 job with -np versus -npernode").
>> I have also encountered this problem, with a simple "hello, world"
>> program:
>>
>> % mpirun -np 16 ./a.out
>> myrank, icount = 0 16
>> myrank, icount = 2 16
>> myrank, icount = 5 16
>> myrank, icount = 7 16
>> myrank, icount = 1 16
>> myrank, icount = 4 16
>> myrank, icount = 6 16
>> myrank, icount = 3 16
>> myrank, icount = 8 16
>> myrank, icount = 9 16
>> myrank, icount = 10 16
>> myrank, icount = 12 16
>> myrank, icount = 13 16
>> myrank, icount = 15 16
>> myrank, icount = 11 16
>> myrank, icount = 14 16
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>>
>> % mpirun -np 16 -npernode 8 ./a.out
>> [c1146:15313] *** Process received signal ***
>> [c1146:15313] Signal: Segmentation fault (11)
>> [c1146:15313] Signal code: Address not mapped (1)
>> [c1146:15313] Failing at address: 0x50
>> [c1146:15313] *** End of error message ***
>> Segmentation fault
>> [c1138:26571] [[62315,0],1] routed:binomial: Connection to lifeline [[62315,0],0] lost
>> % module swap openmpi openmpi/1.4.1
>> % mpirun -np 16 -npernode 8 ./a.out
>> myrank, icount = 8 16
>> myrank, icount = 13 16
>> myrank, icount = 10 16
>> myrank, icount = 11 16
>> myrank, icount = 15 16
>> myrank, icount = 14 16
>> myrank, icount = 12 16
>> myrank, icount = 5 16
>> myrank, icount = 2 16
>> myrank, icount = 3 16
>> myrank, icount = 1 16
>> myrank, icount = 0 16
>> myrank, icount = 9 16
>> myrank, icount = 6 16
>> myrank, icount = 7 16
>> myrank, icount = 4 16
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>>
>> Compilers are PGI/10.5, OS is Scientific Linux 5.4, resource manager is
>> torque 2.4.5. Please let me know if you need more information. Thanks!
>>
>> --
>> Best regards,
>>
>> David Turner
>> User Services Group email: dpturner_at_[hidden]
>> NERSC Division phone: (510) 486-4027
>> Lawrence Berkeley Lab fax: (510) 486-4316
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Best regards,
David Turner
User Services Group        email: dpturner_at_[hidden]
NERSC Division             phone: (510) 486-4027
Lawrence Berkeley Lab        fax: (510) 486-4316