Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph H Castain (rhc_at_[hidden])
Date: 2006-10-18 13:16:53


Hi Lydia

Could you confirm the version you are using? I think there is a typo there.

Also, could you tell us how you configured the code (the configure command
line would be nice).

Thanks
Ralph

On 10/18/06 11:03 AM, "Lydia Heck" <lydia.heck_at_[hidden]> wrote:

>
> I have recently installed openmpi 1.3r1212a over tcp and gigabit
> on a Solaris 10 x86/64 system.
>
> The compilation of some test codes
> monte (a monte carlo estimate of pi),
> connectivity which test connectivity between processes and nodes
> prime, which calculates prime numbers (these testcode are examples
> which are bundled with Sun HPC).
>
> compile fine using the openmpi version of mpicc, mpif95 and mpic++
>
> And sometimes the jobs work fine, but most of the time the jobs freeze
> leaving zombies behind.
>
> my run time command is
>
> mpirun --hostfile my-hosts -mca pls_rsh_agent rsh --mca btl tcp,self -np 14 \
> monte
>
> and I get as output
> oberon(209) > mpirun --hostfile my-hosts -mca pls_rsh_agent rsh --mca btl
> tcp,self -np 14 monte
> Monte-Carlo estimate of pi by 14 processes is 3.141503.
>
> with the cursor hanging.
>
> The process table shows
>
> oberon# ps -eaf | grep dph0elh
> dph0elh 9583 7445 7 17:45:01 pts/26 9:22 mpirun --hostfile my-hosts
> -mca pls_rsh_agent rsh --mca btl tcp,self -np 14 mon
> dph0elh 9595 9588 0 - ? 0:02 <defunct>
> dph0elh 9588 1 7 17:45:01 ?? 9:03 orted --bootproxy 1 --name
> 0.0.1 --num_procs 5 --vpid_start 0 --nodename oberon
> dph0elh 7445 6924 0 17:01:38 pts/26 0:00 -tcsh
> root 9656 4151 0 18:01:31 pts/36 0:00 grep dph0elh
> dph0elh 9593 9588 0 - ? 0:02 <defunct>
>
>
> one of the nodes offers 8 cpus the other nodes in the hostfile offer 2.
> There are a total of 14 cpus available. and as you can see from the command
> line
> I use --mca btl tcp,self
>
> There are no other interconnects.
>
> I could not find any entry in the FAQs, except for the advice on using
> --mca btl tcp,self.
>
>
>
>
> ------------------------------------------
> Dr E L Heck
>
> University of Durham
> Institute for Computational Cosmology
> Ogden Centre
> Department of Physics
> South Road
>
> DURHAM, DH1 3LE
> United Kingdom
>
> e-mail: lydia.heck_at_[hidden]
>
> Tel.: + 44 191 - 334 3628
> Fax.: + 44 191 - 334 3645
> ___________________________________________
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users