Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: David Bronke (whitelynx_at_[hidden])
Date: 2007-03-15 15:51:09


Ok, now that I've figured out what the signal means, I'm wondering
exactly what is running into permission problems... the program I'm
running doesn't use any functions except printf, sprintf, and MPI_*...
I was thinking that possibly changes to permissions on certain /dev
entries in newer distros might cause this, but I'm not even sure what
/dev entries would be used by MPI.

On 3/15/07, McCalla, Mac <macmccalla_at_[hidden]> wrote:
> Hi,
> If the perror command is available on your system it will tell
> you what the message is associated with the signal value. On my system
> RHEL4U3, it is permission denied.
>
> HTH,
>
> mac mccalla
>
> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
> Behalf Of David Bronke
> Sent: Thursday, March 15, 2007 12:25 PM
> To: users_at_[hidden]
> Subject: [OMPI users] Signal 13
>
> I've been trying to get OpenMPI working on two of the computers at a lab
> I help administer, and I'm running into a rather large issue. When
> running anything using mpirun as a normal user, I get the following
> output:
>
>
> $ mpirun --no-daemonize --host
> localhost,localhost,localhost,localhost,localhost,localhost,localhost,lo
> calhost
> /workspace/bronke/mpi/hello
> mpirun noticed that job rank 0 with PID 0 on node "localhost" exited on
> signal 13.
> [trixie:18104] ERROR: A daemon on node localhost failed to start as
> expected.
> [trixie:18104] ERROR: There may be more information available from
> [trixie:18104] ERROR: the remote shell (see above).
> [trixie:18104] The daemon received a signal 13.
> 8 additional processes aborted (not shown)
>
>
> However, running the same exact command line as root works fine:
>
>
> $ sudo mpirun --no-daemonize --host
> localhost,localhost,localhost,localhost,localhost,localhost,localhost,lo
> calhost
> /workspace/bronke/mpi/hello
> Password:
> p is 8, my_rank is 0
> p is 8, my_rank is 1
> p is 8, my_rank is 2
> p is 8, my_rank is 3
> p is 8, my_rank is 6
> p is 8, my_rank is 7
> Greetings from process 1!
>
> Greetings from process 2!
>
> Greetings from process 3!
>
> p is 8, my_rank is 5
> p is 8, my_rank is 4
> Greetings from process 4!
>
> Greetings from process 5!
>
> Greetings from process 6!
>
> Greetings from process 7!
>
>
> I've looked up signal 13, and have found that it is apparently SIGPIPE;
> I also found a thread on the LAM-MPI site:
> http://www.lam-mpi.org/MailArchives/lam/2004/08/8486.php
> However, this thread seems to indicate that the problem would be in the
> application, (/workspace/bronke/mpi/hello in this case) but there are no
> pipes in use in this app, and the fact that it works as expected as root
> doesn't seem to fit either. I have tried running mpirun with --verbose
> and it doesn't show any more output than without it, so I've run into a
> sort of dead-end on this issue. Does anyone know of any way I can figure
> out what's going wrong or how I can fix it?
>
> Thanks!
> --
> David H. Bronke
> Lead Programmer
> G33X Nexus Entertainment
> http://games.g33xnexus.com/precursors/
>
> v3sw5/7Hhw5/6ln4pr6Ock3ma7u7+8Lw3/7Tm3l6+7Gi2e4t4Mb7Hen5g8+9ORPa22s6MSr7
> p6
> hackerkey.com
> Support Web Standards! http://www.webstandards.org/
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
David H. Bronke
Lead Programmer
G33X Nexus Entertainment
http://games.g33xnexus.com/precursors/
v3sw5/7Hhw5/6ln4pr6Ock3ma7u7+8Lw3/7Tm3l6+7Gi2e4t4Mb7Hen5g8+9ORPa22s6MSr7p6
hackerkey.com
Support Web Standards! http://www.webstandards.org/