Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Unable to find the following executable
From: Addepalli, Srirangam V (srirangam.v.addepalli_at_[hidden])
Date: 2010-11-20 13:28:31


Hello Tushar,
Can you send me the output of ompi_info.
Have you tried using just tcp instead of IB to narrow down.
Rangam

#!/bin/sh
#PBS -V
#PBS -q wasatch
#PBS -N SWMF
#PBS -l nodes=1:ppn=8
# change to the run directory
#cd $SWMF_v2.3/run
cat `echo ${PBS_NODEFILE}` > list_of_nodes

mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 8 /home/A00945081/SWMF_v2.3/run/SWMF.exe > run.log

________________________________________
From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] On Behalf Of Tushar Andriyas [thugnomics28_at_[hidden]]
Sent: Saturday, November 20, 2010 12:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Unable to find the following executable

Rangam,

It does not want to run at all. Attached is the log file from the batch file run u sent.

On Sat, Nov 20, 2010 at 10:32 AM, Addepalli, Srirangam V <srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>> wrote:
Hello Tushar,
MPIRUN is not able to spawn processes on the node allocated. This should help

#!/bin/sh
#PBS -V
#PBS -q wasatch
#PBS -N SWMF
#PBS -l nodes=2:ppn=8
# change to the run directory
#cd $SWMF_v2.3/run
cat `echo ${PBS_NODEFILE}` > list_of_nodes
mpirun -np 8 /home/A00945081/SWMF_v2.3/run/SWMF.exe > run.log

Rangam

________________________________________
From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]> [users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>] On Behalf Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>]
Sent: Saturday, November 20, 2010 10:48 AM
To: Open MPI Users
Subject: Re: [OMPI users] Unable to find the following executable

Hi Rangam,

I ran the batch file that you gave and have attached the error file. Also, since the WASATCH cluster is kind of small, people usually run on UINTA. So, if possible could you look at the uinta error files?
Tushar

On Fri, Nov 19, 2010 at 12:31 PM, Addepalli, Srirangam V <srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>> wrote:
Hello Tushar,
After looking at the log files you attached it appears that there are multiple issues.

[0,1,11]: Myrinet/GM on host wasatch-55 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.

Usually they occur if there is a mismatch in mpirun version and mca blt selection. I suggest the following order to check if the job actually works on a single node

#!/bin/sh
#PBS -V
#PBS -q wasatch
#PBS -N SWMF
#PBS -l nodes=2:ppn=8
# change to the run directory
#cd $SWMF_v2.3/run
cat `echo ${PBS_NODEFILE}` > list_of_nodes
mpirun -np 8 -machinefile list_of_nodes /home/A00945081/SWMF_v2.3/run/SWMF.exe > run.log

Rangam

________________________________________
From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>> [users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>] On Behalf Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>]
Sent: Friday, November 19, 2010 1:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Unable to find the following executable

Hey Rangam,

I tried out the batch script and the error file comes out empty and the output file has /home/A00945081/SWM_v2.3/run/SWMF.exe (WHEN RUN ON A SINGLE MACHINE) and the same with multiple machines in the run. So, does that mean that the exe is auto mounted ? What should I do next?

Tushar

On Fri, Nov 19, 2010 at 10:05 AM, Addepalli, Srirangam V <srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>>> wrote:
Hello Tushar,

Try the following script.

#!/bin/sh
#PBS -V
#PBS -q wasatch
#PBS -N SWMF
#PBS -l nodes=1:ppn=8
# change to the run directory
#cd $SWMF_v2.3/run
cat `echo ${PBS_NODEFILE}` > list_of_nodes

The objective is to check if your user directories are auto mounted on compute nodes and are available during run time.

If the job returns information about SWMF.exe then it can be safely assumed that user directories are being auto mounted.

Rangam

________________________________________
From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>> [users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>>] On Behalf Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>>]
Sent: Friday, November 19, 2010 8:35 AM
To: Open MPI Users
Subject: Re: [OMPI users] Unable to find the following executable

It just gives back the info on folders in my home directory. Dont get me wrong but i m kinda new in this. So, could u type out d full command which i need to give?

Tushar

On Thu, Nov 18, 2010 at 8:35 AM, Ralph Castain <rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>>>>> wrote:
You can qsub a simple "ls" on that path - that will tell you if the path is valid on all machines in that allocation.

What typically happens is that home directories aren't remotely mounted, or are mounted on a different location.

On Thu, Nov 18, 2010 at 8:31 AM, Tushar Andriyas <thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>>>> wrote:
no its not in the same directory as SWMF. I guess the path is the same since all the machines in a cluster are configured d same way. How do I know if this is not the case?

On Thu, Nov 18, 2010 at 8:25 AM, Ralph Castain <rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]>>>>> wrote:
Is you "hello world" test program in the same directory as SWMF? Is it possible that the path you are specifying is not available on all of the remote machines? That's the most common problem we see.

On Thu, Nov 18, 2010 at 7:59 AM, Tushar Andriyas <thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>>>> wrote:
Hi there,

Thanks for the expedite reply. The thing is that although the mpirun is setup correctly (since a simple hello world works), when I run the main SWMF.exe executable, the cluster machines somehow fail to find the executable (SWMF.exe).

So, I have attached the sample error file from one of the runs (SWMF.e143438) and also the MAKEFILES so that you could better gauge the problem. The makefiles have Linux as the OS and pgf90 as compiler with mpif90 as the linker. I am using openmpi-1.2.7-pgi. Job is submitted using a batch file (job.bats) and the scheduler is Torque (version I am not sure but I can see three on the machines viz 2.0.0, 2.2.1, 2.5.2).

I have also attached an error file from one of the clusters (WASATCH viz SWMF.e143439) and UINTA (SWMF.e143440) with the whole path of the exe as Srirangam mentioned as follows (in the batch file).

mpirun --prefix /opt/libraries/openmpi/openmpi-1.2.7-pgi /home/A00945081/SWMF_v2.3/run/SWMF.exe > runlog_`date +%y%m%d%H%M`

I have tried both mpirun and mpiexec but nothing seems to work.

Tushar

On Wed, Nov 17, 2010 at 8:12 PM, Addepalli, Srirangam V <srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]><mailto:srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>>>> wrote:
Hello Tushar,
Have you tried supplying the full path of the executable just to check ?
Rangam
________________________________________
From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>>> [users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:users-bounces_at_[hidden]<mailto:users-bounce
s_at_[hidden]><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>>>] On Behalf Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>>>]
Sent: Wednesday, November 17, 2010 8:49 PM
To: users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
Subject: [OMPI users] Unable to find the following executable

Hi there,

I am new to using mpi commands and was stuck in problem with running a code. When I submit my job through a batch file, the job exits with the message that the executable could not be found on the machines. I have tried a lot of options such as PBS -V and so on on but the problem persists. If someone is interested, I can send the full info on the cluster, the compiler and openmpi settings and other stuff. BTW the launcher is torque (which you might have guessed). The code does not have a forum so I am in a deep mire.

Thanks,
Tushar

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]<mailto:users_at_[hidden]>
http://www.open-mpi.org/mailman/listinfo.cgi/users