Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Unable to find the following executable
From: Tushar Andriyas (thugnomics28_at_[hidden])
Date: 2010-11-20 13:11:57


Rangam,

It does not want to run at all. Attached is the log file from the batch file
run u sent.

On Sat, Nov 20, 2010 at 10:32 AM, Addepalli, Srirangam V <
srirangam.v.addepalli_at_[hidden]> wrote:

> Hello Tushar,
> MPIRUN is not able to spawn processes on the node allocated. This should
> help
>
> #!/bin/sh
> #PBS -V
> #PBS -q wasatch
> #PBS -N SWMF
> #PBS -l nodes=2:ppn=8
> # change to the run directory
> #cd $SWMF_v2.3/run
> cat `echo ${PBS_NODEFILE}` > list_of_nodes
> mpirun -np 8 /home/A00945081/SWMF_v2.3/run/SWMF.exe > run.log
>
>
> Rangam
>
>
> ________________________________________
> From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] On Behalf Of
> Tushar Andriyas [thugnomics28_at_[hidden]]
> Sent: Saturday, November 20, 2010 10:48 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Unable to find the following executable
>
> Hi Rangam,
>
> I ran the batch file that you gave and have attached the error file. Also,
> since the WASATCH cluster is kind of small, people usually run on UINTA. So,
> if possible could you look at the uinta error files?
> Tushar
>
> On Fri, Nov 19, 2010 at 12:31 PM, Addepalli, Srirangam V <
> srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>
> wrote:
> Hello Tushar,
> After looking at the log files you attached it appears that there are
> multiple issues.
>
> [0,1,11]: Myrinet/GM on host wasatch-55 was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
>
> Usually they occur if there is a mismatch in mpirun version and mca blt
> selection. I suggest the following order to check if the job actually works
> on a single node
>
> #!/bin/sh
> #PBS -V
> #PBS -q wasatch
> #PBS -N SWMF
> #PBS -l nodes=2:ppn=8
> # change to the run directory
> #cd $SWMF_v2.3/run
> cat `echo ${PBS_NODEFILE}` > list_of_nodes
> mpirun -np 8 -machinefile list_of_nodes
> /home/A00945081/SWMF_v2.3/run/SWMF.exe > run.log
>
>
> Rangam
>
>
> ________________________________________
> From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]> [
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>] On Behalf
> Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>]
> Sent: Friday, November 19, 2010 1:11 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Unable to find the following executable
>
> Hey Rangam,
>
> I tried out the batch script and the error file comes out empty and the
> output file has /home/A00945081/SWM_v2.3/run/SWMF.exe (WHEN RUN ON A SINGLE
> MACHINE) and the same with multiple machines in the run. So, does that mean
> that the exe is auto mounted ? What should I do next?
>
> Tushar
>
> On Fri, Nov 19, 2010 at 10:05 AM, Addepalli, Srirangam V <
> srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]
> ><mailto:srirangam.v.addepalli_at_[hidden]<mailto:
> srirangam.v.addepalli_at_[hidden]>>> wrote:
> Hello Tushar,
>
> Try the following script.
>
> #!/bin/sh
> #PBS -V
> #PBS -q wasatch
> #PBS -N SWMF
> #PBS -l nodes=1:ppn=8
> # change to the run directory
> #cd $SWMF_v2.3/run
> cat `echo ${PBS_NODEFILE}` > list_of_nodes
>
>
>
>
> The objective is to check if your user directories are auto mounted on
> compute nodes and are available during run time.
>
> If the job returns information about SWMF.exe then it can be safely assumed
> that user directories are being auto mounted.
>
> Rangam
>
>
>
> ________________________________________
> From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]
> ><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>> [
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>] On Behalf
> Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]
> ><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>]
> Sent: Friday, November 19, 2010 8:35 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Unable to find the following executable
>
> It just gives back the info on folders in my home directory. Dont get me
> wrong but i m kinda new in this. So, could u type out d full command which i
> need to give?
>
> Tushar
>
> On Thu, Nov 18, 2010 at 8:35 AM, Ralph Castain <rhc_at_[hidden]<mailto:
> rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]
> >><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:
> rhc_at_[hidden]<mailto:rhc_at_[hidden]>>>> wrote:
> You can qsub a simple "ls" on that path - that will tell you if the path is
> valid on all machines in that allocation.
>
> What typically happens is that home directories aren't remotely mounted, or
> are mounted on a different location.
>
>
> On Thu, Nov 18, 2010 at 8:31 AM, Tushar Andriyas <thugnomics28_at_[hidden]
> <mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]>>>> wrote:
> no its not in the same directory as SWMF. I guess the path is the same
> since all the machines in a cluster are configured d same way. How do I know
> if this is not the case?
>
>
> On Thu, Nov 18, 2010 at 8:25 AM, Ralph Castain <rhc_at_[hidden]<mailto:
> rhc_at_[hidden]><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]
> >><mailto:rhc_at_[hidden]<mailto:rhc_at_[hidden]><mailto:
> rhc_at_[hidden]<mailto:rhc_at_[hidden]>>>> wrote:
> Is you "hello world" test program in the same directory as SWMF? Is it
> possible that the path you are specifying is not available on all of the
> remote machines? That's the most common problem we see.
>
>
> On Thu, Nov 18, 2010 at 7:59 AM, Tushar Andriyas <thugnomics28_at_[hidden]
> <mailto:thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]>><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]><mailto:thugnomics28_at_[hidden]<mailto:
> thugnomics28_at_[hidden]>>>> wrote:
> Hi there,
>
> Thanks for the expedite reply. The thing is that although the mpirun is
> setup correctly (since a simple hello world works), when I run the main
> SWMF.exe executable, the cluster machines somehow fail to find the
> executable (SWMF.exe).
>
> So, I have attached the sample error file from one of the runs
> (SWMF.e143438) and also the MAKEFILES so that you could better gauge the
> problem. The makefiles have Linux as the OS and pgf90 as compiler with
> mpif90 as the linker. I am using openmpi-1.2.7-pgi. Job is submitted using a
> batch file (job.bats) and the scheduler is Torque (version I am not sure but
> I can see three on the machines viz 2.0.0, 2.2.1, 2.5.2).
>
> I have also attached an error file from one of the clusters (WASATCH viz
> SWMF.e143439) and UINTA (SWMF.e143440) with the whole path of the exe as
> Srirangam mentioned as follows (in the batch file).
>
> mpirun --prefix /opt/libraries/openmpi/openmpi-1.2.7-pgi
> /home/A00945081/SWMF_v2.3/run/SWMF.exe > runlog_`date +%y%m%d%H%M`
>
> I have tried both mpirun and mpiexec but nothing seems to work.
>
> Tushar
>
>
> On Wed, Nov 17, 2010 at 8:12 PM, Addepalli, Srirangam V <
> srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]
> ><mailto:srirangam.v.addepalli_at_[hidden]<mailto:
> srirangam.v.addepalli_at_[hidden]>><mailto:srirangam.v.addepalli_at_[hidden]
> <mailto:srirangam.v.addepalli_at_[hidden]><mailto:
> srirangam.v.addepalli_at_[hidden]<mailto:srirangam.v.addepalli_at_[hidden]>>>>
> wrote:
> Hello Tushar,
> Have you tried supplying the full path of the executable just to check ?
> Rangam
> ________________________________________
> From: users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]
> ><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]
> >><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]
> ><mailto:users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>> [
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>><mailto:
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]><mailto:
> users-bounces_at_[hidden]<mailto:users-bounces_at_[hidden]>>>] On Behalf
> Of Tushar Andriyas [thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]
> ><mailto:thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>><mailto:
> thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]><mailto:
> thugnomics28_at_[hidden]<mailto:thugnomics28_at_[hidden]>>>]
> Sent: Wednesday, November 17, 2010 8:49 PM
> To: users_at_[hidden]<mailto:users_at_[hidden]><mailto:
> users_at_[hidden]<mailto:users_at_[hidden]>><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]>>>
> Subject: [OMPI users] Unable to find the following executable
>
> Hi there,
>
> I am new to using mpi commands and was stuck in problem with running a
> code. When I submit my job through a batch file, the job exits with the
> message that the executable could not be found on the machines. I have tried
> a lot of options such as PBS -V and so on on but the problem persists. If
> someone is interested, I can send the full info on the cluster, the compiler
> and openmpi settings and other stuff. BTW the launcher is torque (which you
> might have guessed). The code does not have a forum so I am in a deep mire.
>
> Thanks,
> Tushar
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>><mailto:users_at_[hidden]<mailto:
> users_at_[hidden]><mailto:users_at_[hidden]<mailto:users_at_[hidden]>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]><mailto:users_at_[hidden]
> <mailto:users_at_[hidden]>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]<mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>