Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Running simple MPI program
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2010-10-23 12:43:53


What if you run w 2 hosts?

It's unusual that no indication of the actual error is shown.

Are you running exactly the same version of OMPI on both nodes?

Sent from my PDA. No type good.

On Oct 23, 2010, at 12:37 PM, "Brandon Fulcher" <minguo_at_[hidden]> wrote:

> Hi Jeff, thanks for responding.
>
> mpirun hostname returns the name of the local machine.
>
> On Sat, Oct 23, 2010 at 11:27 AM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
> I didn't notice if it came up earlier - are you running the same version of OMPI on each node?
>
> What happens if you try mpirunning hostname (ie not an MPI app)?
>
> Sent from my PDA. No type good.
>
> On Oct 23, 2010, at 12:07 PM, "Brandon Fulcher" <minguo_at_[hidden]> wrote:
>
>> Hi Jody, thank you for the response.
>>
>> Specifying the number of processes in the manner you provided
>> (mpirun -np 2 hostfile hosts.txt ilk)
>>
>> Does indeed succeed. All processes are launched on my local machine which has two slots. If I change the command to:
>>
>> mpirun -np 3 hostfile hosts.txt ilk
>>
>> It however fails giving the same error.
>>
>> --------------------------------------------------------------------------
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --------------------------------------------------------------------------
>>
>>
>> On Sat, Oct 23, 2010 at 10:13 AM, jody <jody.xha_at_[hidden]> wrote:
>> Hi Brandon
>> Does it work if you try this:
>> mpirun -np 2 hostfile hosts.txt ilk
>>
>> (see http://www.open-mpi.org/faq/?category=running#simple-spmd-run)
>>
>> jody
>>
>> On Sat, Oct 23, 2010 at 4:07 PM, Brandon Fulcher <minguo_at_[hidden]> wrote:
>> > Thank you for the response!
>> >
>> > The code runs on my own machine as well. Both machines, in fact. And I did
>> > not build MPI but installed the package from the ubuntu repositories.
>> >
>> > The problem occurs when I try to run a job using two machines or simply try
>> > to run it on a slave from the master.
>> >
>> > the actual command I have run along with the output is below:
>> >
>> > mpirun -hostfile hosts.txt ilk
>> > --------------------------------------------------------------------------
>> > mpirun noticed that the job aborted, but has no info as to the process
>> > that caused that situation.
>> > --------------------------------------------------------------------------
>> >
>> > where hosts.txt contains:
>> > 192.168.0.2 cpu=2
>> > 192.168.0.6 cpu=1
>> >
>> >
>> > If it matters the same output is given if I define a remote host in the
>> > command such as (if I am on 192.168.0.2)
>> > mpirun -host 192.168.0.6 ilk
>> >
>> > Now if I run it locally, the job succeeds. This works from either cpu.
>> > mpirun ilk
>> >
>> >
>> > Thanks in advance.
>> >
>> > On Fri, Oct 22, 2010 at 11:59 PM, David Zhang <solarbikedz_at_[hidden]> wrote:
>> >>
>> >> since you said you're new to MPI, what command did you use to run the 2
>> >> processes?
>> >>
>> >> On Fri, Oct 22, 2010 at 9:58 PM, David Zhang <solarbikedz_at_[hidden]>
>> >> wrote:
>> >>>
>> >>> your code works on mine machine. could be they way you build mpi.
>> >>>
>> >>> On Fri, Oct 22, 2010 at 7:26 PM, Brandon Fulcher <minguo_at_[hidden]>
>> >>> wrote:
>> >>>>
>> >>>> Hi, I am completely new to MPI and am having trouble running a job
>> >>>> between two cpus.
>> >>>>
>> >>>> The same thing happens no matter what MPI job I try to run, but here is
>> >>>> a simple 'hello world' style program I am trying to run.
>> >>>>
>> >>>> #include <mpi.h>
>> >>>> #include <stdio.h>
>> >>>>
>> >>>> int main(int argc, char **argv)
>> >>>> {
>> >>>> int *buf, i, rank, nints, len;
>> >>>> char hostname[256];
>> >>>>
>> >>>> MPI_Init(&argc,&argv);
>> >>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> >>>> gethostname(hostname,255);
>> >>>> printf("Hello world! I am process number: %d on host %s\n", rank,
>> >>>> hostname);
>> >>>> MPI_Finalize();
>> >>>> return 0;
>> >>>> }
>> >>>>
>> >>>>
>> >>>> On either CPU, I can successfully compile and run, but when trying to
>> >>>> run the program using two CPUS it fails with this output:
>> >>>>
>> >>>>
>> >>>> --------------------------------------------------------------------------
>> >>>> mpirun noticed that the job aborted, but has no info as to the process
>> >>>> that caused that situation.
>> >>>>
>> >>>> --------------------------------------------------------------------------
>> >>>>
>> >>>>
>> >>>> With no additional information or errors, What can I do to go about
>> >>>> finding out what is wrong?
>> >>>>
>> >>>>
>> >>>>
>> >>>> I have read the FAQ and followed the instructions. I can ssh into the
>> >>>> slave without entering a password and have the libraries installed on both
>> >>>> machines.
>> >>>>
>> >>>> The only thing pertinent I could find is this faq
>> >>>> http://www.open-mpi.org/faq/?category=running#missing-prereqs but I do not
>> >>>> know if it applies since I have installed open mpi from the Ubuntu
>> >>>> repositories and assume the libraries are correctly set.
>> >>>>
>> >>>> _______________________________________________
>> >>>> users mailing list
>> >>>> users_at_[hidden]
>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> David Zhang
>> >>> University of California, San Diego
>> >>
>> >>
>> >>
>> >> --
>> >> David Zhang
>> >> University of California, San Diego
>> >>
>> >> _______________________________________________
>> >> users mailing list
>> >> users_at_[hidden]
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users