Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Running simple MPI program
From: Brandon Fulcher (minguo_at_[hidden])
Date: 2010-10-23 12:37:01


Hi Jeff, thanks for responding.

mpirun hostname returns the name of the local machine.

On Sat, Oct 23, 2010 at 11:27 AM, Jeff Squyres (jsquyres) <
jsquyres_at_[hidden]> wrote:

> I didn't notice if it came up earlier - are you running the same version of
> OMPI on each node?
>
> What happens if you try mpirunning hostname (ie not an MPI app)?
>
> Sent from my PDA. No type good.
>
> On Oct 23, 2010, at 12:07 PM, "Brandon Fulcher" <minguo_at_[hidden]> wrote:
>
> Hi Jody, thank you for the response.
>
> Specifying the number of processes in the manner you provided
> (mpirun -np 2 hostfile hosts.txt ilk)
>
> Does indeed succeed. All processes are launched on my local machine which
> has two slots. If I change the command to:
>
> mpirun -np 3 hostfile hosts.txt ilk
>
> It however fails giving the same error.
>
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
>
>
> On Sat, Oct 23, 2010 at 10:13 AM, jody < <jody.xha_at_[hidden]>
> jody.xha_at_[hidden]> wrote:
>
>> Hi Brandon
>> Does it work if you try this:
>> mpirun -np 2 hostfile hosts.txt ilk
>>
>> (see <http://www.open-mpi.org/faq/?category=running#simple-spmd-run>
>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run)
>>
>> jody
>>
>> On Sat, Oct 23, 2010 at 4:07 PM, Brandon Fulcher < <minguo_at_[hidden]>
>> minguo_at_[hidden]> wrote:
>> > Thank you for the response!
>> >
>> > The code runs on my own machine as well. Both machines, in fact. And I
>> did
>> > not build MPI but installed the package from the ubuntu repositories.
>> >
>> > The problem occurs when I try to run a job using two machines or simply
>> try
>> > to run it on a slave from the master.
>> >
>> > the actual command I have run along with the output is below:
>> >
>> > mpirun -hostfile hosts.txt ilk
>> >
>> --------------------------------------------------------------------------
>> > mpirun noticed that the job aborted, but has no info as to the process
>> > that caused that situation.
>> >
>> --------------------------------------------------------------------------
>> >
>> > where hosts.txt contains:
>> > 192.168.0.2 cpu=2
>> > 192.168.0.6 cpu=1
>> >
>> >
>> > If it matters the same output is given if I define a remote host in the
>> > command such as (if I am on 192.168.0.2)
>> > mpirun -host 192.168.0.6 ilk
>> >
>> > Now if I run it locally, the job succeeds. This works from either cpu.
>> > mpirun ilk
>> >
>> >
>> > Thanks in advance.
>> >
>> > On Fri, Oct 22, 2010 at 11:59 PM, David Zhang < <solarbikedz_at_[hidden]>
>> solarbikedz_at_[hidden]> wrote:
>> >>
>> >> since you said you're new to MPI, what command did you use to run the 2
>> >> processes?
>> >>
>> >> On Fri, Oct 22, 2010 at 9:58 PM, David Zhang < <solarbikedz_at_[hidden]>
>> solarbikedz_at_[hidden]>
>> >> wrote:
>> >>>
>> >>> your code works on mine machine. could be they way you build mpi.
>> >>>
>> >>> On Fri, Oct 22, 2010 at 7:26 PM, Brandon Fulcher < <minguo_at_[hidden]>
>> minguo_at_[hidden]>
>> >>> wrote:
>> >>>>
>> >>>> Hi, I am completely new to MPI and am having trouble running a job
>> >>>> between two cpus.
>> >>>>
>> >>>> The same thing happens no matter what MPI job I try to run, but here
>> is
>> >>>> a simple 'hello world' style program I am trying to run.
>> >>>>
>> >>>> #include <mpi.h>
>> >>>> #include <stdio.h>
>> >>>>
>> >>>> int main(int argc, char **argv)
>> >>>> {
>> >>>> int *buf, i, rank, nints, len;
>> >>>> char hostname[256];
>> >>>>
>> >>>> MPI_Init(&argc,&argv);
>> >>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> >>>> gethostname(hostname,255);
>> >>>> printf("Hello world! I am process number: %d on host %s\n", rank,
>> >>>> hostname);
>> >>>> MPI_Finalize();
>> >>>> return 0;
>> >>>> }
>> >>>>
>> >>>>
>> >>>> On either CPU, I can successfully compile and run, but when trying to
>> >>>> run the program using two CPUS it fails with this output:
>> >>>>
>> >>>>
>> >>>>
>> --------------------------------------------------------------------------
>> >>>> mpirun noticed that the job aborted, but has no info as to the
>> process
>> >>>> that caused that situation.
>> >>>>
>> >>>>
>> --------------------------------------------------------------------------
>> >>>>
>> >>>>
>> >>>> With no additional information or errors, What can I do to go about
>> >>>> finding out what is wrong?
>> >>>>
>> >>>>
>> >>>>
>> >>>> I have read the FAQ and followed the instructions. I can ssh into
>> the
>> >>>> slave without entering a password and have the libraries installed on
>> both
>> >>>> machines.
>> >>>>
>> >>>> The only thing pertinent I could find is this faq
>> >>>> <http://www.open-mpi.org/faq/?category=running#missing-prereqs>
>> http://www.open-mpi.org/faq/?category=running#missing-prereqs but I do
>> not
>> >>>> know if it applies since I have installed open mpi from the Ubuntu
>> >>>> repositories and assume the libraries are correctly set.
>> >>>>
>> >>>> _______________________________________________
>> >>>> users mailing list
>> >>>> <users_at_[hidden]>users_at_[hidden]
>> >>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> David Zhang
>> >>> University of California, San Diego
>> >>
>> >>
>> >>
>> >> --
>> >> David Zhang
>> >> University of California, San Diego
>> >>
>> >> _______________________________________________
>> >> users mailing list
>> >> <users_at_[hidden]>users_at_[hidden]
>> >> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > <users_at_[hidden]>users_at_[hidden]
>> > <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>> _______________________________________________
>> users mailing list
>> <users_at_[hidden]>users_at_[hidden]
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>