Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Prakash Velayutham (prakash.velayutham_at_[hidden])
Date: 2007-06-05 10:27:13


Ralph,

Ralph H Castain wrote:
> Hmmm...I think I know what may be happening. Could you send me:
>
> 1. what Open MPI version you are using?
>
Open MPI 1.2.1
> 2. any MCA parameters you might be setting in your environment (remember
> that we may be picking up some system configuration file for those)
>
How do I get these?
> This isn't related to the problem, but I also note that you are spawning
> "hostname" and then trying to do MPI send/recv with it - I don't think that
> is going to work.
>
I know. I could not start another client code before this. So just
wanted to check if /bin/hostname works with the spawn.
> Thanks
> Ralph
>
Thanks,
Prakash
>
> On 6/5/07 4:16 AM, "Prakash Velayutham" <Prakash.Velayutham_at_[hidden]>
> wrote:
>
>> Hi,
>>
>> Sorry about that. Two lines got cut out from the program. Here is the
>> full program and error messages again. No Resource Manager involved,
>> just ssh/rsh.
>>
>> Hostfile contains
>>
>> bmi-opt2-01
>> bmi-opt2-02
>> bmi-opt2-03
>> bmi-opt2-04
>>
>> ############################
>> #include<string.h>
>> #include<stdlib.h>
>> #include<stdio.h>
>> #include"mpi.h"
>>
>> void
>> main(int argc, char **argv)
>> {
>> int tag = 0;
>> int my_rank;
>> int num_proc;
>> char message_0[] = "hello slave, i'm your master";
>> char message_1[50];
>> char master_data[] = "slaves to work";
>> int array_of_errcodes[10];
>> int num;
>> MPI_Status status;
>> MPI_Comm inter_comm;
>> MPI_Info info;
>> int arr[1];
>> int rc1;
>> MPI_Init(&argc, &argv);
>> MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
>> MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
>> printf("MASTER : spawning 3 slaves ... \n");
>> rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1,
>> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr);
>> printf("MASTER : send a message to master of slaves ...\n");
>> MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm);
>> MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, &status);
>> printf("MASTER : message received : %s\n", message_1);
>> MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm);
>> MPI_Finalize();
>> exit(0);
>> }
>> #################################
>>
>> prakash_at_bmi-opt2-01:~/thesis/CS/Samples/x86_64> mpirun -np 1 --pernode
>> --prefix /usr/local/openmpi-1.2 --hostfile machinefile ./master1
>> MASTER : spawning 3 slaves ...
>> src is (null) and orte type is 0
>> [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>> dss/dss_copy.c at line 43
>> [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>> gpr_replica_put_get_fn.c at line 410
>> [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>> base/rmaps_base_registry_fns.c at line 612
>> [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>> base/rmaps_base_map_job.c at line 93
>> [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>> base/rmaps_base_receive.c at line 139
>> mpirun: killing job...
>>
>> mpirun noticed that job rank 0 with PID 3532 on node bmi-opt2-01 exited
>> on signal 15 (Terminated).
>>
>> Thanks,
>> Prakash
>>
>>>>> rhc_at_[hidden] 06/03/07 9:31 PM >>>
>>>>>
>> Hi Prakash
>>
>> Are you sure the code you provided here is the one generating the output
>> you
>> attached? I don't see this message anywhere in your code:
>>
>> MASTER : spawning 3 slaves ...
>>
>> and it certainly isn't anything we generate. Also, your output implies
>> you
>> are in some kind of loop, yet your code contains only a single
>> comm_spawn.
>>
>> Could you please clarify?
>>
>> Thanks
>> Ralph
>>
>>
>> On 6/3/07 5:50 AM, "Prakash Velayutham" <Prakash.Velayutham_at_[hidden]>
>> wrote:
>>
>>
>>> Hello,
>>>
>>> Version - Open MPI 1.2.1.
>>>
>>> I have a simple program as below:
>>>
>>> #include<string.h>
>>> #include<stdlib.h>
>>> #include<stdio.h>
>>> #include"mpi.h"
>>>
>>> void
>>> main(int argc, char **argv)
>>> {
>>>
>>> int tag = 0;
>>> int my_rank;
>>> int num_proc;
>>> char message_0[] = "hello slave, i'm your master";
>>> char message_1[50];
>>> char master_data[] = "slaves to work";
>>> int num;
>>> MPI_Status status;
>>> MPI_Comm inter_comm;
>>> MPI_Info info;
>>> int arr[1];
>>> int rc1;
>>> MPI_Init(&argc, &argv);
>>> MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
>>> MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
>>> rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1,
>>> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr);
>>> printf("MASTER : send a message to master of slaves ...\n");
>>> MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm);
>>> MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm,
>>>
>> &status);
>>
>>> printf("MASTER : message received : %s\n", message_1);
>>> MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm);
>>> MPI_Finalize();
>>> exit(0);
>>> }
>>>
>>> When this is run, all I get is
>>>
>>>> ~/thesis/CS/Samples/x86_64> mpirun -np 4 --pernode --hostfile
>>>>
>>> machinefile --prefix /usr/local/openmpi-1.2 ./master1
>>> MASTER : spawning 3 slaves ...
>>> MASTER : spawning 3 slaves ...
>>> MASTER : spawning 3 slaves ...
>>> MASTER : spawning 3 slaves ...
>>> src is (null) and orte type is 0
>>> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>>> dss/dss_copy.c at line 43
>>> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>>> gpr_replica_put_get_fn.c at line 410
>>> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>>> base/rmaps_base_registry_fns.c at line 612
>>> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>>> base/rmaps_base_map_job.c at line 93
>>> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
>>> base/rmaps_base_receive.c at line 139
>>> mpirun: killing job...
>>>
>>> mpirun noticed that job rank 0 with PID 25447 on node bmi-opt2-01
>>>
>> exited
>>
>>> on signal 15 (Terminated).
>>> 3 additional processes aborted (not shown)
>>>
>>> Any idea what is wrong with this.
>>>
>>> Thanks,
>>> Prakash