Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Network connection check
From: jody (jody.xha_at_[hidden])
Date: 2009-07-23 08:39:55


Maybe you could make a system call to ping the other machine.
    char sCommand[512];
    // build the command string
    sprintf(sCommand, "ping -c %d -q %s > /dev/null", numPings, sHostName);
    // execute the command
    int iResult =system(sCommand);

If the ping was successful, iResult will have the value 0.

Jody

On Thu, Jul 23, 2009 at 1:36 PM, vipin kumar<vipinkumar41_at_[hidden]> wrote:
>
>
> On Thu, Jul 23, 2009 at 3:03 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>> It depends on which network fails. If you lose all TCP connectivity, Open
>> MPI should abort the job as the out-of-band system will detect the loss of
>> connection. If you only lose the MPI connection (whether TCP or some other
>> interconnect), then I believe the system will eventually generate an error
>> after it retries sending the message a specified number of times, though it
>> may not abort.
>
> Thank you Ralph,
>
> From your reply I came to know that the question I posted earlier was not
> reflecting the problem properly.
>
> I can't use blocking communication routines in my main program (
> "masterprocess") because any type of network failure( may be due to physical
> connectivity or TCP connectivity or MPI connection as you told) may occur.
> So I am using non blocking point to point communication routines, and TEST
> later for completion of that Request. Once I enter a TEST loop I will test
> for Request complition till TIMEOUT. Suppose TIMEOUT has occured, In this
> case first I will check whether
>
>  1:  Slave machine is reachable or not,  (How I will do that ??? Given - I
> have IP address and Host Name of Slave machine.)
>
>  2:  if reachable, check whether program(orted and "slaveprocess") is alive
> or not.
>
> I don't want to abort my master process in case 1 and hope that network
> connection will come up in future. Fortunately OpenMPI doesn't abort any
> process. Both processes can run independently without communicating.
>
>
> Thanks and Regards,
> --
> Vipin K.
> Research Engineer,
> C-DOTB, India
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>