Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] (no subject)
From: Alberto Giannetti (albertogiannetti_at_[hidden])
Date: 2008-04-23 14:55:10


I am running the test program on Darwin 8.11.1, 1.83 Ghz Intel dual
core. My Open MPI install is 1.2.4.
I can't see any allocated shared memory segment on my system (ipcs -
m), although the receiver opens a couple of TCP sockets in listening
mode. It looks like my implementation does not use shared memory. Is
this a configuration issue?

> a.out 5628 albertogiannetti 3u unix R,W,NB
> 0x380b198 0t0 ->0x41ced48
> a.out 5628 albertogiannetti 4u unix R,W
> 0x41ced48 0t0 ->0x380b198
> a.out 5628 albertogiannetti 5u IPv4 R,W,NB
> 0x3d4d920 0t0 TCP *:50969 (LISTEN)
> a.out 5628 albertogiannetti 6u IPv4 R,W,NB
> 0x3e62394 0t0 TCP 192.168.0.10:50970->192.168.0.10:50962
> (ESTABLISHED)
> a.out 5628 albertogiannetti 7u IPv4 R,W,NB
> 0x422d228 0t0 TCP *:50973 (LISTEN)
> a.out 5628 albertogiannetti 8u IPv4 R,W,NB
> 0x2dfd394 0t0 TCP 192.168.0.10:50969->192.168.0.10:50975
> (ESTABLISHED)

On Apr 23, 2008, at 12:34 PM, Jeff Squyres wrote:
> Because on-node communication typically uses shared memory, so we
> currently have to poll. Additionally, when using mixed on/off-node
> communication, we have to alternate between polling shared memory and
> polling the network.
>
> Additionally, we actively poll because it's the best way to lower
> latency. MPI implementations are almost always first judged on their
> latency, not [usually] their CPU utilization. Going to sleep in a
> blocking system call will definitely negatively impact latency.
>
> We have plans for implementing the "spin for a while and then block"
> technique (as has been used in other MPI's and middleware layers), but
> it hasn't been a high priority.
>
>
> On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote:
>
>> Thanks Torje. I wonder what is the benefit of looping on the incoming
>> message-queue socket rather than using system I/O signals, like read
>> () or select().
>>
>> On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote:
>>> Hi Alberto,
>>>
>>> The blocked processes are in fact spin-waiting. While they don't
>>> have
>>> anything better to do (waiting for that message), they will check
>>> their incoming message-queues in a loop.
>>>
>>> So the MPI_Recv()-operation is blocking, but it doesn't mean that
>>> the
>>> processes are blocked by the OS scheduler.
>>>
>>>
>>> I hope that made some sense :)
>>>
>>>
>>> Best regards,
>>>
>>> Torje
>>>
>>>
>>> On Apr 23, 2008, at 5:34 PM, Alberto Giannetti wrote:
>>>
>>>> I have simple MPI program that sends data to processor rank 0. The
>>>> communication works well but when I run the program on more than 2
>>>> processors (-np 4) the extra receivers waiting for data run on >
>>>> 90%
>>>> CPU load. I understand MPI_Recv() is a blocking operation, but why
>>>> does it consume so much CPU compared to a regular system read()?
>>>>
>>>>
>>>>
>>>> #include <sys/types.h>
>>>> #include <unistd.h>
>>>> #include <stdio.h>
>>>> #include <stdlib.h>
>>>> #include <mpi.h>
>>>>
>>>> void process_sender(int);
>>>> void process_receiver(int);
>>>>
>>>>
>>>> int main(int argc, char* argv[])
>>>> {
>>>> int rank;
>>>>
>>>> MPI_Init(&argc, &argv);
>>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>
>>>> printf("Processor %d (%d) initialized\n", rank, getpid());
>>>>
>>>> if( rank == 1 )
>>>> process_sender(rank);
>>>> else
>>>> process_receiver(rank);
>>>>
>>>> MPI_Finalize();
>>>> }
>>>>
>>>>
>>>> void process_sender(int rank)
>>>> {
>>>> int i, j, size;
>>>> float data[100];
>>>> MPI_Status status;
>>>>
>>>> printf("Processor %d initializing data...\n", rank);
>>>> for( i = 0; i < 100; ++i )
>>>> data[i] = i;
>>>>
>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>
>>>> printf("Processor %d sending data...\n", rank);
>>>> MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD);
>>>> printf("Processor %d sent data\n", rank);
>>>> }
>>>>
>>>>
>>>> void process_receiver(int rank)
>>>> {
>>>> int count;
>>>> float value[200];
>>>> MPI_Status status;
>>>>
>>>> printf("Processor %d waiting for data...\n", rank);
>>>> MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55,
>>>> MPI_COMM_WORLD, &status);
>>>> printf("Processor %d Got data from processor %d\n", rank,
>>>> status.MPI_SOURCE);
>>>> MPI_Get_count(&status, MPI_FLOAT, &count);
>>>> printf("Processor %d, Got %d elements\n", rank, count);
>>>> }
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users