Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] (no subject)
From: Alberto Giannetti (albertogiannetti_at_[hidden])
Date: 2008-04-23 18:39:03


No oversubscription. I did not recompiled OMPI or installed from RPM.

On Apr 23, 2008, at 3:49 PM, Danesh Daroui wrote:
> Do you really mean that Open-MPI uses busy loop in order to handle
> incomming calls? It seems to be incorrect since
> spinning is a very bad and inefficient technique for this purpose. Why
> don't you use blocking and/or signals instead of
> that? I think the priority of this task is very high because polling
> just wastes resources of the system. On the other hand,
> what Alberto claims is not reasonable to me.
>
> Alberto,
> - Are you oversubscribing one node which means that you are running
> your
> code on a single processor machine, pretending
> to have four CPUs?
>
> - Did you compile Open-MPI or installed from RPM?
>
> Receiving process shouldn't be that expensive.
>
> Regards,
>
> Danesh
>
>
>
> Jeff Squyres skrev:
>> Because on-node communication typically uses shared memory, so we
>> currently have to poll. Additionally, when using mixed on/off-node
>> communication, we have to alternate between polling shared memory and
>> polling the network.
>>
>> Additionally, we actively poll because it's the best way to lower
>> latency. MPI implementations are almost always first judged on their
>> latency, not [usually] their CPU utilization. Going to sleep in a
>> blocking system call will definitely negatively impact latency.
>>
>> We have plans for implementing the "spin for a while and then block"
>> technique (as has been used in other MPI's and middleware layers),
>> but
>> it hasn't been a high priority.
>>
>>
>> On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote:
>>
>>
>>> Thanks Torje. I wonder what is the benefit of looping on the
>>> incoming
>>> message-queue socket rather than using system I/O signals, like read
>>> () or select().
>>>
>>> On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote:
>>>
>>>> Hi Alberto,
>>>>
>>>> The blocked processes are in fact spin-waiting. While they don't
>>>> have
>>>> anything better to do (waiting for that message), they will check
>>>> their incoming message-queues in a loop.
>>>>
>>>> So the MPI_Recv()-operation is blocking, but it doesn't mean
>>>> that the
>>>> processes are blocked by the OS scheduler.
>>>>
>>>>
>>>> I hope that made some sense :)
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Torje
>>>>
>>>>
>>>> On Apr 23, 2008, at 5:34 PM, Alberto Giannetti wrote:
>>>>
>>>>
>>>>> I have simple MPI program that sends data to processor rank 0. The
>>>>> communication works well but when I run the program on more than 2
>>>>> processors (-np 4) the extra receivers waiting for data run on
>>>>> > 90%
>>>>> CPU load. I understand MPI_Recv() is a blocking operation, but why
>>>>> does it consume so much CPU compared to a regular system read()?
>>>>>
>>>>>
>>>>>
>>>>> #include <sys/types.h>
>>>>> #include <unistd.h>
>>>>> #include <stdio.h>
>>>>> #include <stdlib.h>
>>>>> #include <mpi.h>
>>>>>
>>>>> void process_sender(int);
>>>>> void process_receiver(int);
>>>>>
>>>>>
>>>>> int main(int argc, char* argv[])
>>>>> {
>>>>> int rank;
>>>>>
>>>>> MPI_Init(&argc, &argv);
>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>>
>>>>> printf("Processor %d (%d) initialized\n", rank, getpid());
>>>>>
>>>>> if( rank == 1 )
>>>>> process_sender(rank);
>>>>> else
>>>>> process_receiver(rank);
>>>>>
>>>>> MPI_Finalize();
>>>>> }
>>>>>
>>>>>
>>>>> void process_sender(int rank)
>>>>> {
>>>>> int i, j, size;
>>>>> float data[100];
>>>>> MPI_Status status;
>>>>>
>>>>> printf("Processor %d initializing data...\n", rank);
>>>>> for( i = 0; i < 100; ++i )
>>>>> data[i] = i;
>>>>>
>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>>
>>>>> printf("Processor %d sending data...\n", rank);
>>>>> MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD);
>>>>> printf("Processor %d sent data\n", rank);
>>>>> }
>>>>>
>>>>>
>>>>> void process_receiver(int rank)
>>>>> {
>>>>> int count;
>>>>> float value[200];
>>>>> MPI_Status status;
>>>>>
>>>>> printf("Processor %d waiting for data...\n", rank);
>>>>> MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55,
>>>>> MPI_COMM_WORLD, &status);
>>>>> printf("Processor %d Got data from processor %d\n", rank,
>>>>> status.MPI_SOURCE);
>>>>> MPI_Get_count(&status, MPI_FLOAT, &count);
>>>>> printf("Processor %d, Got %d elements\n", rank, count);
>>>>> }
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users