Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Calling MPI_Test() too many times results in a time spike
From: Ioannis Papadopoulos (giannis.papadopoulos_at_[hidden])
Date: 2010-11-29 11:40:10


Hi,

Has anyone observed similar behaviour? Is it something that I'll have to
deal with it in my code or does it indeed qualify as an issue to be
looked into?

Thanks

Yiannis Papadopoulos wrote:
> It is a quad core machine (Intel Core2 Quad CPU Q9550 - 64bit).
> OpenMPI is the one that comes with Fedora 13.
>
> I tested it on a cluster of Intel Xeon (32bit) with Gigabit Ethernet
> with OpenMPI 1.3.3, the results are:
>
> // with no timeout
> Total time: 1.00309 secs
> Estimated calls: 1116537
> MPI_Test calls: 395488
> Average: 0.0000018129 secs
> Max: 0.0002990606 secs [91608]
> Min: 0.0000008807 secs [694]
> Successful MPI_Test(): 0.0000421085 secs
>
> // with timeout
> Total time: 1.00342 secs
> Estimated calls: 1140521
> MPI_Test calls: 1004
> Average: 0.0000018511 secs
> Max: 0.0000100477 secs [1003]
> Min: 0.0000008813 secs [729]
> Successful MPI_Test(): 0.0000100477 secs
>
> I have seen similar behavior using IBM MPI on Power5+/AIX.
> On Fri, Nov 12, 2010 at 11:00 AM, Jeff Squyres<jsquyres_at_[hidden]> wrote:
>
>> What kind of network did you do these tests on?
>>
>> Were both processes on the same machine or different machines?
>>
>>
>> On Nov 10, 2010, at 12:29 AM, Yiannis Papadopoulos wrote:
>>
>>
>>> Hello,
>>>
>>> I am using OpenMPI 1.4.1. I have a small test case that calls
>>> MPI_Test() too many times. I see one or two random time spikes when
>>> this happens. On the other hand, if I avoid calling MPI_Test() based
>>> on a timeout, this problem disappears.
>>>
>>> For example, with no timeout, the results I'm getting are:
>>> Total time: 1.00008 secs
>>> MPI_Test calls: 5541425
>>> Average: 0.0000001234 secs
>>> Max: 0.0002560616 secs [4374247]
>>> Min: 0.0000000000 secs [1]
>>> Successful MPI_Test(): 0.0000090599 secs
>>>
>>> With timeout of 0.0001secs (which means I just delay a bit calling MPI_Test()):
>>> Total time: 1.00009 secs
>>> MPI_Test calls: 9946
>>> Average: 0.0000000018 secs
>>> Max: 0.0000011921 secs [98]
>>> Min: 0.0000000000 secs [1]
>>> Successful MPI_Test(): 0.0000011921 secs
>>>
>>> With a timeout of 0.01secs:
>>> Total time: 1.0001 secs
>>> MPI_Test calls: 101
>>> Average: 0.0000001936 secs
>>> Max: 0.0000028610 secs [100]
>>> Min: 0.0000000000 secs [2]
>>> Success: 0.0000028610 secs
>>>
>>>
>>> Things to notice:
>>> 1) the total time is similar
>>> 2) the number of MPI_Test() calls varies drastically
>>> 3) the average MPI_Test() call time is significantly less when a
>>> timeout is introduced
>>> 4) the call that takes the most time is not the last one (the
>>> successful) as one would expect except the case that the timeout is a
>>> bit big (0.01secs).
>>> 5) the time of the slowest MPI_Test() call is 2 orders of magnitude
>>> higher than the the successful MPI_Test() in the non-timeout version.
>>>
>>> Does this qualify as a bug?
>>>
>>> Thanks
>>>
>>>
>>> PS The code I have used follows. I compiled it with
>>>
>>> mpic++ -O3 mpi_test_delay.cc -o mpi_test_delay_no_timeout
>>>
>>> for no timeout and
>>>
>>> mpic++ -O3 -DTIMEOUT=0.01 mpi_test_delay.cc -o mpi_test_delay
>>>
>>> for a timeout of 0.01secs.
>>>
>>> // mpi_test_delay.cc
>>> #include<iostream>
>>> #include<iomanip>
>>> #include<mpi.h>
>>> #include<unistd.h>
>>> #include<vector>
>>> #include<algorithm>
>>> #include<numeric>
>>>
>>> int main(int argc, char* argv[])
>>> {
>>> MPI_Init(&argc,&argv);
>>>
>>> MPI_Comm comm = MPI_COMM_WORLD;
>>> int rank = MPI_PROC_NULL;
>>> int size = MPI_PROC_NULL;
>>> MPI_Comm_rank(comm,&rank);
>>> MPI_Comm_size(comm,&size);
>>>
>>> if (size<2) {
>>> MPI_Abort(comm, -1);
>>> }
>>>
>>> const std::size_t bufsize = 1024;
>>> char buffer[bufsize];
>>>
>>> // warm-up
>>> unsigned int N = 0;
>>> if (rank==0) {
>>> sleep(1);
>>> MPI_Send(buffer, bufsize, MPI_BYTE, 1, 0, comm);
>>> }
>>> else {
>>> MPI_Request req;
>>> MPI_Irecv(buffer, bufsize, MPI_BYTE, MPI_ANY_TAG, 0, comm,&req);
>>> for (int flag = 0; flag==0; ++N) {
>>> MPI_Test(&req,&flag, MPI_STATUS_IGNORE);
>>> }
>>> }
>>>
>>> // measurement container
>>> std::vector<double> time_res;
>>> time_res.reserve(N * 1.1);
>>>
>>> MPI_Barrier(comm);
>>>
>>> // benchmark MPI_Test
>>> double total_time = MPI_Wtime();
>>>
>>> if (rank==0) {
>>> sleep(1);
>>> MPI_Send(buffer, bufsize, MPI_BYTE, 1, 0, comm);
>>> }
>>> else if (rank==1) {
>>> MPI_Request req;
>>> MPI_Irecv(buffer, bufsize, MPI_BYTE, MPI_ANY_TAG, 0, comm,&req);
>>>
>>> #ifdef TIMEOUT
>>> double timeout = 0.0;
>>> #endif
>>> for (int flag = 0; flag==0; ) {
>>> #ifdef TIMEOUT
>>> double ntimeout = MPI_Wtime();
>>> if ( (ntimeout-timeout)<TIMEOUT) continue;
>>> timeout = ntimeout;
>>> #endif
>>> double time = MPI_Wtime();
>>> MPI_Test(&req,&flag, MPI_STATUS_IGNORE);
>>> time = MPI_Wtime() - time;
>>> time_res.push_back(time);
>>> }
>>> }
>>> total_time = MPI_Wtime() - total_time;
>>>
>>> MPI_Barrier(comm);
>>>
>>> // print total time for execution, number of MPI_Test calls,
>>> // average, max and min time for MPI_Test and how much was the final
>>> // MPI_Test
>>> if (rank==1) {
>>> double average = std::accumulate(time_res.begin(), time_res.end(),
>>> 0.0)/time_res.size();
>>> std::vector<double>::iterator max =
>>> std::max_element(time_res.begin(), time_res.end());
>>> std::vector<double>::iterator min =
>>> std::min_element(time_res.begin(), time_res.end());
>>>
>>> std::cout<< "Total time: "<< total_time<< " secs\n"
>>> << "MPI_Test calls: "<< time_res.size()<< '\n'
>>> << std::fixed<< std::setprecision(10)
>>> << "Average: "<< average<< " secs\n"
>>> << "Max: "<< *max<< " secs ["<<
>>> std::distance(time_res.begin(), max)<< "]\n"
>>> << "Min: "<< *min<< " secs ["<<
>>> std::distance(time_res.begin(), min)<< "]\n"
>>> << "Success: "<< time_res[time_res.size()-1]<< " secs"
>>> << std::endl;
>>> }
>>>
>>> MPI_Finalize();
>>>
>>> return 0;
>>> }
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>

-- 
Ioannis Papadopoulos                       Voice: +1 979 862 9214
Dept. of Computer Science and Engineering  Email: ipapadop_at_[hidden]
Texas A&M University
3112 TAMU
College Station, TX 77843-3112