Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Prasun Ratn (prasun.r_at_[hidden])
Date: 2007-10-29 04:09:31


There is nothing MPI specific in your code snippet.
You should try to find out what is different in your
code for node 0. You have mentioned that you have
moved the root node to other nodes, so it's not machine
specific. You might be setting up the arrays differently
on the different nodes. You should also try using other
timers such as clock_gettime, gettimeofday etc to see
if the results are consistent.
Also, are you running multiple threads on the same processor?
Did you try out blocking etc ?

42aftab_at_[hidden] wrote:
> Hi All,
> Thanks for the help. I think that I don't have the cache issue
> because all the processes have the same amount of data, and
> accessed in the same fashion. My problem is solved partially as I
> was using 2, 4, 8 , 16, 32 and 64 processes for my application
> code. Now what I did I used 3 processes instead of 2 and 5 instead
> of 4. In other words I used one extra process than what I was using
> before. I forced process 0 to do nothing but just wait for other
> processes to finish. In this way I am having same time, on all the
> processes except process 0, for calculating the code segment that
> was taking more time on process 0. So, still I need help and I will
> be thankful for further help.
>
> regards Aftab Hussain
>
>
> On Fri, October 26, 2007 4:13 pm, Jeff Squyres wrote:
>
>> This is not an MPI problem.
>>
>>
>> Without looking at your code in detail, I'm guessing that you're
>> accessing memory without any regard to memory layout and/or caching. Such
>> an access pattern will therefore thrash your L1 and L2 caches and access
>> memory in a truly horrible pattern that guarantees abysmal performance.
>>
>> Google around for cache effects or check out an operating systems
>> textbook; there's lots of material around about this kind of effect.
>>
>> Good luck.
>>
>>
>>
>>
>> On Oct 26, 2007, at 5:10 AM, 42aftab_at_[hidden] wrote:
>>
>>
>>
>>> Thanks,
>>>
>>>
>>> The array bounds are the same on all the nodes and also the
>>> compute nodes are identical i.e. SunFire V890 nodes. And I have also
>>> changed the root process to be on different nodes, but the problem
>>> remains the same. I still dont understand the problem very well and my
>>> progress is in stand still situation.
>>>
>>> regards aftab hussain
>>>
>>> Hi,
>>>
>>>
>>> Please ensure if following things are correct
>>> 1) The array bounds are equal. Means "my_x" and "size_y" has the same
>>> value on all nodes. 2) Nodes are homogenous. To check that, you could
>>> decide root to be some different node and run the program
>>>
>>> -Neeraj
>>>
>>>
>>>
>>> On Fri, October 26, 2007 10:13 am, 42aftab_at_[hidden] wrote:
>>>
>>>
>>>> Thanks for your reply,
>>>>
>>>>
>>>>
>>>> I used MPI_Wtime for my application but even then process 0 took
>>>> longer time executing the mentioned code segment. I might be worng, but
>>>> what I see is process 0 takes more time to access the array elements
>>>> than other processes. Now I dont see what to do because the mentioned
>>>> code segment is creating a bottleneck for the timing of my application.
>>>>
>>>>
>>>> Can any one suggest somthing in this regard. I will be very thankful
>>>>
>>>>
>>>>
>>>> regards
>>>>
>>>> Aftab Hussain
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, October 25, 2007 9:38 pm, jody wrote:
>>>>
>>>>
>>>>
>>>>> HI
>>>>> I'm not sure if that is a problem,
>>>>> but in MPI applications you shoud use MPI_WTime() for time-
>>>>> measurements
>>>>>
>>>>> Jody
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/25/07, 42aftab_at_[hidden] <42aftab_at_[hidden]> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Hi all,
>>>>>> I am a research assistant (RA) at NUST Pakistan in High
>>>>>> Performance
>>>>>> Scientific Computing Lab. I am working on the parallel
>>>>>> implementation of the Finitie Difference Time Domain (FDTD) method
>>>>>> using MPI. I am using the OpenMPI environment on a cluster of 4
>>>>>> SunFire v890 cluster connected through Myrinet. I am having
>>>>>> problem that when I run my code with let say 4 processes. Process
>>>>>> 0 takes about 3 times more time than other three processes,
>>>>>> executing a for loop which is the main cause of load imbalance in
>>>>>> my code. I am writing the code that is causing the problem. The
>>>>>> code is run by all the processes simultaneously and independently
>>>>>> and I have timed it independent of segments of code.
>>>>>>
>>>>>> start = gethrtime(); for (m = 1; m < my_x ; m++){ for (n = 1; n <
>>>>>> size_y-1; n++) { Ez(m,n) = Ez(m,n) + cezh*((Hy(m,n) - Hy(m-1,n))
>>>>>> -
>>>>>> (Hx(m,n) - Hx(m,n-1)));
>>>>>> }
>>>>>> }
>>>>>> stop = gethrtime(); time = (stop-start);
>>>>>>
>>>>>> In my implementation I used 1-D array to realize 2-D arrays.I
>>>>>> have used the following macros for accesing the array elements.
>>>>>>
>>>>>> #define Hx(I,J) hx[(I)*(size_y) + (J)]
>>>>>> #define Hy(I,J) hy[(I)*(size_y) + (J)]
>>>>>> #define Ez(I,J) ez[(I)*(size_y) + (J)]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Can any one tell me what am I doing wrong here, or macros are
>>>>>> creating the problems or it can be related to any OS issue. I will
>>>>>> be looking forward for help because this problem has stopped my
>>>>>> progress for the last two weeks
>>>>>>
>>>>>> regards aftab hussain
>>>>>>
>>>>>> RA High Performance Scientific Computing Lab
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> NUST Institue of Information Technology
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> National University of Sciences and Technology Pakistan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> This message has been scanned for viruses and
>>>>>> dangerous content by MailScanner, and is believed to be clean.
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> users mailing list users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> This message has been scanned for viruses and
>>>>> dangerous content by MailScanner, and is believed to be clean.
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> This message has been scanned for viruses and
>>>> dangerous content by MailScanner, and is believed to be clean.
>>>>
>>>> _______________________________________________
>>>> users mailing list users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is believed to be clean.
>>>
>>> _______________________________________________
>>> users mailing list users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>> _______________________________________________
>> users mailing list users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is believed to be clean.
>>
>>
>>