Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Scott Atchley (atchley_at_[hidden])
Date: 2007-08-31 13:57:57


Terry,

Are you testing on Linux? If so, which kernel?

See the patch to iperf to handle kernel 2.6.21 and the issue that
they had with usleep(0):

http://dast.nlanr.net/Projects/Iperf2.0/patch-iperf-linux-2.6.21.txt

Scott

On Aug 31, 2007, at 1:36 PM, Terry D. Dontje wrote:

> Ok, I have an update to this issue. I believe there is an
> implementation difference of sched_yield between Linux and
> Solaris. If
> I change the sched_yield in opal_progress to be a usleep(500) then my
> program completes quite quickly. I have sent a few questions to a
> Solaris engineer and hopefully will get some useful information.
>
> That being said, CT-6's implementation also used yield calls (note
> this
> actually is what sched_yield reduces down to in Solaris) and we did
> not
> see the same degradation issue as with Open MPI. I believe the reason
> is because CT-6's SM implementation is not calling CT-6's version of
> progress recursively and forcing all the unexpected to be read in
> before
> continuing. CT-6 also has a natural flow control in it's
> implementation
> (ie it has a fixed set fifo for eager messages.
>
> I believe both of these characteristics lend CT-6 to not being
> completely killed by the yield differences.
>
> --td
>
>
> Li-Ta Lo wrote:
>
>> On Thu, 2007-08-30 at 12:45 -0400, Terry.Dontje_at_[hidden] wrote:
>>
>>
>>> Li-Ta Lo wrote:
>>>
>>>
>>>
>>>> On Thu, 2007-08-30 at 12:25 -0400, Terry.Dontje_at_[hidden] wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Li-Ta Lo wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> hmmm, interesting since my version doesn't abort at all.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Some problem with fortran compiler/language binding? My C
>>>>>> translation
>>>>>> doesn't have any problem.
>>>>>>
>>>>>> [ollie_at_exponential ~]$ mpirun -np 4 a.out 10
>>>>>> Target duration (seconds): 10.000000, #of msgs: 50331, usec
>>>>>> per msg:
>>>>>> 198.684707
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Did you oversubscribe? I found np=10 on a 8 core system
>>>>> clogged things
>>>>> up sufficiently.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Yea, I used np 10 on a 2 proc, 2 hyper-thread system (total 4
>>>> threads).
>>>>
>>>>
>>>>
>>>>
>>>>
>>> Is this using Linux?
>>>
>>>
>>>
>>
>>
>> Yes.
>>
>> Ollie
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel