Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Scott Atchley (atchley_at_[hidden])
Date: 2006-12-05 19:40:33


On Dec 5, 2006, at 6:15 PM, Galen M. Shipman wrote:

> Brock Palen wrote:
>
>> I was asked by mirycom to run a test using the data reliability pml.
>> (dr) I ran it like so:
>>
>> $ mpirun --mca pml dr -np 4 ./xhpl
>>
>> Is this the right format for running the dr pml?
>>
> This should be fine, yes.
> I can running HPL on our test cluster to see if something is wrong
> with DR.
> I assume you are using GM and not MX?

He is running GM.

> Can you try running a simple ping-pong to make sure we have the basics
> on this platform?
> If you have access to them, running the intel test suite would also be
> helpful in determining if/where we have an issue.

He has run IMB compiled with -DCHECK and it did not report any errors.

>> Is there any gotchas on using the dr pml?
>> also if the dr pml is finding errors, and is resending data, can i
>> have it tell me when that happens? Like a verbose mode?
>>
> Unfortunately you will need to update the source and recompile, try:
>
> Updating this file: topdir/ompi/mca/pml/dr/pml_dr.h:245:#define
> MCA_PML_DR_DEBUG_LEVEL -1
> And change MCA_PML_DR_DEBUG_LEVEL to 0..

The problem is that, when running HPL, he sees failed residuals. When
running HPL under MPICH-GM, he does not.

I have tried running HPCC (HPL plus other benchmarks) using OMPI with
GM on 32-bit Xeons and 64-bit Opterons. I do not see any failed
residuals. I am trying to get access to a couple of OSX machines to
replicate Brock's setup.

Scott