Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] oshmem test suite errors
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-02-20 10:44:03


Yes, I've added them to my Cisco MTT ini files in the ompi-svn repo. Look in cisco/mtt/usnic/usnic-trunk.ini and usnic-v1.7.ini.

All relevant sections have "oshmem" in them.

Most are copied from the Mellanox examples, but I made a few tweaks/improvements here and there. I also anticipate adjusting some of the timeouts as we get a few MTT oshmem runs done in some of the sections for some longer-running tests (at np=32 and possibly 64).

On Feb 20, 2014, at 10:34 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> Could you send along the relevant mtt .ini sections?
>
>
> On Feb 20, 2014, at 7:10 AM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>
>> For all of these, I'm using the openshmem test suite that is now committed to the ompi-svn SVN repo. I don't know if the errors are with the tests or with oshmem itself.
>>
>> 1. I'm running the oshmem test suite at 32 processes across 2 16-core servers. I'm seeing a segv in "examples/shmem_2dheat.x 10 10". It seems to run fine at lower np values such as 2, 4, and 8; I didn't try to determine where the crossover to badness occurs.
>>
>> 2. "examples/adjacent_32bit_amo.x 10 10" seems to hang with both tcp and usnic BTLs, even when running at np=2 (I let it run for several minutes before killing it).
>>
>> 3. Ditto for "example/ptp.x 10 10".
>>
>> 4. "examples/shmem_matrix.x 10 10" seems to run fine at np=32 on usnic, but hangs with TCP (i.e., I let it run for 8+ minutes before killing it -- perhaps it would have finished eventually?).
>>
>> ...there's more results (more timeouts and more failures), but they're not yet complete, and I've got to keep working on my own features for v1.7.5, so I need to move to other things right now.
>>
>> I think I have oshmem running well enough to add these to Cisco's nightly MTT runs now, so the results will start showing up there without needing my manual attention.
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/