Open MPI logo

MTT Users Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Josh Hursey (jjhursey_at_[hidden])
Date: 2006-08-29 20:57:29


On Aug 29, 2006, at 6:57 PM, Jeff Squyres wrote:

> On 8/29/06 1:55 PM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:
>
>> So I'm having trouble getting tests to complete without timing out in
>> MTT. It seems that the tests timeout and hang in MTT, but complete
>> normally outside of MTT.
>
> Does this apply to *all* tests, or only some of the tests (like
> allgather)?

All of the tests: Trivial and ibm. They all timeout :(

>
>> Here are some details:
>> Build:
>> Open MPI Trunk (1.3a1r11481)
>>
>> Tests:
>> Trivial
>> ibm
>>
>> BTL:
>> tcp
>> self
>>
>> Nodes/processes:
>> 16 nodes (32 processors) on the Odin Cluster at IU
>>
>>
>> In MTT all of the tests timeout:
>> <mtt snip>
>> Running command: mpirun -mca btl tcp,self -np 32 --prefix
>> /san/homedirs/mpiteam/tmp/mtt-scratch/installs/ompi-nightly-
>> trunk/
>> odin_g
>> cc_warnings/1.3a1r11481/install collective/allgather
>> Timeout: 1 - 1156872348 (vs. now: 1156872028)
>> Past timeout! 1156872348 < 1156872349
>> Past timeout! 1156872348 < 1156872349
> [snipped]
>> &or: returning 0
>> String now: 0
>> *** WARNING: Test: allgather, np=32, variant=1: TIMED OUT (failed)
>> </mtt snip>
>>
>> Outside of MTT using the same build the test runs and completes
>> normally:
>> $ cd ~/tmp/mtt-scratch/installs/ompi-nightly-trunk/
>> odin_gcc_warnings/1.3a1r11481/tests/ibm/ibm/
>> $ mpirun -mca btl tcp,self -np 32 --prefix /san/homedirs/mpiteam/
>> tmp/mtt-scratch/installs/ompi-nightly-trunk/odin_gcc_warnings/
>> 1.3a1r11481/install collective/allgather
>
> Where is mpirun in your path?
>
> MTT actually drops sourceable files in the top-level install dir
> (i.e., the
> 1.3a1r11481) that you can source in your shell and set the
> PATH/LD_LIBRARY_PATH for that install. Can you source it and try
> to run
> again?

Yep I exported the PATH/LD_LIBRARY_PATH to the one cited in the --
prefix argument before running manually.

>
> How long does it take to run manually -- just a few seconds, or a
> long time
> (that could potentially timeout)?

Just a few seconds (say 5 or so).

>
> --
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems

----
Josh Hursey
jjhursey_at_[hidden]
http://www.open-mpi.org/