From: Venkat Venkatsubra
Sent: Wednesday, November 18, 2009 12:54 PM
To: 'mtt-users@open-mpi.org'
Subject: MTT trivial tests fails to complete on Centos5.3-x86_64 bit platform with OFED 1.5

 

Hello All,

 

How do I debug this problem ? Attached are the developer.ini and trivial.ini files.

I can provide any other information that you need.

 

[root@samples]# cat /etc/issue

CentOS release 5.3 (Final)

Kernel \r on an \m

 

[root@samples]# uname -a

Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 10:41:14 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

 

I am running OFED-1.5-20091029-0617 daily build.

 

Started trivial tests using the following command:

 

[root@samples]# cat developer.ini trivial.ini | ../client/mtt --verbose –

….

….

 >> Initializing reporter module: TextFile

 *** Reporter initialized

 *** MPI Get phase starting

 >> MPI Get: [mpi get: my installation]

    Checking for new MPI sources...

    Using MPI in: /usr/mpi/gcc/openmpi-1.3.2/

 *** WARNING: alreadyinstalled_mpi_type was not specified, defaulting to

     "OMPI".

    Got new MPI sources: version 1.3.2

 *** MPI Get phase complete

 *** MPI Install phase starting

 >> MPI Install [mpi install: my installation]

    Installing MPI: [my installation] / [1.3.2] / [my installation]...

 >> Reported to text file

   /root/mtt-svn/samples/MPI_Install-my_installation-my_installation-1.3.2.htm

   l

 >> Reported to text file

   /root/mtt-svn/samples/MPI_Install-my_installation-my_installation-1.3.2.txt

    Completed MPI Install successfully

 *** MPI Install phase complete

 *** Test Get phase starting

 >> Test Get: [test get: trivial]

    Checking for new test sources...

    Got new test sources

 *** Test Get phase complete

 *** Test Build phase starting

 >> Test Build [test build: trivial]

    Building for [my installation] / [1.3.2] / [my installation] / [trivial]

 >> Reported to text file

   /root/mtt-svn/samples/Test_Build-trivial-my_installation-1.3.2.html

 >> Reported to text file

   /root/mtt-svn/samples/Test_Build-trivial-my_installation-1.3.2.txt

    Completed test build successfully

 *** Test Build phase complete

 *** Test Run phase starting

 >> Test Run [trivial]

 >> Running with [my installation] / [1.3.2] / [my installation]

    Using MPI Details [open mpi] with MPI Install [my installation]

 

During this stage the test stalls.

After about ~10 minutes the test gets killed.

dmesg on which the test is running displays the following output:

 

 ==========

 Dmesg output

 ==========

 Out of memory: Killed process 5346 (gdmgreeter).

 audispd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

 

 Call Trace:

  [<ffffffff800c39dd>] out_of_memory+0x8e/0x2f5

  [<ffffffff8000f2eb>] __alloc_pages+0x245/0x2ce

  [<ffffffff80012a62>] __do_page_cache_readahead+0x95/0x1d9

  [<ffffffff80215932>] sock_readv+0xb7/0xd1

  [<ffffffff80088896>] __wake_up_common+0x3e/0x68

  [<ffffffff80013401>] filemap_nopage+0x148/0x322

  [<ffffffff80008863>] __handle_mm_fault+0x1f8/0xe5c

  [<ffffffff80066b9a>] do_page_fault+0x4cb/0x830

  [<ffffffff8005dde9>] error_exit+0x0/0x84

 

Thanks!

 

Venkat