Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] CMAQ crashes with OpenMPI
From: Matthew Russell (mrussel2_at_[hidden])
Date: 2011-08-09 16:49:41


Hi,

I'm trying to run CMAQ - an air quality model developed by the US EPA - on a
Mac (Lion) using OpenMPI (1.5.3) installed with MacPorts.

I am able to run CMAQ in parallel, and am able to run small programs that
use OpenMPI.

I set the OpenMPI environment variables to use pgf90/pgcc (10.9) as my
compiler. Using PGI because some of the code I need to build is fortran 77
( *sigh* ), and for some other reasons.

The error I get is:

/opt/local/lib/openmpi/bin/mpirun -v -machinefile
/Users/matt/cmaq/darwin11/scripts/cctm/machines8 -np 2
/Users/matt/cmaq/darwin11/scripts/cctm/CCTM_e1a_Darwin11_x86_64pg
[pontus:72547] *** Process received signal ***
[pontus:72547] Signal: Segmentation fault: 11 (11)
[pontus:72547] Signal code: Address not mapped (1)
[pontus:72547] Failing at address: 0x0
[pontus:72547] [ 0] 2 libsystem_c.dylib
0x00007fff91065cfa _sigtramp + 26
[pontus:72547] [ 1] 3 ???
0x00007fff5fbe58ab 0x0 + 140734799698091
[pontus:72547] [ 2] 4 CCTM_e1a_Darwin11_x86_64pg
 0x000000010003c89b distr_env_ + 971
[pontus:72547] [ 3] 5 CCTM_e1a_Darwin11_x86_64pg
 0x000000010003cbe5 par_init_ + 565
[pontus:72547] [ 4] 6 CCTM_e1a_Darwin11_x86_64pg
 0x0000000100032e1b MAIN_ + 219
[pontus:72547] [ 5] 7 CCTM_e1a_Darwin11_x86_64pg
 0x00000001000016f6 main + 70
[pontus:72547] [ 6] 8 CCTM_e1a_Darwin11_x86_64pg
 0x000000010000163a _start + 248
[pontus:72547] [ 7] 9 CCTM_e1a_Darwin11_x86_64pg
 0x0000000100001541 start + 33
[pontus:72547] [ 8] 10 ???
0x0000000000000001 0x0 + 1
[pontus:72547] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 72547 on node
pontus.cee.carleton.ca exited on signal 11 (Segmentation fault: 11).
--------------------------------------------------------------------------

I don't expect anyone to know the solution from this brief error message,
however I was wondering if anyone has insight on how I might debug this? I
am too new to both OpenMPI and CMAQ to be served that well from this
traceback.

I'm told by others in my research group that CMAQ with OpenMPI on Linux
works fine, and that the error I'm getting is very similar to the error
others got when trying this on a Mac (Snow Leopard) with ifort.. before they
gave up...

OpenMPI was configured with:
configure.args --sysconfdir=${prefix}/etc/${name} \
                --includedir=${prefix}/include/${name} \
                --bindir=${prefix}/lib/${name}/bin \
                --mandir=${prefix}/share/man \
                --with-memory-manager=none

# enable build on Lion
if {$os.major} >= 11} {
        configure.compiler gcc-4.2
}

The --with-memory-manager is there because I saw it fix potentially similar
problems in other postings to this Mailing list. It didn't make a
difference though.

Thanks!