Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] CMAQ crashes with OpenMPI
From: Matthew Russell (mrussel2_at_[hidden])
Date: 2011-08-10 11:55:23


Ack, that's a very good point. I made sure to compile all my
other dependencies (NetCDF, IOAPI) with PGI, but I overlooked that one.
 I'll admit that even after years of working with these models, I'm still
never sure when I can and can't mix binaries compiled with different
compilers. I used certain flags that should make my PGI binaries compatible
with GNU, but I'm never completely sure.

It was the OpenMPI version that came with MacPorts, I avoided the default on
on OS X because it does not include a Fortran compiler.

I'll try building OpenMPI from source again, I had trouble with orte-clean*,
but I can probably get that working or consult the compilation mailing list.

Thanks for your input!

* Undefined symbols for architecture x86_64:
  "_orte_odls", referenced from:
      _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
ld: symbol(s) not found for architecture x86_64

On Tue, Aug 9, 2011 at 5:00 PM, Doug Reeder <dlr1_at_[hidden]> wrote:

> Matt,
>
> Are you sure you are building against your macports version of openmpi and
> not the one that ships w/ lion. In the trace back are items 4-9, that end w/
> x86_64pg from the pgi compiler. You said you are using pgf90 and pgcc but in
> the configure input it looks like gcc is being used on lion.
>
> Doug Reeder
> On Aug 9, 2011, at 1:49 PM, Matthew Russell wrote:
>
>
> Hi,
>
> I'm trying to run CMAQ - an air quality model developed by the US EPA - on
> a Mac (Lion) using OpenMPI (1.5.3) installed with MacPorts.
>
> I am able to run CMAQ in parallel, and am able to run small programs that
> use OpenMPI.
>
> I set the OpenMPI environment variables to use pgf90/pgcc (10.9) as my
> compiler. Using PGI because some of the code I need to build is fortran 77
> ( *sigh* ), and for some other reasons.
>
> The error I get is:
>
> /opt/local/lib/openmpi/bin/mpirun -v -machinefile
> /Users/matt/cmaq/darwin11/scripts/cctm/machines8 -np 2
> /Users/matt/cmaq/darwin11/scripts/cctm/CCTM_e1a_Darwin11_x86_64pg
> [pontus:72547] *** Process received signal ***
> [pontus:72547] Signal: Segmentation fault: 11 (11)
> [pontus:72547] Signal code: Address not mapped (1)
> [pontus:72547] Failing at address: 0x0
> [pontus:72547] [ 0] 2 libsystem_c.dylib
> 0x00007fff91065cfa _sigtramp + 26
> [pontus:72547] [ 1] 3 ???
> 0x00007fff5fbe58ab 0x0 + 140734799698091
> [pontus:72547] [ 2] 4 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010003c89b distr_env_ + 971
> [pontus:72547] [ 3] 5 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010003cbe5 par_init_ + 565
> [pontus:72547] [ 4] 6 CCTM_e1a_Darwin11_x86_64pg
> 0x0000000100032e1b MAIN_ + 219
> [pontus:72547] [ 5] 7 CCTM_e1a_Darwin11_x86_64pg
> 0x00000001000016f6 main + 70
> [pontus:72547] [ 6] 8 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010000163a _start + 248
> [pontus:72547] [ 7] 9 CCTM_e1a_Darwin11_x86_64pg
> 0x0000000100001541 start + 33
> [pontus:72547] [ 8] 10 ???
> 0x0000000000000001 0x0 + 1
> [pontus:72547] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 72547 on node
> pontus.cee.carleton.ca exited on signal 11 (Segmentation fault: 11).
> --------------------------------------------------------------------------
>
> I don't expect anyone to know the solution from this brief error message,
> however I was wondering if anyone has insight on how I might debug this? I
> am too new to both OpenMPI and CMAQ to be served that well from this
> traceback.
>
> I'm told by others in my research group that CMAQ with OpenMPI on Linux
> works fine, and that the error I'm getting is very similar to the error
> others got when trying this on a Mac (Snow Leopard) with ifort.. before they
> gave up...
>
> OpenMPI was configured with:
> configure.args --sysconfdir=${prefix}/etc/${name} \
> --includedir=${prefix}/include/${name} \
> --bindir=${prefix}/lib/${name}/bin \
> --mandir=${prefix}/share/man \
> --with-memory-manager=none
>
> # enable build on Lion
> if {$os.major} >= 11} {
> configure.compiler gcc-4.2
> }
>
> The --with-memory-manager is there because I saw it fix potentially
> similar problems in other postings to this Mailing list. It didn't make a
> difference though.
>
> Thanks!
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>