Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] CMAQ crashes with OpenMPI
From: Matthew Russell (mrussel2_at_[hidden])
Date: 2011-08-10 11:55:23


Ack, that's a very good point. I made sure to compile all my
other dependencies (NetCDF, IOAPI) with PGI, but I overlooked that one.
 I'll admit that even after years of working with these models, I'm still
never sure when I can and can't mix binaries compiled with different
compilers. I used certain flags that should make my PGI binaries compatible
with GNU, but I'm never completely sure.

It was the OpenMPI version that came with MacPorts, I avoided the default on
on OS X because it does not include a Fortran compiler.

I'll try building OpenMPI from source again, I had trouble with orte-clean*,
but I can probably get that working or consult the compilation mailing list.

Thanks for your input!

* Undefined symbols for architecture x86_64:
  "_orte_odls", referenced from:
      _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
ld: symbol(s) not found for architecture x86_64

On Tue, Aug 9, 2011 at 5:00 PM, Doug Reeder <dlr1_at_[hidden]> wrote:

> Matt,
>
> Are you sure you are building against your macports version of openmpi and
> not the one that ships w/ lion. In the trace back are items 4-9, that end w/
> x86_64pg from the pgi compiler. You said you are using pgf90 and pgcc but in
> the configure input it looks like gcc is being used on lion.
>
> Doug Reeder
> On Aug 9, 2011, at 1:49 PM, Matthew Russell wrote:
>
>
> Hi,
>
> I'm trying to run CMAQ - an air quality model developed by the US EPA - on
> a Mac (Lion) using OpenMPI (1.5.3) installed with MacPorts.
>
> I am able to run CMAQ in parallel, and am able to run small programs that
> use OpenMPI.
>
> I set the OpenMPI environment variables to use pgf90/pgcc (10.9) as my
> compiler. Using PGI because some of the code I need to build is fortran 77
> ( *sigh* ), and for some other reasons.
>
> The error I get is:
>
> /opt/local/lib/openmpi/bin/mpirun -v -machinefile
> /Users/matt/cmaq/darwin11/scripts/cctm/machines8 -np 2
> /Users/matt/cmaq/darwin11/scripts/cctm/CCTM_e1a_Darwin11_x86_64pg
> [pontus:72547] *** Process received signal ***
> [pontus:72547] Signal: Segmentation fault: 11 (11)
> [pontus:72547] Signal code: Address not mapped (1)
> [pontus:72547] Failing at address: 0x0
> [pontus:72547] [ 0] 2 libsystem_c.dylib
> 0x00007fff91065cfa _sigtramp + 26
> [pontus:72547] [ 1] 3 ???
> 0x00007fff5fbe58ab 0x0 + 140734799698091
> [pontus:72547] [ 2] 4 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010003c89b distr_env_ + 971
> [pontus:72547] [ 3] 5 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010003cbe5 par_init_ + 565
> [pontus:72547] [ 4] 6 CCTM_e1a_Darwin11_x86_64pg
> 0x0000000100032e1b MAIN_ + 219
> [pontus:72547] [ 5] 7 CCTM_e1a_Darwin11_x86_64pg
> 0x00000001000016f6 main + 70
> [pontus:72547] [ 6] 8 CCTM_e1a_Darwin11_x86_64pg
> 0x000000010000163a _start + 248
> [pontus:72547] [ 7] 9 CCTM_e1a_Darwin11_x86_64pg
> 0x0000000100001541 start + 33
> [pontus:72547] [ 8] 10 ???
> 0x0000000000000001 0x0 + 1
> [pontus:72547] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 72547 on node
> pontus.cee.carleton.ca exited on signal 11 (Segmentation fault: 11).
> --------------------------------------------------------------------------
>
> I don't expect anyone to know the solution from this brief error message,
> however I was wondering if anyone has insight on how I might debug this? I
> am too new to both OpenMPI and CMAQ to be served that well from this
> traceback.
>
> I'm told by others in my research group that CMAQ with OpenMPI on Linux
> works fine, and that the error I'm getting is very similar to the error
> others got when trying this on a Mac (Snow Leopard) with ifort.. before they
> gave up...
>
> OpenMPI was configured with:
> configure.args --sysconfdir=${prefix}/etc/${name} \
> --includedir=${prefix}/include/${name} \
> --bindir=${prefix}/lib/${name}/bin \
> --mandir=${prefix}/share/man \
> --with-memory-manager=none
>
> # enable build on Lion
> if {$os.major} >= 11} {
> configure.compiler gcc-4.2
> }
>
> The --with-memory-manager is there because I saw it fix potentially
> similar problems in other postings to this Mailing list. It didn't make a
> difference though.
>
> Thanks!
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>