Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-01-22 13:25:50


On Jan 22, 2007, at 11:53 AM, Axel Schweiger wrote:

> Thanks for your reply. Yes POP 1.2 is dead w.r.t. development but our
> application still uses it. The 1.2 to 2.0 transition
> involves a lot of physical differences and for a while at least we are
> stuck with 1.2.

Gotcha.

> Can't say if there is a bug that was fixed since there was a lot of
> re-engineering going to 2.0. . But I do know that POP 1.2 works
> fine with the MPICH MPI implementation. Wouldn't you expect that a bad
> parameters would produce the same error with MPICH?

Usually, but not always. Mostly, this involves problems with C
codes, but it can happen in Fortran as well. Specifically, different
run-time behaviors of MPI implementations can sometimes result in a
code that runs under one MPI and not under another, typically (but
not always) if the code makes some assumptions or violates the
standard in some way.

I see in OMPI's MPI_CART_SHIFT, we only return the "bad communicator"
error if we get an invalid communicator or an intercommunicator. Are
you familiar with the POP code at all to be able to dive into it to
see where the problem is actually occurring?

> Thanks much
> Axel
> Jeff Squyres wrote:
>> Looking at the web page for POP (http://climate.lanl.gov/Models/POP/
>> index.shtml), it looks like POP 1.2 is pretty ancient. I gather from
>> your text that later versions work ok ("POP 2").
>>
>> My first guess -- knowing nothing about the POP code itself -- is
>> that there is a bug in the POP 1.2 code such that it is passing a bad
>> parameter to MPI_CART_SHIFT, and that later versions (POP 2) fixed
>> the problem.
>>
>> Do you know if this is the case?
>>
>>
>> On Jan 19, 2007, at 8:06 PM, Axel Schweiger wrote:
>>
>>
>>> I am having a problem running pop 1.2 (Parallel Ocean Model) with
>>> OpenMPI version 1.1.2 compiled with PGI 6.2-4 on RH EL-4 Update 4
>>> (configure result attached)
>>>
>>> The error is as follows:
>>>
>>> mpirun -v -np 4 -machinefile node18.dat pop
>>> [node18:11220] *** An error occurred in MPI_Cart_shift
>>> [node18:11221] *** An error occurred in MPI_Cart_shift
>>> [node18:11221] *** on communicator MPI_COMM_WORLD
>>> [node18:11221] *** MPI_ERR_COMM: invalid communicator
>>> [node18:11221] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> [node18:11220] *** on communicator MPI_COMM_WORLD
>>> [node18:11220] *** MPI_ERR_COMM: invalid communicator
>>> [node18:11220] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> 3 additional processes aborted (not shown)
>>>
>>> The application runs fine with MPICH 1.2.6 and other applications
>>> (POP 2) run fine with OpenMPI
>>>
>>> Any suggestions
>>>
>>> Thanks
>>>
>>> <configure_pgi_ext.log.gz>
>>> <axel.vcf>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems