Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] torque pbs behaviour...
From: Klymak Jody (jklymak_at_[hidden])
Date: 2009-08-11 09:11:12


On 10-Aug-09, at 8:03 PM, Ralph Castain wrote:

> Interesting! Well, I always make sure I have my personal OMPI build
> before any system stuff, and I work exclusively on Mac OS-X:

I am still finding this very mysterious....

I have removed all the OS-X -supplied libraries, recompiled and
installed openmpi 1.3.3, and I am *still* getting this warning when
running ompi_info:

[saturna.cluster:50307] mca: base: component_find: iof "mca_iof_proxy"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: iof "mca_iof_svc"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_dash_host" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_hostfile" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_localhost" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras "mca_ras_xgrid"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: rcache
"mca_rcache_rb" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored

So, I guess I'm not clear how the library can be an issue...

I *do* get another error from running the gcm that I do not get from
running simpler jobs - hopefully this helps explain things:

[xserve03.local][[61029,1],4][btl_tcp_endpoint.c:
486:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process
identifier [[61029,1],3]

The processes are running, the mitgcmuv processes are running on the
xserves, and using considerable resources! They open STDERR/STDOUT
but nothing is flushed into them, including the few print statements
I've put in before and after MPI_INIT (as Ralph suggested).

On 11-Aug-09, at 4:17 AM, Ashley Pittman wrote:

> If you suspect a hang then you can use the command orte-ps (on the
> node
> where the mpirun is running) and it should show you your job. This
> will
> tell you if the job is started and still running or if there was a
> problem launching.

/usr/local/openmpi/bin/orte-ps
[saturna.cluster:51840] mca: base: component_find: iof "mca_iof_proxy"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:51840] mca: base: component_find: iof "mca_iof_svc"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored

Information from mpirun [61029,0]
-----------------------------------

     JobID | State | Slots | Num Procs |
------------------------------------------
[61029,1] | Running | 2 | 16 |
             Process Name | ORTE Name | Local Rank | PID | Node
| State |
        -------------------------------------------------------------------------------
        ../build/mitgcmuv | [[61029,1],0] | 0 | 40206 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],1] | 0 | 40005 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],2] | 1 | 40207 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],3] | 1 | 40006 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],4] | 2 | 40208 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],5] | 2 | 40007 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],6] | 3 | 40209 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],7] | 3 | 40008 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],8] | 4 | 40210 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],9] | 4 | 40009 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],10] | 5 | 40211 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],11] | 5 | 40010 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],12] | 6 | 40212 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],13] | 6 | 40011 | xserve04 |
Running |
        ../build/mitgcmuv | [[61029,1],14] | 7 | 40213 | xserve03 |
Running |
        ../build/mitgcmuv | [[61029,1],15] | 7 | 40012 | xserve04 |
Running |

Thanks, Jody