Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] IBCM error
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-07-13 07:57:53

Brad and I did some scale testing of IBCM and saw this error
sometimes. It seemed to happen with higher frequency when you
increased the number of processes on a single node.

I talked to Sean Hefty about it, but we never figured out a definitive
cause or solution. My best guess is that there is something wonky
about multiple processes simultaneously interacting with the IBCM
kernel driver from userspace; but I don't know jack about kernel
stuff, so that's a total SWAG.

Thanks for reminding me of this issue; I admit that I had forgotten
about it. :-( Pasha -- should IBCM not be the default?

On Jul 13, 2008, at 7:08 AM, Lenny Verkhovsky wrote:

> Hi,
> I am getting this error sometimes.
> /home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile /home/
> USERS/lenny/TESTS/COMPILERS/hostfile /home/USERS/lenny/TESTS/
> [witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/
> btl_openib_connect_ibcm.c:769:ibcm_component_query] failed to
> ib_cm_listen 10 times: rc=-1, errno=22
> Hello world! I'm 0 of 100 on witch2
> Best Regards
> Lenny.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]

Jeff Squyres
Cisco Systems