Secifically Jobs are not being handed to other nodes ever. Running
mpirun -mca btl openib,self -np 20 /bin/hostname
will return the same hostname 20 times, even if I specify -bynode as an
argument. They are Debian systems with 2 dual core processors in them,
and have the most recent open fabrics user and kernel packages from
openfabrics.org installed. I'm running a 2.6.18 kernel. My subnet
manager is on my switch, which is a Cisco SFS 7000. Also, as I
mentioned earlier everything is ok when I am using ipoib, but switching
to verbs is giving me a lot of problems.
Output from ibv_devinfo:
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
With the obvious exception of the node_guid, and sys_image_guid this is
the same across all of the nodes. I'm also attaching config.log and the
output from ompi_info --all
ulimit -l reports unlimited
Jeff Squyres wrote:
> Can you be more specific about what problems you're seeing?
> Note that the rdma mpool is the plugin that is used by the openib btl;
> there is no openib mpool (there used to be, but its functionality got
> generalized and put into the "rdma" plugin).
> On Feb 19, 2008, at 12:35 PM, jessie puls wrote:
>> jessie puls wrote:
>>> Hi all,
>>> I'm having problems getting openmpi to work correctly using verbs on
>>> some systems. It's been working using openib for quite some time,
>>> but I
>>> need to get it working using verbs for some research I'm doing.
>> This would make a whole lot more sense if I'd typed it correctly.
>> been working using ipoib.
>>> all seems to be good on the openib side of things. ibv_devinfo and
>>> ibv_devices returns device information, and they are listed as
>>> active on
>>> each node. Also all hosts are visible to each other (ibhosts shows a
>>> full list).
>>> The problem I see with openmpi is I have the openib btl, but not the
>>> openib mpool, and when looking at the contents of ompi/mca/mpool/ I
>>> don't see openib there (sm and rdma are both listed and ompi_info
>>> they've been included in the build). Any help would be appreciated.
>>> users mailing list
>> users mailing list