In trying to get openmpi up and running on a new cluster, I came across
this error about having both of my IB switches set to the same
subnet-gid. Snooping around on my hosts which run the opensm daemon, I
indeed found this to be the case in the /var/log/osm-ib[0-1].log files,
giving up finding it with ibstat which showed these values to be
different, at least the second part of the GID.
Before I try and pursue how to actually change this value for the opensm
daemon, I do have a question.
Since both of my hosts are connected to each switch, how am I to
instruct openmpi to use port0? I'm trying to use port0 as the MPI
network and port1 as the storage network. Is there something that I
need to add someplace forcing connections only to some default-subnet-gid?