Le 15/12/2010 18:45, Ralph Castain a écrit :
> It looks like all the messages are flowing within a single job (all
> three processes mentioned in the error have the same identifier). Only
> possibility I can think of is that somehow you are reusing ports - is
> it possible your system doesn't have enough ports to support all the
Seems there is on every worker node a range of almost 30k ports available:
> ssh r33i0n0 cat /proc/sys/net/ipv4/ip_local_port_range
This is AFAIK the only way I can get info about this.
Are these 30k ports this enough ?
Question is : is OpenMPI opening ports from every node towards every
other node ?
In such a case I could figure out why it is going to to lacking ports when
I increase the number of nodes.
But: is there a possibility (mca param ?) to prevent OpenMPI to open
so many ports ?
Indeed, apart from rank 0 node, every MPI process will need to
communicate with ONLY
the 8 (nearest) neighbour nodes. So, there should be a switch somewhere
to open a port ONLY when needed, but I did not find it among ompi_info
Which one is it ?
Thanks, Best, G.